Again, we use these libraries almost all the time in every script
# Load Libraries ----
# this is done each time you run a script
library("readxl") # read in excel files
library("tidyverse") # dplyr and piping and ggplot etc
library("lubridate") # dates and times
library("scales") # scales on ggplot ases
library("skimr") # quick summary stats
library("janitor") # clean up excel imports
library("patchwork") # multipanel graphs
Read in the files and this is an example of a sonde deployement in part of Lake Tanganyika and only is a short cast in the upper depths.
# So now we have seen how to look at the data
# What if we wanted to modify the data in terms of columns or rows
# lets read in a new file to add some complexity for fun
exo.df <- read_csv("data/lt_exo_2017_01_23_datetimes.csv")
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## date = col_character(),
## time = col_time(format = ""),
## datetime = col_logical(),
## site = col_character(),
## ph = col_double(),
## wtemp_c = col_double(),
## spcond_uscm = col_double(),
## odo_pctsat = col_double(),
## odo_mgl = col_double(),
## turb_ntu = col_double(),
## tss_mgl = col_double(),
## psi = col_double(),
## depth_m = col_double()
## )
head(exo.df)
## # A tibble: 6 x 13
## date time datetime site ph wtemp_c spcond_uscm odo_pctsat odo_mgl
## <chr> <time> <lgl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1/23/2017 12:43:31 NA LTK 6.91 24.2 15.5 104. 8.73
## 2 1/23/2017 12:43:32 NA LTK 6.91 24.2 15.5 104. 8.73
## 3 1/23/2017 12:43:34 NA LTK 6.91 24.2 15.5 104 8.72
## 4 1/23/2017 12:43:36 NA LTK 6.92 24.3 15.5 104 8.71
## 5 1/23/2017 12:43:38 NA LTK 6.93 24.3 15.5 104. 8.71
## 6 1/23/2017 12:43:40 NA LTK 6.93 24.3 15.5 104 8.71
## # … with 4 more variables: turb_ntu <dbl>, tss_mgl <dbl>, psi <dbl>,
## # depth_m <dbl>
using the mutate command we can change the datatime variable and paste together the date and the time variables with a space as a searator. This will create a character variable. This then needs to be converted to a datatime
# So when this comes in
# what type of variable is date?
# what type of variable is time?
# What if we wanted to make a datetime column?
# Mutate and paste ----
# sep is the separator and you just list the variables you want to paste togeher
exo.df <- exo.df %>%
mutate(datetime = paste(date, time, sep=" "))
just in case you wanted to separate two variables.
# what if you wanted to separate these varaibles?
exo.df <- exo.df %>%
separate(datetime, c("newdate", "newtime"), sep=" ", remove=FALSE)
# note if you wanted to separte newdate into "year", "month", "day" what would you do?
exo.df <- exo.df
when you want to convert a variable into a Date or datetime (POSIXct) variable you can use the abbreviations in front of the variable to convert it.
y = year
m = month
d = day
h = hour
m = minute
s = second
# Dates and times -----
# Once you know how to mutate data you can now use lubridate to work with dates
# Sometimes dates and times come in as characters rather than date format
# So we have date and we have datetime but how do we make R understand
# that these are not characters and are POSIXct date times or Dates
# for datetime we do...
exo.df <- exo.df %>%
mutate(datetime = mdy_hms(datetime))
In R date time like in UNIX is the nubmber of seconds since 1970-01-01 00:00:00 and that will comme in handy in a few minutes.
# What do you think we would do for the date column?
# Modify the code below
exo.df <- exo.df %>%
mutate(date = (date))
Someitmes you need to make up data that is within a minute or so of each other. It is likely not possible to match them up perfectly and sometime rounding time to the nearest common time is necessary. You can use this using the set of parameters below.
# How can we modify the datetime to
exo.df <- exo.df %>%
mutate(datetime = ymd_hms(format(
strptime("1970-01-01", "%Y-%m-%d", tz = "UTC") +
round(as.numeric(ymd_hms(datetime)) / 300) * 300)))