This web site should help you get up and running on R and RStudio using tidyverse. I am focused on aquatic sciences and environmental monitoring data, but this should be flexible in its uses. We will follow the approach of of “R for Data Science” by Garrett Grolemund and Hadley Wickham where we import data and focus on graphing. Statistics will be secondary but we will cover some of the basics.
This is also a selfish way to organize my code and snippets in a central place that I can find them and make it available to others that are struggling with R and dataframes.
Links to many of the example dataframes that I will use.
Example data
Before working on a project, its useful to ponder what the data will look like, the names of variables, and what the final graph and analysis will be in a sketch. The basics of organization are: - folders and structure
- file formats
- file structure
- output formats
- installing R and R studio - basics of the R interface - R projects - installing libraries
The first step after installing R and RStudio is to install packages and learn how to load them. Here I cover many of the packages I use a lot and show you how to install them, what they do briefly and then how to load the libraries in each session.
How do you load files that are CSV (comma separated values) or excel files or other formats. Later on I will cover how to import many files and manipulate them as they are being loaded but that is beyond what we need to do when starting.
This page will cover the basics of graphing with GGPlot and how it works.
How to make simple statistical graphs and mean and standard error plots.
This page will cover the basics of graphing with GGPlot and how it works.
This page covers how to do advanced plotting and layouts using Patchwork.
This page covers some of the basics for doing math and summarizing data
Removing and reording columns and then filtering data.
Working with dates is never fun unless you use Lubridate.
This will show you how to modify the dataframe from wide to long and back. Also it has how to spread and unite values…
Modifying factor order is necessary for specialized graphing and also for statistics.
Bringing two dataframes together is often necessary and this covers some of the bascis of how to do this.
How to flag data usign ifelse and case_when. This is useful to flag data or to categorize data.
I am trying to add in some of the stats that I have done in the past. This is very much a work in progress and will get it updated in the next few weeks as the field season slows down
Correlations
Regression
T-tests
One-way Anovas
Two-way Anovas
This will cover some of the tricks for preparing data to be analyzed in rLakeAnalyzer.
This will cover how to work with sonde data and then graph that data. This data is from Lake Tanganyika and is not to be published or used without permission.
This provides a neat trick on how to extract certain datetime chunks from a larger dataframe.
This example expands on what was presented at GLEON 18 in New Paltz, New York by GLEON graduate students. I have put that code up here for public use. I learned so much at this meeting - it was GREAT!!