# load the libraries each time you restart Rinstall.packages("tidyverse")
The following package(s) will be installed:
- tidyverse [2.0.0]
These packages will be installed into "~/Documents/r_projects/web_stats/renv/library/macos/R-4.4/aarch64-apple-darwin20".
# Installing packages --------------------------------------------------------
- Installing tidyverse ... OK [linked from cache]
Successfully installed 1 package in 7.5 milliseconds.
The following package(s) will be installed:
- skimr [2.1.5]
These packages will be installed into "~/Documents/r_projects/web_stats/renv/library/macos/R-4.4/aarch64-apple-darwin20".
# Installing packages --------------------------------------------------------
- Installing skimr ... OK [linked from cache]
Successfully installed 1 package in 95 milliseconds.
install.packages("janitor")
The following package(s) will be installed:
- janitor [2.2.0]
These packages will be installed into "~/Documents/r_projects/web_stats/renv/library/macos/R-4.4/aarch64-apple-darwin20".
# Installing packages --------------------------------------------------------
- Installing janitor ... OK [linked from cache]
Successfully installed 1 package in 65 milliseconds.
install.packages("patchwork")
The following package(s) will be installed:
- patchwork [1.2.0]
These packages will be installed into "~/Documents/r_projects/web_stats/renv/library/macos/R-4.4/aarch64-apple-darwin20".
# Installing packages --------------------------------------------------------
- Installing patchwork ... OK [linked from cache]
Successfully installed 1 package in 55 milliseconds.
library(skimr)library(janitor)library(patchwork)
Read in the file
Read in excel files
Note that you can read in excel files in the same way.
# Note you can read in excel files just as easy mm.df <-read_excel("data/mms.xlsx")head(mm.df)
# A tibble: 6 × 4
center color diameter mass
<chr> <chr> <dbl> <dbl>
1 peanut butter blue 16.2 2.18
2 peanut butter brown 16.5 2.01
3 peanut butter orange 15.5 1.78
4 peanut butter brown 16.3 1.98
5 peanut butter yellow 15.6 1.62
6 peanut butter brown 17.4 2.59
GGPlot
This script will go over a lot of the basics of creating graphs in GGPlot and later on we will go over how to do more specialized things. This is by no means a complete guide to GGPlot but will do most of the things that you will need to do in GGPlot. Any suggestions or recommendations of things to add would be welcome.
Graphing data
I feel that graphing is the key to all data analysis. If you can look at your data you can begin to see patterns that you may have predicted and want to test statistically. You will also be able to see outliers that exist that might affect results faster than looking at summary statistics. You can also determine if the data is normally distributed and how the variances compare from one group to another.
Using proper GGPlot code you are supposed to have data = , y = and x = ….
I have found that these are not necessary most of the time and we can talk about this later.
How to plot an XY plot
This is a basic XY plot of the data and is one of the first steps in exploring data. Later on we will look at how to modify this plot to be easier to interpret.
The dataframe comes first and the AES statement is the aesthetics or the x and y data you want to see. You can also add colors, shapes, fill, line types and some other things to map to the data in this statement.
Try adding color = color inside the aes(x=color, y=diameter) statement. You can also try shape = color
# GGplot uses layers to build a graphggplot(data=mm.df, aes(x=color, y=diameter)) +# this sets up data geom_point() # this adds a geometry to present the data from above
Because GGPlot builds things in layers you can add other geoms to the plot. Below you should try this code and see what happens when you put in + after geom_line() and then add geom_boxplot(fill="blue"). Note try putting in before or after the geom_point() line.
# Add geom_point() -----# Add points to the graph below using geom_point()ggplot(mm.df, aes(x=color, y=diameter)) +geom_point()
Adding axes labels
You can add in axes labels that are not special. Using the labs(x= " ", y = " ") statement. You can add in line breaks by putting in a \n in the statement that you have below.
What I find really nice is being able to create formatted axes labels. You can do this a few ways but I have found the that the expression statement works the best for my needs. You can add in a ~ to add a space between symbols and a * will connect things without a space.
# Label expressions -----# Adding special formatting to labelsggplot(mm.df, aes(x=color, y=diameter)) +geom_boxplot() +geom_point() +labs(x ="color", y =expression(bold("Diameter ("*mu*"*1000)")))