# load the libraries each time you restart R
library(tidyverse)
library(lubridate)
library(readxl)
library(scales)
library(skimr)
library(janitor)
library(patchwork)
Reading and writing data
Objective
How to read in data and write data back to a CSV file
The first and most important thing to be able to do is to read in a file - do stuff - and then save what you did to that file in the output directory. We will practice reading in CSV and Excel files.
Data for the exercise
This page has a link to all of the data files
We will use a mock data file that uses M&M’s
M&M CSV file and also the M&M Excel file
Load Libraries
Read in the file
# Read in file using tidyverse code-----
<- read_csv("data/mms.csv") mm.df
Rows: 816 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): center, color
dbl (2): diameter, mass
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Read in excel files
Note that you can read in excel files in the same way.
# Note you can read in excel files just as easy
<- read_excel("data/mms.xlsx") mm_excel.df
Look at dataframe structure
One way is to click the blue trianlge in the environment tab in the upper right
You can also use code to inspect the structure of the dataset
# data Structure
str(mm.df)
spc_tbl_ [816 × 4] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ center : chr [1:816] "peanut butter" "peanut butter" "peanut butter" "peanut butter" ...
$ color : chr [1:816] "blue" "brown" "orange" "brown" ...
$ diameter: num [1:816] 16.2 16.5 15.5 16.3 15.6 ...
$ mass : num [1:816] 2.18 2.01 1.78 1.98 1.62 2.59 1.9 2.55 2.07 2.26 ...
- attr(*, "spec")=
.. cols(
.. center = col_character(),
.. color = col_character(),
.. diameter = col_double(),
.. mass = col_double()
.. )
- attr(*, "problems")=<externalptr>
# or
glimpse(mm.df)
Rows: 816
Columns: 4
$ center <chr> "peanut butter", "peanut butter", "peanut butter", "peanut bu…
$ color <chr> "blue", "brown", "orange", "brown", "yellow", "brown", "yello…
$ diameter <dbl> 16.20, 16.50, 15.48, 16.32, 15.59, 17.43, 15.45, 17.30, 16.37…
$ mass <dbl> 2.18, 2.01, 1.78, 1.98, 1.62, 2.59, 1.90, 2.55, 2.07, 2.26, 1…
Saving files
Before we go too far it is often important to save the modified data
We can use the read_r package to do this with write_csv
# Saving files -----
# We can save the file we just read in using
# Saving dataframes -----
# lets say you have made a lot of changes and its now time to save the dataframe
write_csv(mm.df, "output/mm_output.csv")