Reading and writing data

Objective

How to read in data and write data back to a CSV file

The first and most important thing to be able to do is to read in a file - do stuff - and then save what you did to that file in the output directory. We will practice reading in CSV and Excel files.

Data for the exercise

This page has a link to all of the data files

We will use a mock data file that uses M&M’s

M&M CSV file and also the M&M Excel file

Load Libraries

# load the libraries each time you restart R
library(tidyverse)
library(lubridate)
library(readxl)
library(scales)
library(skimr)
library(janitor)
library(patchwork)

Read in the file

# Read in file using tidyverse code-----
mm.df <- read_csv("data/mms.csv")
Rows: 816 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): center, color
dbl (2): diameter, mass

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Read in excel files

Note that you can read in excel files in the same way.

# Note you can read in excel files just as easy
  mm_excel.df <- read_excel("data/mms.xlsx")

Look at dataframe structure

One way is to click the blue trianlge in the environment tab in the upper right
You can also use code to inspect the structure of the dataset

# data Structure
str(mm.df)
spc_tbl_ [816 × 4] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ center  : chr [1:816] "peanut butter" "peanut butter" "peanut butter" "peanut butter" ...
 $ color   : chr [1:816] "blue" "brown" "orange" "brown" ...
 $ diameter: num [1:816] 16.2 16.5 15.5 16.3 15.6 ...
 $ mass    : num [1:816] 2.18 2.01 1.78 1.98 1.62 2.59 1.9 2.55 2.07 2.26 ...
 - attr(*, "spec")=
  .. cols(
  ..   center = col_character(),
  ..   color = col_character(),
  ..   diameter = col_double(),
  ..   mass = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 
# or
glimpse(mm.df)
Rows: 816
Columns: 4
$ center   <chr> "peanut butter", "peanut butter", "peanut butter", "peanut bu…
$ color    <chr> "blue", "brown", "orange", "brown", "yellow", "brown", "yello…
$ diameter <dbl> 16.20, 16.50, 15.48, 16.32, 15.59, 17.43, 15.45, 17.30, 16.37…
$ mass     <dbl> 2.18, 2.01, 1.78, 1.98, 1.62, 2.59, 1.90, 2.55, 2.07, 2.26, 1…

Saving files

Before we go too far it is often important to save the modified data
We can use the read_r package to do this with write_csv

# Saving files -----
# We can save the file we just read in using 
# Saving dataframes -----
# lets say you have made a lot of changes and its now time to save the dataframe
write_csv(mm.df, "output/mm_output.csv")