6  Import/Export

6.1 CSV

6.1.1 {utils}::read.csv()

{utils} is part of base R

im_utils <- utils::read.csv("data/import.csv")
im_utils
  Name.1 Name..2. Name..3
1    ch1        1      11
2    ch2        2      12
3    ch3        3      13
4    ch4       NA      14
5    ch5        5      15
6               6      16
7    ch7        7      17
class(im_utils)
[1] "data.frame"
  • output is data frame (base R)
  • missing for character is blank
  • missing for numeric is ‘NA’
  • spaces or special characters in headers are replaced with period
  • Sometimes ï.. added to beginning of first column name (to remove use read.csv("example.csv", fileEncoding = 'UTF-8-BOM'); Source: roelpeters | remove i umlaut)
  • remove headers with header = FALSE; column names will be V1, V2, V3, etc.

6.1.2 {readr}::read_csv()

im_readr <- readr::read_csv("data/import.csv")
Rows: 7 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Name 1
dbl (2): Name (2), Name #3

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
im_readr
# A tibble: 7 × 3
  `Name 1` `Name (2)` `Name #3`
  <chr>         <dbl>     <dbl>
1 ch1               1        11
2 ch2               2        12
3 ch3               3        13
4 ch4              NA        14
5 ch5               5        15
6 <NA>              6        16
7 ch7               7        17
class(im_readr)
[1] "spec_tbl_df" "tbl_df"      "tbl"         "data.frame" 
  • output is tibble (tidyverse)
  • receive message with type of columns that R is using for import
  • missing for character and numeric is NA; outputs (html) will show <NA> for missing character and NA for missing numeric
  • headers are that have spaces or special characters are placed within back ticks (``)
  • remove headers with col_names = FALSE; column names will be X1, X2, X3, etc.

6.1.2.1 Specify Column Types

R does a pretty good job of figuring out what the columns should be but if its needed to specify column types column types can be specified as shown below. If you don’t want the column types message to show and don’t want to show column types use the show_col_types = FALSE.

test <- readr::read_csv("data/import.csv"
         , col_types = readr::cols(
            `Name 1`   = readr::col_character()
          , `Name (2)` = readr::col_double()
          , `Name #3`  = readr::col_double()
           )
)
test
# A tibble: 7 × 3
  `Name 1` `Name (2)` `Name #3`
  <chr>         <dbl>     <dbl>
1 ch1               1        11
2 ch2               2        12
3 ch3               3        13
4 ch4              NA        14
5 ch5               5        15
6 <NA>              6        16
7 ch7               7        17

6.1.3 Export

Can use utils::write.csv() or readr::write_csv() - have slightly different functionality.

utils::write.csv(test, "data/export1_utils.txt")
 "", "Name 1","Name (2)","Name #3"
"1",    "ch1",        1,       10
"2",    "ch2",        2,       12
"3",    "ch3",        3,       13
"4",    "ch4",       NA,       14
"5",    "ch5",        5,       15
"6",      NA,         6,       16
"7",     "ch7",       7,       17
readr::write_csv(test, "data/export2_readr.txt")
Name 1, Name (2),Name #3
   ch1,        1,     10
   ch2,        2,     12
   ch3,        3,     13
   ch4,       NA,     14
   ch5,        5,     15
    NA,        6,     16
   ch7,        7,     17

Rownames; row.names = TRUE to include; row.names = FALSE to exclude

  • utils::write.csv() default includes row names (usually row number)
  • readr::write_csv() default does not include row names; CANNOT ADD

NA values; na = "" to have missing data be exported as blank cell

  • If data set is a base R data frame:
    • write.csv() default is na = "NA" for numeric, always blank for character (CAN’T CHANGE!)
    • write_csv() default is na = "NA" for numeric, always blank for character (CAN’T CHANGE!)
  • If data set is a tibble:
    • write.csv() default is na = "NA" for numeric and character
    • write_csv() default is na = "NA" for numeric and character