Loading Files From Other Programs
Site: | Saylor Academy |
Course: | PRDV420: Introduction to R Programming |
Book: | Loading Files From Other Programs |
Printed by: | Guest user |
Date: | Friday, May 17, 2024, 2:04 AM |
Description
User-contributed packages provide tools for loading into R data saved in many other formats. Often several packages can load the same file format – you can find them by searching on the internet.
Loading Files from Other Programs
You should follow the same advice I gave you for Excel files whenever you wish to work with file formats native to other programs: open the file in the original program and export the data as a plain-text file, usually a CSV. This will ensure the most faithful transcription of the data in the file, and it will usually give you the most options for customizing how the data is transcribed.
Sometimes, however, you may acquire a file but not the program it
came from. As a result, you won't be able to open the file in its native
program and export it as a text file. In this case, you can use one of
the functions in Table D.4 to open the file. These functions mostly come in R's foreign
package. Each attempts to read in a different file format with as few hiccups as possible.
Table D.4: A number of functions will attempt to read the file types of other data-analysis programs
File format | Function | Library |
---|---|---|
ERSI ArcGIS | read.shapefile |
shapefiles |
Matlab | readMat |
R.matlab |
minitab | read.mtp |
foreign |
SAS (permanent data set) | read.ssd |
foreign |
SAS (XPORT format) | read.xport |
foreign |
SPSS | read.spss |
foreign |
Stata | read.dta |
foreign |
Systat | read.systat |
foreign |
Connecting to Databases
You can also use R to connect to a database and read in data.
Use the RODBC package to connect to databases through an ODBC connection.
Use the DBI package to connect to databases through individual drivers. The DBI package provides a common syntax for working with different databases. You will have to download a database-specific package to use in conjunction with DBI. These packages provide the API for the native drivers of different database programs. For MySQL use RMySQL, for SQLite use RSQLite, for Oracle use ROracle, for PostgreSQL use RPostgreSQL, and for databases that use drivers based on the Java Database Connectivity (JDBC) API use RJDBC. Once you have loaded the appropriate driver package, you can use the commands provided by DBI to access your database.
Source: G. Grolemund, https://rstudio-education.github.io/hopr/dataio.html
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
Other types of data
To get other types of data into R, we recommend starting with the tidyverse packages listed below. They're certainly not perfect, but they are a good place to start. For rectangular data:
-
haven reads SPSS, Stata, and SAS files.
-
readxl reads excel files (both
.xls
and.xlsx
). -
DBI, along with a database specific backend (e.g. RMySQL, RSQLite, RPostgreSQL etc) allows you to run SQL queries against a database and return a data frame.
For hierarchical data: use jsonlite (by Jeroen Ooms) for json, and xml2 for XML. Jenny Bryan has some excellent worked examples at https://jennybc.github.io/purrr-tutorial/.
For other file types, try the R data import/export manual and the rio package.
Source: H. Wickham and G. Grolemund, https://r4ds.had.co.nz/data-import.html
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.