Book Image

R Data Visualization Cookbook

Book Image

R Data Visualization Cookbook

Overview of this book

Table of Contents (17 chapters)
R Data Visualization Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Importing data in R


Data comes in various formats. Most of the data available online can be downloaded in the form of text documents (.txt extension) or as comma-separated values (.csv). We also encounter data in the tab-delimited format, XLS, HTML, JSON, XML, and so on. If you are interested in working with data, either in JSON or XML, refer to the recipe Constructing a bar plot using XML in R in Chapter 10, Creating Applications in R.

How to do it...

In order to import a CSV file in R, we can use the read.csv() function:

test = read.csv("raw.csv", sep = ",", header = TRUE)

Alternatively, read.table() function allows us to import data with different separators and formats. Following are some of the methods used to import data in R:

How it works…

The first argument in the read.csv() function is the filename, followed by the separator used in the file. The header = TRUE argument is used to instruct R that the file contains headers. Please note that R will search for this file in its current directory. We have to specify the directory containing the file using the setwd() function. Alternatively, we can navigate and set our working directory by navigating to Sessions | Set working directory | Choose directory.

The first argument in the read.table() function is the filename that contains the data, the second argument states that the data contains the header, and the third argument is related to the separator. If our data consists of a semi colon (;), a tab delimited, or the @ symbol as a separator, we can specify this under the sep ="" argument. Note that, to specify a separator as a tab delimited, users would have to substitute sep = "," with sep ="\t" in the read.table() function.

One of the other useful arguments is the row.names argument. If we omit row.names, R will use the column serial numbers as row.names. We can assign row.names for our data by specifying it as row.names = c("Name").