In fixed-width formatted files, columns have fixed widths; if a data element does not use up the entire allotted column width, then the element is padded with spaces to make up the specified width. To read fixed-width text files, specify columns by column widths or by starting positions.
Download the files for this chapter and store the student-fwf.txt
file in your R working directory.
Read the fixed-width formatted file as follows:
> student <- read.fwf("student-fwf.txt", widths=c(4,15,20,15,4), col.names=c("id","name","email","major","year"))
In the student-fwf.txt
file, the first column occupies 4 character positions, the second 15, and so on. The c(4,15,20,15,4)
expression specifies the widths of the five columns in the data file.
We can use the optional col.names
argument to supply our own variable names.
The read.fwf()
function has several optional arguments that come in handy. We discuss a few of these as follows:
Files with headers use the following command:
> student <- read.fwf("student-fwf-header.txt", widths=c(4,15,20,15,4), header=TRUE, sep="\t",skip=2)
If header=TRUE
, the first row of the file is interpreted as having the column headers. Column headers, if present, need to be separated by the specified sep
argument. The sep
argument only applies to the header row.
The skip
argument denotes the number of lines to skip; in this recipe, the first two lines are skipped.