Reading data from a delimited text file
File handling with Python is a very important topic for GIS programmers. Text files have been used as an interchange format to exchange data between systems. They are simple, cross-platform, and easy to process. Comma and tab-delimited text files are among the most commonly used formats for text files, so we'll take an extensive look at the Python tools available to process these files. A common task for GIS programmers is to read comma-delimited text files containing x and y coordinates, along with other attribute information. This information is then converted into GIS data formats, such as shapefiles or geodatabases.
Getting ready
To use Python's built-in file processing functionality, you must first open the file. Once open, data within the file is processed using functions provided by Python, and finally, the file is closed.
Tip
Always remember to close the file when you're done. Python does not necessarily close the files for you, so it is possible that you could run out of resources or overwrite something. Also, some operating system platforms won't let the same file be simultaneously open for read-only and writing purposes.
In this recipe, you will learn how to open, read, and process a comma-delimited text file.
How to do it…
Follow these steps to create a Python script that reads a comma-delimited text file:
In your
C:\ArcpyBook\data
folder, you will find a file calledN_America.A2007275.txt
. Open this file in a text editor. It should appear as follows:18.102,-94.353,310.7,1.3,1.1,10/02/2007,0420,T,72 19.300,-89.925,313.6,1.1,1.0,10/02/2007,0420,T,82 19.310,-89.927,309.9,1.1,1.0,10/02/2007,0420,T,68 26.888,-101.421,307.3,2.7,1.6,10/02/2007,0425,T,53 26.879,-101.425,306.4,2.7,1.6,10/02/2007,0425,T,45 36.915,-97.132,342.4,1.0,1.0,10/02/2007,0425,T,100
Note
This file contains data related to wildfire incidents that was derived from a satellite sensor from a single day in 2007. Each row contains latitude and longitude information for the fire along with additional information, including the date and time, satellite type, confidence value, and other details. In this recipe, you are going to pull out only the latitude, longitude, and confidence value. The first item contains the latitude, the second contains longitude, and the final value contains the confidence value.
Open IDLE and create a file called
C:\ArcpyBook\Appendix2\ReadDelimitedTextFile.py
.Use the Python
open()
function to open the file in order to read it:f = open("c:/ArcpyBook/data/N_America.A2007275.txt','r')
Add a
for
loop to iterate all the rows:for fire in f:
Use the
split()
function to split the values into a list, using a comma as the delimiter. The list will be assigned to a variable calledlstValues
. Make sure that you indent this line of code inside the for loop you just created:lstValues = fire.split(",")
Using the index values that reference latitude, longitude, and confidence values, create new variables:
latitude = float(lstValues[0]) longitude = float(lstValues[1]) confid = int(lstValues[8])
Print the values of each with the
print
statement:print("The latitude is: " + str(latitude) + " The longitude is: " + str(longitude) + " The confidence value is: " + str(confid))
Close the file:
f.close()
The entire script should appear as follows:
f = open('c:/ArcpyBook/data/N_America.A2007275.txt','r') for fire in f.readlines(): lstValues = fire.split(',') latitude = float(lstValues[0]) longitude = float(lstValues[1]) confid = int(lstValues[8]) print("The latitude is: " + str(latitude) + " The longitude is: " + str(longitude) + " The confidence value is: " + str(confid)) f.close()
You can check your work by examining the
C:\ArcpyBook\code\Appendix2
\ReadDelimitedTextFile.py
solution file.Save and run the script. You should see the following output:
The latitude is: 18.102 The longitude is: -94.353 The confidence value is: 72 The latitude is: 19.3 The longitude is: -89.925 The confidence value is: 82 The latitude is: 19.31 The longitude is: -89.927 The confidence value is: 68 The latitude is: 26.888 The longitude is: -101.421 The confidence value is: 53 The latitude is: 26.879 The longitude is: -101.425 The confidence value is: 45 The latitude is: 36.915 The longitude is: -97.132 The confidence value is: 100
How it works…
Python's open()
function creates a file object, which serves as a link to a file residing on your computer. You must call the open()
function on a file before reading or writing data in a file. The first parameter for the open()
function is a path to the file you'd like to open. The second parameter of the open()
function corresponds to a mode, which is typically read (r
), write (w
), or append (a
). A value of r
indicates that you'd like to open the file for read-only operations, while a value of w
indicates that you'd like to open the file for write operations. If the file you open in write mode already exists, it will overwrite any existing data in the file, so be careful when using this mode. Append (a
) mode will open a file for write operations, but instead of overwriting any existing data, it will append data to the end of the file. So, in this recipe, we have opened the N_America.A2007275.txt
file in read-only mode.
Inside the for
loop, which is used to loop through each of the values in the text file one line at a time, the split()
function is used to create a list object from a line of text that is delimited in some way. Our file is comma-delimited, so we can use split(",")
. You can also split based on other delimiters, such as tabs, spaces, or any other delimiter. This new list object created by split()
is stored in a variable called lstValues
. This variable contains each of the wildfire values. This is illustrated in the following screenshot. You'll notice that latitude is located in the first position, longitude is located in the second position, and so on. Lists are zero-based:
Using the index values (which references latitude, longitude, and confidence values), we create new variables called latitude
, longitude
, and confid
. Finally, we print each of the values. A more robust geoprocessing script might write this information into a feature class using an InsertCursor
object. We actually did this in a previous recipe in Chapter 8, Using the ArcPy Data Access Module with Feature Classes and Tables.
It would be possible to use the readlines()
function to read the entire contents of the file into a Python list, which could then be iterated. Each row in the text file will be a unique value in the list. Since this function reads the entire file into a list, you need to use this method with caution, as large files can cause significant performance problems.
There's more...
Similar to instances of reading files, there are a number of methods that you can use to write data to a file. The write()
function is probably the easiest to use. It takes a single string argument and writes it to a file. The writelines()
function can be used to write the contents of a list structure to a file. Before writing data to a text file, you will need to open the file in either a write or append mode.