Creating ZIP files
GIS often requires the use of large files that will be compressed into a .zip
format for ease of sharing. Python includes a module that you can use to decompress and compress files in this format.
Getting ready
ZIP is a common compression and archive format and is implemented in Python through the zipfile
module. The ZipFile
class can be used to create, read, and write .zip
files. To create a new .zip
file, simply provide the filename along with a mode as w
, which indicates that you want to write data to the file. In the following code example, we are creating a .zip
file called datafile.zip
. The second parameter, w
, indicates that a new file will be created. A new file will be created or an existing file with the same name will be truncated in the write mode. An optional compression parameter can also be used when creating the file. This value can be set to either ZIP_STORED
or ZIP_DEFLATED
:
zipfile.ZipFile('dataFile.zip', 'w',zipfile.ZIP_STORED)
In this exercise, you will use Python to create file, add files, and apply compression to a .zip
file. You'll be archiving all the shapefiles located in the C:\ArcpyBook\data
directory.
How to do it…
Follow these steps to learn how to create a script that builds a .zip
file:
Open IDLE and create a script called
C:\ArcpyBook\Appendix2\CreateZipfile.py
.Import the
zipfile
andos
modules:import os import zipfile
Create a new
.zip
file calledshapefiles.zip
in write mode and add a compression parameter:zfile = zipfile.ZipFile("shapefiles.zip", "w", zipfile.ZIP_STORED)
Next, we'll use the
os.listdir()
function to create a list of files in the data directory:files = os.listdir("c:/ArcpyBook/data")
Loop through a list of all the files and write to the
.zip
file if the file ends withshp
,dbf
, orshx
:for f in files: if f.endswith("shp") or f.endswith("dbf") or f.endswith("shx"): zfile.write("C:/ArcpyBook/data/" + f)
Print out a list of all the files that were added to the ZIP archive. You can use the
ZipFile.namelist()
function to create a list of files in the archive.for f in zfile.namelist(): print "Added %s" % f
Close the
.zip
archive:zfile.close()
The entire script should appear as follows:
import os import zipfile #create the zip file zfile = zipfile.ZipFile("shapefiles.zip", "w", zipfile.ZIP_STORED) files = os.listdir("c:/ArcpyBook/data") for f in files: if f.endswith("shp") or f.endswith("dbf") or f.endswith("shx"): zfile.write("C:/ArcpyBook/data/" + f) #list files in the archive for f in zfile.namelist(): print("Added %s" % f) zfile.close()
You can check your work by examining the
C:\ArcpyBook\code\Appendix2
\CreateZipfile_Step1
.py solution file.Save and run the script. You should see the following output:
Added ArcpyBook/data/Burglaries_2009.dbf Added ArcpyBook/data/Burglaries_2009.shp Added ArcpyBook/data/Burglaries_2009.shx Added ArcpyBook/data/Streams.dbf Added ArcpyBook/data/Streams.shp Added ArcpyBook/data/Streams.shx
In Windows Explorer, you should be able to see the output
.zip
file, as shown in the following screenshot. Note the size of archive. This file was created without compression:Now, we're going to create a compressed version of the
.zip
file to see the difference. Make the following changes to the line of code that creates the.zip
file:zfile = zipfile.ZipFile("shapefiles2.zip", "w", zipfile.ZIP_DEFLATED)
You can check your work by examining the
C:\ArcpyBook\code\Appendix2
\CreateZipfile_Step2
.py solution file.Save and rerun the script.
Take a look at the size of the new
shapefiles2.zip
file that you just created. Note the decreased size of the file due to compression:
How it works…
In this recipe, you created a new .zip
file called shapefiles.zip
in write mode. In the first iteration of this script, you didn't compress the contents of the file. However, in the second iteration, you did it by using the DEFLATED
parameter that was passed into the constructor for the ZipFile
object. The script then obtained a list of files in the data directory and looped through each of the files. Each file that has an extension of .shp
, .dbf
, or .shx
is then written to the archive file, using the write()
function. Finally, the names of each of the files written to the archive are printed to the screen.
There's more…
The contents of an existing file stored in a ZIP archive can be read by using the read()
method. The file should first be opened in a read mode, and then you can call the read()
method passing a parameter that represents the filename that should be read. The contents of the file can then be printed to the screen, written to another file, or stored as a list or dictionary variable.