NumPy arrays can be persistently saved on disk using built-in functions in NumPy such as np.savetxt()
, np.save()
, or np.savez()
, and loaded in memory using analogous functions. Common file formats for data arrays include raw binary files as in the previous recipe, the NPY file format implemented by NumPy (raw binary files with a header containing the metadata), and Hierarchical Data Format (HDF5).
An HDF5 file contains one or several datasets (arrays or heterogeneous tables) organized into a POSIX-like hierarchy. Datasets may be accessed lazily with memory mapping. In this recipe, we will use h5py
, a Python package designed to deal with HDF5 files with a NumPy-like programming interface.
You need h5py
for this recipe and the next one. It should be included with Anaconda, but you can also install it with conda install h5py
.