Book Image

IPython Interactive Computing and Visualization Cookbook - Second Edition

By : Cyrille Rossant
Book Image

IPython Interactive Computing and Visualization Cookbook - Second Edition

By: Cyrille Rossant

Overview of this book

Python is one of the leading open source platforms for data science and numerical computing. IPython and the associated Jupyter Notebook offer efficient interfaces to Python for data analysis and interactive visualization, and they constitute an ideal gateway to the platform. IPython Interactive Computing and Visualization Cookbook, Second Edition contains many ready-to-use, focused recipes for high-performance scientific computing and data analysis, from the latest IPython/Jupyter features to the most advanced tricks, to help you write better and faster code. You will apply these state-of-the-art methods to various real-world examples, illustrating topics in applied mathematics, scientific modeling, and machine learning. The first part of the book covers programming techniques: code quality and reproducibility, code optimization, high-performance computing through just-in-time compilation, parallel computing, and graphics card programming. The second part tackles data science, statistics, machine learning, signal and image processing, dynamical systems, and pure and applied mathematics.
Table of Contents (19 chapters)
IPython Interactive Computing and Visualization CookbookSecond Edition
Contributors
Preface
Index

Manipulating large arrays with HDF5


NumPy arrays can be persistently saved on disk using built-in functions in NumPy such as np.savetxt(), np.save(), or np.savez(), and loaded in memory using analogous functions. Common file formats for data arrays include raw binary files as in the previous recipe, the NPY file format implemented by NumPy (raw binary files with a header containing the metadata), and Hierarchical Data Format (HDF5).

An HDF5 file contains one or several datasets (arrays or heterogeneous tables) organized into a POSIX-like hierarchy. Datasets may be accessed lazily with memory mapping. In this recipe, we will use h5py, a Python package designed to deal with HDF5 files with a NumPy-like programming interface.

Getting ready

You need h5py for this recipe and the next one. It should be included with Anaconda, but you can also install it with conda install h5py.

How to do it...

  1. First, we need to import NumPy and h5py:

    >>> import numpy as np
        import h5py
  2. Let's create a new...