Book Image

IPython Interactive Computing and Visualization Cookbook - Second Edition

By : Cyrille Rossant
Book Image

IPython Interactive Computing and Visualization Cookbook - Second Edition

By: Cyrille Rossant

Overview of this book

Python is one of the leading open source platforms for data science and numerical computing. IPython and the associated Jupyter Notebook offer efficient interfaces to Python for data analysis and interactive visualization, and they constitute an ideal gateway to the platform. IPython Interactive Computing and Visualization Cookbook, Second Edition contains many ready-to-use, focused recipes for high-performance scientific computing and data analysis, from the latest IPython/Jupyter features to the most advanced tricks, to help you write better and faster code. You will apply these state-of-the-art methods to various real-world examples, illustrating topics in applied mathematics, scientific modeling, and machine learning. The first part of the book covers programming techniques: code quality and reproducibility, code optimization, high-performance computing through just-in-time compilation, parallel computing, and graphics card programming. The second part tackles data science, statistics, machine learning, signal and image processing, dynamical systems, and pure and applied mathematics.
Table of Contents (19 chapters)
IPython Interactive Computing and Visualization CookbookSecond Edition
Contributors
Preface
Index

Fitting a probability distribution to data with the maximum likelihood method


A good way to explain a dataset is to apply a probabilistic model to it. Finding an adequate model can be a job on its own. Once a model is chosen, it is necessary to compare it to the data. This is what statistical estimation is about. In this recipe, we apply the maximum likelihood method on a dataset of survival times after heart transplant (1967-1974 study).

Getting ready

As usual in this chapter, a background in probability theory and real analysis is recommended. In addition, you need the statsmodels package to retrieve the test dataset. It should be included in Anaconda, but you can always install it with the conda install statsmodels command.

How to do it...

  1. statsmodels is a Python package for conducting statistical data analyses. It also contains real-world datasets that we can use when experimenting with new methods. Here, we load the heart dataset:

    >>> import numpy as np
        import scipy.stats as...