Scikit is short for SciPy Toolkits, which are add-on packages for SciPy. It provides a wide range of analytics modules and scikit-learn is one of them; this is by far the most comprehensive machine learning module for Python. scikit-learn provides a simple and efficient way to perform data mining and data analysis, and it has a very active user community.
You can download and install scikit-learn from its official website at http://scikit-learn.org/stable/. If you are using a Python scientific distribution, such as Anaconda, it is included here as well.
Now, it's time for some machine learning using scikit-learn. One of the advantages of scikit-learn is that it provides some sample datasets (demo datasets) for practice. Let's load the diabetes dataset first.
In [1]: from sklearn.datasets import load_diabetes In [2]: diabetes = load_diabetes() In [3]: diabetes.data Out[3]: array([[ 0.03807591, 0.05068012, 0.06169621, ..., -0.00259226, 0.01990842...