Book Image

Hands-On Data Science and Python Machine Learning

By : Frank Kane
Book Image

Hands-On Data Science and Python Machine Learning

By: Frank Kane

Overview of this book

Join Frank Kane, who worked on Amazon and IMDb’s machine learning algorithms, as he guides you on your first steps into the world of data science. Hands-On Data Science and Python Machine Learning gives you the tools that you need to understand and explore the core topics in the field, and the confidence and practice to build and analyze your own machine learning models. With the help of interesting and easy-to-follow practical examples, Frank Kane explains potentially complex topics such as Bayesian methods and K-means clustering in a way that anybody can understand them. Based on Frank’s successful data science course, Hands-On Data Science and Python Machine Learning empowers you to conduct data analysis and perform efficient machine learning using Python. Let Frank help you unearth the value in your data using the various data mining and data analysis techniques available in Python, and to develop efficient predictive models to predict future results. You will also learn how to perform large-scale machine learning on Big Data using Apache Spark. The book covers preparing your data for analysis, training machine learning models, and visualizing the final data analysis.
Table of Contents (11 chapters)

Running Python scripts

Throughout this book, we'll be using the IPython/Jupyter Notebook format (which are .ipynb files) that we've been looking at so far, and it's a great format for a book like this because it lets me put little blocks of code in there and put a little text and things around it explaining what it's doing, and you can experiment with things live.

Of course, it's great from that standpoint, but in the real world, you're probably not going to be using IPython/Jupyter Notebooks to actually run your Python scripts in production, so let me just really briefly go through the other ways you can run Python code, and other interactive ways of running Python code as well. So it's a pretty flexible system. Let's take a look.

More options than just the IPython/Jupyter Notebook

I want to make sure that you know there's more than one way to run Python code. Now, throughout this book, we'll be using the IPython/Jupyter Notebook format but in the real world, you're not going to be running your code as a notebook. You're going to be running it as a standalone Python script. So I just want to make sure you know how to do that and see how it works.

So let's go back to this first example that we ran in the book, just to illustrate the importance of whitespace. We can just select and copy that code out of the notebook format and paste it into a new file.

This can be done by clicking on the New button at the extreme left. So let's make a new file and paste it in and let's save this file and call it, test.py, where py is the usual extension that we give to Python scripts. Now, I can run this in a few different ways.

Running Python scripts in command prompt

I can actually run the script in a command prompt. If I go to Tools, I can go to Canopy Command Prompt, and that will open up a command window that has all the necessary environment variables already in place for running Python. I can just type python test.py and run the script, and out comes my result:

So in the real world, you'd probably do something like that. It might be on a Crontab or something like that, who knows? But running a real script in production is just that simple. You can now close the command prompt.

Using the Canopy IDE

Moving back, I can also run the script from within the IDE. So from within Canopy, I can go to the Run menu. I can either go to Run | Run File, or click on the little play icon, and that will also execute my script, and see the results at the bottom in the output window, as shown in the following screenshot:

So that's another way to do it, and finally, you can also run scripts within this interactive prompt present at the bottom interactively. I can actually type in Python commands one at a time down, and have them just execute and stay within the environment down there:

For example, I could say stuff, make it a list call, and have 1, 2, 3, 4, and now I can say len(stuff), and that will give me 4:

I can say, for x in stuff:print x, and we get output as 1 2 3 4:

So you can see you can kind of makeup scripts as you go down in the interactive prompt at the bottom and execute things one thing at a time. In this example, stuff is a variable we created, a list that stays in memory, it's kind of like a global variable in other languages within this environment.

Now if I do want to reset this environment, if I want to get rid of stuff and start all over, the way you do that is you go up to the Run menu here and you can say Restart Kernel, and that will strike you over with a blank slate:

So now I have a new Python environment that's a clean slate, and in this case, what did I call it? Type stuff and stuff doesn't exist yet because I have a new environment, but I can make it something else, such as [4, 5, 6]; run it and there it is:

So there you have it, three ways of running Python code: the IPython/Jupyter Notebook, which we'll use throughout this book just because it's a good learning tool, you can also run scripts as standalone script files, and you can also execute Python code in the interactive command prompt.

So there you have it, and there you have three different ways of running Python code and experimenting and running things in production. So keep that in mind. We'll be using notebooks throughout the rest of this book, but again, you have those other options when the time comes.