Book Image

Hands-On Data Science and Python Machine Learning

By : Frank Kane
Book Image

Hands-On Data Science and Python Machine Learning

By: Frank Kane

Overview of this book

Join Frank Kane, who worked on Amazon and IMDb’s machine learning algorithms, as he guides you on your first steps into the world of data science. Hands-On Data Science and Python Machine Learning gives you the tools that you need to understand and explore the core topics in the field, and the confidence and practice to build and analyze your own machine learning models. With the help of interesting and easy-to-follow practical examples, Frank Kane explains potentially complex topics such as Bayesian methods and K-means clustering in a way that anybody can understand them. Based on Frank’s successful data science course, Hands-On Data Science and Python Machine Learning empowers you to conduct data analysis and perform efficient machine learning using Python. Let Frank help you unearth the value in your data using the various data mining and data analysis techniques available in Python, and to develop efficient predictive models to predict future results. You will also learn how to perform large-scale machine learning on Big Data using Apache Spark. The book covers preparing your data for analysis, training machine learning models, and visualizing the final data analysis.
Table of Contents (11 chapters)

Understanding Python code

The first example that I want to give you of some Python code is right here. The following block of code represents some real Python code that we can actually run right within this view of the entire notebook page, but let's zoom in now and look at that code:


Let's take a look at what's going on. We have a list of numbers and a list in Python, kind of like an array in other languages. It is designated by these square brackets:

We have this data structure of a list that contains the numbers 1 through 6, and then to iterate through every number in that list, we'll say for number in listOfNumbers:, that's the Python syntax for iterating through a list of stuff and a colon.

Tabs and whitespaces have real meaning in Python, so you can't just format things the way you want to. You have to pay attention to them.

The point that I want to make is that in other languages, it's pretty typical to have a bracket or a brace of some sort there to denote that I'm inside a for loop, an if block, or some sort of block of code, but in Python, that's all designated with whitespaces. Tab is actually important in telling Python what's in which block of code:

for number in listOfNumbers: 
    print number, 
    if (number % 2 == 0): 
        print ("is even")
    else: 
        print ("is odd") 
         
print ("Hooray! We're all done.")

You'll notice that within this for block, we have a tab of one within that entire block, and for every number in listOfNumbers we will execute all of this code that's tabbed in by one Tab stop. We'll print the number, and the comma just means that we're not going to do a new line afterwards. We'll print something else right after it, and if (number % 2 = 0), we'll say it's even. Otherwise, we'll say it's odd, and when we're done, we'll print out All done:

You can see the output right below the code. I ran the output before as I had actually saved it within my notebook, but if you want to actually run it yourself, you can just click within that block and click on the Play button, and we'll actually execute it and do it again. Just to convince yourself that it's really doing something, let's change the print statement to say something else, say, Hooray! We're all done. Let's party! If I run this now, you can see, sure enough, my message there has changed:

So again, the point I want to make is that whitespace is important. You will designate blocks of code that run together, you know, such as a for loop or if then statements, using indentation or tabs, so remember that. Also, pay attention to your colons too. You'll notice that a lot of these clauses begin with a colon.