Book Image

Data Science Using Python and R

By : Chantal D. Larose, Daniel T. Larose
Book Image

Data Science Using Python and R

By: Chantal D. Larose, Daniel T. Larose

Overview of this book

Data science is hot. Bloomberg named a data scientist as the ‘hottest job in America’. Python and R are the top two open-source data science tools using which you can produce hands-on solutions to real-world business problems, using state-of-the-art techniques. Each chapter in the book presents step-by-step instructions and walkthroughs for solving data science problems using Python and R. You’ll learn how to prepare data, perform exploratory data analysis, and prepare to model the data. As you progress, you’ll explore what are decision trees and how to use them. You’ll also learn about model evaluation, misclassification costs, naïve Bayes classification, and neural networks. The later chapters provide comprehensive information about clustering, regression modeling, dimension reduction, and association rules mining. The book also throws light on exciting new topics, such as random forests and general linear models. The book emphasizes data-driven error costs to enhance profitability, which avoids the common pitfalls that may cost a company millions of dollars. By the end of this book, you’ll have enough knowledge and confidence to start providing solutions to data science problems using R and Python.
Table of Contents (20 chapters)
Free Chapter
1
ABOUT THE AUTHORS
17
INDEX
18
END USER LICENSE AGREEMENT

EXERCISES

CLARIFYING THE CONCEPTS

  1. Neural networks classification represents an attempt to imitate what?
  2. Using Figure 9.1, explain how an artificial neuron model imitates the actions of real neurons.
  3. What is the main benefit of neural networks for modeling? What gives neural networks this power?
  4. Describe the main drawback of neural network modeling.
  5. Explain what we mean when we say that a neural network is completely connected.
  6. Describe the benefits and drawbacks of using more or fewer nodes in the hidden layer.
  7. Referring to the example in the text, calculate netB = 1.5 and imagesf(netB)=11+e1.5=0.8176--
  8. Explain how the sigmoid function combines nearly linear behavior, curvilinear behavior, and nearly constant behavior.
  9. Describe the process of backpropagation.
  10. The essential problem for the neural network is to construct a set of weight that will minimize what?

WORKING WITH THE DATA

For the following exercises, work with the Framingham_training and Framingham_test data sets. Use either...