Book Image

Statistical Application Development with R and Python - Second Edition

Book Image

Statistical Application Development with R and Python - Second Edition

Overview of this book

Statistical Analysis involves collecting and examining data to describe the nature of data that needs to be analyzed. It helps you explore the relation of data and build models to make better decisions. This book explores statistical concepts along with R and Python, which are well integrated from the word go. Almost every concept has an R code going with it which exemplifies the strength of R and applications. The R code and programs have been further strengthened with equivalent Python programs. Thus, you will first understand the data characteristics, descriptive statistics and the exploratory attitude, which will give you firm footing of data analysis. Statistical inference will complete the technical footing of statistical methods. Regression, linear, logistic modeling, and CART, builds the essential toolkit. This will help you complete complex problems in the real world. You will begin with a brief understanding of the nature of data and end with modern and advanced statistical models like CART. Every step is taken with DATA and R code, and further enhanced by Python. The data analysis journey begins with exploratory analysis, which is more than simple, descriptive, data summaries. You will then apply linear regression modeling, and end with logistic regression, CART, and spatial statistics. By the end of this book you will be able to apply your statistical learning in major domains at work or in your projects.
Table of Contents (19 chapters)
Statistical Application Development with R and Python - Second Edition
Credits
About the Author
Acknowledgment
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface
Index

IDEs for R and Python


The Integrated Development Environment or IDE- most users do not use the software frontend these days. IDEs are convenient for many reasons and the uninitiated reader can search for the keyword. In very simple terms, the IDE may be thought of as the showroom and the core software as the factory. The RStudio appears to be the most popular IDE for R and Jupyter Notebook for Python.

The website for RStudio is https://www.rstudio.com/ and for Jupyter Notebook, it is http://jupyter.org/. The authors of the RStudio version are shown in the following screenshot:

We will not delve into details on the IDEs and the role they play. It is good enough to use them. More details about the importance of IDEs can be easily obtained on the web, and especially Wikipedia. An important Python distribution is Anaconda and there are lots of funny stories about the Anaconda-Python predators and how their names fascinate the software programmers. The Anaconda distribution is available at https://www.continuum.io/downloads and we recommend the reader to use the same. All the Python programs are run on the Jupyter Notebook IDE. The authors of the Anaconda Prompt are shown in the following screenshot:

The code in the jupyter notebook has not yet run. And if you enter that on your Anaconda Prompt and hit the return key, the IDE will be started. The frontend of the Jupyter notebook, which will be opened in your default internet browser, looks like the following:

Now, an important question is the need of different IDEs for different software. Of course, it is not necessary. The R software can be integrated with the Anaconda distribution, particularly with options later in the Jupyter IDE. Towards this, we need to run the code conda install -c r r-essentials in the Anaconda Prompt. Now, if you click on the New drop-down button, you will see two options under the Notebook: one is Python 3 and the other is R. Thus, you can now run Python as well as R in the Jupyter Notebook IDE:

Python Idle is also another popular IDE and the Windows version looks like this: