Book Image

Learning R Programming

By : Kun Ren
Book Image

Learning R Programming

By: Kun Ren

Overview of this book

R is a high-level functional language and one of the must-know tools for data science and statistics. Powerful but complex, R can be challenging for beginners and those unfamiliar with its unique behaviors. Learning R Programming is the solution - an easy and practical way to learn R and develop a broad and consistent understanding of the language. Through hands-on examples you'll discover powerful R tools, and R best practices that will give you a deeper understanding of working with data. You'll get to grips with R's data structures and data processing techniques, as well as the most popular R packages to boost your productivity from the offset. Start with the basics of R, then dive deep into the programming techniques and paradigms to make your R code excel. Advance quickly to a deeper understanding of R's behavior as you learn common tasks including data analysis, databases, web scraping, high performance computing, and writing documents. By the end of the book, you'll be a confident R programmer adept at solving problems with the right techniques.
Table of Contents (21 chapters)
Learning R Programming
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

The need for R


R stands out from a wide variety of statistical software for the following reasons:

  • Free of charge: R is totally free. You don't need to buy a license, so there is no financial entry barrier to use it and most of its extension packages.

  • Open-source: R and most of its packages are fully open source. Thousands of developers are constantly reviewing the source code of the packages to check whether there are bugs to fix or things to improve. If you encounter exceptions, you can even dig into the source code, find where the problem is, and contribute to fixing it.

  • Popular: R is a very popular, if not the most popular, statistical programming language and platform to perform data mining, analysis, and visualization. High popularity often means easier communication between you and other users because you "speak" the same language.

  • Flexible: R is a dynamic script language. It is highly flexible to allow programming styles in multiple paradigms, including functionality programming and object-oriented programming. It also supports flexible metaprogramming. Its flexibility enables you to perform highly customized and comprehensive data transformation and visualization.

  • Reproducible: When using software based on a graphical user interface, you only need to choose from menus and click buttons. However, it is hard to accurately reproduce what you have done automatically without writing scripts.

    In most scientific research areas and many industrial applications, reproducibility is necessary for many reasons. R scripts can precisely describe what you do with the computing environment and data so that it is fully reproducible from scratch.

  • Rich resources: R has a huge, rapidly increasing number of online resources. One type of resource is extension packages. There are, at the time of writing this, more than 7,500 packages available at CRAN (short for Comprehensive R Archive Network), a world-wide network of mirror servers from which you can get identical, up-to-date, R distributions and packages.

    These packages are created and maintained by more than 4,500 package developers in almost all data-related areas, such as multivariate analysis, time series analysis, econometrics, Bayesian inference, optimization, finance, genetics, chemometrics, computational physics, and many others. Take a look at CRAN Task View (https://cran.r-project.org/web/views/) for a good summary.

    In addition to the enormous number of packages, there are also a great number of authors who regularly write personal blogs and Stack Overflow answers and share their thoughts, experiences, and recommended practices. Plus, there are a lot of websites specializing in R, such as R-bloggers (http://www.r-bloggers.com/), R documentation (http://www.rdocumentation.org/), and METACRAN (http://www.r-pkg.org/).

  • Strong community: The community of R consists of not only R developers but also, (the majority), R users from a wide range of backgrounds such as statistics, econometrics, finance, bioinformatics, mechanical engineering, physics, medicine, and so on.

    A great number of R developers actively contribute to open source projects or packages written in R. The goal of the community is to make data analysis, exploration, and visualization easier and more interesting.

    If you are stuck in a problem in R, just google what puzzles you; probably, there are already some answers to your question. If not, just ask a question on Stack Overflow and you will get a response in a very short time.

  • Cutting-edge: Many R users are professional researchers in statistics, econometrics, or other disciplines. Quite often, authors publish their new papers along with a new package that includes the cutting-edge techniques presented in the paper. Maybe it's a new statistical test, a pattern recognition method, or a better-tuned optimization algorithm.

    No matter what it is, the R community has the privilege of applying cutting-edge data science knowledge in the real world  often ahead of everyone else, improving its functionality and revealing its potential.