Book Image

Learning R Programming

By : Kun Ren
Book Image

Learning R Programming

By: Kun Ren

Overview of this book

R is a high-level functional language and one of the must-know tools for data science and statistics. Powerful but complex, R can be challenging for beginners and those unfamiliar with its unique behaviors. Learning R Programming is the solution - an easy and practical way to learn R and develop a broad and consistent understanding of the language. Through hands-on examples you'll discover powerful R tools, and R best practices that will give you a deeper understanding of working with data. You'll get to grips with R's data structures and data processing techniques, as well as the most popular R packages to boost your productivity from the offset. Start with the basics of R, then dive deep into the programming techniques and paradigms to make your R code excel. Advance quickly to a deeper understanding of R's behavior as you learn common tasks including data analysis, databases, web scraping, high performance computing, and writing documents. By the end of the book, you'll be a confident R programmer adept at solving problems with the right techniques.
Table of Contents (21 chapters)
Learning R Programming
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

Chapter 14. Web Scraping

R provides a platform with easy access to statistical computing and data analysis. Given a data set, it is handy to perform data transformation and apply analytic models and numeric methods with either flexible data structures or high performance, as discussed in previous chapters.

However, the input data set is not always as immediately available as tables provided by well-organized commercial databases. Sometimes, we have to collect data by ourselves. Web content is an important source of data for a wide range of research fields. To collect (scrape or harvest) data from the Internet, we need appropriate techniques and tools. In this chapter, we'll introduce the basic knowledge and tools of web scraping, including:

  • Looking inside web pages

  • Learning CSS and XPath selector

  • Analyzing HTML code and extracting data