Book Image

Practical Data Analysis

By : Hector Cuesta
Book Image

Practical Data Analysis

By: Hector Cuesta

Overview of this book

Plenty of small businesses face big amounts of data but lack the internal skills to support quantitative analysis. Understanding how to harness the power of data analysis using the latest open source technology can lead them to providing better customer service, the visualization of customer needs, or even the ability to obtain fresh insights about the performance of previous products. Practical Data Analysis is a book ideal for home and small business users who want to slice and dice the data they have on hand with minimum hassle.Practical Data Analysis is a hands-on guide to understanding the nature of your data and turn it into insight. It will introduce you to the use of machine learning techniques, social networks analytics, and econometrics to help your clients get insights about the pool of data they have at hand. Performing data preparation and processing over several kinds of data such as text, images, graphs, documents, and time series will also be covered.Practical Data Analysis presents a detailed exploration of the current work in data analysis through self-contained projects. First you will explore the basics of data preparation and transformation through OpenRefine. Then you will get started with exploratory data analysis using the D3js visualization framework. You will also be introduced to some of the machine learning techniques such as, classification, regression, and clusterization through practical projects such as spam classification, predicting gold prices, and finding clusters in your Facebook friends' network. You will learn how to solve problems in text classification, simulation, time series forecast, social media, and MapReduce through detailed projects. Finally you will work with large amounts of Twitter data using MapReduce to perform a sentiment analysis implemented in Python and MongoDB. Practical Data Analysis contains a combination of carefully selected algorithms and data scrubbing that enables you to turn your data into insight.
Table of Contents (24 chapters)
Practical Data Analysis
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

About the Reviewers

Mark Kerzner holds degrees in Law, Math, and Computer Science. He has been designing software for many years, and Hadoop-based systems since 2008. He is the President of SHMsoft, a provider of Hadoop applications for various verticals, and a co-author of the Hadoop Illuminated book/project. He has authored and co-authored books and patents.

Dr. Sampath Kumar works as an assistant professor and head of the Department of Applied Statistics at Telangana University. He has completed M.Sc, M.Phl, and Ph.D. in Statistics. He has five years of teaching experience for PG course. He has more than four years of experience in the corporate sector. His expertise is in statistical data analysis using SPSS, SAS, R, Minitab, MATLAB, and so on. He is an advanced programmer in SAS and matlab software. He has teaching experience in different, applied and pure statistics subjects such as forecasting models, applied regression analysis, multivariate data analysis, operations research, and so on for M.Sc students. He is currently supervising Ph.D. scholars.

Ricky J. Sethi is currently the Director of Research for The Madsci Network and a research scientist at University of Massachusetts Medical Center and UMass Amherst. Dr. Sethi's research tends to be interdisciplinary in nature, relying on machine-learning methods and physics-based models to examine issues in computer vision, social computing, and science learning. He received his B.A. in Molecular and Cellular Biology (Neurobiology)/Physics from the University of California, Berkeley, M.S. in Physics/Business (Information Systems) from the University of Southern California, and Ph.D. in Computer Science (Artificial Intelligence/Computer Vision) from the University of California, Riverside. He has authored or co-authored over 30 peer-reviewed papers or book chapters and was also chosen as an NSF Computing Innovation Fellow at both UCLA and USC's Information Sciences Institute.

Dr. Suchita Tripathi did her Ph.D. and M.Sc. at Allahabad University in Anthropology. She also has skills in computer applications and SPSS data analysis software. She has language proficiency in Hindi, English, and Japanese. She learned primary and intermediate level Japanese language from ICAS Japanese language training school, Sendai, Japan and received various certificates. She is the author of six articles and one book. She had two years of teaching experience in the Department of Anthropology and Tribal Development, GGV Central University, Bilaspur (C.G.). Her major areas of research are Urban Anthropology, Anthropology of Disasters, Linguistic and Archeological Anthropology.

Dr. Jarrell Waggoner is a software engineer at Groupon, working on internal tools to perform sales analytics and demand forecasting. He completed his Ph.D. in Computer Science and Engineering from the University of South Carolina and has worked on numerous projects in the areas of computer vision and image processing, including an NEH-funded document image processing project, a DARPA competition to build an event recognition system, and an interdisciplinary AFOSR-funded materials science image processing project. He is an ardent supporter of free software, having used a variety of open source languages, operating systems, and frameworks in his research. His open source projects and contributions, along with his research work, can be found on GitHub (https://github.com/malloc47) and on his website (http://www.malloc47.com).