Book Image

Statistics for Data Science

Book Image

Statistics for Data Science

Overview of this book

Data science is an ever-evolving field, which is growing in popularity at an exponential rate. Data science includes techniques and theories extracted from the fields of statistics; computer science, and, most importantly, machine learning, databases, data visualization, and so on. This book takes you through an entire journey of statistics, from knowing very little to becoming comfortable in using various statistical methods for data science tasks. It starts off with simple statistics and then move on to statistical methods that are used in data science algorithms. The R programs for statistical computation are clearly explained along with logic. You will come across various mathematical concepts, such as variance, standard deviation, probability, matrix calculations, and more. You will learn only what is required to implement statistics in data science tasks such as data cleaning, mining, and analysis. You will learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks. By the end of the book, you will be comfortable with performing various statistical computations for data science programmatically.
Table of Contents (19 chapters)
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

About the Author

James D. Miller, is an IBM certified expert, creative innovator and accomplished Director, Sr. Project Leader and Application/System Architect with +35 years of extensive applications and system design and development experience across multiple platforms and technologies. Experiences include introducing customers to new and sometimes disruptive technologies and platforms, integrating with IBM Watson Analytics, Cognos BI, TM1 and web architecture design, systems analysis, GUI design and testing, database modelling and systems analysis, design and development of OLAP, client/server, web and mainframe applications and systems utilizing: IBM Watson Analytics, IBM Cognos BI and TM1 (TM1 rules, TI, TM1Web and Planning Manager), Cognos Framework Manager, dynaSight-ArcPlan, ASP, DHTML, XML, IIS, MS Visual Basic and VBA, Visual Studio, PERL, SPLUNK, WebSuite, MS SQL Server, ORACLE, SYBASE Server, and so on.

Responsibilities have also included all aspects of Windows and SQL solution development and design including analysis; GUI (and website) design; data modelling; table, screen/form and script development; SQL (and remote stored procedures and triggers) development/testing; test preparation and management and training of programming staff. Other experience includes the development of  ExtractTransform, and Load (ETL)  infrastructure such as data transfer automation between mainframe (DB2, Lawson, Great Plains, and so on.) systems and client/server SQL server and web-based applications and integration of enterprise applications and data sources.

Mr Miller has acted as Internet Applications Development Mgr. responsible for the design, development, QA and delivery of multiple websites including online trading applications, warehouse process control and scheduling systems, administrative and control applications. Mr Miller also was responsible for the design, development and administration of a web-based financial reporting system for a 450-million-dollar organization, reporting directly to the CFO and his executive team.

He has also been responsible for managing and directing multiple resources in various management roles including project and team leader, lead developer and applications development director.

He has authored the following books published by Packt:

  • Mastering Predictive Analytics with R – Second Edition 
  • Big Data Visualization 
  • Learning IBM Watson Analytics 
  • Implementing Splunk – Second Edition 
  • Mastering Splunk 
  • IBM Cognos TM1 Developer's Certification Guide 

He has also authored a number of whitepapers on best practices such as Establishing a Center of Excellence and continues to post blogs on a number of relevant topics based on personal experiences and industry best practices. 

He is a perpetual learner continuing to pursue experiences and certifications, currently holding the following current technical certifications:

  • IBM Certified Developer Cognos TM1
  • IBM Certified Analyst Cognos TM1
  • IBM Certified Administrator Cognos TM1
  • IBM Cognos TM1 Master 385 Certification
  • IBM Certified Advanced Solution Expert Cognos TM1
  • IBM OpenPages Developer Fundamentals C2020-001-ENU
  • IBM Cognos 10 BI Administrator C2020-622
  • IBM Cognos 10 BI Author C2090-620-ENU
  • IBM Cognos BI Professional C2090-180-ENU
  • IBM Cognos 10 BI Metadata Model Developer C2090-632
  • IBM Certified Solution Expert - Cognos BI

Specialties: The evaluation and introduction of innovative and disruptive technologies, cloud migration, IBM Watson Analytics, big data, data visualizations, Cognos BI and TM1 application design and development, OLAP, Visual Basic, SQL Server, forecasting and planning; international application, and development, business intelligence, project development, and delivery and process improvement.

To Nanette L. Miller: "Like a river flows surely to the sea, darling so it goes, some things are meant to be."