Python: End-to-end Data Analysis

By : Ivan Idris, Luiz Felipe Martins, Martin Czygan, Phuong Vo.T.H, Magnus Vilhelm Persson

Python: End-to-end Data Analysis

By: Ivan Idris, Luiz Felipe Martins, Martin Czygan, Phuong Vo.T.H, Magnus Vilhelm Persson

Overview of this book

Data analysis is the process of applying logical and analytical reasoning to study each component of data present in the system. Python is a multi-domain, high-level, programming language that offers a range of tools and libraries suitable for all purposes, it has slowly evolved as one of the primary languages for data science. Have you ever imagined becoming an expert at effectively approaching data analysis problems, solving them, and extracting all of the available information from your data? If yes, look no further, this is the course you need! In this course, we will get you started with Python data analysis by introducing the basics of data analysis and supported Python libraries such as matplotlib, NumPy, and pandas. Create visualizations by choosing color maps, different shapes, sizes, and palettes then delve into statistical data analysis using distribution algorithms and correlations. You’ll then find your way around different data and numerical problems, get to grips with Spark and HDFS, and set up migration scripts for web mining. You’ll be able to quickly and accurately perform hands-on sorting, reduction, and subsequent analysis, and fully appreciate how data analysis methods can support business decision-making. Finally, you will delve into advanced techniques such as performing regression, quantifying cause and effect using Bayesian methods, and discovering how to use Python’s tools for supervised machine learning. The course provides you with highly practical content explaining data analysis with Python, from the following Packt books: 1. Getting Started with Python Data Analysis. 2. Python Data Analysis Cookbook. 3. Mastering Python Data Analysis. By the end of this course, you will have all the knowledge you need to analyze your data with varying complexity levels, and turn it into actionable insights.

Parameter	Value	Description
`dtype`	Type name or dictionary of type of columns	Sets the data type for data or columns. By default it will try to infer the most appropriate data type.
`skiprows`	List-like or integer	The number of lines to skip (starting from 0).
`na_values`	List-like or dict, default None	Values to recognize as `NA`/`NaN`. If a dict is passed, this can be set on a per-column basis.
`true_values`	List	A list of values to be converted to Boolean True as well.
`false_values`	List	A list of values to be converted to Boolean False as well.
`keep_default_na`	`Bool`, `default True`	If the `na_values` parameter is present and `keep_default_na` is `False`, the default NaN values are ignored, otherwise they are appended to
`thousands`	`Str`, `default None`	The thousands separator
`nrows`	`Int`, `default None`	Limits the number of rows to read from the file.
`error_bad_lines`	`Boolean`, `default True`	If set to True, a DataFrame is returned, even if an error occurred during parsing.

Function	Description
`read_table`	Read the general delimited file into DataFrame
`read_fwf`	Read a table of fixed-width formatted lines into DataFrame
`read_clipboard`	Read text from the clipboard and pass to `read_table`. It is useful for converting tables from web pages

Update Method	Description
`inc()`	Increment a numeric field
`set()`	Set certain fields to new values
`unset()`	Remove a field from the document
`push()`	Append a value onto an array in the document
`pushAll()`	Append several values onto an array in the document
`addToSet()`	Add a value to an array, only if it does not exist
`pop()`	Remove the last value of an array
`pull()`	Remove all occurrences of a value from an array
`pullAll()`	Remove all occurrences of any set of values from an array
`rename()`	Rename a field
`bit()`	Update a value by bitwise operation

Function	Description
`rpushx(name, value)`	Push value onto the tail of the list name if name exists
`rpop(name)`	Remove and return the last item of the list name
`lset(name, index, value)`	Set item at the index position of the list name to input value
`lpushx(name,value)`	Push value on the head of the list name if name exists
`lpop(name)`	Remove and return the first item of the list name

Python: End-to-end Data Analysis

By : Ivan Idris, Luiz Felipe Martins, Martin Czygan, Phuong Vo.T.H, Magnus Vilhelm Persson

Python: End-to-end Data Analysis

By: Ivan Idris, Luiz Felipe Martins, Martin Czygan, Phuong Vo.T.H, Magnus Vilhelm Persson

Overview of this book

Related Content you might be interested in

Current Title:

Python: End-to-end Data Analysis