Python Real-World Projects

By : Steven F. Lott

5 (1)

Buy this Book

Python Real-World Projects

5 (1)

By: Steven F. Lott

Buy this Book

Overview of this book

In today's competitive job market, a project portfolio often outshines a traditional resume. Python Real-World Projects empowers you to get to grips with crucial Python concepts while building complete modules and applications. With two dozen meticulously designed projects to explore, this book will help you showcase your Python mastery and refine your skills. Tailored for beginners with a foundational understanding of class definitions, module creation, and Python's inherent data structures, this book is your gateway to programming excellence. You’ll learn how to harness the potential of the standard library and key external projects like JupyterLab, Pydantic, pytest, and requests. You’ll also gain experience with enterprise-oriented methodologies, including unit and acceptance testing, and an agile development approach. Additionally, you’ll dive into the software development lifecycle, starting with a minimum viable product and seamlessly expanding it to add innovative features. By the end of this book, you’ll be armed with a myriad of practical Python projects and all set to accelerate your career as a Python programmer.

Preface

Who this book is for

What this book covers

A note on skills required

To get the most out of this book

Conventions used

Get in touch

Share your thoughts

Download a free PDF copy of this book

Chapter 1: Project Zero: A Template for Other Projects

1.1 On quality

1.2 Suggested project sprints

1.3 List of deliverables

1.4 Development tool installation

1.5 Project 0 – Hello World with test cases

1.6 Summary

1.7 Extras

Free Chapter

Chapter 2: Overview of the Projects

2.1 General data acquisition

2.2 Acquisition via Extract

2.3 Inspection

2.4 Clean, validate, standardize, and persist

2.5 Summarize and analyze

2.6 Statistical modeling

2.7 Data contracts

2.8 Summary

Chapter 3: Project 1.1: Data Acquisition Base Application

3.1 Description

3.2 Architectural approach

3.3 Deliverables

3.4 Summary

3.5 Extras

Chapter 4: Data Acquisition Features: Web APIs and Scraping

4.1 Project 1.2: Acquire data from a web service

4.2 Project 1.3: Scrape data from a web page

4.3 Summary

4.4 Extras

Chapter 5: Data Acquisition Features: SQL Database

5.1 Project 1.4: A local SQL database

5.2 Project 1.5: Acquire data from a SQL extract

5.3 Summary

5.4 Extras

Chapter 6: Project 2.1: Data Inspection Notebook

Chapter 7: Data Inspection Features

7.1 Project 2.2: Validating cardinal domains — measures, counts, and durations

7.1.1 Description

7.1.2 Approach

7.1.3 Deliverables

7.2 Project 2.3: Validating text and codes — nominal data and ordinal numbers

7.2.1 Description

7.2.2 Approach

7.2.3 Deliverables

7.3 Project 2.4: Finding reference domains

7.4 Summary

7.5 Extras

Chapter 8: Project 2.5: Schema and Metadata

Chapter 9: Project 3.1: Data Cleaning Base Application

Chapter 10: Data Cleaning Features

10.1 Project 3.2: Validate and convert source fields

10.2 Project 3.3: Validate text fields (and numeric coded fields)

10.3 Project 3.4: Validate references among separate data sources

10.4 Project 3.5: Standardize data to common codes and ranges

10.5 Project 3.6: Integration to create an acquisition pipeline

10.6 Summary

10.7 Extras

Chapter 11: Project 3.7: Interim Data Persistence

11.1 Description

11.2 Overall approach

11.3 Deliverables

11.4 Summary

11.5 Extras

Chapter 12: Project 3.8: Integrated Data Acquisition Web Service

12.1 Description

12.2 Overall approach

12.3 Deliverables

12.4 Summary

12.5 Extras

Chapter 13: Project 4.1: Visual Analysis Techniques

13.1 Description

13.2 Overall approach

13.3 Deliverables

13.4 Summary

13.5 Extras

Chapter 14: Project 4.2: Creating Reports

14.1 Description

14.2 Overall approach

14.3 Deliverables

14.4 Summary

14.5 Extras

Chapter 15: Project 5.1: Modeling Base Application

Chapter 16: Project 5.2: Simple Multivariate Statistics

Chapter 17: Next Steps

17.1 Overall data wrangling

17.2 The concept of “decision support”

17.3 Concept of metadata and provenance

17.4 Next steps toward machine learning

Why subscribe?

Other Books You Might Enjoy

Packt is searching for authors like you

Share your thoughts

Download a free PDF copy of this book

Index

Customer Reviews

5 (1)

5 star

100%

4 star

3 star

2 star

1 star

What this book covers

We can decompose this book into five general topics:

We’ll start with Acquiring Data From Sources. The first six projects will cover projects to acquire data for analytic processing from a variety of sources.
Once we have data, we often need to Inspect and Survey. The next five projects look at some ways to inspect data to make sure it’s usable, and diagnose odd problems, outliers, and exceptions.
The general analytics pipeline moves on to Cleaning, Converting, and Normalizing. There are eight projects that tackle these closely-related problems.
The useful results begin with Presenting Summaries. There’s a lot of variability here, so we’ll only present two project ideas. In many cases, you will want to provide their own, unique solutions to presenting the data they’ve gathered.
This book winds up with two small projects covering some basics of Statistical Modeling. In some organizations, this may be the start of more sophisticated data science and machine learning applications. We encourage you to continue your study of Python applications in the data science realm.

The first part has two preliminary chapters to help define what the deliverables are and what the broad sweep of the projects will include. Chapter 1, Project Zero: A Template for Other Projects is a baseline project. The functionality is a “Hello, World!” application. However, the additional infrastructure of unit tests, acceptance tests, and the use of a tool like tox or nox to execute the tests is the focus.

The next chapter, Chapter 2, Overview of the Projects, shows the general approach this book will follow. This will present the flow of data from acquisition through cleaning to analysis and reporting. This chapter decomposes the large problem of “data analytics” into a number of smaller problems that can be solved in isolation.

The sequence of chapters starting with Chapter 3, Project 1.1: Data Acquisition Base Application, builds a number of distinct data acquisition applications. This sequence starts with acquiring data from CSV files. The first variation, in Chapter 4, Data Acquisition Features: Web APIs and Scraping, looks at ways to get data from web pages.

The next two projects are combined into Chapter 5, Data Acquisition Features: SQL Database. This chapter builds an example SQL database, and then extracts data from it. The example database lets us explore enterprise database management concepts to more fully understand some of the complexities of working with relational data.

Once data has been acquired, the projects transition to data inspection. Chapter 6, Project 2.1: Data Inspection Notebook creates an initial inspection notebook. In Chapter 7, Data Inspection Features, a series of projects add features to the basic inspection notebook for different categories of data.

This topic finishes with the Chapter 8, Project 2.5: Schema and Metadata project to create a formal schema for a data source and for the acquired data. The JSON Schema standard is used because it seems to be easily adapted to enterprise data processing. This schema formalization will become part of later projects.

The third topic — cleaning — starts with Chapter 9, Project 3.1: Data Cleaning Base Application. This is the base application to clean the acquired data. This introduces the Pydantic package as a way to provide explicit data validation rules.

Chapter 10, Data Cleaning Features has a number of projects to add features to the core data cleaning application. Many of the example datasets in the previous chapters provide very clean data; this makes the chapter seem like needless over-engineering. It can help if you extract sample data and then manually corrupt it so that you have examples of invalid and valid data.

In Chapter 11, Project 3.7: Interim Data Persistence, we’ll look at saving the cleaned data for further use.

The acquire-and-clean pipeline is often packaged as a web service. In Chapter 12, Project 3.8: Integrated Data Acquisition Web Service, we’ll create a web server to offer the cleaned data for subsequent processing. This kind of web services wrapper around a long-running acquire-and-clean process presents a number of interesting design problems.

The next topic is the analysis of the data. In Chapter 13, Project 4.1: Visual Analysis Techniques we’ll look at ways to produce reports, charts, and graphs using the power of JupyterLab.

In many organizations, data analysis may lead to a formal document, or report, showing the results. This may have a large audience of stakeholders and decision-makers. In Chapter 14, Project 4.2: Creating Reports we’ll look at ways to produce elegant reports from the raw data using computations in a JupyterLab notebook.

The final topic is statistical modeling. This starts with Chapter 15, Project 5.1: Modeling Base Application to create an application that embodies lessons learned in the Inspection Notebook and Analysis Notebook projects. Sometimes we can share Python programming among these projects. In other cases, however, we can only share the lessons learned; as our understanding evolves, we often change data structures and apply other optimizations making it difficult to simply share a function or class definition.

In Chapter 16, Project 5.2: Simple Multivariate Statistics, we expand on univariate modeling to add multivariate statistics. This modeling is kept simple to emphasize foundational design and architectural details. If you’re interested in more advanced statistics, we suggest building the basic application project, getting it to work, and then adding more sophisticated modeling to an already-working baseline project.

The final chapter, Chapter 17, Next Steps, provides some pointers for more sophisticated applications. In many cases, a project evolves from exploration to monitoring and maintenance. There will be a long tail where the model continues to be confirmed and refined. In some cases, the long tail ends when a model is replaced. Seeing this long tail can help an analyst understand the value of time invested in creating robust, reliable software at each stage of their journey.

Python Real-World Projects

By : Steven F. Lott

Python Real-World Projects

By: Steven F. Lott

Overview of this book

Related Content you might be interested in

Current Title:

Python Real-World Projects

Modern Python Cookbook.

Functional Python Programming, 3rd edition

Mastering Object-Oriented Python.

What this book covers