Python Real-World Projects

By : Steven F. Lott

5 (1)

Buy this Book

Python Real-World Projects

5 (1)

By: Steven F. Lott

Buy this Book

Overview of this book

In today's competitive job market, a project portfolio often outshines a traditional resume. Python Real-World Projects empowers you to get to grips with crucial Python concepts while building complete modules and applications. With two dozen meticulously designed projects to explore, this book will help you showcase your Python mastery and refine your skills. Tailored for beginners with a foundational understanding of class definitions, module creation, and Python's inherent data structures, this book is your gateway to programming excellence. You’ll learn how to harness the potential of the standard library and key external projects like JupyterLab, Pydantic, pytest, and requests. You’ll also gain experience with enterprise-oriented methodologies, including unit and acceptance testing, and an agile development approach. Additionally, you’ll dive into the software development lifecycle, starting with a minimum viable product and seamlessly expanding it to add innovative features. By the end of this book, you’ll be armed with a myriad of practical Python projects and all set to accelerate your career as a Python programmer.

Preface

Who this book is for

What this book covers

A note on skills required

To get the most out of this book

Conventions used

Get in touch

Share your thoughts

Download a free PDF copy of this book

Chapter 1: Project Zero: A Template for Other Projects

1.1 On quality

1.2 Suggested project sprints

1.3 List of deliverables

1.4 Development tool installation

1.5 Project 0 – Hello World with test cases

1.6 Summary

1.7 Extras

Free Chapter

Chapter 2: Overview of the Projects

2.1 General data acquisition

2.2 Acquisition via Extract

2.3 Inspection

2.4 Clean, validate, standardize, and persist

2.5 Summarize and analyze

2.6 Statistical modeling

2.7 Data contracts

2.8 Summary

Chapter 3: Project 1.1: Data Acquisition Base Application

3.1 Description

3.2 Architectural approach

3.3 Deliverables

3.4 Summary

3.5 Extras

Chapter 4: Data Acquisition Features: Web APIs and Scraping

4.1 Project 1.2: Acquire data from a web service

4.2 Project 1.3: Scrape data from a web page

4.3 Summary

4.4 Extras

Chapter 5: Data Acquisition Features: SQL Database

5.1 Project 1.4: A local SQL database

5.2 Project 1.5: Acquire data from a SQL extract

5.3 Summary

5.4 Extras

Chapter 6: Project 2.1: Data Inspection Notebook

Chapter 7: Data Inspection Features

7.1 Project 2.2: Validating cardinal domains — measures, counts, and durations

7.1.1 Description

7.1.2 Approach

7.1.3 Deliverables

7.2 Project 2.3: Validating text and codes — nominal data and ordinal numbers

7.2.1 Description

7.2.2 Approach

7.2.3 Deliverables

7.3 Project 2.4: Finding reference domains

7.4 Summary

7.5 Extras

Chapter 8: Project 2.5: Schema and Metadata

Chapter 9: Project 3.1: Data Cleaning Base Application

Chapter 10: Data Cleaning Features

10.1 Project 3.2: Validate and convert source fields

10.2 Project 3.3: Validate text fields (and numeric coded fields)

10.3 Project 3.4: Validate references among separate data sources

10.4 Project 3.5: Standardize data to common codes and ranges

10.5 Project 3.6: Integration to create an acquisition pipeline

10.6 Summary

10.7 Extras

Chapter 11: Project 3.7: Interim Data Persistence

11.1 Description

11.2 Overall approach

11.3 Deliverables

11.4 Summary

11.5 Extras

Chapter 12: Project 3.8: Integrated Data Acquisition Web Service

12.1 Description

12.2 Overall approach

12.3 Deliverables

12.4 Summary

12.5 Extras

Chapter 13: Project 4.1: Visual Analysis Techniques

13.1 Description

13.2 Overall approach

13.3 Deliverables

13.4 Summary

13.5 Extras

Chapter 14: Project 4.2: Creating Reports

14.1 Description

14.2 Overall approach

14.3 Deliverables

14.4 Summary

14.5 Extras

Chapter 15: Project 5.1: Modeling Base Application

Chapter 16: Project 5.2: Simple Multivariate Statistics

Chapter 17: Next Steps

17.1 Overall data wrangling

17.2 The concept of “decision support”

17.3 Concept of metadata and provenance

17.4 Next steps toward machine learning

Why subscribe?

Other Books You Might Enjoy

Packt is searching for authors like you

Share your thoughts

Download a free PDF copy of this book

Index

Customer Reviews

5 (1)

5 star

100%

4 star

3 star

2 star

1 star

2.3 Inspection

Data inspection needs to be done when starting development. It’s essential to survey new data to be sure it really is what’s needed to solve the user’s problems. A common frustration is incomplete or inconsistent data, and these problems need to be exposed as soon as possible to avoid wasting time and effort creating software to process data that doesn’t really exist.

Additionally, data is inspected manually to uncover problems. It’s important to recognize that data sources are in a constant state of flux. As applications evolve and mature, the data provided for analysis will change. In many cases, data analytics applications discover other enterprise changes after the fact via invalid data. It’s important to understand the evolution via good data inspection tools.

Inspection is an inherently manual process. Therefore, we’re going to use JupyterLab to create notebooks to look at the data and determine some basic features.

In rare cases where privacy is important, developers may not be allowed to do data inspection. More privileged people — with permission to see payment card or healthcare details — may be part of data inspection. This means an inspection notebook may be something created by a developer for use by stakeholders.

In many cases, a data inspection notebook can be the start of a fully-automated data cleansing application. A developer can extract notebook cells as functions, building a module that’s usable from both notebook and application. The cell results can be used to create unit test cases.

The stage in the pipeline requires a number of inspection projects:

Project 2.1: ”Inspect Data”. This will build a core data inspection notebook with enough features to confirm that some of the acquired data is likely to be valid.
Project 2.2: ”Inspect Data: Cardinal Domains”. This project will add analysis features for measurements, dates, and times. These are cardinal domains that reflect measures and counts.
Project 2.3: ”Inspect Data: Nominal and Ordinary Domains”. This project will add analysis features for text or coded numeric data. This includes nominal data and ordinal numeric domains. It’s important to recognize that US Zip Codes are digit strings, not numbers.
Project 2.4: ”Inspect Data: Reference Data”. This notebook will include features to find reference domains when working with data that has been normalized and decomposed into subsets with references via coded ”key” values.
Project 2.5: ”Define a Reusable Schema”. As a final step, it can help define a formal schema, and related metadata, using the JSON Schema standard.

While some of these projects seem to be one-time efforts, they often need to be written with some care. In many cases, a notebook will need to be reused when there’s a problem. It helps to provide adequate explanations and test cases to help refresh someone’s memory on details of the data and what are known problem areas. Additionally, notebooks may serve as examples for test cases and the design of Python classes or functions to automate cleaning, validating, or standardizing data.

After a detailed inspection, we can then build applications to automate cleaning, validating, and normalizing the values. The next batch of projects will address this stage of the pipeline.

Python Real-World Projects

By : Steven F. Lott

Python Real-World Projects

By: Steven F. Lott

Overview of this book

Related Content you might be interested in

Current Title:

Python Real-World Projects

Modern Python Cookbook.

Functional Python Programming, 3rd edition

Mastering Object-Oriented Python.

2.3 Inspection