Book Image

Python Real-World Projects

By : Steven F. Lott
5 (1)
Book Image

Python Real-World Projects

5 (1)
By: Steven F. Lott

Overview of this book

In today's competitive job market, a project portfolio often outshines a traditional resume. Python Real-World Projects empowers you to get to grips with crucial Python concepts while building complete modules and applications. With two dozen meticulously designed projects to explore, this book will help you showcase your Python mastery and refine your skills. Tailored for beginners with a foundational understanding of class definitions, module creation, and Python's inherent data structures, this book is your gateway to programming excellence. You’ll learn how to harness the potential of the standard library and key external projects like JupyterLab, Pydantic, pytest, and requests. You’ll also gain experience with enterprise-oriented methodologies, including unit and acceptance testing, and an agile development approach. Additionally, you’ll dive into the software development lifecycle, starting with a minimum viable product and seamlessly expanding it to add innovative features. By the end of this book, you’ll be armed with a myriad of practical Python projects and all set to accelerate your career as a Python programmer.
Table of Contents (20 chapters)
19
Index

2.8 Summary

This data analysis pipeline moves data from sources through a series of stages to create clean, valid, standardized data. The general flow supports a variety of needs and permits a great deal of customization and extension.

For developers with an interest in data science or machine learning, these projects cover what is sometimes called the ”data wrangling” part of data science or machine learning. It can be a significant complication as data is understood and differences among data sources are resolved and explored. These are the — sometimes difficult — preparatory steps prior to building a model that can be used for AI decision-making.

For readers with an interest in the web, this kind of data processing and extraction is part of presenting data via a web application API or website. Project 3.7 creates a web server, and will be of particular interest. Because the web service requires clean data, the preceding projects are helpful for creating data that can be published.

For folks with an automation or IoT interest, Part 2 explains how to use Jupyter Notebooks to gather and inspect data. This is a common need, and the various steps to clean, validate, and standardize data become all the more important when dealing with real-world devices subject to the vagaries of temperature and voltage.

We’ve looked at the following multi-stage approach to doing data analysis:

  • Data Acquisition

  • Inspection of Data

  • Clean, Validate, Standardize, and Persist

  • Summarize and Analyze

  • Create a Statistical Model

This pipeline follows the Extract-Transform-Load (ETL) concept. The terms have been changed because the legacy words are sometimes misleading. Our acquisition stage overlaps with what is understood as the ”Extract” operation. For some developers, Extract is limited to database extracts; we’d like to go beyond that to include other data source transformations. Our cleaning, validating, and standardizing stages are usually combined into the ”Transform” operation. Saving the clean data is generally the objective of ”Load”; we’re not emphasizing a database load, but instead, we’ll use files.

Throughout the book, we’ll describe each project’s objective and provide the foundation of a sound technical approach. The details of the implementation are up to you. We’ll enumerate the deliverables; this may repeat some of the information from Chapter 1, Project Zero: A Template for Other Projects. The book provides a great deal of information on acceptance test cases and unit test cases — the definition of done. By covering the approach, we’ve left room for you to design and implement the needed application software.

In the next chapter, we’ll build the first data acquisition project. This will work with CSV-format files. Later projects will work with database extracts and web services.