Book Image

Modern Python Cookbook

Book Image

Modern Python Cookbook

Overview of this book

Python is the preferred choice of developers, engineers, data scientists, and hobbyists everywhere. It is a great scripting language that can power your applications and provide great speed, safety, and scalability. By exposing Python as a series of simple recipes, you can gain insight into specific language features in a particular context. Having a tangible context helps make the language or standard library feature easier to understand. This book comes with over 100 recipes on the latest version of Python. The recipes will benefit everyone ranging from beginner to an expert. The book is broken down into 13 chapters that build from simple language concepts to more complex applications of the language. The recipes will touch upon all the necessary Python concepts related to data structures, OOP, functional programming, as well as statistical programming. You will get acquainted with the nuances of Python syntax and how to effectively use the advantages that it offers. You will end the book equipped with the knowledge of testing, web services, and configuration and application integration tips and tricks. The recipes take a problem-solution approach to resolve issues commonly faced by Python programmers across the globe. You will be armed with the knowledge of creating applications with flexible logging, powerful configuration, and command-line options, automated unit tests, and good documentation.
Table of Contents (18 chapters)
Title Page
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Reading complex formats using regular expressions


There are many file formats that lack the elegant regularity of a CSV file. One common file format that's rather difficult to parse is a web server log file. These files tend to have complex data without a single separator character or consistent quoting rules.

When we looked at a simplified log file in the Writing generator functions with the yield statement recipe in online Chapter 12Functional And Reactive Programming Features (link provided in Preface), we saw that the rows look as follows:

[2016-05-08 11:08:18,651] INFO in ch09_r09: Sample Message One 
[2016-05-08 11:08:18,651] DEBUG in ch09_r09: Debugging 
[2016-05-08 11:08:18,652] WARNING in ch09_r09: Something might have gone wrong

There are a variety of punctuation marks used in this file. The csv module can't handle this complexity.

How can we process this kind of data with the elegant simplicity of a CSV file? Can we transform these irregular rows to a more regular data structure...