Book Image

Secret Recipes of the Python Ninja

Book Image

Secret Recipes of the Python Ninja

Overview of this book

This book covers the unexplored secrets of Python, delve into its depths, and uncover its mysteries. You’ll unearth secrets related to the implementation of the standard library, by looking at how modules actually work. You’ll understand the implementation of collections, decimals, and fraction modules. If you haven’t used decorators, coroutines, and generator functions much before, as you make your way through the recipes, you’ll learn what you’ve been missing out on. We’ll cover internal special methods in detail, so you understand what they are and how they can be used to improve the engineering decisions you make. Next, you’ll explore the CPython interpreter, which is a treasure trove of secret hacks that not many programmers are aware of. We’ll take you through the depths of the PyPy project, where you’ll come across several exciting ways that you can improve speed and concurrency. Finally, we’ll take time to explore the PEPs of the latest versions to discover some interesting hacks.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
Packt Upsell
Foreword
Contributors
Preface
Index

Comparing source code to bytecode


Interpreted languages, such as Python, typically take raw source code and generate bytecode. Bytecode is encoded instructions that are on a lower level than source code but not quite as optimized as machine code, that is, assembly language.

Bytecode is often executed within the interpreter (which is a type of virtual machine), though it can also be compiled further into assembly language. Bytecode is used primarily to allow easy, cross-platform compatibility. Python, Java, Ruby, Perl, and similar languages, are examples of languages that use bytecode interpreters for different architectures while the source code stays the same.

While Python automatically compiles source code into bytecode, there are some options and features that can be used to modify how the interpreter works with bytecode. These options can improve the performance of Python programs, a key feature as interpreted languages are, by nature, slower than compiled languages

How to do it...

  1. To create bytecode, simply execute a Python program via python <program>.py.
  2. When running a Python command from the command line, there are a couple of switches that can reduce the size of the compiled bytecode. Be aware that some programs may expect the statements that are removed from the following examples to function correctly, so only use them if you know what to expect.

-O removes assertstatements from the compiled code. These statements provide some debugging help when testing the program, but generally aren't required for production code.

-OO removes both assert and __doc__ strings for even more size reduction.

  1. Loading programs from bytecode into memory is faster than with source code, but actual program execution is no faster (due to the nature of the Python interpreter).
  2. The compileall module can generate bytecode for all modules within a directory. More information on the command can be found at https://docs.python.org/3.6/library/compileall.html.

How it works...

When source code (.py) is read by the Python interpreter, the bytecode is generated and stored in __pycache__ as <module_name>.<version>.pyc. The .pyc extension indicates that it is compiled Python code. This naming convention is what allows different versions of Python code to exist simultaneously on the system.

When source code is modified, Python will automatically check the date with the compiled version in cache and, if it's out of date, will automatically recompile the bytecode. However, a module that is loaded directly from the command line will not be stored in __pycache__ and is recompiled every time. In addition, if there is no source module, the cache can't be checked, that is, a bytecode-only package won't have a cache associated with it.

There's more...

Because bytecode is platform-independent (due to being run through the platform's interpreter), Python code can be released either as .py source files or as .pyc bytecode. This is where bytecode-only packages come into play; to provide a bit of obfuscation and (subjective) security, Python programs can be released without the source code and only the pre-compiled .pyc files are provided. In this case, the compiled code is placed in the source directory rather than the source-code files.