Book Image

Mastering Object-oriented Python

By : Steven F. Lott, Steven F. Lott
Book Image

Mastering Object-oriented Python

By: Steven F. Lott, Steven F. Lott

Overview of this book

Table of Contents (26 chapters)
Mastering Object-oriented Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Some Preliminaries
Index

Some Preliminaries

To make the design issues in the rest of the book clearer, we need to look at some of our motivational problems. One of these is the game of Blackjack. Specifically, we're interested in simulating strategies for playing Blackjack. We don't want to endorse gambling. Indeed, a bit of study will show that the game is stacked heavily against the player. This should reveal that most casino gambling is little more than a tax on the innumerate.

Simulation, however, was one of the early problem domains for object-oriented programming. This is an area where object-oriented programming works out particularly elegantly. For more information, see http://en.wikipedia.org/wiki/Simula. Also see An Introduction to Programming in Simula by Rob Pooley.

This chapter will provide some background in tools that are essential for writing complete Python programs and packages. We'll use these tools in later chapters.

We'll make use of the timeit module to compare various object-oriented designs to see which has better performance. It's important to weigh objective evidence along with the more subjective consideration of how well the code seems to reflect the problem domain.

We'll look at the object-oriented use of the unittest and doctest modules. These are essential ingredients in writing software that are known to actually work.

A good object-oriented design should be clear and understandable. In order to assure that it is understood and used as well as maintained properly, writing Pythonic documentation is essential. Docstrings in modules, classes, and methods are very important. We'll touch on RST markup here and cover it in depth in Chapter 18, Quality and Documentation.

Apart from this, we'll address the Integrated Development Environment (IDE) question. A common question regards the best IDE for Python development.

Finally, we'll introduce the concepts behind Python's special method names. The subject of special methods fills the first seven chapters. Here, we'll provide some background that may be of help in understanding Part 1, Pythonic Classes via Special Methods.

We will try to avoid digressing into the foundations of Python object-oriented programming. We're assuming that you've already read the Python 3 Object Oriented Programming book by Packt Publishing. We don't want to repeat things that have been thoroughly stated elsewhere. In this book, we will focus solely on Python 3.

We'll refer to a number of common, object-oriented design patterns. We'll try to avoid repeating the presentation in Packt's Learning Python Design Patterns.

About casino Blackjack

If you're unfamiliar with the casino game of Blackjack, here's an overview.

The objective is to accept cards from the dealer to create a hand that has a point total that is between the dealer's total and 21.

The number cards (2 to 10) have point values equal to the number. The face cards (jack, queen, and king) are worth 10 points. The ace is worth either 11 points or one point. When using an ace as 11 points, the value of the hand is soft. When using an ace as one point, the value is hard.

A hand with an ace and seven, therefore, has a hard total of 8 and a soft total of 18.

There are four two-card combinations that total twenty-one. These are all called blackjack even though only one of the four combinations involves a jack.

Playing the game

The game of Blackjack can vary from casino to casino, but the outline is similar. The mechanics of play work as follows:

  • First, the player and dealer each get two cards. The player, of course, knows the value of both of their cards. They're dealt face up in a casino.

  • One of the dealer's cards is face up and the other is face down. The player therefore knows a little bit about the dealer's hand, but not everything.

  • If the dealer has an ace showing, there's a 4:13 chance that the hidden card is worth 10 and the dealer has 21. The player can elect to make an additional insurance bet.

  • Next, the player can elect to either receive cards or stop receiving cards. These two most common choices are called hit or stand.

  • There are some additional choices too. If the player's cards match, the hand can be split. This is an additional bet, and the two hands are played separately.

  • Finally, the players can double their bet before taking one last card. This is called doubling down. If the player's cards total 10 or 11, this is a common bet to make.

The final evaluation of the hand works as follows:

  • If the player went over 21, the hand is a bust, the player loses, and the dealer's facedown card is irrelevant.

  • If the player's total is 21 or under, then the dealer takes cards according to a simple, fixed rule. The dealer must hit a hand that totals less than 18. The dealer must stand on a hand that totals 18 or more. There are some small variations here that we can ignore for the moment.

  • If the dealer goes bust, the player wins.

  • If both the dealer and player are 21 or under, the hands are compared to see if the player has won or lost.

The amounts of the final payoffs aren't too relevant for now. For a more accurate simulation of various play and betting strategies, the payoffs will matter quite a bit.

Blackjack player strategies

In the case of Blackjack (which is different from a game such as Roulette), there are actually two kinds of strategies that the player must use, as follows:

  • A strategy to decide what game play to make: take insurance, hit, stand, split, or double down.

  • A strategy to decide what amount to bet. A common statistical fallacy leads players to raise and lower their bets in an attempt to preserve their winnings and minimize their losses. Any software to emulate casino games must also emulate these more complex betting strategies. These are interesting algorithms that are often stateful and lead to the learning of some advanced Python programming techniques.

These two sets of strategies are the prime examples of the STRATEGY design pattern.

Object design for simulating Blackjack

We'll use elements of the game like the player hand and card as examples of object modeling. However, we won't design the entire simulation. We'll focus on elements of this game because they have some nuance but aren't terribly complex.

We have a simple container: one hand object will contain zero or more card objects.

We'll take a look at the subclasses of Card for NumberCard, FaceCard, and Ace. We'll take a look at a wide variety of ways to define this simple class hierarchy. Because the hierarchy is so small (and simple), we can easily try a number of implementation alternatives.

We'll take a look at a variety of ways to implement the player's hand. This is a simple collection of cards with some additional features.

We also need to look at the player as a whole. A player will have a sequence of hands as well as a betting strategy and a Blackjack play strategy. This is a rather complex composite object.

We'll also take a quick look at the deck of cards that cards are shuffled and dealt from.

Performance – the timeit module

We'll make use of the timeit module to compare the actual performance of different object-oriented designs and Python constructs. The timeit module contains a number of functions. The one we'll focus on is named timeit. This function creates a Timer object for some statement. It can also include some setup code that prepares the environment. It then calls the timeit() method of Timer to execute the setup just once and the target statement repeatedly. The return value is the time required to run the statement.

The default count is 100,000. This provides a meaningful time that averages out other OS-level activity on the computer that is performing the measurement. For complex or long-running statements, a lower count may be prudent.

The following is a simple interaction with timeit:

>>> timeit.timeit( "obj.method()", """
... class SomeClass:
...     def method(self):
...        pass
... obj= SomeClass()
""")
0.1980541350058047

Tip

Downloading the example code

You can download the example code files for all Packt Publishing books you have purchased from your account at http://www.packtpub.com. If you have purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

The statement obj.method() is provided to timeit() as a string. The setup is the class definition and is provided as a string as well. It's important to note that everything required by the statement must be in the setup. This includes all imports as well as all variable definitions and object creation. Everything.

It can take a few tries to get the setup complete. When using interactive Python, we often lose track of global variables and imports that have scrolled off the top of the terminal window. This example showed that 100,000 method calls that do nothing take 0.198 seconds.

The following is another example of using timeit:

>>> timeit.timeit( "f()","""
... def f():
...     pass
... """ )
0.13721893899491988

This shows us that a do-nothing function call is slightly less expensive than a do-nothing method invocation. The overhead in this case is almost 44 percent.

In some cases, OS overheads may be a measurable component of the performance. These tend to vary based on factors that are hard to control. We can use the repeat() function instead of the timeit() function in this module. It will collect multiple samples of the basic timing to allow further analysis of OS effects on performance.

For our purposes, the timeit() function will provide all the feedback we need to measure the various object-oriented design considerations objectively.

Testing – unittest and doctest

Unit testing is absolutely essential. If there's no automated test to show a particular element functionality, then the feature doesn't really exist. Put another way, it's not done until there's a test that shows that it's done.

We'll touch, tangentially, on testing. If we were to delve into testing each object-oriented design feature, the book would be twice as big as it is. Omitting the details of testing has the disadvantage that it makes good unit tests seem optional. They're emphatically not optional.

Tip

Unit testing is essential

When in doubt, design the tests first. Fit the code to the test cases.

Python offers two built-in testing frameworks. Most applications and libraries will make use of both. The general wrapper for all testing is the unittest module. In addition, many public API docstrings will have examples that can be found and used by the doctest module. Also, unittest can incorporate modules of doctest.

One lofty ideal is that every class and function has at least a unit test. More importantly, visible classes, functions, and modules will have doctest too. There are other lofty ideals: 100 percent code coverage, 100 percent logic path coverage, and so on.

Pragmatically, some classes don't need testing. A class created by namedtuple(), for example, doesn't really need a unit test, unless you don't trust the namedtuple() implementation in the first place. If you don't trust your Python implementation, you can't really write applications with it.

Generally, we want to develop the test cases first and then write code that fits these test cases. The test cases formalize the API for the code. This book will reveal numerous ways to write code that has the same interface. This is important. Once we've defined an interface, there are still numerous candidate implementations that fit the interface. One set of tests should apply to several different object-oriented designs.

One general approach to using the unittest tools is to create at least three parallel directories for your project as follows:

  • myproject: This directory is the final package that will be installed in lib/site-packages for your package or application. It has an __init__.py package, and we'll put our files in here for each module.

  • test: This directory has the test scripts. In some cases, the scripts will parallel the modules. In some cases, the scripts may be larger and more complex than the modules themselves.

  • doc: This directory has other documentation. We'll touch on this in the next section as well as in Chapter 18, Quality and Documentation.

In some cases, we'll want to run the same test suite on multiple candidate classes so that we can be sure that each candidate works. There's no point in doing timeit comparisons on code that doesn't actually work.

Unit testing and technology spikes

As part of object-oriented design, we'll often create technology spike modules that look like the code shown in this section. We'll break it down into three sections. First, we have the overall abstract test as follows:

import types
import unittest

class TestAccess( unittest.TestCase ):
    def test_should_add_and_get_attribute( self ):
        self.object.new_attribute= True
        self.assertTrue( self.object.new_attribute )
    def test_should_fail_on_missing( self ):
        self.assertRaises( AttributeError, lambda: self.object.undefined )

This abstract TestCase subclass defines a few tests that we're expecting a class to pass. The actual object being tested is omitted. It's referenced as self.object, but no definition is provided, making this TestCase subclass abstract. A setUp() method is required by each concrete subclass.

The following are three concrete TestAccess subclasses that will exercise three different kinds of objects:

class SomeClass:
    pass
class Test_EmptyClass( TestAccess ):
    def setUp( self ):
       self.object= SomeClass()
class Test_Namespace( TestAccess ):
    def setUp( self ):
       self.object= types.SimpleNamespace()
class Test_Object( TestAccess ):
    def setUp( self ):
       self.object= object()

The subclasses of the TestAccess classes each provide the required setUp() method. Each method builds a different kind of object for testing. One is an instance of an otherwise empty class. The second is an instance of types.SimpleNamespace. The third is an instance of object.

In order to run these tests, we'll need to build a suite that doesn't allow us to run the TestAccess abstract test.

The following is the rest of the spike:

def suite():
    s= unittest.TestSuite()
    s.addTests( unittest.defaultTestLoader.loadTestsFromTestCase(Test_EmptyClass) )
    s.addTests( unittest.defaultTestLoader.loadTestsFromTestCase(Test_Namespace) )
    s.addTests( unittest.defaultTestLoader.loadTestsFromTestCase(Test_Object) )
    return s

if __name__ == "__main__":
    t= unittest.TextTestRunner()
    t.run( suite() )

We now have concrete evidence that the object class can't be used the same way the types.SimpleNamespace class can be used. Further, we have a simple test class that we can use to demonstrate other designs that work (or don't work.) The tests, for example, demonstrate that types.SimpleNamespace behaves like an otherwise empty class.

We have omitted numerous details of potential unit test cases. We'll look at testing in depth in Chapter 15, Designing for Testability.

Docstrings – RST markup and documentation tools

All Python code should have docstrings at the module, class, and method levels. Not every single method requires a docstring. Some method names are really well chosen, and little more needs to be said about them. Most times, however, documentation is essential for clarity.

Python documentation is often written using ReStructured Text (RST) markup.

Throughout the code examples in the book, however, we'll omit docstrings. It keeps the book to a reasonable size. This gap has the disadvantage that it makes docstrings seem optional. They're emphatically not optional.

We'll emphasize this again. Docstrings are essential.

The docstring material is used by Python in the following three ways:

  • The internal help() function displays the docstrings

  • The doctest tool can find examples in docstrings and run them as test cases

  • External tools such as Sphinx and epydoc can produce elegant documentation extracts

Because of the relative simplicity of RST, it's quite easy to write good docstrings. We'll take a look at documentation and the expected markup in detail in Chapter 18, Quality and Documentation. For now, however, we'll provide a quick example of what a docstring might look like:

def factorial( n ):
    """Compute n! recursively.

    :param n: an integer >= 0
    :returns: n!

    Because of Python's stack limitation, this won't
    compute a value larger than about 1000!.

    >>> factorial(5)
    120
    """
    if n == 0: return 1
    return n*factorial(n-1)

This shows RST markup for parameters and return values. It includes an additional note about a profound limitation. It also includes the doctest output that can be used to validate the implementation using the doctest tool. There are numerous markup features that can be used to provide additional structure and semantic information.

The IDE question

A common question regards the best IDE for Python development. The short answer is that the IDE choice doesn't matter at all. The number of development environments that support Python is vast.

All the examples in this book show interactive examples from the Python >>> prompt. Running examples interactively makes a profound statement. Well-written Python should be simple enough to run from the command line.

Note

We should be able to demonstrate a design at the >>> prompt.

Exercising code from the >>> prompt is an important quality test for Python design complexity. If the classes or functions are too complex, then there's no easy way to exercise it from the >>> prompt. For some complex classes, we may need to provide appropriate mock objects to permit easy, interactive use.

About special method names

Python has multiple layers of implementation. We're interested in just two of them.

On the surface, we have Python's source text. This source text is a mixture of a traditional object-oriented notation and procedural function call notation. The postfix object-oriented notation includes object.method() or object.attribute constructs. The prefix notation involves function(object) constructs that are more typical of procedural programming languages. We also have an infix notation such as object+other. Plus, of course, some statements such as for and with invoke object methods.

The presence of function(object) prefix constructs leads some programmers to question the "purity" of Python's object orientation. It's not clear that a fastidiously strict adherence to the object.method() notation is necessary or even helpful. Python uses a mixture of prefix and suffix notations. The prefix notations are stand-ins for special method suffix notations. The presence of the prefix, infix, and postfix notations is based on choices of expressiveness and esthetics. One goal of well-written Python is that it should read more or less like English. Underneath the hood, the syntax variations are implemented consistently by Python's special methods.

Everything in Python is an object. This is unlike Java or C++ where there are "primitive" types that avoid the object paradigm. Every Python object offers an array of special methods that provide implementation details for the surface features of the language. We might, for example, write str(x) in an application program. This prefix surface notation is implemented as x.__str__() under the hood.

A construct such as a+b may be implemented as a.__add__(b) or b.__radd__(a) depending on the type of compatibility rules that were built into the class definitions for objects a and b.

The mapping between surface syntax and the implementation of special methods is emphatically not a trivial rewrite from function(x) to x.__function__(). There are numerous language features that have interesting special methods to support that feature. Some special methods have default implementations inherited from the base class, object, while other special methods have no default implementation and will raise an exception.

Throughout Part 1, Pythonic Classes via Special Methods, we'll introduce the special methods and show how we can implement these special methods to provide seamless integration between Python and our class definitions.

Summary

We've looked at one of our sample problem domains: the casino game of Blackjack. We like it because it has some algorithmic complexity, but isn't too sophisticated or esoteric. We've also introduced three important modules that we'll be using throughout the book:

  • The timeit module is something we'll use to compare performance of alternative implementations

  • The unittest and doctest modules will be used to confirm that our software works correctly

We've also looked at some of the ways we'll add documentation to our Python programs. We'll be using docstrings in modules, classes, and functions. To save space, not every example will show the docstrings. In spite of this, they should be considered as essential.

The use of an integrated development environment (IDE) isn't essential. Any IDE or text editor that works for you will be fine for advanced Python development.

The eight chapters which follow will address different subsets of the special method names. These are about how we'll create our own Python programming that integrates seamlessly with the built-in library modules.

In the next chapter, we'll focus on the __init__() method and the various ways we can use it. The __init__() method is profound because initialization is the first big step in an object's life; every object must be initialized properly to work properly. More important than that, the argument values for __init__() can take on many forms. We'll look at a variety of ways to design __init__().