Hands-On Software Engineering with Python

By : Brian Allbee, Nimesh Verma

Hands-On Software Engineering with Python

By: Brian Allbee, Nimesh Verma

Overview of this book

Software Engineering is about more than just writing code—it includes a host of soft skills that apply to almost any development effort, no matter what the language, development methodology, or scope of the project. Being a senior developer all but requires awareness of how those skills, along with their expected technical counterparts, mesh together through a project's life cycle. This book walks you through that discovery by going over the entire life cycle of a multi-tier system and its related software projects. You'll see what happens before any development takes place, and what impact the decisions and designs made at each step have on the development process. The development of the entire project, over the course of several iterations based on real-world Agile iterations, will be executed, sometimes starting from nothing, in one of the fastest growing languages in the world—Python. Application of practices in Python will be laid out, along with a number of Python-specific capabilities that are often overlooked. Finally, the book will implement a high-performance computing solution, from first principles through complete foundation.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

Programming versus Software Engineering

The bigger picture

Asking questions

Summary

The Software Development Life Cycle

Pre-development phases of the SDLC

Development – specific phases of the SDLC

Post-development phases of the SDLC

Summary

System Modeling

Architecture, both logical and physical

Use cases (business processes and rules)

Data structure and flow

Interprocess communication

System scope and scale

Summary

Methodologies, Paradigms, and Practices

Process methodologies

Development paradigms

Development practices

Summary

The hms_sys System Project

Goals for the system

What's known/designed before development starts

What the iteration chapters will look like

Summary

Development Tools and Best Practices

Development tools

Best practices

Summary

Setting Up Projects and Processes

Iteration goals

Assembly of stories and tasks

Setting Up SCM

Stubbing out component projects

Integrating tests with the build process

Summary

Creating Business Objects

Iteration goals

Assembly of stories and tasks

A quick review of classes

Implementing the basic business objects in hms_sys

Summary

Testing Business Objects

Starting the unit testing process

Unit testing patterns established so far

Distribution and installation considerations

Quality assurance and acceptance

Operation/use, maintenance, and decommissioning considerations

Summary

Thinking About Business Object Data Persistence

Iterations are (somewhat) flexible

Data storage options

Selecting a data storage option

Polymorphism (and programming to an interface)

Data access design strategies

Summary

Data Persistence and BaseDataObject

The BaseDataObject ABC

Unit testing BaseDataObject

Summary

Persisting Object Data to Files

Setting up the hms_artisan project

Creating a local file system data store

Implementing JSONFileDataObject

The concrete business objects of hms_artisan

Summary

Persisting Data to a Database

The Artisan Gateway and Central Office application objects

The concrete business objects of the Central Office projects

Summary

Testing Data Persistence

Writing the unit tests

Testing hms_artisan.data_storage

Testing the new hms_core Classes

Unit tests and trust

Building/distribution, demonstration, and acceptance

Operations/use, maintenance, and decommissioning considerations

Summary

Anatomy of a Service

What is a service?

Service structure

A generic service design

Integrating a service with the OS

Summary

The Artisan Gateway Service

Overview and goal

Iteration stories

Messages

Deciding on a message-transmission mechanism

Traffic to and from the service

Impacts on testing and deployment

Summary

Handling Service Transactions

Remaining stories

A bit of reorganization

Preparation for object transactions

Product object transactions

Artisan object transactions

Order object transactions

When do messages get sent?

Summary

Testing and Deploying Services

The challenges of testing services

The overall testing strategy

Demonstrating the service

Packaging and deploying the service

Where hms_sys development could go from here

Summary

Multiprocessing and HPC in Python

Common factors to consider

A simple but expensive algorithm

Local parallel processing

Parallelizing across multiple machines

Integrating Python with large-scale, cluster computing frameworks

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Asking questions

There can be as many distinct questions that can be asked about any given chunk of code as there are chunks of code to ask about—even very simple code, living in a complex system, can raise questions in response to questions, and more questions in response to those questions.

If there isn't an obvious starting point, starting with the following really basic questions is a good first step:

Who will be using the functionality?
What will they be doing with it?
When, and where, will they have access to it?
What problem is it trying to solve? For example, why do they need it?
How does it have to work? If detail is lacking, breaking this one down into two separate questions is useful:
- What should happen if it executes successfully?
- What should happen if the execution fails?

Teasing out more information about the whole system usually starts with something as basic as the following questions:

What other parts of the system does this code interact with?
How does it interact with them?

Having identified all of the moving parts, thinking about "What happens if…" scenarios is a good way to identify potential points where things will break, risks, and dangerous interactions. You can ask questions such as the following:

What happens if this argument, which expects a number, is handed a string?
What happens if that property isn't the object that's expected?
What happens if some other object tries to change this object while it's already being changed?

Whenever one question has been answered, simply ask, What else? This can be useful for verifying whether the current answer is reasonably complete.

Let's see this process in action. To provide some context, a new function is being written for a system that keeps track of mineral resources on a map-grid, for three resources: gold, silver, and copper. Grid locations are measured in meters from a common origin point, and each grid location keeps track of a floating-point number, from 0.0 to 1.0, which indicates how likely it is that resource will be found in the grid square. The developmental dataset already includes four default nodes - at (0,0), (0,1), (1,0), and (1,1) - with no values, as follows:

The system already has some classes defined to represent individual map nodes, and functions to provide basic access to those nodes and their properties, from whatever central data store they live in:

Constants, exceptions, and functions for various purposes already exist, as follows:

node_resource_names: This contains all of the resource names that the system is concerned with, and can be thought of and treated as a list of strings: ['gold','silver','copper']
NodeAlreadyExistsError: An exception that will be raised if an attempt is made to create a MapNode that already exists
NonexistentNodeError: An exception that will be raised if a request is made for a MapNode that doesn't exist

OutOfMapBoundsError: An exception that will be raised if a request is made for a MapNode that isn't allowed to exist in the map area
create_node(x,y): Creates and returns a new, default MapNode, registering it in the global dataset of nodes in the process
get_node(x,y): Finds and returns a MapNode at the specified (x, y) coordinate location in the global dataset of available nodes

A developer makes an initial attempt at writing the code to set a value for a single resource at a given node, as a part of a project. The resulting code looks as follows (assume that all necessary imports already exist):

def SetNodeResource(x, y, z, r, v):
    n = get_node(x,y)
    n.z = z
    n.resources.add(r, v)

This code is functional, from the perspective that it will do what it's supposed to (and what the developer expected) for a set of simple tests; for example, executing, as follows:

SetNodeResource(0,0,None,'gold',0.25) print(get_node(0,0)) SetNodeResource(0,0,None,'silver',0.25) print(get_node(0,0)) SetNodeResource(0,0,None,'copper',0.25) print(get_node(0,0))

The results are in the following output:

By that measure, there's nothing wrong with the code and its functions, after all. Now, let's ask some of our questions, as follows:

Who will be using this functionality?: The function may be called, by either of two different application front-ends, by on-site surveyors, or by post-survey assayers. The surveyors probably won't use it often, but if they see obvious signs of a deposit during the survey, they're expected to log it with a 100% certainty of finding the resource(s) at that grid location; otherwise, they'll leave the resource rating completely alone.

What will they be doing with it?: Between the base requirements (to set a value for a single resource at a given node) and the preceding answer, this feels like it's already been answered.

When, and where, do they have access to it?: Through a library that's used by the surveyor and assayer applications. No one will use it directly, but it will be integrated into those applications.

How should it work?: This has already been answered, but raises the question: Will there ever be a need to add more than one resource rating at a time? That's probably worth nothing, if there's a good place to implement it.

What other parts of the system does this code interact with?: There's not much here that isn't obvious from the code; it uses MapNode objects, those objects' resources, and the get_node function.

What happens if an attempt is made to alter an existing MapNode?: With the code as it was originally written, this behaves as expected. This is the happy path that the code was written to handle, and it works.

What happens if a node doesn't already exist?: The fact that there is a NonexistentNodeError defined is a good clue that at least some map operations require a node to exist before they can complete. Execute a quick test against that by calling the existing function, as follows:

SetNodeResource(0,6,None,'gold',0.25)

The preceding command results in the following:

This is the result because the development data doesn't have a MapNode at that location yet.

What happens if a node can't exist at a given location?: Similarly, there's an OutOfMapBoundsError defined. Since there are no out-of-bounds nodes in the development data, and the code won't currently get past the fact that an out-of-bounds node doesn't exist, there's no good way to see what happens if this is attempted.

What happens if the z-value isn't known at the time?: Since the create_node function doesn't even expect a z-value, but MapNode instances have one, there's a real risk that calling this function on an existing node would overwrite an existing z-altitude value, on an existing node. That, in the long run, could be a critical bug.
Does this meet all of the various developmental standards that apply?: Without any details about standards, it's probably fair to assume that any standards that were defined would probably include, at a minimum, the following:
- Naming conventions for code elements, such as function names and arguments; an existing function at the same logical level as get_node, using SetNodeResources as the name of the new function, while perfectly legal syntactically, may be violating a naming convention standard.
- At least some of the effort towards documentation, of which there's none.
- Some inline comments (maybe), if there is a need to explain parts of the code to future readers—there are none of these also, although, given the amount of code in this version and the relatively straightforward approach, it's arguable whether there would be any need.
What should happen if the execution fails?: It should probably throw explicit errors, with reasonably detailed error messages, if something fails during execution.
What happens if an invalid value is passed for any of the arguments?: Some of them can be tested by executing the current function (as was done previously), while supplying invalid arguments—an out-of -range number first, then an invalid resource name.

Consider the following code, executed with an invalid number:

SetNodeResource(0,0,'gold',2)

The preceding code results in the following output:

Also, consider the following code, with an invalid resource type:

SetNodeResource(0,0,'tin',0.25)

The preceding code results in the following:

The function itself can either succeed or raise an error during execution, judging by these examples; so, ultimately, all that really needs to happen is that those potential errors have to be accounted for, in some fashion.

Other questions may come to mind, but the preceding questions are enough to implement some significant changes. The final version of the function, after considering the implications of the preceding answers and working out how to handle the issues that those answers exposed, is as follows:

def set_node_resource(x, y, resource_name, 
    resource_value, z=None):
    """
Sets the value of a named resource for a specified 
node, creating that node in the process if it doesn't 
exist.

Returns the MapNode instance.

Arguments:
 - x ................ (int, required, non-negative) The
                      x-coordinate location of the node 
                      that the resource type and value is 
                      to be associated with.
 - y ................ (int, required, non-negative) The 
                      y-coordinate location of the node 
                      that the resource type and value is 
                      to be associated with.
 - z ................ (int, optional, defaults to None) 
                      The z-coordinate (altitude) of the 
                      node.
 - resource_name .... (str, required, member of 
                      node_resource_names) The name of the 
                      resource to associate with the node.
 - resource_value ... (float, required, between 0.0 and 1.0, 
                      inclusive) The presence of the 
                      resource at the node's location.

Raises
 - RuntimeError if any errors are detected.
"""
    # Get the node, if it exists
    try:
        node = get_node(x,y)
    except NonexistentNodeError:
        # The node doesn't exist, so create it and 
        # populate it as applicable
        node = create_node(x, y)
    # If z is specified, set it
    if z != None:
        node.z = z
# TODO: Determine if there are other exceptions that we can 
#       do anything about here, and if so, do something 
#       about them. For example:
#    except Exception as error:
#        # Handle this exception
    # FUTURE: If there's ever a need to add more than one 
    #    resource-value at a time, we could add **resources 
    #    to the signature, and call node.resources.add once 
    #    for each resource.
    # All our values are checked and validated by the add 
    # method, so set the node's resource-value
    try:
        node.resources.add(resource_name, resource_value)
        # Return the newly-modified/created node in case 
        # we need to keep working with it.
        return node
    except Exception as error:
        raise RuntimeError(
            'set_node_resource could not set %s to %0.3f '
            'on the node at (%d,%d).' 
            % (resource_name, resource_value, node.x, 
            node.y)
        )

Stripping out the comments and documentation for the moment, this may not look much different from the original code—only nine lines of code were added—but the differences are significant, as follows:

It doesn't assume that a node will always be available.
If the requested node doesn't exist, it creates a new one to operate on, using the existing function defined for that purpose.
It doesn't assume that every attempt to add a new resource will succeed.
When such an attempt fails, it raises an error that shows what happened.

All of these additional items are direct results of the questions asked earlier, and of making conscious decisions on how to deal with the answers to those questions. That kind of end result is where the difference between the programming and software engineering mindsets really appears.

Hands-On Software Engineering with Python

By : Brian Allbee, Nimesh Verma

Hands-On Software Engineering with Python

By: Brian Allbee, Nimesh Verma

Overview of this book

Related Content you might be interested in

Current Title:

Hands-On Software Engineering with Python