Book Image

Mastering Python Scientific Computing

Book Image

Mastering Python Scientific Computing

Overview of this book

Table of Contents (17 chapters)
Mastering Python Scientific Computing
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

The background of the Python programming language


Python is a general-purpose high-level programming language that supports most programming paradigms, including procedural, object-oriented, imperative, aspect-oriented, and functional programming. It also supports logical programming using an extension. It is an interpreted language that helps programmers compose a program in fewer lines than the code for the same concept in C++, Java, or other languages. Python supports dynamic typing and automatic memory management. It has a large and comprehensive standard library, and now it also has support for a number of custom libraries for many specific tasks. It is very easy to install packages using package managers such as pip, easy_install, homebrew (OS X), apt-get (Linux), and others.

Python is an open source language; its interpreters are available for most operating systems, including Windows, Linux, OS X, and others. There are a number of tools available to convert a Python program into an executable form for different operating systems, for example, Py2exe and PyInstaller. This executable form is standalone code that does not require a Python interpreter for execution.

The guiding principles of the Python language

Python's guiding principles by Guido van Rossum, who is also known as the Benevolent Dictator For Life (BDFL), have been converted into some aphorism by Tim Peters and are available at https://www.python.org/dev/peps/pep-0020/. Let's discuss these with some explanations, as follows:

  • Beautiful is better than ugly: The philosophy behind this is to write programs for human readers, with simple expression syntax and consistent syntax and behavior for all programs.

  • Explicit is better than implicit: Most concepts are kept explicit, just like the explicit Boolean type. We have used an explicit literal value—true or false—for Boolean variables instead of depending on zero or nonzero integers. Still, it does support the integer-based Boolean concept. Nonzero values are treated as Boolean. Similarly, its for loop can operate data structures without managing the variable. The same loop can iterate through tuples and characters in a string.

  • Simple is better than complex: Memory allocation and the garbage collector manage allocation or deallocation of memory to avoid complexity. Another simplicity is introduced in the simple print statement. This avoids the use of file descriptors for simple printing. Moreover, objects automatically get converted to a printable form in comma-separated values.

  • Complex is better than complicated: Scientific computing concepts are complex, but this doesn't mean that the program will be complicated. Python programs are not complicated, even for very complex application. The "Pythonic" way is inherently simple, and the SciPy and NumPy packages are very good examples of this.

  • Flat is better than nested: Python provides a wide variety of modules in its standard library. Namespaces in Python are kept in a flat structure, so there is no need to use very long names, such as java.net.socket instead of a simple socket in Python. Python's standard library follows the batteries included philosophy. This standard library provides tools suitable for many tasks. For example, modules for various network protocols are supported for the development of rich Internet applications. Similarly, modules for graphic user interface programming, database programming, regular expressions, high-precision arithmetic, unit testing, and much more are bundled in the standard library. Some of the modules in the library include networking (socket, select, SocketServer, BaseHTTPServer, asyncore, asynchat, xmlrpclib, and SimpleXMLRPCServer), Internet protocols (urllib, httplib, ftplib, smtpd, smtplib, poplib, imaplib, and json), database (anydbm, pickle, shelve, sqlite3, and mongodb), and parallel processing (subprocess, threading, multiprocessing, and queue).

  • Sparse is better than dense: The Python standard library is kept shallow and the Python package index maintains an exhaustive list of third-party packages meant for supporting in-depth operations for a topic. We can use pip to install custom Python packages.

  • Readability counts: The block structure of your program should be created using white spaces, and Python uses minimal punctuation in its syntax. As semicolons introduce blocks, no semicolons are needed at the end of the line. Semicolons are allowed but they are not required in every line of code. Similarly, in most situations, parentheses are not required for expressions. Python introduces inline documentation used to generate API documentation. Python's documentation is available at runtime and online.

  • Special cases aren't special enough to break the rules: The philosophy behind this is that everything in Python is an object. All built-in types are implemented as objects. The data types that represent numbers have methods. Even functions are themselves objects with methods.

  • Although practicality beats purity: Python supports multiple programming styles to give users the choice to select the style that is most suitable for their problem. It supports OOP, procedural, functional, and many more types of programming.

  • Errors should never pass silently: It uses the concept of exception handling to avoid handling errors at low level APIs so that they may be handled at a higher level while writing the program that uses these APIs. It supports the concept of standard exceptions with specific meanings, and users are allowed to define exceptions for custom error handling. To support debugging of code, the concept of traceback is provided. In Python programs, by default, the error handling mechanism prints a complete traceback pointing to the error in stderr. The traceback includes the source filename, line number, and source code, if it is available.

  • Unless explicitly silenced: To take care of some situations, there are options to let an error pass by silently. For these situations, we can use the try statement without except. There is also an option to convert an exception into a string.

  • In the face of ambiguity, refuse the temptation to guess: Automatic type conversion is performed only when it is not surprising. For example, an operation between an integer operand with a float operand results in a float value.

  • There should be oneand preferably only oneobvious way to do it: This is very obvious. It requires elimination of all redundancy. Hence, it is easier to learn and remember.

  • Although that way may not be obvious at first unless you're Dutch: The way that we discussed in the previous point is applicable to the standard library. Of course, there will be redundancy in third-party modules. For example, we have support for multiple GUI APIs, such as as GTK, wxPython, and KDE. Similarly for web programming, we have Django, AppEngine, and Pyramid.

  • Now is better than never: This statement is meant to motivate users to adopt Python as their favorite tool. There is a concept of ctypes meant to wrap existing C/C++ shared libraries for use in Python programs.

  • Although never is often better than *right* now: With this philosophy, the Python Enhancement Proposals (PEP) processed a temporary moratorium (suspension) on all changes to the syntax, semantics, and built-in components for a specified period to promote the alternative development catch-up.

  • If the implementation is hard to explain, it's a bad idea and If the implementation is easy to explain, it may be a good idea: In Python all the changes to the syntax, new library modules, and APIs will be processed through a highly rigorous process of review and approval.

Why Python for scientific computing?

To be frank, if we're talking about the Python language alone, then we need to think about some option. Fortunately, we have support for NumPy, SciPy, IPython, and matplotlib, and this makes Python the best choice. We are going to discuss these libraries in subsequent chapters. The following are the comprehensive features of Python and the associated library that make Python preferable to the other alternatives such as MATLAB, R, and other programming languages. Mostly, there is no single alternative that possesses all of these features.

Compact and readable code

Python code is generally compact and inherently more readable in comparison to its alternatives for scientific computing. As discussed in the Python guiding principles, this is the impact of the design philosophy of Python.

Holistic language design

Overall, the design of the Python language is highly convenient for scientific computing because Python supports multiple programming styles, including procedural, object-oriented, functional, and logic programming. The user has a wide range of choices and they can select the most suitable one for their problem. This is not the case with most of the available alternatives.

Free and open source

Python and the associated tools are freely available for use, and they are published as open source tools. This brings an added advantage of availability of their internal source code. On the other hand, most competing tools are costly proprietary products and their internal algorithms and concepts are not published for users.

Language interoperability

Python supports interoperability with most existing technologies. We can call or use functions, code, packages, and objects written in different languages, such as MATLAB, C, C++, R, Fortran, and others. There are a number of options available to support this interoperability, such as Ctypes, Cython, and SWIG.

Portable and extensible

Python supports most platforms. So, it is a portable programming language, and its program written for one platform will result in almost the same output on any other platform if Python toolkits are available for that platform. The design principles behind Python have made it a highly extensible language, and that's why we have a large number of high-class libraries available for a number of different tasks.

Hierarchical module system

Python supports a modular system to organize programs in the form of functions and classes in a namespace. The namespace system is very simple in order to keep learning and remembering the concepts easy. This also supports enhanced code reusability and maintenance.

Graphical user interface packages

The Python language offers a wide set of choices in graphics packages and tool sets. These toolkits and packages support graphic design, user interface designing, data visualization, and various other activities.

Data structures

Python supports an exhaustive range of data structures, which is the most important component in the design and implementation of a program to perform scientific computations. Support for a dictionary is the most highlightable feature of the data structure functionality of the Python language.

Python's testing framework

Python's unit testing framework, named PyUnit, supports complete unit testing functionality for integration with the mypython program. It supports various important unit testing concepts, including test fixture, test cases, test suites, and test runner.

Available libraries

Owing to the batteries-included philosophy of Python, it supports a wide range of standard packages in its bundled library. As it is an extensible language, a number of well-tested custom-specific purpose libraries are available for a wide range of users. Let's briefly discus a few libraries used for scientific computations.

NumPy/SciPy is a package that supports most mathematical and statistical operations required for any scientific computation. The SymPy library provides functionality for symbolic computations of basic symbolic arithmetic, algebra, calculus, discrete mathematics, quantum physics, and more. PyTables is a package used to efficiently process datasets that have a large amount of data in the form of a hierarchical database. IPython facilitates the interactive computing feature of Python. It is a command shell that supports interactive computing in multiple programming languages. matplotlib is a library that supports plotting functionality for Python/NumPy. It supports plotting of various types of graphs, such as line plot, histogram, scatter plot, and 3D plot. SQLAlchemy is an object-relational mapping library for Python programming. By using this library, we can use the database capability for scientific computations with great performance and ease. Finally, it is time to introduce a toolkit written on top of the packages we just discussed and a number of other open source libraries and toolkits. This toolkit is named SageMath. It is a piece of open source mathematical software.

The downsides of Python

After discussing a lot of upsides of Python over the alternatives, if we start searching for some downsides, we will notice something important: the integrated development environment (IDE) of Python is not the most powerful IDE compared to the alternatives. As Python toolkits are arranged in the form of discrete packages and toolkits, some of them have a command-line interface. So, in the comparison of this feature, Python is lagging behind some alternatives on specific platforms, for example, MATLAB on Windows. However, this doesn't mean that Python is not that convenient; it is equally comparable and supports ease of use.