Reading HTML documents
We can use the standard library
html.parser module, but it's not as helpful as we'd like. It only provides low-level lexical scanning information; it doesn't provide a high-level data structure that describes the original web page.
Instead, we'll use the Beautiful Soup module to parse HTML pages into more useful data structures. This is available from the Python Package Index (PyPI). See https://pypi.python.org/pypi/beautifulsoup4.
This must be downloaded and installed. Often, this is as simple as doing the following:
python -m pip install beautifulsoup4
python -m pip command ensures that we will use the
pip command that goes with the currently active virtual environment.
We've gathered some...