-
Book Overview & Buying
-
Table Of Contents
Building ETL Pipelines with Python
By :
With a bit more familiarity around where to source data, let’s put it in the context of an importation activity within a data pipeline workflow. We’re going to use a Jupyter notebook for prototyping the final methodology we will eventually deploy within a Python script. The reasoning behind this is simple: Jupyter notebooks allow easy visualization, but can be quite clunky to deploy; Python scripts have less visualization access (it can be done, but not as effortlessly as in Jupyter) but can easily be used for deployment and various environments. In our case, we want to properly test and “sanity-check” the format of the imported source data. Later in the book, we’ll show how, when we transcribe our code to a Python script, we gain access to PyCharm’s powerful environment to easily test, log, and encrypt Python scripts.
Within your PyCharm environment for Chapter 4, verify...