Setting up your coding environment
The following table summarizes the essential software used in this book.
Here are instructions for uploading this software to your system.
The data science libraries that you will need in this book along with Jupyter Notebooks, scikit-learn (sklearn), and Python may be installed together using Anaconda, which is recommended.
Here are the steps to install Anaconda on your computer as of 2020:
Click Download on the following screen, which does not yet start the download, but presents you with a variety of options (see step 3):
Select your installer. The 64-Bit Graphical Installer is recommended for Windows and Mac. Make sure that you select from the top two rows under Python 3.7 since Python 3.7 is used throughout this book:
After your download begins, continue with the prompts on your computer to complete the installation:
Warning for Mac users
If you run into the error You cannot install Anaconda3 in this location, do not panic. Just click on the highlighted row Install for me only and the Continue button will present as an option.
Using Jupyter notebooks
Now that you have Anaconda installed, you may open a Jupyter notebook to use Python 3.7. Here are the steps to open a Jupyter notebook:
Click on Anaconda-Navigator on your computer.
Click Launch under Jupyter Notebook as shown in the following screenshot:
This should open a Jupyter notebook in a browser window. While Jupyter notebooks appear in web browsers for convenience, they are run on your personal computer, not online. Google Colab notebooks are an acceptable online alternative, but in this book, Jupyter notebooks are used exclusively.
Select Python 3 from the New tab present on the right side of your Jupyter notebook as shown in the following screenshot:
This should bring you to the following screen:
Congratulations! You are now ready to run Python code! Just type anything in the cell, such as
print('hello xgboost!'), and press Shift + Enter to run the code.
Troubleshooting Jupyter notebooks
If you have trouble running or installing Jupyter notebooks, please visit Jupyter's official troubleshooting guide: https://jupyter-notebook.readthedocs.io/en/stable/troubleshooting.html.
At the time of writing, XGBoost is not yet included in Anaconda so it must be installed separately.
Here are the steps for installing XGBoost on your computer:
Go to https://anaconda.org/conda-forge/xgboost. Here is what you should see:
Copy the first line of code in the preceding screenshot, as shown here:
Open the Terminal on your computer.
If you do not know where your Terminal is located, search
Terminalfor Mac and
Windows Terminalfor Windows.
Paste the following code into your Terminal, press Enter, and follow any prompts:
conda install -c conda-forge xgboost
Verify that the installation has worked by opening a new Jupyter notebook as outlined in the previous section. Then enter
import xgboostand press Shift + Enter. You should see the following:
If you got no errors, congratulations! You now have all the necessary technical requirements to run code in this book.
If you received errors trying to set up your coding environment, please go back through the previous steps, or consider reviewing the Anaconda error documentation presented here: https://docs.anaconda.com/anaconda/user-guide/troubleshooting/. Previous users of Anaconda should update Anaconda by entering
conda update conda in the Terminal. If you have trouble uploading XGBoost, see the official documentation at https://xgboost.readthedocs.io/en/latest/build.html.
Here is code that you may run in a Jupyter notebook to see what versions of the following software you are using:
import platform; print(platform.platform()) import sys; print("Python", sys.version) import numpy; print("NumPy", numpy.__version__) import scipy; print("SciPy", scipy.__version__) import sklearn; print("Scikit-Learn", sklearn.__version__) import xgboost; print("XGBoost", xgboost.__version__)
Here are the versions used to generate code in this book:
Darwin-19.6.0-x86_64-i386-64bit Python 3.7.7 (default, Mar 26 2020, 10:32:53) [Clang 4.0.1 (tags/RELEASE_401/final)] NumPy 1.19.1 SciPy 1.5.2 Scikit-Learn 0.23.2 XGBoost 1.2.0
It's okay if you have different versions than ours. Software is updated all the time, and you may obtain better results by using newer versions when released. If you are using older versions, however, it's recommended that you update using Anaconda by running
conda update conda in the terminal. You may also run
conda update xgboost if you installed an older version of XGBoost previously and forged it with Anaconda as outlined in the previous section.
Accessing code files
If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Hands-On-Gradient-Boosting-with-XGBoost-and-Scikit-learn. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!