Before we start, it's important to know how the components work with each other. Let's start with workspaces. The workspace area is where you can share results between data scientists and engineers through the use of Databricks notebooks. Notebooks can interoperate with the filesystem in Databricks to store Parquet or Delta Lake files. The workspaces section also stores files such as Python libraries and JAR files. In the workspaces section, you can create folders to store shared files. I typically create a packages folder to store the Python and JAR files. Before we install the Python packages, let's first examine what a cluster is by going to the cluster section.
In your Databricks instance, go to the Clusters menu. You can create a cluster or use a cluster that has already been created. With clusters, you specify the amount of compute needed. Spark can work over large datasets but also work with GPUs for ML-optimized workloads. Some clusters have ML tools...