As machine learning needs a lot of computational power, in order to save some resources (especially memory) we will use the Spark environment not backed by YARN in this chapter. This mode of operation is named standalone and creates a Spark node without cluster functionalities; all the processing will be on the driver machine and won't be shared. Don't worry; the code that we will see in this chapter will work in a cluster environment as well.
In order to operate this way, perform the following steps:
To turn it off, use the Ctrl + C keys to exit the IPython Notebook and vagrant halt
to turn off the virtual machine.