Java is a statically typed programming language and code written in Java needs compiling. While Java is good for developing complex data science applications, it makes it harder to interactively explore the data; every time, we need to recompile the source code and re-run the analysis script to see the results. This means that, if we need to read some data, we will have to do it over and over again. If the dataset is large, the program takes more time to start.
So it is hard to interact with data and this makes EDA more difficult in Java than in other languages. In particular, Read-Evaluate-Print Loop (REPL), an interactive shell, is quite an important feature for doing EDA.
Unfortunately, Java 8 does not have REPL, but there are several alternatives:
- Java 9 with jshell
- Completely alternative platforms such as Python or R
In this chapter, we will look at the first two options--JVM...