Now that we are clear how a connection to a Cassandra server is done, we can talk about a very special client. Everything we have seen previously has been directed at reaching this point. We have seen what Spark can do; now we know Cassandra and we know we can use it as a storage layer to improve Spark performance.
We need a client to achieve this connection but this client is special because it has been designed specifically for Spark and not for a specific language. This special client is called: Spark Cassandra connector.
The Spark-Cassandra connector has its own GitHub repository, the latest stable version is in master, but we can access a special version through a particular branch.
The Cassandra connector project home page is: https://github.com/datastax/spark-cassandra-connector .
At the time of writing, the most stable connector version is 1.6.0.
The connector is basically a .jar
file loaded when Spark starts. So, if you prefer to access...