In the previous recipe, we saw how to use EMR to access the DynamoDB data and query the same as well. In this recipe, we will see how to join two DynamoDB tables in order to get the combined view.
To perform this recipe, you should have performed the earlier recipe and should have your EMR cluster still running.
Here, we will use two tables: one is the Customer
table, and the other one is the Orders
table. The Customer
table contains detailed information of the customer, while the Order
table contains the details of the order, along with customerId
, which provides a link between these two tables. Now we want to execute queries that need information from both tables, which cannot be achieved solely by DynamoDB, and so, we use EMR:
To get started, we need to make sure that we have two tables created, as mentioned earlier. Now, we will connect to the EMR cluster, and we will create two Hive tables corresponding...