In the first recipe of this chapter, we saw how to use the AWS Pipeline to export the DynamoDB data to S3. The AWS Pipeline creation and execution is easy and quick, but we have very little control on things that happen in the pipeline, so we now are going to talk about one recipe that will explain how to export the DynamoDB data to S3 using EMR.
To perform this recipe, you should have performed the earlier recipe and have your EMR cluster still running.
Let's export data to AWS S3 from DynamoDB:
To perform this recipe, we need to create two tables. In the earlier recipes, we have already created
productHiveTable
, as shown in the following code:CREATE EXTERNAL TABLE productHiveTable ( id string, type string, mnfr string, name string, price bigint, stock bigint) STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' TBLPROPERTIES ("dynamodb.table.name" = "product","dynamodb.column.mapping" = "id:id,type...