Optimizing dataflow execution – the Data_Transfer transform
The transform object Data_Transfer is a pure optimization tool helping you to push down resource-consuming operations and transformations like JOIN
and GROUP BY
to the database level.
Getting ready
Take the dataflow from the Loading data from a flat file recipe in Chapter 4, Dataflow – Extract, Transform, and Load. This dataflow loads the Friends_*.txt file into a
STAGE.FRIENDS
table.Modify the
Friends_30052015.txt
file and remove all lines except the ones about Jane and Dave.In the dataflow, add another source table,
OLTP.PERSON
, and join it to a source file object in the Query transform by the first-name field. Propagate thePERSONTYPE
andLASTNAME
columns from the sourceOLTP.PERSON
table into the output Query transform schema, as shown here:
How to do it…
Our goal will be to configure this new dataflow to push down the insert of the joined dataset of data coming from the file and data coming from the OLTP.PERSON
table to a database...