As we have seen, some systems may store data in a denormalized form and, in the previous section, we saw how we could normalize the data. In essence, we were turning the data from column into a row. However, with some data, we may wish to change its normalized form not to rows, but to individual columns. For example, suppose a system stores its employee data with the following schema:
[employee_id] | [name]
And the name field holds the first name and last name of the employee in the following format:
[last_name], [first_name]
An example file is shown as follows:
Note
Note that the schema does not have three fields, but that the second field contains the first and last name, separated by a comma.
Our objective in this example is to manipulate the data so that it maps to a three-field schema:
[employee_id] | [last_name] | [first_name]
Follow the walk-through given:
Create a new job and name it
ExtractDelimitedFields
.Create a File delimited metadata item for our input file...