So far, we have been getting data from plain files and databases. These are two of the most common data sources, but there are many more kinds of sources available in PDI, mainly grouped in, but not limited to, the Input
folder. The following subsections will present some of the sources that we didn't cover in the previous sections, which are also of use.
With PDI, you can read XML files or parse fields whose contents are in an XML structure. In both cases, you parse the XML with the Get data from XML
input step. For specifying the fields to read, you use XPath notation
. When the XML is very big or complex, there is an alternative step:XML Input Stream (StAX).
Similarly, you can parse JSON structures with the JSON Input
step. For specifying the fields in this case, you use JSONPath notation
.
Also, you can parse both XML and JSON structures with JavaScript or Java code, by using the Modified Java Script Value
step or the User Defined Java Class
step...