Processing XML documents with Oracle JDeveloper 11g

Processing XML documents with Oracle JDeveloper 11g

Overview of this book

XML is an open standard for creating markup languages and exchanging structured documents and data over the Internet. JDeveloper 11g presents an effective, quick, and easy-to-use means of processing XML documents. Inspired by the author's previous XML articles for the Oracle community, this expanded hands-on tutorial guides newcomers and intermediate users through JDeveloper 11g and XML document development. It offers up-to-date information on working with the latest version of JDeveloper, and brand new information on JAXB 2.0 support in JDeveloper 11g. Filled with illustrations, explanatory tables, and comprehensive instructions, this book walks the reader through the wide assortment of JDeveloper's capabilities. Oracle's JDeveloper 11g is an Integrated Development Environment that provides a visual and declarative approach to application development. Over the course of 14 chapters, readers will get hands-on with JDeveloper as the comprehensive and self-contained tutorials provide clear instruction on the key XML tasks that JDeveloper can accomplish. Filled with practical information and illustrated examples, this book shows the reader how to create, parse, and store XML documents quickly, as well as providing step-by-step instructions on how to construct an XML schema and use the schema to validate an XML document. Oracle's XML Developer Kit (XDK) offers a set of components, tools, and utilities for developing XML-based applications, and developers will find the detailed XDK coverage invaluable. Later chapters are given over to using XPath, transforming XML with XSLT, and using the JSTL XML Tag Library. Moving through the book, a chapter on the JAXB 2.0 API shows you how to bind, marshal and unmarshal XML documents, before we finally delve into comparing XML documents, and converting them into PDF and Excel formats. In all, this book will enable the reader to gain a good and wide-ranging understanding of what JDeveloper has to offer for XML processing.

Processing XML documents with Oracle JDeveloper 11g

Credits

About the Author

About the Reviewers

Preface

Free Chapter

Creating and Parsing an XML Document

Setting the environment

Generating an XML document

Parsing an XML document with the DOM API

Parsing an XML document with the SAX API

Summary

Creating an XML Schema

An overview of XML Schema

Setting the environment

Creating an XML schema

Registering an XML schema

Creating an XML document from the XML schema

Summary

XML Schema Validation

JDeveloper built-in schema validation

Schema validation in XDK 11g

Setting the environment

Schema validation with XSDValidator

Schema validation with a SAX parser

Schema validation with a DOM parser

Summary

XPath

What is XPath?

XPath support in Oracle XDK 11g

Setting the environment

XPath search

Selecting nodes with XPath API

Summary

Transforming XML with XSLT

What we will cover in this chapter

Setting the environment for XSLT transformation

Transforming an XML document

XSLT extension functions

Summary

JSTL XML Tag Library

Overview of the JSTL XML tag library

Setting the environment

Parsing with the JSTL XML tag library

Transforming with the JSTL XML tag library

Summary

Loading and Saving XML with DOM 3.0 LS

Background

The API

Setting the environment

Loading an XML document

Saving an XML document

Filtering an XML document

Summary

Validating an XML Document with DOM 3 Validation

Setting the environment

Constructing and validating an XML document

Summary

JAXB 2.0

Setting the environment

Compiling an XML schema

Marshalling an XML document

Unmarshalling an XML document

Mapping Java to XML using annotations

Summary

Comparing XML Documents

Setting the environment

Comparing XML documents with the XMLDiff class

Summary

Converting XML to PDF

Setting the environment

Converting XML to XSL-FO

Parsing the XML document

Converting XSL-FO to PDF

Summary

Converting XML to MS Excel

Setting the environment

Converting an XML document to an Excel spreadsheet

Converting an Excel spreadsheet to an XML document

Summary

Storing XML in Oracle Berkeley DB XML

Installing Oracle Berkeley DB XML

Using the command shell

Using the Berkeley DB XML API in JDeveloper

Summary

Oracle XML Publisher

Setting the environment

Summary

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Parsing an XML document with the DOM API

In this section we shall parse an XML document (the XML document that was created in the previous section) with a DOM parser. DOM parsing creates an in-memory tree-like structure of an XML document, which may be navigated with the DOM API. We shall iterate over the XML document parsed, and output elements and attribute node values.

The DOM parsing API classes are in the oracle.xml.parser.v2 package and the DOM parser factory and parser classes are in the oracle.xml.jaxp package. First, import these packages into the DOMParserApp.java class in JDeveloper:

import oracle.xml.jaxp.*;
import oracle.xml.parser.v2.*;

Creating the factory

Create a JXDcoumentBuilderFactory object with the static method newInstance(). The factory object is used to obtain a parser that may be used to create a DOM document tree from an XML document:

JXDocumentBuilderFactory factory = (JXDocumentBuilderFactory) JXDocumentBuilderFactory.newInstance();

Set the ERROR_STREAM and SHOW_WARNINGS attributes on the factory object with the setAttribute() method. The ERROR_STREAM attribute specifies the error stream, while the SHOW_WARNINGS attribute specifies if warnings are to be shown. The value of the ERROR_STREAM attribute is an OutputStream object or a PrintWriter object. The value of the SHOW_WARNINGS attribute is a Boolean, which can be set to Boolean.TRUE or Boolean.FALSE. With the OutputStream or PrintWriter specified in the ERROR_STREAM attribute, parsing errors (if any) get outputted to the specified file. If ErrorHandler is also set, ERROR_STREAM is not used. The SHOW_WARNINGS attribute outputs warnings also:

factory.setAttribute(JXDocumentBuilderFactory.ERROR_STREAM,
new FileOutputStream(new File("c:/output/errorStream.txt")));
factory.setAttribute(JXDocumentBuilderFactory.SHOW_WARNINGS,
Boolean.TRUE);

Creating a DOM document object

Create a JXDocumentBuilder object from the factory object by first creating a DocumentBuilder object with newDocumentBuilder() method and subsequently casting the DocumentBuilder object to JXDocumentBuilder. JXDocumentBuilder is the implementation class in Oracle XDK 11g for the abstract class DocumentBuilder:

JXDocumentBuilder documentBuilder = (JXDocumentBuilder) factory.newDocumentBuilder();

The JXDocumentBuilder object is used to create a DOM document object from an XML document. A Document object may be obtained using the JXDocumentBuilder object with one of the parse() methods in the JXDocumentBuilder class. The input to the parser may be specified as InputSource, InputStream, File object, or a String URI. Create an InputStream for the example XML document and parse the document with the parse(InputStream) method:

InputStream input = new FileInputStream(new File("catalog.xml"));
XMLDocument xmlDocument = (XMLDocument) (documentBuilder.parse(input));

The parse() methods of the JXDocumentBuilder object return a Document object, which may be cast to an XMLDocument object, as the XMLDocument class implements the Document interface.

Outputting the XML document components' values

Output the encoding in the XML document using the getEncoding method, and output the version of the XML document using the getVersion method:

System.out.println("Encoding: " + xmlDocument.getEncoding());
System.out.println("Version: " + xmlDocument.getVersion());

The XMLDocument class has various getter methods to retrieve elements in a document. Some of these methods are listed in the following table:

Method Name	Description
`getDocumentElement()`	Returns the root element.
`getElementById(String)`	Returns element for a specified ID. An element that has an ID attribute may be retrieved using this method. An attribute named "id" is not necessarily an ID attribute. An ID attribute is defined in an XML Schema with the xs:ID type and in a DTD with ID attribute type.
`getElementsByTagName (String)`	Returns a NodeList of elements for a specified tag name. The elements are returned in the order defined in the DOM tree. All the elements of the specified tag name are returned, not just the top-level elements. If the tag name is specified as "*", all the elements in the document are returned.
`getElementsByTagNameNS(String namespaceURI, String localName)`	Returns a NodeList of elements for a specified namespace URI and local name.

As an example, retrieve title elements in the namespace http://xdk.com/catalog/journal using the getElementsByTagNameNS method:

NodeList namespaceNodeList = xmlDocument.getElementsByTagNameNS("http://xdk.com/catalog/journal","title");

Iterate over the NodeList to output element namespace, element namespace prefix, element tag name, and element text. The getNamespaceURI() method returns the namespace URI of an element. The getPrefix() method returns the prefix of an element in a namespace. The getTagName() method returns the element tag name. Element text is obtained by first obtaining the text node within the element node using the getFirstChild() method and subsequently the value of the text node:

for (int i = 0; i < namespaceNodeList.getLength(); i++) {
XMLElement namespaceElement = (XMLElement) namespaceNodeList.item(i);
System.out.println("Namespace URI: " +
namespaceElement.getNamespaceURI());
System.out.println("Namespace Prefix: " +
namespaceElement.getPrefix());
System.out.println("Element Name: " +
namespaceElement.getTagName());
System.out.println("Element text: " +
namespaceElement.getFirstChild().getNodeValue());
}

Obtain the root element in the XML document with the getDocumentElement() method. The getDocumentElement method returns an Element object that may be cast to an XMLElement object if any of the methods defined only in the XMLElement class are to be used. The Element object is not required to be cast to an XMLElement object. We have cast the Element object to XMLElement as XMLElement is Oracle XDK 11g's implementation class for the Element interface, and we are discussing Oracle XDK 11g:

XMLElement rootElement = (XMLElement)
(xmlDocument.getDocumentElement());
System.out.println("Root Element is: " + rootElement.getTagName());

Next, we shall iterate over all the subnodes of the root element. Obtain a NodeList of subnodes of the root element with the getChildNodes() method. Create a method iterateNodeList() to iterate over the subnodes of an Element. Iterate over the NodeList and recursively obtain the subelements of the elements in the NodeList. The method hasChildNodes() tests to see if a node has subnodes. Ignorable whitespace is also considered a node, but we are mainly interested in the subelements in a node. The NodeList interface method getLength() returns the length of a node list, and method item(int) returns the Node at a specified index. As class XMLNode is Oracle XDK 11g's implementation class for the Node interface, cast the Node object to XMLNode:

if (rootElement.hasChildNodes()) {
NodeList nodeList = rootElement.getChildNodes();
iterateNodeList(rootElement, nodeList);
}

If a node is of type element, the tag name of the element may be retrieved. Node type is obtained with the getNodeType() method, which returns a short value. The Node interface provides static fields for different types of nodes. The different types of nodes in an XML document are listed in the following table:

Node Type	Description
ELEMENT_NODE	Element node.
ATTRIBUTE_NODE	Attribute node.
TEXT_NODE	Text node, for example the text in an element such as `<elementA>Element A Text</elementA>`.
CDATA_SECTION_NODE	CDATA section node. We discussed a CDATA section in an earlier table.
ENTITY_REFERENCE_NODE	Entity reference node. An entity reference refers to the content of a named entity.
ENTITY_NODE	Entity node. An entity is defined in a DOCTYPE declaration or an external DTD, and represents an abbreviation for data that is to be used repeatedly.
PROCESSING_INSTRUCTION_NODE	Processing Instruction node. We discussed a processing instruction in an earlier section.
COMMENT_NODE	Comment node. We discussed a comment node in an earlier section.
DOCUMENT_NODE	Document node. The document node represents the complete DOM document tree.
DOCUMENT_TYPE_NODE	Doctype node represents the DOCTYPE declaration.
DOCUMENT_FRAGMENT_NODE	DocumentFragment node. A document fragment is a segment of a document.
NOTATION_NODE	Notation node. A notation is defined in a DOCTYPE declaration or an external DTD. Notations represent the format of unparsed entities (non-XML data that a parser does not parse), format of elements with a notation attribute, and the application to which a processing instruction is sent. An example of a notation is as follows: `<!NOTATION gif PUBLIC "gif viewer">`

For an element node, cast the node to XMLElement and output the element tag name:

if (node.getNodeType() == XMLNode.ELEMENT_NODE) {
XMLElement element = (XMLElement) node;
System.out.println("Element Tag Name:"+
element.getTagName))
}

The attributes in a element node are retrieved with the getAttributes() method, which returns a NamedNodeMap of attributes. The getLength() method of NamedNodeMap returns the length of an attribute node list. The method item(int) returns an Attr object for the attribute at the specified index. As class XMLAttr implements the Attr interface, cast the Attr object to XMLAttr. Iterate over the NamedNodeMap to output the attribute name and value. The hasAttributes() method tests if an element node has attributes:

if (element.hasAttributes()) {
NamedNodeMap attributes = element.getAttributes();
for (int i = 0; i < attributes.getLength(); i++) {
XMLAttr attribute = (XMLAttr)attributes.item(i);
System.out.println(" Attribute: " + attribute.getName() +
" with value " +attribute.getValue());
}
}

Running the Java application

The complete DOMParserApp.java Java application code listing is listed as follows with notes about the different sections in the Java class:

1. First, we add the package and import statements.

package xmlparser;
import java.io.*;
import oracle.xml.jaxp.*;
import oracle.xml.parser.v2.*;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.*;
import org.xml.sax.SAXException;

2. Next, we add Java class DOMParserApp.
```
public class DOMParserApp {
```
3. Then, we add the parseXMLDocument method to parse an XML document.
```
public void parseXMLDocument() {
try {
```

4. Now, we create the XMLDocument object by parsing the XML document catalog.xml.

JXDocumentBuilderFactory factory = (JXDocumentBuilderFactory) JXDocumentBuilderFactory.newInstance();
factory.setAttribute(JXDocumentBuilderFactory.ERROR_STREAM,
new FileOutputStream(new File("c:/output/errorStream.txt")));
factory.setAttribute(JXDocumentBuilderFactory.SHOW_WARNINGS,
Boolean.TRUE);
JXDocumentBuilder documentBuilder = (JXDocumentBuilder) factory.newDocumentBuilder();
InputStream input = new FileInputStream(new File("catalog.xml"));
XMLDocument xmlDocument =
(XMLDocument)(documentBuilder.parse(input));

5. Here, we output the document character encoding, the XML version, and namespace node values from the parsed XML document.

System.out.println("Encoding: " + xmlDocument.getEncoding());
System.out.println("Version: " + xmlDocument.getVersion());
NodeList namespaceNodeList = xmlDocument.getElementsByTagNameNS ("http://xdk.com/catalog/journal", "title");
for (int i = 0; i < namespaceNodeList.getLength(); i++) {
XMLElement namespaceElement =
(XMLElement)namespaceNodeList.item(i);
System.out.println("Namespace Prefix: " + namespaceElement. getNamespaceURI());
System.out.println("Namespace URI: " + namespaceElement. getPrefix());
System.out.println("Element Name: " + namespaceElement. getTagName());
System.out.println("Element text: " + namespaceElement.getFirstChild().getNodeValue());
}

6. Next, we obtain the subnodes of the root element and invoke the iterateNodeList method to iterate over the subnodes.

XMLElement rootElement =
(XMLElement)(xmlDocument.getDocumentElement());
System.out.println("Root Element is: " + rootElement.getTagName());
if (rootElement.hasChildNodes()) {
NodeList nodeList = rootElement.getChildNodes();
iterateNodeList(rootElement, nodeList);
}
}
catch (ParserConfigurationException e) {
System.err.println(e.getMessage());
} catch (FileNotFoundException e) {
System.err.println(e.getMessage());
} catch (IOException e) {
System.err.println(e.getMessage());
} catch (SAXException e) {
System.err.println(e.getMessage());
}
}

7. The iterateNodeList method has an Element parameter, which represents the element with subnodes. The second parameter is of the type NodeList, which is the NodeList of subnodes of the Element represented by the first parameter.
```
public void iterateNodeList(Element elem, NodeList nodeList) {
if (nodeList.getLength() > 1) {
System.out.println("Element " + elem.getTagName() +
" has sub-elements\n");
}
```

8. Iterate over the NodeList.

for (int i = 0; i < nodeList.getLength(); i++) {
XMLNode node = (XMLNode)nodeList.item(i);

9. If a node is of type Element, output the Element tag name and element text.

if (node.getNodeType() == XMLNode.ELEMENT_NODE) {
XMLElement element = (XMLElement)node;
System.out.println("Sub-element of " + elem.getNodeName());
System.out.println("Element Tag Name:" + element.getTagName());
System.out.println("Element text: " + element.getFirstChild().getNodeValue());

10. If an Element has attributes, output the attributes.

if (element.hasAttributes()) {
System.out.println("Element has attributes\n");
NamedNodeMap attributes = element.getAttributes();
for (int j = 0; j < attributes.getLength(); j++) {
XMLAttr attribute = (XMLAttr)attributes.item(j);
System.out.println("Attribute: " +attribute.getName() +
" with value "+ attribute.getValue());
}
}

11. If an Element has subnodes, obtain the NodeList of subnodes and iterate over the NodeList by invoking the iterateNodeList method again.
```
if (element.hasChildNodes()) {
iterateNodeList(element, element.getChildNodes());
}
}
}
}
```
12. Finally, we add the main method. In the main method, we create an instance of the DOMParserApp class and invoke the parseXMLDocument method.
```
public static void main(String[] argv) {
DOMParserApp domParser = new DOMParserApp();
domParser.parseXMLDocument();
}
}
```
13. To run the DOMParserApp.java in JDeveloper, right-click on the DOMParserApp.java node in Application Navigator and select Run.

14. The element and attribute values from the XML document get outputted.

The complete output from the DOM parsing application is as follows:

Encoding: UTF-8
Version: 1.0
Namespace Prefix: http://xdk.com/catalog/journal
Namespace URI: journal
Element Name: journal:title
Element text: Declarative Data Filtering
Root Element is: catalog
Element catalog has sub-elements
Sub-element of catalog
Element Tag Name:journal:journal
Element text:
Element has attributes
Attribute: journal:title with value Oracle Magazine
Attribute: journal:publisher with value Oracle Publishing
Attribute: journal:edition with value March-April 2008
Attribute: xmlns:journal with value http://xdk.com/catalog/journal
Element journal:journal has sub-elements
Sub-element of journal:journal
Element Tag Name:journal:article
Element text:
Element has attributes
Attribute: journal:section with value Oracle Developer
Element journal:article has sub-elements
Sub-element of journal:article
Element Tag Name:journal:title
Element text: Declarative Data Filtering
Sub-element of journal:article
Element Tag Name:journal:author
Element text: Steve Muench
Sub-element of catalog
Element Tag Name:journal
Element text:
XML document parsing, DOM API usedDOM parsing application outputElement has attributes
Attribute: title with value Oracle Magazine
Attribute: publisher with value Oracle Publishing
Attribute: edition with value September-October 2008
Element journal has sub-elements
Sub-element of journal
Element Tag Name:article
Element text:
Element has attributes
Attribute: section with value FEATURES
Element article has sub-elements
Sub-element of article
Element Tag Name:title
Element text: Share 2.0
Sub-element of article
Element Tag Name:author
Element text: Alan Joch

To demonstrate error handling with the ERROR_STREAM attribute, add an error in the example XML document. For example, remove a </journal> tag. Run the DOMParserApp.java application in JDeveloper. An error message gets outputted to the file specified in the ERROR_STREAM attribute:

<Line 15, Column 10>: XML-20121: (Fatal Error) End tag
does not match start tag 'journal'.

Processing XML documents with Oracle JDeveloper 11g

Processing XML documents with Oracle JDeveloper 11g

Overview of this book

Related Content you might be interested in

Current Title:

Processing XML documents with Oracle JDeveloper 11g

Parsing an XML document with the DOM API

Creating the factory

Creating a DOM document object

Outputting the XML document components' values

Running the Java application