Showing posts with label XML parsing. Show all posts
Showing posts with label XML parsing. Show all posts

Wednesday, August 10, 2016

An introduction to working with JAXB

I am in the process of migrating a few modules that are dependent on Apache XMLBeans to JAXB. It has been an exciting and challenging few days. I thought of jotting down a few important things I came across for anyone who might find it useful in the future.

First of all, let us look at setting up the maven plugin for the JAXB code generation. As of the time of writing this post, I came across two maven plugins;
Ended up using the first one as I found the configuration to be quite straightforward.

Your maven project structure will be as follows;
Project Folder->src->main->xsd
This will hold all the XSD files from which you would want to generate the JAXB objects.

Project Folder->src->main->xjb
This will holder your “bindings.xml” file, which is your data binding file used for any customization required as part of running the JAX generation task(xjc).

The plugin configuration for maven will be as follows;
 
<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>jaxb2-maven-plugin</artifactId>
     <version>2.2</version>
    <executions>
     <execution>
      <id>xjc</id>
      <goals>
       <goal>xjc</goal>
      </goals>
     </execution>
    </executions>
    <configuration>
     <target>2.1</target>
     
     <sources>
      <source>src/main/xsd</source>
     </sources>
     
    </configuration>
  </plugin>


  • One thing that we were quite used with XMLBeans was the “isSet” type of methods for all optional elements which will check if the element is set or not. By default JAXB does not generate this method and you have to end up using the not null condition on each element. Thankfully, the binding configuration allows for this with the following;

<jxb:bindings 
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
    xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
    jxb:extensionBindingPrefixes="xjc"
    version="2.1">
<jxb:globalBindings  generateIsSetMethod="true"
</jxb:globalBindings>
</jxb:bindings>


  • By default, JAXB does not generate Java enumerations for the enumerations defined on the XSD files. The sad part is I could not find a way to apply this generation at a global level and could only handle it per XSD. But with XMLBeans, this was automatically done. In order to generate Java enumerations, the following should be done;
Sample XSD:

<xs:complexType name="EndpointType">
  <xs:attribute name="protocol">
   <xs:simpleType>
    <xs:restriction base="xs:string">
     <xs:enumeration value="HTTP"/>
     <xs:enumeration value="HTTPS"/>
     <xs:enumeration value="PAYLOAD"/>
    </xs:restriction>
   </xs:simpleType>
  </xs:attribute>
 </xs:complexType>


JAXB binding:
 
<jxb:bindings 
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
    xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
    jxb:extensionBindingPrefixes="xjc"
    version="2.1">
<jxb:bindings schemaLocation="../xsd/testconfig.xsd">
       
  <jxb:bindings node="//xs:complexType[@name='EndpointType']/xs:attribute[@name='protocol']/xs:simpleType">
               <jxb:typesafeEnumClass name="Protocol" />
        </jxb:bindings>
 
   </jxb:bindings>
</jxb:bindings>

schemaLocation – This is the relative path to the XSD I want to refer to. Since my “bindings.xml” resided in the “xjb” directory, I had to go one step up and go into the XSD directory to get the required XSD file.

node – Here you need to provide the xquery path of the simple type that has the enumeration defined. If you cross check this with the XSD provided, you will figure out how the XQuery path retrieves the given element.

Note: If in any event, your xpath returns multiple elements with the same name, you can still handle this by introducing the element multiple=”true” on the <jxb:bindings> element.
E.g : <jxb:bindings node="//xs:complexType[@name='EndpointType']/xs:attribute[@name='protocol']/xs:simpleType" multiple="true">


typesafeEnumClass – On this element you can provide the Java enumeration name to be generated.

  • XMLBeans by default converts all XSD date and date time elements to a Java Calendar object. With JAXB however, by default the XMLGregorianCalendar is used. Yet again the global bindings came to the rescue and this was handled with the below configuration which converted all XSD date elements to a Java Calendar object.


<jxb:bindings 
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
    xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
    jxb:extensionBindingPrefixes="xjc"
    version="2.1">

<jxb:globalBindings>

   <jxb:javaType name="java.util.Calendar" xmlType="xs:dateTime"
            parseMethod="javax.xml.bind.DatatypeConverter.parseDateTime"
            printMethod="javax.xml.bind.DatatypeConverter.printDateTime"/>

        <jxb:javaType name="java.util.Calendar" xmlType="xs:date"
            parseMethod="javax.xml.bind.DatatypeConverter.parseDate"
            printMethod="javax.xml.bind.DatatypeConverter.printDate"/>

        <jxb:javaType name="java.util.Calendar" xmlType="xs:time"
            parseMethod="javax.xml.bind.DatatypeConverter.parseTime"
            printMethod="javax.xml.bind.DatatypeConverter.printTime"/>
    </jxb:globalBindings>

</jxb:bindings>


  • If there is a need to make your JAXB objects serializable, this can be achieved with the following global binding configuration;

<jxb:bindings 
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
    xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
    jxb:extensionBindingPrefixes="xjc"
    version="2.1">

 <jxb:globalBindings >
 <xjc:serializable />
  
  </jxb:globalBindings>
 
 
</jxb:bindings>




The element that does the trick is the “<xjc:serializable/>” element.


  • With JDK 1.8, I faced an issue whereby if one of your XSD’s had an import for another schema to retrieve another XSD via HTTP, this was being blocked. An excerpt of the error thrown out was “because 'http' access is not allowed due to restriction set by the accessExternalDTD property”. The work-around in this case was to use the following maven plugin to set the VM arguments required to bypass this restriction. More information on this issue can be found here.

<plugin>
    <!-- We use this plugin to ensure that our usage of the
    maven-jaxb2-plugin is JDK 8 compatible in absence of a fix
    for https://java.net/jira/browse/MAVEN_JAXB2_PLUGIN-80. -->
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>properties-maven-plugin</artifactId>
   <version>1.0.0</version>
    <executions>
        <execution>
            <id>set-additional-system-properties</id>
            <goals>
                <goal>set-system-properties</goal>
            </goals>
        </execution>
    </executions>
    <configuration>
        <properties>
            <property>
                <name>javax.xml.accessExternalSchema</name>
                <value>file,http</value>
            </property>
    <property>
                <name>javax.xml.accessExternalDTD</name>
                <value>file,http</value>
            </property>
        </properties>
    </configuration>
</plugin>


That is about it. I will keep updating this post as I go on. As always, your feedback on the same is always much appreciated.

Thank you for reading, and have a good day everyone.

Friday, October 29, 2010

XML Parsing with AXIOM

Recently a friend of mine inquired about what the best way is to parse XML. I have to say im not an expert in this subject but from what i have read i instantly remembered that AXIOM is a pull parser which means that when you request for a particular element within your XML document push parsers will give you that exact element where as other pull parsers will build the whole XML document before handing over the relevant document element to you. Hence AXIOM will leave the least memory foot print.

The problem was i could not find a clear and concise tutorial explaining how to deal with XML using AXIOM. After playing around with it i figured out how to manipulate XML with AXIOM which i should say is so much better that the cumbersome code you have to deal with when manipulating with DOM or JDOM.

So following i show a simple example of how to manipulate XML with AXIOM;

First off i present the XML i will be parsing

<?xml version="1.0" encoding="utf-8" ?>
<my_servers>
 <server>
  <server-name>PROD</server-name>
  <server-ip>xx.xx.xx.xx</server-ip>
  <server-port>80</server-port>
  <server-desc>Running A/S</server-desc>
 </server>

 <server>
  <server-name>PROD2</server-name>
  <server-ip>xx1.xx1.xx1.xx1</server-ip>
  <server-port>80</server-port>
  <server-desc>Running A/S</server-desc>
 </server>
</my_servers>

Next i wrote a factory method to handout StaxBuilder instances depending on the XML file you pass. I have done as such so as to minimize the task of creating new StaxBuilder instances everytime.


import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

import javax.xml.stream.XMLStreamException;

import org.apache.axiom.om.impl.builder.StAXOMBuilder;

public class AxiomStaxBuilderFactory {

    private static Map<String, StAXOMBuilder> staxBuilderMap = new ConcurrentHashMap<String, StAXOMBuilder>();

    /**
     * The factory method stores the {@link StAXOMBuilder} instance created for each XML file<br>
     * passed in so that we do not need to create unnecessary objects every time.<br>
     * An instance of {@linkplain ConcurrentHashMap} is used so as to make the<br>
     * instances thread safe.
     * 
     * @param xmlFilePath the path of the XML file
     * @return an instance of the {@link StAXOMBuilder} from the cache or newly created
     */
    public static StAXOMBuilder getAxiomBuilderForFile(String xmlFilePath) {
        StAXOMBuilder staxBuilder = null;
        if (staxBuilderMap.containsKey(xmlFilePath)) {
            staxBuilder = staxBuilderMap.get(xmlFilePath);
        } else {
            try {
                staxBuilder = new StAXOMBuilder(new FileInputStream(xmlFilePath));
                staxBuilderMap.put(xmlFilePath, staxBuilder);
            } catch (FileNotFoundException e) {
                throw new AxiomBuilderException(e);
            } catch (XMLStreamException e) {
                throw new AxiomBuilderException(e);
            }
        }

        return staxBuilder;

    }
}



I have used a Concurrent Hash map so that this wil work well in a multi threaded application. If your not bothered with that you might as well use a normal HashMap for better performance which in this case would be negligible. I have also used a custom exception as i did not want the user to have to handle exceptions so i wrapped the exceptions thrown to my custom run time exception. Following is that code. Nothing major just a normal extension of the RuntimeException class;

/**
 * This exception class wraps all exceptions thrown from the Axiom API
 * as the user does not need to be bound by such checked exceptions.
 * @author dinuka
 *
 */
public class AxiomBuilderException extends RuntimeException {

    /**
     * 
     */
    private static final long serialVersionUID = -7853903625725204661L;

    public AxiomBuilderException(Throwable ex) {
        super(ex);
    }

    public AxiomBuilderException(String msg) {
        super(msg);
    }
}



Next off i have written a utility class to deal with the XML parsig. Ofcourse this is not needed but i just had it so that client calls will be much cleaner without having to deal with XML releated coding which would be abstracted by the utility class. Note -The current method only reads the root level elements passed in.



import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;

import javax.xml.namespace.QName;

import org.apache.axiom.om.OMElement;
import org.apache.axiom.om.impl.builder.StAXOMBuilder;

/**
 * The utility class provides abstraction to users so that <br>
 * the user can just pass in the xml file name and the node he/she<br>
 * wants to access and get the values without having to bother with<br>
 * boilerplate xml handling info.
 * 
 * @author dinuka
 */
public class AxiomUtil {

    /**
     * This method is used if you have for example a node with multiple children<br>
     * Note that this method assumes the node in query is within the root element
     * 
     * @param xmlFilePath the path of the xml file
     * @param nodeName the node name from which you want to retrieve values
     * @return the list containing key value pairs containing the values of the sub elements within<br>
     *         the nodeName passed in.
     */
    public static List<Map<String, String>> getNodeWithChildrenValues(String xmlFilePath, String nodeName) {
        List<Map<String, String>> valueList = new ArrayList<Map<String, String>>();

        StAXOMBuilder staxBuilder = AxiomStaxBuilderFactory.getAxiomBuilderForFile(xmlFilePath);

        OMElement documentElement = staxBuilder.getDocumentElement();
        Iterator nodeElement = documentElement.getChildrenWithName(new QName(nodeName));

        while (nodeElement.hasNext()) {
            OMElement om = (OMElement) nodeElement.next();

            Iterator it = om.getChildElements();
            Map<String, String> valueMap = new HashMap<String, String>();
            while (it.hasNext()) {
                OMElement el = (OMElement) it.next();

                valueMap.put(el.getLocalName(), el.getText());

            }

            valueList.add(valueMap);
        }
        return valueList;
    }

}



And finally i give to you a sample class to test out the XML parsing example i have presented to you here.



import java.util.List;
import java.util.Map;

/**
 * Test class depicting the use of Axiom parsing XML
 * 
 * @author dinuka
 */
public class testServerConfigXML {

    public static void main(String argv[]) {

        List<Map<String, String>> values = AxiomUtil.getNodeWithChildrenValues("/home/dinuka/serverInfo.xml", "server");

        for (Map<String, String> mapVals : values) {
            for (String keys : mapVals.keySet()) {
                System.out.println(keys + "=" + mapVals.get(keys));
            }
        }

    }

}


Thats it. If you have any queries or any improvement points you see pls do leave a comment which would be highly appreciated. Hope this helps anyone out there looking for a similar basic tutorial on AXIOM XML parsing.


Cheers