Blog

JAXB, XML Data Binding

17 Mar, 2011
Xebia Background Header Wave

As an ubiquitous exchange format, XML is well implemented in java. But those implementations hide how they perform the data binding from a XML structure to an object graph. It leaves us helpless in front of an application giving XML as a plain old string. Because low level API (DOM, XPath) — focused on document structure — are tedious, major JAX-RS implementation (Jersey, CXF) have chosen the same high level API — focused on data —: JAXB. Let’s do the same.


Describe an exchange format

When the output XML producer does not have any mechanism to describe its node structure, its consumers have to choose a way to operate at its best. This can be done with an object graph equivalent to this XML output. Let’s see this with a delicious dessert example.
[xml]
<recipe name="Pear compote" type="desert">
<cooking duration="15">
<step optional="true">Leave aside a vanilla pod</step>
<step>Peel and core pears</step>
<step>Cut pears into slices</step>
<step>Pour those slices in a casserole with water</step>
</cooking>
<menu>17-02-2011</menu>
<menu>17-03-2011</menu>
</recipe>
[/xml]
JAXB identifies every node as an element with attributes. An element is a complex type with an element sequence (always first) and an attribute list. An element can have a simple text value if it has no sub-node. An attribute has only a text value. Every XML node is represented by an element and, with XJC, an object of the same name generated from the schema (XJC generation is explained in appendix).
Here is the XSD description of the previous XML:
[xml]
<xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema”&gt;
<xsd:element name="recipe">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="menu" type="xsd:string" maxOccurs="unbounded" />
<xsd:element name="cooking" type="cooking" />
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" />
<xsd:attribute name="type" type="xsd:string" />
</xsd:complexType>
</xsd:element>
<xsd:complexType name="cooking">
<xsd:sequence>
<xsd:element name="step" type="step" maxOccurs="unbounded" />
</xsd:sequence>
<xsd:attribute name="duration" type="xsd:int" />
</xsd:complexType>
<xsd:complexType name="step" mixed="true">
<xsd:attribute name="optional" type="xsd:boolean" />
</xsd:complexType>
</xsd:schema>
[/xml]
These schema nodes are prefixed by « xsd » because their definition use a XML Namespace (xmlns) on the first line. Namespaces are a way to differentiate header declarations from one another, as a Java package would do.
Variables are typed following the w3c recommandation. When a primitive typed list is used (menu node), no new object is needed. However, an object is needed (step node) when a value and some attributes are both present. By default, it’s either one or another, but not both.
Here is the generated code from the schema:
[java]
@XmlAccessorType(XmlAccessType.FIELD)
@XmlRootElement(name = "recipe")
public class Recipe {
protected List<String> menu;
protected Cooking cooking;
@XmlAttribute
protected String name;
@XmlAttribute
protected String type;
}
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "cooking")
public class Cooking {
protected List<Step> step;
@XmlAttribute
protected Integer duration;
}
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "step")
public class Step {
@XmlValue
protected String content;
@XmlAttribute
protected Boolean optional;
}
[/java]
Only one class is annotated @XmlRootElement. It’s the only one (in this schema) that can be the first node of the output. Once the object graph is generated, 3 lines of code are enough to parse a XML string:
[java]
public class JaxbTest {
@Test
public void should_parse_recipe() throws JAXBException {
URL xmlUrl = Resources.getResource("recipe.xml");
Recipe recipe = parse(xmlUrl, Recipe.class);
assertEquals(Integer.valueOf(15), recipe.getCooking().getDuration());
}
private <T> T parse(URL url, Class<T> clazz) throws JAXBException {
Unmarshaller unmarsh = JAXBContext.newInstance(clazz).createUnmarshaller();
return clazz.cast(unmarsh.unmarshal(url));
}
}
[/java]
Compared to low level API, no conversion is needed: declaring types in the schema is enough. In case of a conversion error (a string non convertible to a type) an appropriate IllegalArgumentException is thrown. When a node is missing, the corresponding variables are null.

Refine the exchange format

Once the data binding is successful, more control is needed:

  1. restrict a value to a range of possibilities;
  2. manipulate java date instead of XMLGregorianCalendar used by default by JAXB;
  3. use a different class name to represent an element;
  4. use object inheritance between elements;
  5. annotate existing classes manually.

Restrict a value to a range of possibilities

Restrict attribute « type » to « starter, main, desert » can be done this way:
[xml]
<xsd:element name="recipe">
<xsd:attribute name="type" type="course" />
</xsd:element>
<xsd:simpleType name="course">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="starter"></xsd:enumeration>
<xsd:enumeration value="main"></xsd:enumeration>
<xsd:enumeration value="desert"></xsd:enumeration>
</xsd:restriction>
</xsd:simpleType>
[/xml]
Which gives us, when generated:
[java]
@XmlAccessorType(XmlAccessType.FIELD)
@XmlRootElement(name = "recipe")
public class Recipe {
@XmlAttribute
protected Course type;
}
@XmlEnum
public enum Course {
@XmlEnumValue("starter")
STARTER("starter"),
@XmlEnumValue("main")
MAIN("main"),
@XmlEnumValue("desert")
DESERT("desert");
private final String value;
}
[/java]
When binding happens, if the value does not match any of the listed possibilities, a null value is users by default. By convention, JAXB uses a null value when a problem occurs and throws an exception for typing problems.

Use java dates

To use a simple date instead of a XMLGregorianCalendar used by default in JAXB, a converter is required. Be careful, a new namespace has been declared on the root element. It allows us to redefine the date standard type.
[xml]
<xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"
xmlns:jxb="https://java.sun.com/xml/ns/jaxb"
jxb:version="2.0">
<xsd:annotation><xsd:appinfo>
<jxb:globalBindings>
<jxb:javaType name="java.util.Date" xmlType="xsd:dateTime"
parseMethod="com.xebia.jaxb.JaxbDateConverter.parseDate" />
</jxb:globalBindings>
</xsd:appinfo></xsd:annotation>
<xsd:element name="menu" type="xsd:dateTime" />
</xsd:schema>
[/xml]
The converter must adhere to JAXB conventions, if the value does not fit, no exception is thrown but a null value is returned instead.
[java]
public class JaxbDateConverter {
public static Date parseDate(String s) {
DateFormat formatter = new SimpleDateFormat("dd-MM-yyyy");
try {
return formatter.parse(s);
} catch (ParseException e) {
return null;
}
}
}
[/java]
When generating classes, JAXB creates an adapter class binded to the static methods of the converter.
[java]
@XmlAccessorType(XmlAccessType.FIELD)
@XmlRootElement(name = "recipe")
public class Recipe {
@XmlElement(type = String.class)
@XmlJavaTypeAdapter(Adapter1.class)
@XmlSchemaType(name = "dateTime")
protected List<Date> menu;
[/java]

Use a different class name to represent element

To name a class deferred to the represented element, modifying the schema in the following manner is enough (be careful to add both namespaces):
[xml]
<xsd:element name="menu" type="xsd:dateTime" />
<xsd:annotation><xsd:appinfo>
<jxb:class name="meal" />
</xsd:appinfo></xsd:annotation>
[/xml]

Use object inheritance

Using object inheritance is easy (generated objects will inherit one another, naturally):
[xml]
<xsd:complexType name="menuxl">
<xsd:complexContent>
<xsd:extension base="menu" />
</xsd:complexContent>
<xsd:attribute name="cook" type="xsd:string" />
</xsd:complexType>
[/xml]

Annotate existing classes manually

So far, we have used only generated classes thanks to the schema. But we can also manually annotate a class to bind the XML. A schema can even be generated from the annotated source code.
Manual annotations give more control over objects, their type and the opportunity to use already existing ones that were used for other purposes (adding a non bindable attribute can be done with @XmlTransient).
Once again we have to adhere to JAXB convention. In its generated code JAXB overloads the getters to return empty lists instead of null. It is advised to do the same.
[java]
public List<String> getMenu() {
if (menu == null) {
menu = new ArrayList<String>();
}
return menu;
}
[/java]

Epilogue: exchange format on the server

Until now, we have assumed that no mechanism describing nodes organisation was provided from producer to consumers; this was resolved eventually by exchange formats. It is better, when possible, to remove that constraint by giving consumers a schema with data.
To do that, it is wise to opt for a contract first development style. To limit coupling between the schema and its implementation, it is advised to generate objects on the server side rather than annotating them. To bear within the limits of a XSD schema types and organisation, guarantees the compatibility of the output to any kind of clients (more particularly to non java ones). The Spring reference documentation champions this idea.

Appendix: tools

In order to run the samples, you will need the following : JAXB maven dependency and the associated generation plugin. To ease testing, we also provide a simple plugin configuration (generation can be launched using mvn jaxb2:xjc):
[xml]
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>jaxb2-maven-plugin</artifactId>
<configuration>
<outputDirectory>${basedir}/src/main/java</outputDirectory>
<schemaDirectory>${basedir}/src/main/resources/xsd</schemaDirectory>
<packageName>com.xebia.jaxb.generated</packageName>
<schemaFiles>schema.xsd</schemaFiles>
</configuration>
</plugin>
[/xml]
To dig further on data binding with JAXB, a well-stocked documentation is available here.

Questions?

Get in touch with us to learn more about the subject and related solutions

Explore related posts