C++/Tree Mapping User Manual

Copyright © 2005-2014 CODE SYNTHESIS TOOLS CC

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, version 1.2; with no Invariant Sections, no Front-Cover Texts and no Back-Cover Texts.

This document is available in the following formats: XHTML, PDF, and PostScript.

Table of Contents

Preface
About This Document
More Information
1Introduction
2C++/Tree Mapping
2.1Preliminary Information
2.1.1C++ Standard
2.1.2Identifiers
2.1.3Character Type and Encoding
2.1.4XML Schema Namespace
2.1.5Anonymous Types
2.2Error Handling
2.2.1xml_schema::duplicate_id
2.3Mapping for import and include
2.3.1Import
2.3.2Inclusion with Target Namespace
2.3.3Inclusion without Target Namespace
2.4Mapping for Namespaces
2.5Mapping for Built-in Data Types
2.5.1Inheritance from Built-in Data Types
2.5.2Mapping for anyType
2.5.3Mapping for anySimpleType
2.5.4Mapping for QName
2.5.5Mapping for IDREF
2.5.6Mapping for base64Binary and hexBinary
2.5.7Time Zone Representation
2.5.8Mapping for date
2.5.9Mapping for dateTime
2.5.10Mapping for duration
2.5.11Mapping for gDay
2.5.12Mapping for gMonth
2.5.13Mapping for gMonthDay
2.5.14Mapping for gYear
2.5.15Mapping for gYearMonth
2.5.16Mapping for time
2.6Mapping for Simple Types
2.6.1Mapping for Derivation by Restriction
2.6.2Mapping for Enumerations
2.6.3Mapping for Derivation by List
2.6.4Mapping for Derivation by Union
2.7Mapping for Complex Types
2.7.1Mapping for Derivation by Extension
2.7.2Mapping for Derivation by Restriction
2.8Mapping for Local Elements and Attributes
2.8.1Mapping for Members with the One Cardinality Class
2.8.2Mapping for Members with the Optional Cardinality Class
2.8.3Mapping for Members with the Sequence Cardinality Class
2.8.4Element Order
2.9Mapping for Global Elements
2.9.1Element Types
2.9.2Element Map
2.10Mapping for Global Attributes
2.11Mapping for xsi:type and Substitution Groups
2.12Mapping for any and anyAttribute
2.12.1Mapping for any with the One Cardinality Class
2.12.2Mapping for any with the Optional Cardinality Class
2.12.3Mapping for any with the Sequence Cardinality Class
2.12.4Element Wildcard Order
2.12.5Mapping for anyAttribute
2.13Mapping for Mixed Content Models
3Parsing
3.1Initializing the Xerces-C++ Runtime
3.2Flags and Properties
3.3Error Handling
3.3.1xml_schema::parsing
3.3.2xml_schema::expected_element
3.3.3xml_schema::unexpected_element
3.3.4xml_schema::expected_attribute
3.3.5xml_schema::unexpected_enumerator
3.3.6xml_schema::expected_text_content
3.3.7xml_schema::no_type_info
3.3.8xml_schema::not_derived
3.3.9xml_schema::not_prefix_mapping
3.4Reading from a Local File or URI
3.5Reading from std::istream
3.6Reading from xercesc::InputSource
3.7Reading from DOM
4Serialization
4.1Initializing the Xerces-C++ Runtime
4.2Namespace Infomap and Character Encoding
4.3Flags
4.4Error Handling
4.4.1xml_schema::serialization
4.4.2xml_schema::unexpected_element
4.4.3xml_schema::no_type_info
4.5Serializing to std::ostream
4.6Serializing to xercesc::XMLFormatTarget
4.7Serializing to DOM
5Additional Functionality
5.1DOM Association
5.2Binary Serialization
Appendix A — Default and Fixed Values

Preface

About This Document

This document describes the mapping of W3C XML Schema to the C++ programming language as implemented by CodeSynthesis XSD - an XML Schema to C++ data binding compiler. The mapping represents information stored in XML instance documents as a statically-typed, tree-like in-memory data structure and is called C++/Tree.

Revision 4.0.0
This revision of the manual describes the C++/Tree mapping as implemented by CodeSynthesis XSD version 4.0.0.

This document is available in the following formats: XHTML, PDF, and PostScript.

More Information

Beyond this manual, you may also find the following sources of information useful:

1 Introduction

C++/Tree is a W3C XML Schema to C++ mapping that represents the data stored in XML as a statically-typed, vocabulary-specific object model. Based on a formal description of an XML vocabulary (schema), the C++/Tree mapping produces a tree-like data structure suitable for in-memory processing as well as XML parsing and serialization code.

A typical application that processes XML documents usually performs the following three steps: it first reads (parses) an XML instance document to an object model, it then performs some useful computations on that model which may involve modification of the model, and finally it may write (serialize) the modified object model back to XML.

The C++/Tree mapping consists of C++ types that represent the given vocabulary (Chapter 2, "C++/Tree Mapping"), a set of parsing functions that convert XML documents to a tree-like in-memory data structure (Chapter 3, "Parsing"), and a set of serialization functions that convert the object model back to XML (Chapter 4, "Serialization"). Furthermore, the mapping provides a number of additional features, such as DOM association and binary serialization, that can be useful in some applications (Chapter 5, "Additional Functionality").

2 C++/Tree Mapping

2.1 Preliminary Information

2.1.1 C++ Standard

The C++/Tree mapping provides support for ISO/IEC C++ 1998/2003 (C++98) and ISO/IEC C++ 2011 (C++11). To select the C++ standard for the generated code we use the --std XSD compiler command line option. While the majority of the examples in this manual use C++98, support for the new functionality and library components introduced in C++11 are discussed throughout the document.

2.1.2 Identifiers

XML Schema names may happen to be reserved C++ keywords or contain characters that are illegal in C++ identifiers. To avoid C++ compilation problems, such names are changed (escaped) when mapped to C++. If an XML Schema name is a C++ keyword, the "_" suffix is added to it. All character of an XML Schema name that are not allowed in C++ identifiers are replaced with "_".

For example, XML Schema name try will be mapped to C++ identifier try_. Similarly, XML Schema name strange.na-me will be mapped to C++ identifier strange_na_me.

Furthermore, conflicts between type names and function names in the same scope are resolved using name escaping. Such conflicts include both a global element (which is mapped to a set of parsing and/or serialization functions or element types, see Section 2.9, "Mapping for Global Elements") and a global type sharing the same name as well as a local element or attribute inside a type having the same name as the type itself.

For example, if we had a global type catalog and a global element with the same name then the type would be mapped to a C++ class with name catalog while the parsing functions corresponding to the global element would have their names escaped as catalog_.

By default the mapping uses the so-called K&R (Kernighan and Ritchie) identifier naming convention which is also used throughout this manual. In this convention both type and function names are in lower case and words are separated by underscores. If your application code or schemas use a different notation, you may want to change the naming convention used by the mapping for consistency. The compiler supports a set of widely-used naming conventions that you can select with the --type-naming and --function-naming options. You can also further refine one of the predefined conventions or create a completely custom naming scheme by using the --*-regex options. For more detailed information on these options refer to the NAMING CONVENTION section in the XSD Compiler Command Line Manual.

2.1.3 Character Type and Encoding

The code that implements the mapping, depending on the --char-type option, is generated using either char or wchar_t as the character type. In this document code samples use symbol C to refer to the character type you have selected when translating your schemas, for example std::basic_string<C>.

Another aspect of the mapping that depends on the character type is character encoding. For the char character type the default encoding is UTF-8. Other supported encodings are ISO-8859-1, Xerces-C++ Local Code Page (LPC), as well as custom encodings and can be selected with the --char-encoding command line option.

For the wchar_t character type the encoding is automatically selected between UTF-16 and UTF-32/UCS-4 depending on the size of the wchar_t type. On some platforms (for example, Windows with Visual C++ and AIX with IBM XL C++) wchar_t is 2 bytes long. For these platforms the encoding is UTF-16. On other platforms wchar_t is 4 bytes long and UTF-32/UCS-4 is used.

2.1.4 XML Schema Namespace

The mapping relies on some predefined types, classes, and functions that are logically defined in the XML Schema namespace reserved for the XML Schema language (http://www.w3.org/2001/XMLSchema). By default, this namespace is mapped to C++ namespace xml_schema. It is automatically accessible from a C++ compilation unit that includes a header file generated from an XML Schema definition.

Note that, if desired, the default mapping of this namespace can be changed as described in Section 2.4, "Mapping for Namespaces".

2.1.5 Anonymous Types

For the purpose of code generation, anonymous types defined in XML Schema are automatically assigned names that are derived from enclosing attributes and elements. Otherwise, such types follows standard mapping rules for simple and complex type definitions (see Section 2.6, "Mapping for Simple Types" and Section 2.7, "Mapping for Complex Types"). For example, in the following schema fragment:

<element name="object">
  <complexType>
    ...
  </complexType>
</element>
  

The anonymous type defined inside element object will be given name object. The compiler has a number of options that control the process of anonymous type naming. For more information refer to the XSD Compiler Command Line Manual.

2.2 Error Handling

The mapping uses the C++ exception handling mechanism as a primary way of reporting error conditions. All exceptions that are specified in this mapping derive from xml_schema::exception which itself is derived from std::exception:

struct exception: virtual std::exception
{
  friend
  std::basic_ostream<C>&
  operator<< (std::basic_ostream<C>& os, const exception& e)
  {
    e.print (os);
    return os;
  }

protected:
  virtual void
  print (std::basic_ostream<C>&) const = 0;
};
  

The exception hierarchy supports "virtual" operator<< which allows you to obtain diagnostics corresponding to the thrown exception using the base exception interface. For example:

try
{
  ...
}
catch (const xml_schema::exception& e)
{
  cerr << e << endl;
}
  

The following sub-sections describe exceptions thrown by the types that constitute the object model. Section 3.3, "Error Handling" of Chapter 3, "Parsing" describes exceptions and error handling mechanisms specific to the parsing functions. Section 4.4, "Error Handling" of Chapter 4, "Serialization" describes exceptions and error handling mechanisms specific to the serialization functions.

2.2.1 xml_schema::duplicate_id

struct duplicate_id: virtual exception
{
  duplicate_id (const std::basic_string<C>& id);

  const std::basic_string<C>&
  id () const;

  virtual const char*
  what () const throw ();
};
  

The xml_schema::duplicate_id is thrown when a conflicting instance of xml_schema::id (see Section 2.5, "Mapping for Built-in Data Types") is added to a tree. The offending ID value can be obtained using the id function.

2.3 Mapping for import and include

2.3.1 Import

The XML Schema import element is mapped to the C++ Preprocessor #include directive. The value of the schemaLocation attribute is used to derive the name of the header file that appears in the #include directive. For instance:

<import namespace="http://www.codesynthesis.com/test"
        schemaLocation="test.xsd"/>
  

is mapped to:

#include "test.hxx"
  

Note that you will need to compile imported schemas separately in order to produce corresponding header files.

2.3.2 Inclusion with Target Namespace

The XML Schema include element which refers to a schema with a target namespace or appears in a schema without a target namespace follows the same mapping rules as the import element, see Section 2.3.1, "Import".

2.3.3 Inclusion without Target Namespace

For the XML Schema include element which refers to a schema without a target namespace and appears in a schema with a target namespace (such inclusion sometimes called "chameleon inclusion"), declarations and definitions from the included schema are generated in-line in the namespace of the including schema as if they were declared and defined there verbatim. For example, consider the following two schemas:

<-- common.xsd -->
<schema>
  <complexType name="type">
  ...
  </complexType>
</schema>

<-- test.xsd -->
<schema targetNamespace="http://www.codesynthesis.com/test">
  <include schemaLocation="common.xsd"/>
</schema>
  

The fragment of interest from the generated header file for text.xsd would look like this:

// test.hxx
namespace test
{
  class type
  {
    ...
  };
}
  

2.4 Mapping for Namespaces

An XML Schema namespace is mapped to one or more nested C++ namespaces. XML Schema namespaces are identified by URIs. By default, a namespace URI is mapped to a sequence of C++ namespace names by removing the protocol and host parts and splitting the rest into a sequence of names with '/' as the name separator. For instance:

<schema targetNamespace="http://www.codesynthesis.com/system/test">
  ...
</schema>
  

is mapped to:

namespace system
{
  namespace test
  {
    ...
  }
}
  

The default mapping of namespace URIs to C++ namespace names can be altered using the --namespace-map and --namespace-regex options. See the XSD Compiler Command Line Manual for more information.

2.5 Mapping for Built-in Data Types

The mapping of XML Schema built-in data types to C++ types is summarized in the table below.

XML Schema type Alias in the xml_schema namespace C++ type
anyType and anySimpleType types
anyType type Section 2.5.2, "Mapping for anyType"
anySimpleType simple_type Section 2.5.3, "Mapping for anySimpleType"
fixed-length integral types
byte byte signed char
unsignedByte unsigned_byte unsigned char
short short_ short
unsignedShort unsigned_short unsigned short
int int_ int
unsignedInt unsigned_int unsigned int
long long_ long long
unsignedLong unsigned_long unsigned long long
arbitrary-length integral types
integer integer long long
nonPositiveInteger non_positive_integer long long
nonNegativeInteger non_negative_integer unsigned long long
positiveInteger positive_integer unsigned long long
negativeInteger negative_integer long long
boolean types
boolean boolean bool
fixed-precision floating-point types
float float_ float
double double_ double
arbitrary-precision floating-point types
decimal decimal double
string types
string string type derived from std::basic_string
normalizedString normalized_string type derived from string
token token type derived from normalized_string
Name name type derived from token
NMTOKEN nmtoken type derived from token
NMTOKENS nmtokens type derived from sequence<nmtoken>
NCName ncname type derived from name
language language type derived from token
qualified name
QName qname Section 2.5.4, "Mapping for QName"
ID/IDREF types
ID id type derived from ncname
IDREF idref Section 2.5.5, "Mapping for IDREF"
IDREFS idrefs type derived from sequence<idref>
URI types
anyURI uri type derived from std::basic_string
binary types
base64Binary base64_binary Section 2.5.6, "Mapping for base64Binary and hexBinary"
hexBinary hex_binary
date/time types
date date Section 2.5.8, "Mapping for date"
dateTime date_time Section 2.5.9, "Mapping for dateTime"
duration duration Section 2.5.10, "Mapping for duration"
gDay gday Section 2.5.11, "Mapping for gDay"
gMonth gmonth Section 2.5.12, "Mapping for gMonth"
gMonthDay gmonth_day Section 2.5.13, "Mapping for gMonthDay"
gYear gyear Section 2.5.14, "Mapping for gYear"
gYearMonth gyear_month Section 2.5.15, "Mapping for gYearMonth"
time time Section 2.5.16, "Mapping for time"
entity types
ENTITY entity type derived from name
ENTITIES entities type derived from sequence<entity>

All XML Schema built-in types are mapped to C++ classes that are derived from the xml_schema::simple_type class except where the mapping is to a fundamental C++ type.

The sequence class template is defined in an implementation-specific namespace. It conforms to the sequence interface as defined by the ISO/ANSI Standard for C++ (ISO/IEC 14882:1998, Section 23.1.1, "Sequences"). Practically, this means that you can treat such a sequence as if it was std::vector. One notable extension to the standard interface that is available only for sequences of non-fundamental C++ types is the addition of the overloaded push_back and insert member functions which instead of the constant reference to the element type accept automatic pointer (std::auto_ptr or std::unique_ptr, depending on the C++ standard selected) to the element type. These functions assume ownership of the pointed to object and reset the passed automatic pointer.

2.5.1 Inheritance from Built-in Data Types

In cases where the mapping calls for an inheritance from a built-in type which is mapped to a fundamental C++ type, a proxy type is used instead of the fundamental C++ type (C++ does not allow inheritance from fundamental types). For instance:

<simpleType name="my_int">
  <restriction base="int"/>
</simpleType>
  

is mapped to:

class my_int: public fundamental_base<int>
{
  ...
};
  

The fundamental_base class template provides a close emulation (though not exact) of a fundamental C++ type. It is defined in an implementation-specific namespace and has the following interface:

template <typename X>
class fundamental_base: public simple_type
{
public:
  fundamental_base ();
  fundamental_base (X)
  fundamental_base (const fundamental_base&)

public:
  fundamental_base&
  operator= (const X&);

public:
  operator const X & () const;
  operator X& ();

  template <typename Y>
  operator Y () const;

  template <typename Y>
  operator Y ();
};
  

2.5.2 Mapping for anyType

The XML Schema anyType built-in data type is mapped to the xml_schema::type C++ class:

class type
{
public:
  virtual
  ~type ();

  type ();
  type (const type&);

  type&
  operator= (const type&);

  virtual type*
  _clone () const;

  // anyType DOM content.
  //
public:
  typedef element_optional dom_content_optional;

  const dom_content_optional&
  dom_content () const;

  dom_content_optional&
  dom_content ();

  void
  dom_content (const xercesc::DOMElement&);

  void
  dom_content (xercesc::DOMElement*);

  void
  dom_content (const dom_content_optional&);

  const xercesc::DOMDocument&
  dom_content_document () const;

  xercesc::DOMDocument&
  dom_content_document ();

  bool
  null_content () const;

  // DOM association.
  //
public:
  const xercesc::DOMNode*
  _node () const;

  xercesc::DOMNode*
  _node ();
};
  

When xml_schema::type is used to create an instance (as opposed to being a base of a derived type), it represents the XML Schema anyType type. anyType allows any attributes and any content in any order. In the C++/Tree mapping this content can be represented as a DOM fragment, similar to XML Schema wildcards (Section 2.12, "Mapping for any and anyAttribute").

To enable automatic extraction of anyType content during parsing, the --generate-any-type option must be specified. Because the DOM API is used to access such content, the Xerces-C++ runtime should be initialized by the application prior to parsing and should remain initialized for the lifetime of objects with the DOM content. For more information on the Xerces-C++ runtime initialization see Section 3.1, "Initializing the Xerces-C++ Runtime".

The DOM content is stored as the optional DOM element container and the DOM content accessors and modifiers presented above are identical to those generated for an optional element wildcard. Refer to Section 2.12.2, "Mapping for any with the Optional Cardinality Class" for details on their semantics.

The dom_content_document() function returns the DOM document used to store the raw XML content corresponding to the anyType instance. It is equivalent to the dom_document() function generated for types with wildcards.

The null_content() accessor is an optimization function that allows us to check for the lack of content without actually creating its empty representation, that is, empty DOM document for anyType or empty string for anySimpleType (see the following section for details on anySimpleType).

For more information on DOM association refer to Section 5.1, "DOM Association".

2.5.3 Mapping for anySimpleType

The XML Schema anySimpleType built-in data type is mapped to the xml_schema::simple_type C++ class:

class simple_type: public type
{
public:
  simple_type ();
  simple_type (const C*);
  simple_type (const std::basic_string<C>&);

  simple_type (const simple_type&);

  simple_type&
  operator= (const simple_type&);

  virtual simple_type*
  _clone () const;

  // anySimpleType text content.
  //
public:
  const std::basic_string<C>&
  text_content () const;

  std::basic_string<C>&
  text_content ();

  void
  text_content (const std::basic_string<C>&);
};
  

When xml_schema::simple_type is used to create an instance (as opposed to being a base of a derived type), it represents the XML Schema anySimpleType type. anySimpleType allows any simple content. In the C++/Tree mapping this content can be represented as a string and accessed or modified with the text_content() functions shown above.

2.5.4 Mapping for QName

The XML Schema QName built-in data type is mapped to the xml_schema::qname C++ class:

class qname: public simple_type
{
public:
  qname (const ncname&);
  qname (const uri&, const ncname&);
  qname (const qname&);

public:
  qname&
  operator= (const qname&);

public:
  virtual qname*
  _clone () const;

public:
  bool
  qualified () const;

  const uri&
  namespace_ () const;

  const ncname&
  name () const;
};
  

The qualified accessor function can be used to determine if the name is qualified.

2.5.5 Mapping for IDREF

The XML Schema IDREF built-in data type is mapped to the xml_schema::idref C++ class. This class implements the smart pointer C++ idiom:

class idref: public ncname
{
public:
  idref (const C* s);
  idref (const C* s, std::size_t n);
  idref (std::size_t n, C c);
  idref (const std::basic_string<C>&);
  idref (const std::basic_string<C>&,
         std::size_t pos,
         std::size_t n = npos);

public:
  idref (const idref&);

public:
  virtual idref*
  _clone () const;

public:
  idref&
  operator= (C c);

  idref&
  operator= (const C* s);

  idref&
  operator= (const std::basic_string<C>&)

  idref&
  operator= (const idref&);

public:
  const type*
  operator-> () const;

  type*
  operator-> ();

  const type&
  operator* () const;

  type&
  operator* ();

  const type*
  get () const;

  type*
  get ();

  // Conversion to bool.
  //
public:
  typedef void (idref::*bool_convertible)();
  operator bool_convertible () const;
};
  

The object, idref instance refers to, is the immediate container of the matching id instance. For example, with the following instance document and schema:

<!-- test.xml -->
<root>
  <object id="obj-1" text="hello"/>
  <reference>obj-1</reference>
</root>

<!-- test.xsd -->
<schema>
  <complexType name="object_type">
    <attribute name="id" type="ID"/>
    <attribute name="text" type="string"/>
  </complexType>

  <complexType name="root_type">
    <sequence>
      <element name="object" type="object_type"/>
      <element name="reference" type="IDREF"/>
    </sequence>
  </complexType>

  <element name="root" type="root_type"/>
</schema>
  

The ref instance in the code below will refer to an object of type object_type:

root_type& root = ...;
xml_schema::idref& ref (root.reference ());
object_type& obj (dynamic_cast<object_type&> (*ref));
cout << obj.text () << endl;
  

The smart pointer interface of the idref class always returns a pointer or reference to xml_schema::type. This means that you will need to manually cast such pointer or reference to its real (dynamic) type before you can use it (unless all you need is the base interface provided by xml_schema::type). As a special extension to the XML Schema language, the mapping supports static typing of idref references by employing the refType extension attribute. The following example illustrates this mechanism:

<!-- test.xsd -->
<schema
  xmlns:xse="http://www.codesynthesis.com/xmlns/xml-schema-extension">

  ...

      <element name="reference" type="IDREF" xse:refType="object_type"/>

  ...

</schema>
  

With this modification we do not need to do manual casting anymore:

root_type& root = ...;
root_type::reference_type& ref (root.reference ());
object_type& obj (*ref);
cout << ref->text () << endl;
  

2.5.6 Mapping for base64Binary and hexBinary

The XML Schema base64Binary and hexBinary built-in data types are mapped to the xml_schema::base64_binary and xml_schema::hex_binary C++ classes, respectively. The base64_binary and hex_binary classes support a simple buffer abstraction by inheriting from the xml_schema::buffer class:

class bounds: public virtual exception
{
public:
  virtual const char*
  what () const throw ();
};

class buffer
{
public:
  typedef std::size_t size_t;

public:
  buffer (size_t size = 0);
  buffer (size_t size, size_t capacity);
  buffer (const void* data, size_t size);
  buffer (const void* data, size_t size, size_t capacity);
  buffer (void* data,
          size_t size,
          size_t capacity,
          bool assume_ownership);

public:
  buffer (const buffer&);

  buffer&
  operator= (const buffer&);

  void
  swap (buffer&);

public:
  size_t
  capacity () const;

  bool
  capacity (size_t);

public:
  size_t
  size () const;

  bool
  size (size_t);

public:
  const char*
  data () const;

  char*
  data ();

  const char*
  begin () const;

  char*
  begin ();

  const char*
  end () const;

  char*
  end ();
};
  

The last overloaded constructor reuses an existing data buffer instead of making a copy. If the assume_ownership argument is true, the instance assumes ownership of the memory block pointed to by the data argument and will eventually release it by calling operator delete. The capacity and size modifier functions return true if the underlying buffer has moved.

The bounds exception is thrown if the constructor arguments violate the (size <= capacity) constraint.

The base64_binary and hex_binary classes support the buffer interface and perform automatic decoding/encoding from/to the Base64 and Hex formats, respectively:

class base64_binary: public simple_type, public buffer
{
public:
  base64_binary (size_t size = 0);
  base64_binary (size_t size, size_t capacity);
  base64_binary (const void* data, size_t size);
  base64_binary (const void* data, size_t size, size_t capacity);
  base64_binary (void* data,
                 size_t size,
                 size_t capacity,
                 bool assume_ownership);

public:
  base64_binary (const base64_binary&);

  base64_binary&
  operator= (const base64_binary&);

  virtual base64_binary*
  _clone () const;

public:
  std::basic_string<C>
  encode () const;
};
  
class hex_binary: public simple_type, public buffer
{
public:
  hex_binary (size_t size = 0);
  hex_binary (size_t size, size_t capacity);
  hex_binary (const void* data, size_t size);
  hex_binary (const void* data, size_t size, size_t capacity);
  hex_binary (void* data,
              size_t size,
              size_t capacity,
              bool assume_ownership);

public:
  hex_binary (const hex_binary&);

  hex_binary&
  operator= (const hex_binary&);

  virtual hex_binary*
  _clone () const;

public:
  std::basic_string<C>
  encode () const;
};
  

2.5.7 Time Zone Representation

The date, dateTime, gDay, gMonth, gMonthDay, gYear, gYearMonth, and time XML Schema built-in types all include an optional time zone component. The following xml_schema::time_zone base class is used to represent this information:

class time_zone
{
public:
  time_zone ();
  time_zone (short hours, short minutes);

  bool
  zone_present () const;

  void
  zone_reset ();

  short
  zone_hours () const;

  void
  zone_hours (short);

  short
  zone_minutes () const;

  void
  zone_minutes (short);
};

bool
operator== (const time_zone&, const time_zone&);

bool
operator!= (const time_zone&, const time_zone&);
  

The zone_present() accessor function returns true if the time zone is specified. The zone_reset() modifier function resets the time zone object to the not specified state. If the time zone offset is negative then both hours and minutes components are represented as negative integers.

2.5.8 Mapping for date

The XML Schema date built-in data type is mapped to the xml_schema::date C++ class which represents a year, a day, and a month with an optional time zone. Its interface is presented below. For more information on the base xml_schema::time_zone class refer to Section 2.5.7, "Time Zone Representation".

class date: public simple_type, public time_zone
{
public:
  date (int year, unsigned short month, unsigned short day);
  date (int year, unsigned short month, unsigned short day,
        short zone_hours, short zone_minutes);

public:
  date (const date&);

  date&
  operator= (const date&);

  virtual date*
  _clone () const;

public:
  int
  year () const;

  void
  year (int);

  unsigned short
  month () const;

  void
  month (unsigned short);

  unsigned short
  day () const;

  void
  day (unsigned short);
};

bool
operator== (const date&, const date&);

bool
operator!= (const date&, const date&);
  

2.5.9 Mapping for dateTime

The XML Schema dateTime built-in data type is mapped to the xml_schema::date_time C++ class which represents a year, a month, a day, hours, minutes, and seconds with an optional time zone. Its interface is presented below. For more information on the base xml_schema::time_zone class refer to Section 2.5.7, "Time Zone Representation".

class date_time: public simple_type, public time_zone
{
public:
  date_time (int year, unsigned short month, unsigned short day,
             unsigned short hours, unsigned short minutes,
             double seconds);

  date_time (int year, unsigned short month, unsigned short day,
             unsigned short hours, unsigned short minutes,
             double seconds, short zone_hours, short zone_minutes);
public:
  date_time (const date_time&);

  date_time&
  operator= (const date_time&);

  virtual date_time*
  _clone () const;

public:
  int
  year () const;

  void
  year (int);

  unsigned short
  month () const;

  void
  month (unsigned short);

  unsigned short
  day () const;

  void
  day (unsigned short);

  unsigned short
  hours () const;

  void
  hours (unsigned short);

  unsigned short
  minutes () const;

  void
  minutes (unsigned short);

  double
  seconds () const;

  void
  seconds (double);
};

bool
operator== (const date_time&, const date_time&);

bool
operator!= (const date_time&, const date_time&);
  

2.5.10 Mapping for duration

The XML Schema duration built-in data type is mapped to the xml_schema::duration C++ class which represents a potentially negative duration in the form of years, months, days, hours, minutes, and seconds. Its interface is presented below.

class duration: public simple_type
{
public:
  duration (bool negative,
            unsigned int years, unsigned int months, unsigned int days,
            unsigned int hours, unsigned int minutes, double seconds);
public:
  duration (const duration&);

  duration&
  operator= (const duration&);

  virtual duration*
  _clone () const;

public:
  bool
  negative () const;

  void
  negative (bool);

  unsigned int
  years () const;

  void
  years (unsigned int);

  unsigned int
  months () const;

  void
  months (unsigned int);

  unsigned int
  days () const;

  void
  days (unsigned int);

  unsigned int
  hours () const;

  void
  hours (unsigned int);

  unsigned int
  minutes () const;

  void
  minutes (unsigned int);

  double
  seconds () const;

  void
  seconds (double);
};

bool
operator== (const duration&, const duration&);

bool
operator!= (const duration&, const duration&);
  

2.5.11 Mapping for gDay

The XML Schema gDay built-in data type is mapped to the xml_schema::gday C++ class which represents a day of the month with an optional time zone. Its interface is presented below. For more information on the base xml_schema::time_zone class refer to Section 2.5.7, "Time Zone Representation".

class gday: public simple_type, public time_zone
{
public:
  explicit
  gday (unsigned short day);
  gday (unsigned short day, short zone_hours, short zone_minutes);

public:
  gday (const gday&);

  gday&
  operator= (const gday&);

  virtual gday*
  _clone () const;

public:
  unsigned short
  day () const;

  void
  day (unsigned short);
};

bool
operator== (const gday&, const gday&);

bool
operator!= (const gday&, const gday&);
  

2.5.12 Mapping for gMonth

The XML Schema gMonth built-in data type is mapped to the xml_schema::gmonth C++ class which represents a month of the year with an optional time zone. Its interface is presented below. For more information on the base xml_schema::time_zone class refer to Section 2.5.7, "Time Zone Representation".

class gmonth: public simple_type, public time_zone
{
public:
  explicit
  gmonth (unsigned short month);
  gmonth (unsigned short month,
          short zone_hours, short zone_minutes);

public:
  gmonth (const gmonth&);

  gmonth&
  operator= (const gmonth&);

  virtual gmonth*
  _clone () const;

public:
  unsigned short
  month () const;

  void
  month (unsigned short);
};

bool
operator== (const gmonth&, const gmonth&);

bool
operator!= (const gmonth&, const gmonth&);
  

2.5.13 Mapping for gMonthDay

The XML Schema gMonthDay built-in data type is mapped to the xml_schema::gmonth_day C++ class which represents a day and a month of the year with an optional time zone. Its interface is presented below. For more information on the base xml_schema::time_zone class refer to Section 2.5.7, "Time Zone Representation".

class gmonth_day: public simple_type, public time_zone
{
public:
  gmonth_day (unsigned short month, unsigned short day);
  gmonth_day (unsigned short month, unsigned short day,
              short zone_hours, short zone_minutes);

public:
  gmonth_day (const gmonth_day&);

  gmonth_day&
  operator= (const gmonth_day&);

  virtual gmonth_day*
  _clone () const;

public:
  unsigned short
  month () const;

  void
  month (unsigned short);

  unsigned short
  day () const;

  void
  day (unsigned short);
};

bool
operator== (const gmonth_day&, const gmonth_day&);

bool
operator!= (const gmonth_day&, const gmonth_day&);
  

2.5.14 Mapping for gYear

The XML Schema gYear built-in data type is mapped to the xml_schema::gyear C++ class which represents a year with an optional time zone. Its interface is presented below. For more information on the base xml_schema::time_zone class refer to Section 2.5.7, "Time Zone Representation".

class gyear: public simple_type, public time_zone
{
public:
  explicit
  gyear (int year);
  gyear (int year, short zone_hours, short zone_minutes);

public:
  gyear (const gyear&);

  gyear&
  operator= (const gyear&);

  virtual gyear*
  _clone () const;

public:
  int
  year () const;

  void
  year (int);
};

bool
operator== (const gyear&, const gyear&);

bool
operator!= (const gyear&, const gyear&);
  

2.5.15 Mapping for gYearMonth

The XML Schema gYearMonth built-in data type is mapped to the xml_schema::gyear_month C++ class which represents a year and a month with an optional time zone. Its interface is presented below. For more information on the base xml_schema::time_zone class refer to Section 2.5.7, "Time Zone Representation".

class gyear_month: public simple_type, public time_zone
{
public:
  gyear_month (int year, unsigned short month);
  gyear_month (int year, unsigned short month,
               short zone_hours, short zone_minutes);
public:
  gyear_month (const gyear_month&);

  gyear_month&
  operator= (const gyear_month&);

  virtual gyear_month*
  _clone () const;

public:
  int
  year () const;

  void
  year (int);

  unsigned short
  month () const;

  void
  month (unsigned short);
};

bool
operator== (const gyear_month&, const gyear_month&);

bool
operator!= (const gyear_month&, const gyear_month&);
  

2.5.16 Mapping for time

The XML Schema time built-in data type is mapped to the xml_schema::time C++ class which represents hours, minutes, and seconds with an optional time zone. Its interface is presented below. For more information on the base xml_schema::time_zone class refer to Section 2.5.7, "Time Zone Representation".

class time: public simple_type, public time_zone
{
public:
  time (unsigned short hours, unsigned short minutes, double seconds);
  time (unsigned short hours, unsigned short minutes, double seconds,
        short zone_hours, short zone_minutes);

public:
  time (const time&);

  time&
  operator= (const time&);

  virtual time*
  _clone () const;

public:
  unsigned short
  hours () const;

  void
  hours (unsigned short);

  unsigned short
  minutes () const;

  void
  minutes (unsigned short);

  double
  seconds () const;

  void
  seconds (double);
};

bool
operator== (const time&, const time&);

bool
operator!= (const time&, const time&);
  

2.6 Mapping for Simple Types

An XML Schema simple type is mapped to a C++ class with the same name as the simple type. The class defines a public copy constructor, a public copy assignment operator, and a public virtual _clone function. The _clone function is declared const, does not take any arguments, and returns a pointer to a complete copy of the instance allocated in the free store. The _clone function shall be used to make copies when static type and dynamic type of the instance may differ (see Section 2.11, "Mapping for xsi:type and Substitution Groups"). For instance:

<simpleType name="object">
  ...
</simpleType>
  

is mapped to:

class object: ...
{
public:
  object (const object&);

public:
  object&
  operator= (const object&);

public:
  virtual object*
  _clone () const;

  ...

};
  

The base class specification and the rest of the class definition depend on the type of derivation used to define the simple type.

2.6.1 Mapping for Derivation by Restriction

XML Schema derivation by restriction is mapped to C++ public inheritance. The base type of the restriction becomes the base type for the resulting C++ class. In addition to the members described in Section 2.6, "Mapping for Simple Types", the resulting C++ class defines a public constructor with the base type as its single argument. For instance:

<simpleType name="object">
  <restriction base="base">
    ...
  </restriction>
</simpleType>
  

is mapped to:

class object: public base
{
public:
  object (const base&);
  object (const object&);

public:
  object&
  operator= (const object&);

public:
  virtual object*
  _clone () const;
};
  

2.6.2 Mapping for Enumerations

XML Schema restriction by enumeration is mapped to a C++ class with semantics similar to C++ enum. Each XML Schema enumeration element is mapped to a C++ enumerator with the name derived from the value attribute and defined in the class scope. In addition to the members described in Section 2.6, "Mapping for Simple Types", the resulting C++ class defines a public constructor that can be called with one of the enumerators as its single argument, a public constructor that can be called with enumeration's base value as its single argument, a public assignment operator that can be used to assign the value of one of the enumerators, and a public implicit conversion operator to the underlying C++ enum type.

Furthermore, for string-based enumeration types, the resulting C++ class defines a public constructor with a single argument of type const C* and a public constructor with a single argument of type const std::basic_string<C>&. For instance:

<simpleType name="color">
  <restriction base="string">
    <enumeration value="red"/>
    <enumeration value="green"/>
    <enumeration value="blue"/>
  </restriction>
</simpleType>
  

is mapped to:

class color: public xml_schema::string
{
public:
  enum value
  {
    red,
    green,
    blue
  };

public:
  color (value);
  color (const C*);
  color (const std::basic_string<C>&);
  color (const xml_schema::string&);
  color (const color&);

public:
  color&
  operator= (value);

  color&
  operator= (const color&);

public:
  virtual color*
  _clone () const;

public:
  operator value () const;
};
  

2.6.3 Mapping for Derivation by List

XML Schema derivation by list is mapped to C++ public inheritance from xml_schema::simple_type (Section 2.5.3, "Mapping for anySimpleType") and a suitable sequence type. The list item type becomes the element type of the sequence. In addition to the members described in Section 2.6, "Mapping for Simple Types", the resulting C++ class defines a public default constructor, a public constructor with the first argument of type size_type and the second argument of list item type that creates a list object with the specified number of copies of the specified element value, and a public constructor with the two arguments of an input iterator type that creates a list object from an iterator range. For instance:

<simpleType name="int_list">
  <list itemType="int"/>
</simpleType>
  

is mapped to:

class int_list: public simple_type,
                public sequence<int>
{
public:
  int_list ();
  int_list (size_type n, int x);

  template <typename I>
  int_list (const I& begin, const I& end);
  int_list (const int_list&);

public:
  int_list&
  operator= (const int_list&);

public:
  virtual int_list*
  _clone () const;
};
  

The sequence class template is defined in an implementation-specific namespace. It conforms to the sequence interface as defined by the ISO/ANSI Standard for C++ (ISO/IEC 14882:1998, Section 23.1.1, "Sequences"). Practically, this means that you can treat such a sequence as if it was std::vector. One notable extension to the standard interface that is available only for sequences of non-fundamental C++ types is the addition of the overloaded push_back and insert member functions which instead of the constant reference to the element type accept automatic pointer (std::auto_ptr or std::unique_ptr, depending on the C++ standard selected) to the element type. These functions assume ownership of the pointed to object and reset the passed automatic pointer.

2.6.4 Mapping for Derivation by Union

XML Schema derivation by union is mapped to C++ public inheritance from xml_schema::simple_type (Section 2.5.3, "Mapping for anySimpleType") and std::basic_string<C>. In addition to the members described in Section 2.6, "Mapping for Simple Types", the resulting C++ class defines a public constructor with a single argument of type const C* and a public constructor with a single argument of type const std::basic_string<C>&. For instance:

<simpleType name="int_string_union">
  <xsd:union memberTypes="xsd:int xsd:string"/>
</simpleType>
  

is mapped to:

class int_string_union: public simple_type,
                        public std::basic_string<C>
{
public:
  int_string_union (const C*);
  int_string_union (const std::basic_string<C>&);
  int_string_union (const int_string_union&);

public:
  int_string_union&
  operator= (const int_string_union&);

public:
  virtual int_string_union*
  _clone () const;
};
  

2.7 Mapping for Complex Types

An XML Schema complex type is mapped to a C++ class with the same name as the complex type. The class defines a public copy constructor, a public copy assignment operator, and a public virtual _clone function. The _clone function is declared const, does not take any arguments, and returns a pointer to a complete copy of the instance allocated in the free store. The _clone function shall be used to make copies when static type and dynamic type of the instance may differ (see Section 2.11, "Mapping for xsi:type and Substitution Groups").

Additionally, the resulting C++ class defines two public constructors that take an initializer for each member of the complex type and all its base types that belongs to the One cardinality class (see Section 2.8, "Mapping for Local Elements and Attributes"). In the first constructor, the arguments are passed as constant references and the newly created instance is initialized with copies of the passed objects. In the second constructor, arguments that are complex types (that is, they themselves contain elements or attributes) are passed as either std::auto_ptr (C++98) or std::unique_ptr (C++11), depending on the C++ standard selected. In this case the newly created instance is directly initialized with and assumes ownership of the pointed to objects and the std::[auto|unique]_ptr arguments are reset to 0. For instance:

<complexType name="complex">
  <sequence>
    <element name="a" type="int"/>
    <element name="b" type="string"/>
  </sequence>
</complexType>

<complexType name="object">
  <sequence>
    <element name="s-one" type="boolean"/>
    <element name="c-one" type="complex"/>
    <element name="optional" type="int" minOccurs="0"/>
    <element name="sequence" type="string" maxOccurs="unbounded"/>
  </sequence>
</complexType>
  

is mapped to:

class complex: public xml_schema::type
{
public:
  object (const int& a, const xml_schema::string& b);
  object (const complex&);

public:
  object&
  operator= (const complex&);

public:
  virtual complex*
  _clone () const;

  ...

};

class object: public xml_schema::type
{
public:
  object (const bool& s_one, const complex& c_one);
  object (const bool& s_one, std::[auto|unique]_ptr<complex> c_one);
  object (const object&);

public:
  object&
  operator= (const object&);

public:
  virtual object*
  _clone () const;

  ...

};
  

Notice that the generated complex class does not have the second (std::[auto|unique]_ptr) version of the constructor since all its required members are of simple types.

If an XML Schema complex type has an ultimate base which is an XML Schema simple type then the resulting C++ class also defines a public constructor that takes an initializer for the base type as well as for each member of the complex type and all its base types that belongs to the One cardinality class. For instance:

<complexType name="object">
  <simpleContent>
    <extension base="date">
      <attribute name="lang" type="language" use="required"/>
    </extension>
  </simpleContent>
</complexType>
  

is mapped to:

class object: public xml_schema::string
{
public:
  object (const xml_schema::language& lang);

  object (const xml_schema::date& base,
          const xml_schema::language& lang);

  ...

};
  

Furthermore, for string-based XML Schema complex types, the resulting C++ class also defines two public constructors with the first arguments of type const C* and std::basic_string<C>&, respectively, followed by arguments for each member of the complex type and all its base types that belongs to the One cardinality class. For enumeration-based complex types the resulting C++ class also defines a public constructor with the first arguments of the underlying enum type followed by arguments for each member of the complex type and all its base types that belongs to the One cardinality class. For instance:

<simpleType name="color">
  <restriction base="string">
    <enumeration value="red"/>
    <enumeration value="green"/>
    <enumeration value="blue"/>
  </restriction>
</simpleType>

<complexType name="object">
  <simpleContent>
    <extension base="color">
      <attribute name="lang" type="language" use="required"/>
    </extension>
  </simpleContent>
</complexType>
  

is mapped to:

class color: public xml_schema::string
{
public:
  enum value
  {
    red,
    green,
    blue
  };

public:
  color (value);
  color (const C*);
  color (const std::basic_string<C>&);

  ...

};

class object: color
{
public:
  object (const color& base,
          const xml_schema::language& lang);

  object (const color::value& base,
          const xml_schema::language& lang);

  object (const C* base,
          const xml_schema::language& lang);

  object (const std::basic_string<C>& base,
          const xml_schema::language& lang);

  ...

};
  

Additional constructors can be requested with the --generate-default-ctor and --generate-from-base-ctor options. See the XSD Compiler Command Line Manual for details.

If an XML Schema complex type is not explicitly derived from any type, the resulting C++ class is derived from xml_schema::type. In cases where an XML Schema complex type is defined using derivation by extension or restriction, the resulting C++ base class specification depends on the type of derivation and is described in the subsequent sections.

The mapping for elements and attributes that are defined in a complex type is described in Section 2.8, "Mapping for Local Elements and Attributes".

2.7.1 Mapping for Derivation by Extension

XML Schema derivation by extension is mapped to C++ public inheritance. The base type of the extension becomes the base type for the resulting C++ class.

2.7.2 Mapping for Derivation by Restriction

XML Schema derivation by restriction is mapped to C++ public inheritance. The base type of the restriction becomes the base type for the resulting C++ class. XML Schema elements and attributes defined within restriction do not result in any definitions in the resulting C++ class. Instead, corresponding (unrestricted) definitions are inherited from the base class. In the future versions of this mapping, such elements and attributes may result in redefinitions of accessors and modifiers to reflect their restricted semantics.

2.8 Mapping for Local Elements and Attributes

XML Schema element and attribute definitions are called local if they appear within a complex type definition, an element group definition, or an attribute group definitions.

Local XML Schema element and attribute definitions have the same C++ mapping. Therefore, in this section, local elements and attributes are collectively called members.

While there are many different member cardinality combinations (determined by the use attribute for attributes and the minOccurs and maxOccurs attributes for elements), the mapping divides all possible cardinality combinations into three cardinality classes:

one
attributes: use == "required"
attributes: use == "optional" and has default or fixed value
elements: minOccurs == "1" and maxOccurs == "1"
optional
attributes: use == "optional" and doesn't have default or fixed value
elements: minOccurs == "0" and maxOccurs == "1"
sequence
elements: maxOccurs > "1"

An optional attribute with a default or fixed value acquires this value if the attribute hasn't been specified in an instance document (see Appendix A, "Default and Fixed Values"). This mapping places such optional attributes to the One cardinality class.

A member is mapped to a set of public type definitions (typedefs) and a set of public accessor and modifier functions. Type definitions have names derived from the member's name. The accessor and modifier functions have the same name as the member. For example:

<complexType name="object">
  <sequence>
    <element name="member" type="string"/>
  </sequence>
</complexType>
  

is mapped to:

class object: public xml_schema::type
{
public:
  typedef xml_schema::string member_type;

  const member_type&
  member () const;

  ...

};
  

In addition, if a member has a default or fixed value, a static accessor function is generated that returns this value. For example:

<complexType name="object">
  <attribute name="data" type="string" default="test"/>
</complexType>
  

is mapped to:

class object: public xml_schema::type
{
public:
  typedef xml_schema::string data_type;

  const data_type&
  data () const;

  static const data_type&
  data_default_value ();

  ...

};
  

Names and semantics of type definitions for the member as well as signatures of the accessor and modifier functions depend on the member's cardinality class and are described in the following sub-sections.

2.8.1 Mapping for Members with the One Cardinality Class

For the One cardinality class, the type definitions consist of an alias for the member's type with the name created by appending the _type suffix to the member's name.

The accessor functions come in constant and non-constant versions. The constant accessor function returns a constant reference to the member and can be used for read-only access. The non-constant version returns an unrestricted reference to the member and can be used for read-write access.

The first modifier function expects an argument of type reference to constant of the member's type. It makes a deep copy of its argument. Except for member's types that are mapped to fundamental C++ types, the second modifier function is provided that expects an argument of type automatic pointer (std::auto_ptr or std::unique_ptr, depending on the C++ standard selected) to the member's type. It assumes ownership of the pointed to object and resets the passed automatic pointer. For instance:

<complexType name="object">
  <sequence>
    <element name="member" type="string"/>
  </sequence>
</complexType>
  

is mapped to:

class object: public xml_schema::type
{
public:
  // Type definitions.
  //
  typedef xml_schema::string member_type;

  // Accessors.
  //
  const member_type&
  member () const;

  member_type&
  member ();

  // Modifiers.
  //
  void
  member (const member_type&);

  void
  member (std::[auto|unique]_ptr<member_type>);
  ...

};
  

In addition, if requested by specifying the --generate-detach option and only for members of non-fundamental C++ types, the mapping provides a detach function that returns an automatic pointer to the member's type, for example:

class object: public xml_schema::type
{
public:
  ...

  std::[auto|unique]_ptr<member_type>
  detach_member ();
  ...

};
  

This function detaches the value from the tree leaving the member value uninitialized. Accessing such an uninitialized value prior to re-initializing it results in undefined behavior.

The following code shows how one could use this mapping:

void
f (object& o)
{
  using xml_schema::string;

  string s (o.member ());                // get
  object::member_type& sr (o.member ()); // get

  o.member ("hello");           // set, deep copy
  o.member () = "hello";        // set, deep copy

  // C++98 version.
  //
  std::auto_ptr<string> p (new string ("hello"));
  o.member (p);                 // set, assumes ownership
  p = o.detach_member ();       // detach, member is uninitialized
  o.member (p);                 // re-attach

  // C++11 version.
  //
  std::unique_ptr<string> p (new string ("hello"));
  o.member (std::move (p));     // set, assumes ownership
  p = o.detach_member ();       // detach, member is uninitialized
  o.member (std::move (p));     // re-attach
}
  

2.8.2 Mapping for Members with the Optional Cardinality Class

For the Optional cardinality class, the type definitions consist of an alias for the member's type with the name created by appending the _type suffix to the member's name and an alias for the container type with the name created by appending the _optional suffix to the member's name.

Unlike accessor functions for the One cardinality class, accessor functions for the Optional cardinality class return references to corresponding containers rather than directly to members. The accessor functions come in constant and non-constant versions. The constant accessor function returns a constant reference to the container and can be used for read-only access. The non-constant version returns an unrestricted reference to the container and can be used for read-write access.

The modifier functions are overloaded for the member's type and the container type. The first modifier function expects an argument of type reference to constant of the member's type. It makes a deep copy of its argument. Except for member's types that are mapped to fundamental C++ types, the second modifier function is provided that expects an argument of type automatic pointer (std::auto_ptr or std::unique_ptr, depending on the C++ standard selected) to the member's type. It assumes ownership of the pointed to object and resets the passed automatic pointer. The last modifier function expects an argument of type reference to constant of the container type. It makes a deep copy of its argument. For instance:

<complexType name="object">
  <sequence>
    <element name="member" type="string" minOccurs="0"/>
  </sequence>
</complexType>
  

is mapped to:

class object: public xml_schema::type
{
public:
  // Type definitions.
  //
  typedef xml_schema::string member_type;
  typedef optional<member_type> member_optional;

  // Accessors.
  //
  const member_optional&
  member () const;

  member_optional&
  member ();

  // Modifiers.
  //
  void
  member (const member_type&);

  void
  member (std::[auto|unique]_ptr<member_type>);

  void
  member (const member_optional&);

  ...

};
  

The optional class template is defined in an implementation-specific namespace and has the following interface. The [auto|unique]_ptr-based constructor and modifier function are only available if the template argument is not a fundamental C++ type.

template <typename X>
class optional
{
public:
  optional ();

  // Makes a deep copy.
  //
  explicit
  optional (const X&);

  // Assumes ownership.
  //
  explicit
  optional (std::[auto|unique]_ptr<X>);

  optional (const optional&);

public:
  optional&
  operator= (const X&);

  optional&
  operator= (const optional&);

  // Pointer-like interface.
  //
public:
  const X*
  operator-> () const;

  X*
  operator-> ();

  const X&
  operator* () const;

  X&
  operator* ();

  typedef void (optional::*bool_convertible) ();
  operator bool_convertible () const;

  // Get/set interface.
  //
public:
  bool
  present () const;

  const X&
  get () const;

  X&
  get ();

  // Makes a deep copy.
  //
  void
  set (const X&);

  // Assumes ownership.
  //
  void
  set (std::[auto|unique]_ptr<X>);

  // Detach and return the contained value.
  //
  std::[auto|unique]_ptr<X>
  detach ();

  void
  reset ();
};

template <typename X>
bool
operator== (const optional<X>&, const optional<X>&);

template <typename X>
bool
operator!= (const optional<X>&, const optional<X>&);

template <typename X>
bool
operator< (const optional<X>&, const optional<X>&);

template <typename X>
bool
operator> (const optional<X>&, const optional<X>&);

template <typename X>
bool
operator<= (const optional<X>&, const optional<X>&);

template <typename X>
bool
operator>= (const optional<X>&, const optional<X>&);
  

The following code shows how one could use this mapping:

void
f (object& o)
{
  using xml_schema::string;

  if (o.member ().present ())       // test
  {
    string& s (o.member ().get ()); // get
    o.member ("hello");             // set, deep copy
    o.member ().set ("hello");      // set, deep copy
    o.member ().reset ();           // reset
  }

  // Same as above but using pointer notation:
  //
  if (o.member ())                  // test
  {
    string& s (*o.member ());       // get
    o.member ("hello");             // set, deep copy
    *o.member () = "hello";         // set, deep copy
    o.member ().reset ();           // reset
  }

  // C++98 version.
  //
  std::auto_ptr<string> p (new string ("hello"));
  o.member (p);                     // set, assumes ownership

  p = new string ("hello");
  o.member ().set (p);              // set, assumes ownership

  p = o.member ().detach ();        // detach, member is reset
  o.member ().set (p);              // re-attach

  // C++11 version.
  //
  std::unique_ptr<string> p (new string ("hello"));
  o.member (std::move (p));         // set, assumes ownership

  p.reset (new string ("hello"));
  o.member ().set (std::move (p));  // set, assumes ownership

  p = o.member ().detach ();        // detach, member is reset
  o.member ().set (std::move (p));  // re-attach
}
  

2.8.3 Mapping for Members with the Sequence Cardinality Class

For the Sequence cardinality class, the type definitions consist of an alias for the member's type with the name created by appending the _type suffix to the member's name, an alias of the container type with the name created by appending the _sequence suffix to the member's name, an alias of the iterator type with the name created by appending the _iterator suffix to the member's name, and an alias of the constant iterator type with the name created by appending the _const_iterator suffix to the member's name.

The accessor functions come in constant and non-constant versions. The constant accessor function returns a constant reference to the container and can be used for read-only access. The non-constant version returns an unrestricted reference to the container and can be used for read-write access.

The modifier function expects an argument of type reference to constant of the container type. The modifier function makes a deep copy of its argument. For instance:

<complexType name="object">
  <sequence>
    <element name="member" type="string" minOccurs="unbounded"/>
  </sequence>
</complexType>
  

is mapped to:

class object: public xml_schema::type
{
public:
  // Type definitions.
  //
  typedef xml_schema::string member_type;
  typedef sequence<member_type> member_sequence;
  typedef member_sequence::iterator member_iterator;
  typedef member_sequence::const_iterator member_const_iterator;

  // Accessors.
  //
  const member_sequence&
  member () const;

  member_sequence&
  member ();

  // Modifier.
  //
  void
  member (const member_sequence&);

  ...

};
  

The sequence class template is defined in an implementation-specific namespace. It conforms to the sequence interface as defined by the ISO/ANSI Standard for C++ (ISO/IEC 14882:1998, Section 23.1.1, "Sequences"). Practically, this means that you can treat such a sequence as if it was std::vector. Two notable extensions to the standard interface that are available only for sequences of non-fundamental C++ types are the addition of the overloaded push_back and insert as well as the detach_back and detach member functions. The additional push_back and insert functions accept an automatic pointer (std::auto_ptr or std::unique_ptr, depending on the C++ standard selected) to the element type instead of the constant reference. They assume ownership of the pointed to object and reset the passed automatic pointer. The detach_back and detach functions detach the element value from the sequence container and, by default, remove the element from the sequence. These additional functions have the following signatures:

template <typename X>
class sequence
{
public:
  ...

  void
  push_back (std::[auto|unique]_ptr<X>)

  iterator
  insert (iterator position, std::[auto|unique]_ptr<X>)

  std::[auto|unique]_ptr<X>
  detach_back (bool pop = true);

  iterator
  detach (iterator position,
          std::[auto|unique]_ptr<X>& result,
          bool erase = true)

  ...
}
  

The following code shows how one could use this mapping:

void
f (object& o)
{
  using xml_schema::string;

  object::member_sequence& s (o.member ());

  // Iteration.
  //
  for (object::member_iterator i (s.begin ()); i != s.end (); ++i)
  {
    string& value (*i);
  }

  // Modification.
  //
  s.push_back ("hello");  // deep copy

  // C++98 version.
  //
  std::auto_ptr<string> p (new string ("hello"));
  s.push_back (p);        // assumes ownership
  p = s.detach_back ();   // detach and pop
  s.push_back (p);        // re-append

  // C++11 version.
  //
  std::unique_ptr<string> p (new string ("hello"));
  s.push_back (std::move (p)); // assumes ownership
  p = s.detach_back ();        // detach and pop
  s.push_back (std::move (p)); // re-append

  // Setting a new container.
  //
  object::member_sequence n;
  n.push_back ("one");
  n.push_back ("two");
  o.member (n);           // deep copy
}
  

2.8.4 Element Order

C++/Tree is a "flattening" mapping in a sense that many levels of nested compositors (choice and sequence), all potentially with their own cardinalities, are in the end mapped to a flat set of elements with one of the three cardinality classes discussed in the previous sections. While this results in a simple and easy to use API for most types, in certain cases, the order of elements in the actual XML documents is not preserved once parsed into the object model. And sometimes such order has application-specific significance. As an example, consider a schema that defines a batch of bank transactions:

<complexType name="withdraw">
  <sequence>
    <element name="account" type="unsignedInt"/>
    <element name="amount" type="unsignedInt"/>
  </sequence>
</complexType>

<complexType name="deposit">
  <sequence>
    <element name="account" type="unsignedInt"/>
    <element name="amount" type="unsignedInt"/>
  </sequence>
</complexType>

<complexType name="batch">
  <choice minOccurs="0" maxOccurs="unbounded">
    <element name="withdraw" type="withdraw"/>
    <element name="deposit" type="deposit"/>
  </choice>
</complexType>
  

The batch can contain any number of transactions in any order but the order of transactions in each actual batch is significant. For instance, consider what could happen if we reorder the transactions and apply all the withdrawals before deposits.

For the batch schema type defined above the default C++/Tree mapping will produce a C++ class that contains a pair of sequence containers, one for each of the two elements. While this will capture the content (transactions), the order of this content as it appears in XML will be lost. Also, if we try to serialize the batch we just loaded back to XML, all the withdrawal transactions will appear before deposits.

To overcome this limitation of a flattening mapping, C++/Tree allows us to mark certain XML Schema types, for which content order is important, as ordered.

There are several command line options that control which schema types are treated as ordered. To make an individual type ordered, we use the --ordered-type option, for example:

--ordered-type batch
  

To automatically treat all the types that are derived from an ordered type also ordered, we use the --ordered-type-derived option. This is primarily useful if you would like to iterate over the complete hierarchy's content using the content order sequence (discussed below).

Ordered types are also useful for handling mixed content. To automatically mark all the types with mixed content as ordered we use the --ordered-type-mixed option. For more information on handling mixed content see Section 2.13, "Mapping for Mixed Content Models".

Finally, we can mark all the types in the schema we are compiling with the --ordered-type-all option. You should only resort to this option if all the types in your schema truly suffer from the loss of content order since, as we will discuss shortly, ordered types require extra effort to access and, especially, modify. See the XSD Compiler Command Line Manual for more information on these options.

Once a type is marked ordered, C++/Tree alters its mapping in several ways. Firstly, for each local element, element wildcard (Section 2.12.4, "Element Wildcard Order"), and mixed content text (Section 2.13, "Mapping for Mixed Content Models") in this type, a content id constant is generated. Secondly, an addition sequence is added to the class that captures the content order. Here is how the mapping of our batch class changes once we make it ordered:

class batch: public xml_schema::type
{
public:
  // withdraw
  //
  typedef withdraw withdraw_type;
  typedef sequence<withdraw_type> withdraw_sequence;
  typedef withdraw_sequence::iterator withdraw_iterator;
  typedef withdraw_sequence::const_iterator withdraw_const_iterator;

  static const std::size_t withdraw_id = 1;

  const withdraw_sequence&
  withdraw () const;

  withdraw_sequence&
  withdraw ();

  void
  withdraw (const withdraw_sequence&);

  // deposit
  //
  typedef deposit deposit_type;
  typedef sequence<deposit_type> deposit_sequence;
  typedef deposit_sequence::iterator deposit_iterator;
  typedef deposit_sequence::const_iterator deposit_const_iterator;

  static const std::size_t deposit_id = 2;

  const deposit_sequence&
  deposit () const;

  deposit_sequence&
  deposit ();

  void
  deposit (const deposit_sequence&);

  // content_order
  //
  typedef xml_schema::content_order content_order_type;
  typedef std::vector<content_order_type> content_order_sequence;
  typedef content_order_sequence::iterator content_order_iterator;
  typedef content_order_sequence::const_iterator content_order_const_iterator;

  const content_order_sequence&
  content_order () const;

  content_order_sequence&
  content_order ();

  void
  content_order (const content_order_sequence&);

  ...
};
  

Notice the withdraw_id and deposit_id content ids as well as the extra content_order sequence that does not correspond to any element in the schema definition. The other changes to the mapping for ordered types has to do with XML parsing and serialization code. During parsing the content order is captured in the content_order sequence while during serialization this sequence is used to determine the order in which content is serialized. The content_order sequence is also copied during copy construction and assigned during copy assignment. It is also taken into account during comparison.

The entry type of the content_order sequence is the xml_schema::content_order type that has the following interface:

namespace xml_schema
{
  struct content_order
  {
    content_order (std::size_t id, std::size_t index = 0);

    std::size_t id;
    std::size_t index;
  };

  bool
  operator== (const content_order&, const content_order&);

  bool
  operator!= (const content_order&, const content_order&);

  bool
  operator< (const content_order&, const content_order&);
}
  

The content_order sequence describes the order of content (elements, including wildcards, as well as mixed content text). Each entry in this sequence consists of the content id (for example, withdraw_id or deposit_id in our case) as well as, for elements of the sequence cardinality class, an index into the corresponding sequence container (the index is unused for the one and optional cardinality classes). For example, in our case, if the content id is withdraw_id, then the index will point into the withdraw element sequence.

With all this information we can now examine how to iterate over transaction in the batch in content order:

batch& b = ...

for (batch::content_order_const_iterator i (b.content_order ().begin ());
     i != b.content_order ().end ();
     ++i)
{
  switch (i->id)
  {
  case batch::withdraw_id:
    {
      const withdraw& t (b.withdraw ()[i->index]);
      cerr << t.account () << " withdraw " << t.amount () << endl;
      break;
    }
  case batch::deposit_id:
    {
      const deposit& t (b.deposit ()[i->index]);
      cerr << t.account () << " deposit " << t.amount () << endl;
      break;
    }
  default:
    {
      assert (false); // Unknown content id.
    }
  }
}
  

If we serialized our batch back to XML, we would also see that the order of transactions in the output is exactly the same as in the input rather than all the withdrawals first followed by all the deposits.

The most complex aspect of working with ordered types is modifications. Now we not only need to change the content, but also remember to update the order information corresponding to this change. As a first example, we add a deposit transaction to the batch:

using xml_schema::content_order;

batch::deposit_sequence& d (b.deposit ());
batch::withdraw_sequence& w (b.withdraw ());
batch::content_order_sequence& co (b.content_order ());

d.push_back (deposit (123456789, 100000));
co.push_back (content_order (batch::deposit_id, d.size () - 1));
  

In the above example we first added the content (deposit transaction) and then updated the content order information by adding an entry with deposit_id content id and the index of the just added deposit transaction.

Removing the last transaction can be easy if we know which transaction (deposit or withdrawal) is last:

d.pop_back ();
co.pop_back ();
  

If, however, we do not know which transaction is last, then things get a bit more complicated:

switch (co.back ().id)
{
case batch::withdraw_id:
  {
    d.pop_back ();
    break;
  }
case batch::deposit_id:
  {
    w.pop_back ();
    break;
  }
}

co.pop_back ();
  

The following example shows how to add a transaction at the beginning of the batch:

w.push_back (withdraw (123456789, 100000));
co.insert (co.begin (),
           content_order (batch::withdraw_id, w.size () - 1));
  

Note also that when we merely modify the content of one of the elements in place, we do not need to update its order since it doesn't change. For example, here is how we can change the amount in the first withdrawal:

w[0].amount (10000);
  

For the complete working code shown in this section refer to the order/element example in the examples/cxx/tree/ directory in the XSD distribution.

If both the base and derived types are ordered, then the content order sequence is only added to the base and the content ids are unique within the whole hierarchy. In this case the content order sequence for the derived type contains ordering information for both base and derived content.

In some applications we may need to perform more complex content processing. For example, in our case, we may need to remove all the withdrawal transactions. The default container, std::vector, is not particularly suitable for such operations. What may be required by some applications is a multi-index container that not only allows us to iterate in content order similar to std::vector but also search by the content id as well as the content id and index pair.

While C++/Tree does not provide this functionality by default, it allows us to specify a custom container type for content order with the --order-container command line option. The only requirement from the generated code side for such a container is to provide the vector-like push_back(), size(), and const iteration interfaces.

As an example, here is how we can use the Boost Multi-Index container for content order. First we create the content-order-container.hxx header with the following definition (in C++11, use the alias template instead):

#ifndef CONTENT_ORDER_CONTAINER
#define CONTENT_ORDER_CONTAINER

#include <cstddef> // std::size_t

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/member.hpp>
#include <boost/multi_index/identity.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/random_access_index.hpp>

struct by_id {};
struct by_id_index {};

template <typename T>
struct content_order_container:
  boost::multi_index::multi_index_container<
    T,
    boost::multi_index::indexed_by<
      boost::multi_index::random_access<>,
      boost::multi_index::ordered_unique<
        boost::multi_index::tag<by_id_index>,
        boost::multi_index::identity<T>
      >,
      boost::multi_index::ordered_non_unique<
        boost::multi_index::tag<by_id>,
        boost::multi_index::member<T, std::size_t, &T::id>
      >
    >
  >
{};

#endif
  

Next we add the following two XSD compiler options to include this header into every generated header file and to use the custom container type (see the XSD compiler command line manual for more information on shell quoting for the first option):

--hxx-prologue '#include "content-order-container.hxx"'
--order-container content_order_container
  

With these changes we can now use the multi-index functionality, for example, to search for a specific content id:

typedef batch::content_order_sequence::index<by_id>::type id_set;
typedef id_set::iterator id_iterator;

const id_set& ids (b.content_order ().get<by_id> ());

std::pair<id_iterator, id_iterator> r (
  ids.equal_range (std::size_t (batch::deposit_id));

for (id_iterator i (r.first); i != r.second; ++i)
{
  const deposit& t (b.deposit ()[i->index]);
  cerr << t.account () << " deposit " << t.amount () << endl;
}
  

2.9 Mapping for Global Elements

An XML Schema element definition is called global if it appears directly under the schema element. A global element is a valid root of an instance document. By default, a global element is mapped to a set of overloaded parsing and, optionally, serialization functions with the same name as the element. It is also possible to generate types for root elements instead of parsing and serialization functions. This is primarily useful to distinguish object models with the same root type but with different root elements. See Section 2.9.1, "Element Types" for details. It is also possible to request the generation of an element map which allows uniform parsing and serialization of multiple root elements. See Section 2.9.2, "Element Map" for details.

The parsing functions read XML instance documents and return corresponding object models as an automatic pointer (std::auto_ptr or std::unique_ptr, depending on the C++ standard selected). Their signatures have the following pattern (type denotes element's type and name denotes element's name):

std::[auto|unique]_ptr<type>
name (....);
  

The process of parsing, including the exact signatures of the parsing functions, is the subject of Chapter 3, "Parsing".

The serialization functions write object models back to XML instance documents. Their signatures have the following pattern:

void
name (<stream type>&, const type&, ....);
  

The process of serialization, including the exact signatures of the serialization functions, is the subject of Chapter 4, "Serialization".

2.9.1 Element Types

The generation of element types is requested with the --generate-element-map option. With this option each global element is mapped to a C++ class with the same name as the element. Such a class is derived from xml_schema::element_type and contains the same set of type definitions, constructors, and member function as would a type containing a single element with the One cardinality class named "value". In addition, the element type also contains a set of member functions for accessing the element name and namespace as well as its value in a uniform manner. For example:

<complexType name="type">
  <sequence>
    ...
  </sequence>
</complexType>

<element name="root" type="type"/>
  

is mapped to:

class type
{
  ...
};

class root: public xml_schema::element_type
{
public:
  // Element value.
  //
  typedef type value_type;

  const value_type&
  value () const;

  value_type&
  value ();

  void
  value (const value_type&);

  void
  value (std::[auto|unique]_ptr<value_type>);

  // Constructors.
  //
  root (const value_type&);

  root (std::[auto|unique]_ptr<value_type>);

  root (const xercesc::DOMElement&, xml_schema::flags = 0);

  root (const root&, xml_schema::flags = 0);

  virtual root*
  _clone (xml_schema::flags = 0) const;

  // Element name and namespace.
  //
  static const std::string&
  name ();

  static const std::string&
  namespace_ ();

  virtual const std::string&
  _name () const;

  virtual const std::string&
  _namespace () const;

  // Element value as xml_schema::type.
  //
  virtual const xml_schema::type*
  _value () const;

  virtual xml_schema::type*
  _value ();
};

void
operator<< (xercesc::DOMElement&, const root&);
  

The xml_schema::element_type class is a common base type for all element types and is defined as follows:

namespace xml_schema
{
  class element_type
  {
  public:
    virtual
    ~element_type ();

    virtual element_type*
    _clone (flags f = 0) const = 0;

    virtual const std::basic_string<C>&
    _name () const = 0;

    virtual const std::basic_string<C>&
    _namespace () const = 0;

    virtual xml_schema::type*
    _value () = 0;

    virtual const xml_schema::type*
    _value () const = 0;
  };
}
  

The _value() member function returns a pointer to the element value or 0 if the element is of a fundamental C++ type and therefore is not derived from xml_schema::type.

Unlike parsing and serialization functions, element types are only capable of parsing and serializing from/to a DOMElement object. This means that the application will need to perform its own XML-to-DOM parsing and DOM-to-XML serialization. The following section describes a mechanism provided by the mapping to uniformly parse and serialize multiple root elements.

2.9.2 Element Map

When element types are generated for root elements it is also possible to request the generation of an element map with the --generate-element-map option. The element map allows uniform parsing and serialization of multiple root elements via the common xml_schema::element_type base type. The xml_schema::element_map class is defined as follows:

namespace xml_schema
{
  class element_map
  {
  public:
    static std::[auto|unique]_ptr<xml_schema::element_type>
    parse (const xercesc::DOMElement&, flags = 0);

    static void
    serialize (xercesc::DOMElement&, const element_type&);
  };
}
  

The parse() function creates the corresponding element type object based on the element name and namespace and returns it as an automatic pointer (std::auto_ptr or std::unique_ptr, depending on the C++ standard selected) to xml_schema::element_type. The serialize() function serializes the passed element object to DOMElement. Note that in case of serialize(), the DOMElement object should have the correct name and namespace. If no element type is available for an element, both functions throw the xml_schema::no_element_info exception:

struct no_element_info: virtual exception
{
  no_element_info (const std::basic_string<C>& element_name,
                   const std::basic_string<C>& element_namespace);

  const std::basic_string<C>&
  element_name () const;

  const std::basic_string<C>&
  element_namespace () const;

  virtual const char*
  what () const throw ();
};
  

The application can discover the actual type of the element object returned by parse() either using dynamic_cast or by comparing element names and namespaces. The following code fragments illustrate how the element map can be used:

// Parsing.
//
DOMElement& e = ... // Parse XML to DOM.

auto_ptr<xml_schema::element_type> r (
  xml_schema::element_map::parse (e));

if (root1 r1 = dynamic_cast<root1*> (r.get ()))
{
  ...
}
else if (r->_name == root2::name () &&
         r->_namespace () == root2::namespace_ ())
{
  root2& r2 (static_cast<root2&> (*r));

  ...
}
  
// Serialization.
//
xml_schema::element_type& r = ...

string name (r._name ());
string ns (r._namespace ());

DOMDocument& doc = ... // Create a new DOMDocument with name and ns.
DOMElement& e (*doc->getDocumentElement ());

xml_schema::element_map::serialize (e, r);

// Serialize DOMDocument to XML.
  

2.10 Mapping for Global Attributes

An XML Schema attribute definition is called global if it appears directly under the schema element. A global attribute does not have any mapping.

2.11 Mapping for xsi:type and Substitution Groups

The mapping provides optional support for the XML Schema polymorphism features (xsi:type and substitution groups) which can be requested with the --generate-polymorphic option. When used, the dynamic type of a member may be different from its static type. Consider the following schema definition and instance document:

<!-- test.xsd -->
<schema>
  <complexType name="base">
    <attribute name="text" type="string"/>
  </complexType>

  <complexType name="derived">
    <complexContent>
      <extension base="base">
        <attribute name="extra-text" type="string"/>
      </extension>
    </complexContent>
  </complexType>

  <complexType name="root_type">
    <sequence>
      <element name="item" type="base" maxOccurs="unbounded"/>
    </sequence>
  </complexType>

  <element name="root" type="root_type"/>
</schema>

<!-- test.xml -->
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <item text="hello"/>
  <item text="hello" extra-text="world" xsi:type="derived"/>
</root>
  

In the resulting object model, the container for the root::item member will have two elements: the first element's type will be base while the second element's (dynamic) type will be derived. This can be discovered using the dynamic_cast operator as shown in the following example:

void
f (root& r)
{
  for (root::item_const_iterator i (r.item ().begin ());
       i != r.item ().end ()
       ++i)
  {
    if (derived* d = dynamic_cast<derived*> (&(*i)))
    {
      // derived
    }
    else
    {
      // base
    }
  }
}
  

The _clone virtual function should be used instead of copy constructors to make copies of members that might use polymorphism:

void
f (root& r)
{
  for (root::item_const_iterator i (r.item ().begin ());
       i != r.item ().end ()
       ++i)
  {
    std::auto_ptr<base> c (i->_clone ());
  }
}
  

The mapping can often automatically determine which types are polymorphic based on the substitution group declarations. However, if your XML vocabulary is not using substitution groups or if substitution groups are defined in a separate schema, then you will need to use the --polymorphic-type option to specify which types are polymorphic. When using this option you only need to specify the root of a polymorphic type hierarchy and the mapping will assume that all the derived types are also polymorphic. Also note that you need to specify this option when compiling every schema file that references the polymorphic type. Consider the following two schemas as an example:

<!-- base.xsd -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:complexType name="base">
    <xs:sequence>
      <xs:element name="b" type="xs:int"/>
    </xs:sequence>
  </xs:complexType>

  <!-- substitution group root -->
  <xs:element name="base" type="base"/>

</xs:schema>
  
<!-- derived.xsd -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <include schemaLocation="base.xsd"/>

  <xs:complexType name="derived">
    <xs:complexContent>
      <xs:extension base="base">
        <xs:sequence>
          <xs:element name="d" type="xs:string"/>
        </xs:sequence>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <xs:element name="derived" type="derived" substitutionGroup="base"/>

</xs:schema>
  

In this example we need to specify "--polymorphic-type base" when compiling both schemas because the substitution group is declared in a schema other than the one defining type base.

You can also indicate that all types should be treated as polymorphic with the --polymorphic-type-all. However, this may result in slower generated code with a greater footprint.

2.12 Mapping for any and anyAttribute

For the XML Schema any and anyAttribute wildcards an optional mapping can be requested with the --generate-wildcard option. The mapping represents the content matched by wildcards as DOM fragments. Because the DOM API is used to access such content, the Xerces-C++ runtime should be initialized by the application prior to parsing and should remain initialized for the lifetime of objects with the wildcard content. For more information on the Xerces-C++ runtime initialization see Section 3.1, "Initializing the Xerces-C++ Runtime".

The mapping for any is similar to the mapping for local elements (see Section 2.8, "Mapping for Local Elements and Attributes") except that the type used in the wildcard mapping is xercesc::DOMElement. As with local elements, the mapping divides all possible cardinality combinations into three cardinality classes: one, optional, and sequence.

The mapping for anyAttribute represents the attributes matched by this wildcard as a set of xercesc::DOMAttr objects with a key being the attribute's name and namespace.

Similar to local elements and attributes, the any and anyAttribute wildcards are mapped to a set of public type definitions (typedefs) and a set of public accessor and modifier functions. Type definitions have names derived from "any" for the any wildcard and "any_attribute" for the anyAttribute wildcard. The accessor and modifier functions are named "any" for the any wildcard and "any_attribute" for the anyAttribute wildcard. Subsequent wildcards in the same type have escaped names such as "any1" or "any_attribute1".

Because Xerces-C++ DOM nodes always belong to a DOMDocument, each type with a wildcard has an associated DOMDocument object. The reference to this object can be obtained using the accessor function called dom_document. The access to the document object from the application code may be necessary to create or modify the wildcard content. For example:

<complexType name="object">
  <sequence>
    <any namespace="##other"/>
  </sequence>
  <anyAttribute namespace="##other"/>
</complexType>
  

is mapped to:

class object: public xml_schema::type
{
public:
  // any
  //
  const xercesc::DOMElement&
  any () const;

  void
  any (const xercesc::DOMElement&);

  ...

  // any_attribute
  //
  typedef attribute_set any_attribute_set;
  typedef any_attribute_set::iterator any_attribute_iterator;
  typedef any_attribute_set::const_iterator any_attribute_const_iterator;

  const any_attribute_set&
  any_attribute () const;

  any_attribute_set&
  any_attribute ();

  ...

  // DOMDocument object for wildcard content.
  //
  const xercesc::DOMDocument&
  dom_document () const;

  xercesc::DOMDocument&
  dom_document ();

  ...
};
  

Names and semantics of type definitions for the wildcards as well as signatures of the accessor and modifier functions depend on the wildcard type as well as the cardinality class for the any wildcard. They are described in the following sub-sections.

2.12.1 Mapping for any with the One Cardinality Class

For any with the One cardinality class, there are no type definitions. The accessor functions come in constant and non-constant versions. The constant accessor function returns a constant reference to xercesc::DOMElement and can be used for read-only access. The non-constant version returns an unrestricted reference to xercesc::DOMElement and can be used for read-write access.

The first modifier function expects an argument of type reference to constant xercesc::DOMElement and makes a deep copy of its argument. The second modifier function expects an argument of type pointer to xercesc::DOMElement. This modifier function assumes ownership of its argument and expects the element object to be created using the DOM document associated with this instance. For example:

<complexType name="object">
  <sequence>
    <any namespace="##other"/>
  </sequence>
</complexType>
  

is mapped to:

class object: public xml_schema::type
{
public:
  // Accessors.
  //
  const xercesc::DOMElement&
  any () const;

  xercesc::DOMElement&
  any ();

  // Modifiers.
  //
  void
  any (const xercesc::DOMElement&);

  void
  any (xercesc::DOMElement*);

  ...

};
  

The following code shows how one could use this mapping:

void
f (object& o, const xercesc::DOMElement& e)
{
  using namespace xercesc;

  DOMElement& e1 (o.any ());             // get
  o.any (e)                              // set, deep copy
  DOMDocument& doc (o.dom_document ());
  o.any (doc.createElement (...));       // set, assumes ownership
}
  

2.12.2 Mapping for any with the Optional Cardinality Class

For any with the Optional cardinality class, the type definitions consist of an alias for the container type with name any_optional (or any1_optional, etc., for subsequent wildcards in the type definition).

Unlike accessor functions for the One cardinality class, accessor functions for the Optional cardinality class return references to corresponding containers rather than directly to DOMElement. The accessor functions come in constant and non-constant versions. The constant accessor function returns a constant reference to the container and can be used for read-only access. The non-constant version returns an unrestricted reference to the container and can be used for read-write access.

The modifier functions are overloaded for xercesc::DOMElement and the container type. The first modifier function expects an argument of type reference to constant xercesc::DOMElement and makes a deep copy of its argument. The second modifier function expects an argument of type pointer to xercesc::DOMElement. This modifier function assumes ownership of its argument and expects the element object to be created using the DOM document associated with this instance. The third modifier function expects an argument of type reference to constant of the container type and makes a deep copy of its argument. For instance:

<complexType name="object">
  <sequence>
    <any namespace="##other" minOccurs="0"/>
  </sequence>
</complexType>
  

is mapped to:

class object: public xml_schema::type
{
public:
  // Type definitions.
  //
  typedef element_optional any_optional;

  // Accessors.
  //
  const any_optional&
  any () const;

  any_optional&
  any ();

  // Modifiers.
  //
  void
  any (const xercesc::DOMElement&);

  void
  any (xercesc::DOMElement*);

  void
  any (const any_optional&);

  ...

};
  

The element_optional container is a specialization of the optional class template described in Section 2.8.2, "Mapping for Members with the Optional Cardinality Class". Its interface is presented below:

class element_optional
{
public:
  explicit
  element_optional (xercesc::DOMDocument&);

  // Makes a deep copy.
  //
  element_optional (const xercesc::DOMElement&, xercesc::DOMDocument&);

  // Assumes ownership.
  //
  element_optional (xercesc::DOMElement*, xercesc::DOMDocument&);

  element_optional (const element_optional&, xercesc::DOMDocument&);

public:
  element_optional&
  operator= (const xercesc::DOMElement&);

  element_optional&
  operator= (const element_optional&);

  // Pointer-like interface.
  //
public:
  const xercesc::DOMElement*
  operator-> () const;

  xercesc::DOMElement*
  operator-> ();

  const xercesc::DOMElement&
  operator* () const;

  xercesc::DOMElement&
  operator* ();

  typedef void (element_optional::*bool_convertible) ();
  operator bool_convertible () const;

  // Get/set interface.
  //
public:
  bool
  present () const;

  const xercesc::DOMElement&
  get () const;

  xercesc::DOMElement&
  get ();

  // Makes a deep copy.
  //
  void
  set (const xercesc::DOMElement&);

  // Assumes ownership.
  //
  void
  set (xercesc::DOMElement*);

  void
  reset ();
};

bool
operator== (const element_optional&, const element_optional&);

bool
operator!= (const element_optional&, const element_optional&);
  

The following code shows how one could use this mapping:

void
f (object& o, const xercesc::DOMElement& e)
{
  using namespace xercesc;

  DOMDocument& doc (o.dom_document ());

  if (o.any ().present ())                  // test
  {
    DOMElement& e1 (o.any ().get ());       // get
    o.any ().set (e);                       // set, deep copy
    o.any ().set (doc.createElement (...)); // set, assumes ownership
    o.any ().reset ();                      // reset
  }

  // Same as above but using pointer notation:
  //
  if (o.member ())                          // test
  {
    DOMElement& e1 (*o.any ());             // get
    o.any (e);                              // set, deep copy
    o.any (doc.createElement (...));        // set, assumes ownership
    o.any ().reset ();                      // reset
  }
}
  

2.12.3 Mapping for any with the Sequence Cardinality Class

For any with the Sequence cardinality class, the type definitions consist of an alias of the container type with name any_sequence (or any1_sequence, etc., for subsequent wildcards in the type definition), an alias of the iterator type with name any_iterator (or any1_iterator, etc., for subsequent wildcards in the type definition), and an alias of the constant iterator type with name any_const_iterator (or any1_const_iterator, etc., for subsequent wildcards in the type definition).

The accessor functions come in constant and non-constant versions. The constant accessor function returns a constant reference to the container and can be used for read-only access. The non-constant version returns an unrestricted reference to the container and can be used for read-write access.

The modifier function expects an argument of type reference to constant of the container type. The modifier function makes a deep copy of its argument. For instance:

<complexType name="object">
  <sequence>
    <any namespace="##other" minOccurs="unbounded"/>
  </sequence>
</complexType>
  

is mapped to:

class object: public xml_schema::type
{
public:
  // Type definitions.
  //
  typedef element_sequence any_sequence;
  typedef any_sequence::iterator any_iterator;
  typedef any_sequence::const_iterator any_const_iterator;

  // Accessors.
  //
  const any_sequence&
  any () const;

  any_sequence&
  any ();

  // Modifier.
  //
  void
  any (const any_sequence&);

  ...

};
  

The element_sequence container is a specialization of the sequence class template described in Section 2.8.3, "Mapping for Members with the Sequence Cardinality Class". Its interface is similar to the sequence interface as defined by the ISO/ANSI Standard for C++ (ISO/IEC 14882:1998, Section 23.1.1, "Sequences") and is presented below:

class element_sequence
{
public:
  typedef xercesc::DOMElement        value_type;
  typedef xercesc::DOMElement*       pointer;
  typedef const xercesc::DOMElement* const_pointer;
  typedef xercesc::DOMElement&       reference;
  typedef const xercesc::DOMElement& const_reference;

  typedef <implementation-defined>   iterator;
  typedef <implementation-defined>   const_iterator;
  typedef <implementation-defined>   reverse_iterator;
  typedef <implementation-defined>   const_reverse_iterator;

  typedef <implementation-defined>   size_type;
  typedef <implementation-defined>   difference_type;
  typedef <implementation-defined>   allocator_type;

public:
  explicit
  element_sequence (xercesc::DOMDocument&);

  // DOMElement cannot be default-constructed.
  //
  // explicit
  // element_sequence (size_type n);

  element_sequence (size_type n,
                    const xercesc::DOMElement&,
                    xercesc::DOMDocument&);

  template <typename I>
  element_sequence (const I& begin,
                    const I& end,
                    xercesc::DOMDocument&);

  element_sequence (const element_sequence&, xercesc::DOMDocument&);

  element_sequence&
  operator= (const element_sequence&);

public:
  void
  assign (size_type n, const xercesc::DOMElement&);

  template <typename I>
  void
  assign (const I& begin, const I& end);

public:
  // This version of resize can only be used to shrink the
  // sequence because DOMElement cannot be default-constructed.
  //
  void
  resize (size_type);

  void
  resize (size_type, const xercesc::DOMElement&);

public:
  size_type
  size () const;

  size_type
  max_size () const;

  size_type
  capacity () const;

  bool
  empty () const;

  void
  reserve (size_type);

  void
  clear ();

public:
  const_iterator
  begin () const;

  const_iterator
  end () const;

  iterator
  begin ();

  iterator
  end ();

  const_reverse_iterator
  rbegin () const;

  const_reverse_iterator
  rend () const

    reverse_iterator
  rbegin ();

  reverse_iterator
  rend ();

public:
  xercesc::DOMElement&
  operator[] (size_type);

  const xercesc::DOMElement&
  operator[] (size_type) const;

  xercesc::DOMElement&
  at (size_type);

  const xercesc::DOMElement&
  at (size_type) const;

  xercesc::DOMElement&
  front ();

  const xercesc::DOMElement&
  front () const;

  xercesc::DOMElement&
  back ();

  const xercesc::DOMElement&
  back () const;

public:
  // Makes a deep copy.
  //
  void
  push_back (const xercesc::DOMElement&);

  // Assumes ownership.
  //
  void
  push_back (xercesc::DOMElement*);

  void
  pop_back ();

  // Makes a deep copy.
  //
  iterator
  insert (iterator position, const xercesc::DOMElement&);

  // Assumes ownership.
  //
  iterator
  insert (iterator position, xercesc::DOMElement*);

  void
  insert (iterator position, size_type n, const xercesc::DOMElement&);

  template <typename I>
  void
  insert (iterator position, const I& begin, const I& end);

  iterator
  erase (iterator position);

  iterator
  erase (iterator begin, iterator end);

public:
  // Note that the DOMDocument object of the two sequences being
  // swapped should be the same.
  //
  void
  swap (sequence& x);
};

inline bool
operator== (const element_sequence&, const element_sequence&);

inline bool
operator!= (const element_sequence&, const element_sequence&);
  

The following code shows how one could use this mapping:

void
f (object& o, const xercesc::DOMElement& e)
{
  using namespace xercesc;

  object::any_sequence& s (o.any ());

  // Iteration.
  //
  for (object::any_iterator i (s.begin ()); i != s.end (); ++i)
  {
    DOMElement& e (*i);
  }

  // Modification.
  //
  s.push_back (e);                       // deep copy
  DOMDocument& doc (o.dom_document ());
  s.push_back (doc.createElement (...)); // assumes ownership
}
  

2.12.4 Element Wildcard Order

Similar to elements, element wildcards in ordered types (Section 2.8.4, "Element Order") are assigned content ids and are included in the content order sequence. Continuing with the bank transactions example started in Section 2.8.4, we can extend the batch by allowing custom transactions:

<complexType name="batch">
  <choice minOccurs="0" maxOccurs="unbounded">
    <element name="withdraw" type="withdraw"/>
    <element name="deposit" type="deposit"/>
    <any namespace="##other" processContents="lax"/>
  </choice>
</complexType>
  

This will lead to the following changes in the generated batch C++ class:

class batch: public xml_schema::type
{
public:
  ...

  // any
  //
  typedef element_sequence any_sequence;
  typedef any_sequence::iterator any_iterator;
  typedef any_sequence::const_iterator any_const_iterator;

  static const std::size_t any_id = 3UL;

  const any_sequence&
  any () const;

  any_sequence&
  any ();

  void
  any (const any_sequence&);

  ...
};
  

With this change we also need to update the iteration code to handle the new content id:

for (batch::content_order_const_iterator i (b.content_order ().begin ());
     i != b.content_order ().end ();
     ++i)
{
  switch (i->id)
  {
    ...

  case batch::any_id:
    {
      const DOMElement& e (b.any ()[i->index]);
      ...
      break;
    }

    ...
  }
}
  

For the complete working code that shows the use of wildcards in ordered types refer to the order/element example in the examples/cxx/tree/ directory in the XSD distribution.

2.12.5 Mapping for anyAttribute

For anyAttribute the type definitions consist of an alias of the container type with name any_attribute_set (or any1_attribute_set, etc., for subsequent wildcards in the type definition), an alias of the iterator type with name any_attribute_iterator (or any1_attribute_iterator, etc., for subsequent wildcards in the type definition), and an alias of the constant iterator type with name any_attribute_const_iterator (or any1_attribute_const_iterator, etc., for subsequent wildcards in the type definition).

The accessor functions come in constant and non-constant versions. The constant accessor function returns a constant reference to the container and can be used for read-only access. The non-constant version returns an unrestricted reference to the container and can be used for read-write access.

The modifier function expects an argument of type reference to constant of the container type. The modifier function makes a deep copy of its argument. For instance:

<complexType name="object">
  <sequence>
    ...
  </sequence>
  <anyAttribute namespace="##other"/>
</complexType>
  

is mapped to:

class object: public xml_schema::type
{
public:
  // Type definitions.
  //
  typedef attribute_set any_attribute_set;
  typedef any_attribute_set::iterator any_attribute_iterator;
  typedef any_attribute_set::const_iterator any_attribute_const_iterator;

  // Accessors.
  //
  const any_attribute_set&
  any_attribute () const;

  any_attribute_set&
  any_attribute ();

  // Modifier.
  //
  void
  any_attribute (const any_attribute_set&);

  ...

};
  

The attribute_set class is an associative container similar to the std::set class template as defined by the ISO/ANSI Standard for C++ (ISO/IEC 14882:1998, Section 23.3.3, "Class template set") with the key being the attribute's name and namespace. Unlike std::set, attribute_set allows searching using names and namespaces instead of xercesc::DOMAttr objects. It is defined in an implementation-specific namespace and its interface is presented below:

class attribute_set
{
public:
  typedef xercesc::DOMAttr         key_type;
  typedef xercesc::DOMAttr         value_type;
  typedef xercesc::DOMAttr*        pointer;
  typedef const xercesc::DOMAttr*  const_pointer;
  typedef xercesc::DOMAttr&        reference;
  typedef const xercesc::DOMAttr&  const_reference;

  typedef <implementation-defined> iterator;
  typedef <implementation-defined> const_iterator;
  typedef <implementation-defined> reverse_iterator;
  typedef <implementation-defined> const_reverse_iterator;

  typedef <implementation-defined> size_type;
  typedef <implementation-defined> difference_type;
  typedef <implementation-defined> allocator_type;

public:
  attribute_set (xercesc::DOMDocument&);

  template <typename I>
  attribute_set (const I& begin, const I& end, xercesc::DOMDocument&);

  attribute_set (const attribute_set&, xercesc::DOMDocument&);

  attribute_set&
  operator= (const attribute_set&);

public:
  const_iterator
  begin () const;

  const_iterator
  end () const;

  iterator
  begin ();

  iterator
  end ();

  const_reverse_iterator
  rbegin () const;

  const_reverse_iterator
  rend () const;

  reverse_iterator
  rbegin ();

  reverse_iterator
  rend ();

public:
  size_type
  size () const;

  size_type
  max_size () const;

  bool
  empty () const;

  void
  clear ();

public:
  // Makes a deep copy.
  //
  std::pair<iterator, bool>
  insert (const xercesc::DOMAttr&);

  // Assumes ownership.
  //
  std::pair<iterator, bool>
  insert (xercesc::DOMAttr*);

  // Makes a deep copy.
  //
  iterator
  insert (iterator position, const xercesc::DOMAttr&);

  // Assumes ownership.
  //
  iterator
  insert (iterator position, xercesc::DOMAttr*);

  template <typename I>
  void
  insert (const I& begin, const I& end);

public:
  void
  erase (iterator position);

  size_type
  erase (const std::basic_string<C>& name);

  size_type
  erase (const std::basic_string<C>& namespace_,
         const std::basic_string<C>& name);

  size_type
  erase (const XMLCh* name);

  size_type
  erase (const XMLCh* namespace_, const XMLCh* name);

  void
  erase (iterator begin, iterator end);

public:
  size_type
  count (const std::basic_string<C>& name) const;

  size_type
  count (const std::basic_string<C>& namespace_,
         const std::basic_string<C>& name) const;

  size_type
  count (const XMLCh* name) const;

  size_type
  count (const XMLCh* namespace_, const XMLCh* name) const;

  iterator
  find (const std::basic_string<C>& name);

  iterator
  find (const std::basic_string<C>& namespace_,
        const std::basic_string<C>& name);

  iterator
  find (const XMLCh* name);

  iterator
  find (const XMLCh* namespace_, const XMLCh* name);

  const_iterator
  find (const std::basic_string<C>& name) const;

  const_iterator
  find (const std::basic_string<C>& namespace_,
        const std::basic_string<C>& name) const;

  const_iterator
  find (const XMLCh* name) const;

  const_iterator
  find (const XMLCh* namespace_, const XMLCh* name) const;

public:
  // Note that the DOMDocument object of the two sets being
  // swapped should be the same.
  //
  void
  swap (attribute_set&);
};

bool
operator== (const attribute_set&, const attribute_set&);

bool
operator!= (const attribute_set&, const attribute_set&);
  

The following code shows how one could use this mapping:

void
f (object& o, const xercesc::DOMAttr& a)
{
  using namespace xercesc;

  object::any_attribute_set& s (o.any_attribute ());

  // Iteration.
  //
  for (object::any_attribute_iterator i (s.begin ()); i != s.end (); ++i)
  {
    DOMAttr& a (*i);
  }

  // Modification.
  //
  s.insert (a);                         // deep copy
  DOMDocument& doc (o.dom_document ());
  s.insert (doc.createAttribute (...)); // assumes ownership

  // Searching.
  //
  object::any_attribute_iterator i (s.find ("name"));
  i = s.find ("http://www.w3.org/XML/1998/namespace", "lang");
}
  

2.13 Mapping for Mixed Content Models

For XML Schema types with mixed content models C++/Tree provides mapping support only if the type is marked as ordered (Section 2.8.4, "Element Order"). Use the --ordered-type-mixed XSD compiler option to automatically mark all types with mixed content as ordered.

For an ordered type with mixed content, C++/Tree adds an extra text content sequence that is used to store the text fragments. This text content sequence is also assigned the content id and its entries are included in the content order sequence, just like elements. As a result, it is possible to capture the order between elements and text fragments.

As an example, consider the following schema that describes text with embedded links:

<complexType name="anchor">
  <simpleContent>
    <extension base="string">
      <attribute name="href" type="anyURI" use="required"/>
    </extension>
  </simpleContent>
</complexType>

<complexType name="text" mixed="true">
  <sequence>
    <element name="a" type="anchor" minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
</complexType>
  

The generated text C++ class will provide the following API (assuming it is marked as ordered):

class text: public xml_schema::type
{
public:
  // a
  //
  typedef anchor a_type;
  typedef sequence<a_type> a_sequence;
  typedef a_sequence::iterator a_iterator;
  typedef a_sequence::const_iterator a_const_iterator;

  static const std::size_t a_id = 1UL;

  const a_sequence&
  a () const;

  a_sequence&
  a ();

  void
  a (const a_sequence&);

  // text_content
  //
  typedef xml_schema::string text_content_type;
  typedef sequence<text_content_type> text_content_sequence;
  typedef text_content_sequence::iterator text_content_iterator;
  typedef text_content_sequence::const_iterator text_content_const_iterator;

  static const std::size_t text_content_id = 2UL;

  const text_content_sequence&
  text_content () const;

  text_content_sequence&
  text_content ();

  void
  text_content (const text_content_sequence&);

  // content_order
  //
  typedef xml_schema::content_order content_order_type;
  typedef std::vector<content_order_type> content_order_sequence;
  typedef content_order_sequence::iterator content_order_iterator;
  typedef content_order_sequence::const_iterator content_order_const_iterator;

  const content_order_sequence&
  content_order () const;

  content_order_sequence&
  content_order ();

  void
  content_order (const content_order_sequence&);

  ...
};
  

Given this interface we can iterate over both link elements and text in content order. The following code fragment converts our format to plain text with references.

const text& t = ...

for (text::content_order_const_iterator i (t.content_order ().begin ());
     i != t.content_order ().end ();
     ++i)
{
  switch (i->id)
  {
  case text::a_id:
    {
      const anchor& a (t.a ()[i->index]);
      cerr << a << "[" << a.href () << "]";
      break;
    }
  case text::text_content_id:
    {
      const xml_schema::string& s (t.text_content ()[i->index]);
      cerr << s;
      break;
    }
  default:
    {
      assert (false); // Unknown content id.
    }
  }
}
  

For the complete working code that shows the use of mixed content in ordered types refer to the order/mixed example in the examples/cxx/tree/ directory in the XSD distribution.

3 Parsing

This chapter covers various aspects of parsing XML instance documents in order to obtain corresponding tree-like object model.

Each global XML Schema element in the form:

<element name="name" type="type"/>
  

is mapped to 14 overloaded C++ functions in the form:

// Read from a URI or a local file.
//

std::[auto|unique]_ptr<type>
name (const std::basic_string<C>& uri,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());

std::[auto|unique]_ptr<type>
name (const std::basic_string<C>& uri,
      xml_schema::error_handler&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());

std::[auto|unique]_ptr<type>
name (const std::basic_string<C>& uri,
      xercesc::DOMErrorHandler&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());


// Read from std::istream.
//

std::[auto|unique]_ptr<type>
name (std::istream&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());

std::[auto|unique]_ptr<type>
name (std::istream&,
      xml_schema::error_handler&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());

std::[auto|unique]_ptr<type>
name (std::istream&,
      xercesc::DOMErrorHandler&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());


std::[auto|unique]_ptr<type>
name (std::istream&,
      const std::basic_string<C>& id,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());

std::[auto|unique]_ptr<type>
name (std::istream&,
      const std::basic_string<C>& id,
      xml_schema::error_handler&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());

std::[auto|unique]_ptr<type>
name (std::istream&,
      const std::basic_string<C>& id,
      xercesc::DOMErrorHandler&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());


// Read from InputSource.
//

std::[auto|unique]_ptr<type>
name (xercesc::InputSource&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());

std::[auto|unique]_ptr<type>
name (xercesc::InputSource&,
      xml_schema::error_handler&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());

std::[auto|unique]_ptr<type>
name (xercesc::InputSource&,
      xercesc::DOMErrorHandler&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());


// Read from DOM.
//

std::[auto|unique]_ptr<type>
name (const xercesc::DOMDocument&,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());

std::[auto|unique]_ptr<type>
name (xml_schema::dom::[auto|unique]_ptr<xercesc::DOMDocument>,
      xml_schema::flags = 0,
      const xml_schema::properties& = xml_schema::properties ());
  

You can choose between reading an XML instance from a local file, URI, std::istream, xercesc::InputSource, or a pre-parsed DOM instance in the form of xercesc::DOMDocument. All the parsing functions return a dynamically allocated object model as either std::auto_ptr or std::unique_ptr, depending on the C++ standard selected. Each of these parsing functions is discussed in more detail in the following sections.

3.1 Initializing the Xerces-C++ Runtime

Some parsing functions expect you to initialize the Xerces-C++ runtime while others initialize and terminate it as part of their work. The general rule is as follows: if a function has any arguments or return a value that is an instance of a Xerces-C++ type, then this function expects you to initialize the Xerces-C++ runtime. Otherwise, the function initializes and terminates the runtime for you. Note that it is legal to have nested calls to the Xerces-C++ initialize and terminate functions as long as the calls are balanced.

You can instruct parsing functions that initialize and terminate the runtime not to do so by passing the xml_schema::flags::dont_initialize flag (see Section 3.2, "Flags and Properties").

3.2 Flags and Properties

Parsing flags and properties are the last two arguments of every parsing function. They allow you to fine-tune the process of instance validation and parsing. Both arguments are optional.

The following flags are recognized by the parsing functions:

xml_schema::flags::keep_dom
Keep association between DOM nodes and the resulting object model nodes. For more information about DOM association refer to Section 5.1, "DOM Association".
xml_schema::flags::own_dom
Assume ownership of the DOM document passed. This flag only makes sense together with the keep_dom flag in the call to the parsing function with the xml_schema::dom::[auto|unique]_ptr<DOMDocument> argument.
xml_schema::flags::dont_validate
Do not validate instance documents against schemas.
xml_schema::flags::dont_initialize
Do not initialize the Xerces-C++ runtime.

You can pass several flags by combining them using the bit-wise OR operator. For example:

using xml_schema::flags;

std::auto_ptr<type> r (
  name ("test.xml", flags::keep_dom | flags::dont_validate));
  

By default, validation of instance documents is turned on even though parsers generated by XSD do not assume instance documents are valid. They include a number of checks that prevent construction of inconsistent object models. This, however, does not mean that an instance document that was successfully parsed by the XSD-generated parsers is valid per the corresponding schema. If an instance document is not "valid enough" for the generated parsers to construct consistent object model, one of the exceptions defined in xml_schema namespace is thrown (see Section 3.3, "Error Handling").

For more information on the Xerces-C++ runtime initialization refer to Section 3.1, "Initializing the Xerces-C++ Runtime".

The xml_schema::properties class allows you to programmatically specify schema locations to be used instead of those specified with the xsi::schemaLocation and xsi::noNamespaceSchemaLocation attributes in instance documents. The interface of the properties class is presented below:

class properties
{
public:
  void
  schema_location (const std::basic_string<C>& namespace_,
                   const std::basic_string<C>& location);
  void
  no_namespace_schema_location (const std::basic_string<C>& location);
};
  

Note that all locations are relative to an instance document unless they are URIs. For example, if you want to use a local file as your schema, then you will need to pass file:///absolute/path/to/your/schema as the location argument.

3.3 Error Handling

As discussed in Section 2.2, "Error Handling", the mapping uses the C++ exception handling mechanism as its primary way of reporting error conditions. However, to handle recoverable parsing and validation errors and warnings, a callback interface maybe preferred by the application.

To better understand error handling and reporting strategies employed by the parsing functions, it is useful to know that the transformation of an XML instance document to a statically-typed tree happens in two stages. The first stage, performed by Xerces-C++, consists of parsing an XML document into a DOM instance. For short, we will call this stage the XML-DOM stage. Validation, if not disabled, happens during this stage. The second stage, performed by the generated parsers, consist of parsing the DOM instance into the statically-typed tree. We will call this stage the DOM-Tree stage. Additional checks are performed during this stage in order to prevent construction of inconsistent tree which could otherwise happen when validation is disabled, for example.

All parsing functions except the one that operates on a DOM instance come in overloaded triples. The first function in such a triple reports error conditions exclusively by throwing exceptions. It accumulates all the parsing and validation errors of the XML-DOM stage and throws them in a single instance of the xml_schema::parsing exception (described below). The second and the third functions in the triple use callback interfaces to report parsing and validation errors and warnings. The two callback interfaces are xml_schema::error_handler and xercesc::DOMErrorHandler. For more information on the xercesc::DOMErrorHandler interface refer to the Xerces-C++ documentation. The xml_schema::error_handler interface is presented below:

class error_handler
{
public:
  struct severity
  {
    enum value
    {
      warning,
      error,
      fatal
    };
  };

  virtual bool
  handle (const std::basic_string<C>& id,
          unsigned long line,
          unsigned long column,
          severity,
          const std::basic_string<C>& message) = 0;

  virtual
  ~error_handler ();
};
  

The id argument of the error_handler::handle function identifies the resource being parsed (e.g., a file name or URI).

By returning true from the handle function you instruct the parser to recover and continue parsing. Returning false results in termination of the parsing process. An error with the fatal severity level results in termination of the parsing process no matter what is returned from the handle function. It is safe to throw an exception from the handle function.

The DOM-Tree stage reports error conditions exclusively by throwing exceptions. Individual exceptions thrown by the parsing functions are described in the following sub-sections.

3.3.1 xml_schema::parsing

struct severity
{
  enum value
  {
    warning,
    error
  };

  severity (value);
  operator value () const;
};

struct error
{
  error (severity,
         const std::basic_string<C>& id,
         unsigned long line,
         unsigned long column,
         const std::basic_string<C>& message);

  severity
  severity () const;

  const std::basic_string<C>&
  id () const;

  unsigned long
  line () const;

  unsigned long
  column () const;

  const std::basic_string<C>&
  message () const;
};

std::basic_ostream<C>&
operator<< (std::basic_ostream<C>&, const error&);

struct diagnostics: std::vector<error>
{
};

std::basic_ostream<C>&
operator<< (std::basic_ostream<C>&, const diagnostics&);

struct parsing: virtual exception
{
  parsing ();
  parsing (const diagnostics&);

  const diagnostics&
  diagnostics () const;

  virtual const char*
  what () const throw ();
};
  

The xml_schema::parsing exception is thrown if there were parsing or validation errors reported during the XML-DOM stage. If no callback interface was provided to the parsing function, the exception contains a list of errors and warnings accessible using the diagnostics function. The usual conditions when this exception is thrown include malformed XML instances and, if validation is turned on, invalid instance documents.

3.3.2 xml_schema::expected_element

struct expected_element: virtual exception
{
  expected_element (const std::basic_string<C>& name,
                    const std::basic_string<C>& namespace_);


  const std::basic_string<C>&
  name () const;

  const std::basic_string<C>&
  namespace_ () const;


  virtual const char*
  what () const throw ();
};
  

The xml_schema::expected_element exception is thrown when an expected element is not encountered by the DOM-Tree stage. The name and namespace of the expected element can be obtained using the name and namespace_ functions respectively.

3.3.3 xml_schema::unexpected_element

struct unexpected_element: virtual exception
{
  unexpected_element (const std::basic_string<C>& encountered_name,
                      const std::basic_string<C>& encountered_namespace,
                      const std::basic_string<C>& expected_name,
                      const std::basic_string<C>& expected_namespace)


  const std::basic_string<C>&
  encountered_name () const;

  const std::basic_string<C>&
  encountered_namespace () const;


  const std::basic_string<C>&
  expected_name () const;

  const std::basic_string<C>&
  expected_namespace () const;


  virtual const char*
  what () const throw ();
};
  

The xml_schema::unexpected_element exception is thrown when an unexpected element is encountered by the DOM-Tree stage. The name and namespace of the encountered element can be obtained using the encountered_name and encountered_namespace functions respectively. If an element was expected instead of the encountered one, its name and namespace can be obtained using the expected_name and expected_namespace functions respectively. Otherwise these functions return empty strings.

3.3.4 xml_schema::expected_attribute

struct expected_attribute: virtual exception
{
  expected_attribute (const std::basic_string<C>& name,
                      const std::basic_string<C>& namespace_);


  const std::basic_string<C>&
  name () const;

  const std::basic_string<C>&
  namespace_ () const;


  virtual const char*
  what () const throw ();
};
  

The xml_schema::expected_attribute exception is thrown when an expected attribute is not encountered by the DOM-Tree stage. The name and namespace of the expected attribute can be obtained using the name and namespace_ functions respectively.

3.3.5 xml_schema::unexpected_enumerator

struct unexpected_enumerator: virtual exception
{
  unexpected_enumerator (const std::basic_string<C>& enumerator);

  const std::basic_string<C>&
  enumerator () const;

  virtual const char*
  what () const throw ();
};
  

The xml_schema::unexpected_enumerator exception is thrown when an unexpected enumerator is encountered by the DOM-Tree stage. The enumerator can be obtained using the enumerator functions.

3.3.6 xml_schema::expected_text_content

struct expected_text_content: virtual exception
{
  virtual const char*
  what () const throw ();
};
  

The xml_schema::expected_text_content exception is thrown when a content other than text is encountered and the text content was expected by the DOM-Tree stage.

3.3.7 xml_schema::no_type_info

struct no_type_info: virtual exception
{
  no_type_info (const std::basic_string<C>& type_name,
                const std::basic_string<C>& type_namespace);

  const std::basic_string<C>&
  type_name () const;

  const std::basic_string<C>&
  type_namespace () const;

  virtual const char*
  what () const throw ();
};
  

The xml_schema::no_type_info exception is thrown when there is no type information associated with a type specified by the xsi:type attribute. This exception is thrown by the DOM-Tree stage. The name and namespace of the type in question can be obtained using the type_name and type_namespace functions respectively. Usually, catching this exception means that you haven't linked the code generated from the schema defining the type in question with your application or this schema has been compiled without the --generate-polymorphic option.

3.3.8 xml_schema::not_derived

struct not_derived: virtual exception
{
  not_derived (const std::basic_string<C>& base_type_name,
               const std::basic_string<C>& base_type_namespace,
               const std::basic_string<C>& derived_type_name,
               const std::basic_string<C>& derived_type_namespace);

  const std::basic_string<C>&
  base_type_name () const;

  const std::basic_string<C>&
  base_type_namespace () const;


  const std::basic_string<C>&
  derived_type_name () const;

  const std::basic_string<C>&
  derived_type_namespace () const;

  virtual const char*
  what () const throw ();
};
  

The xml_schema::not_derived exception is thrown when a type specified by the xsi:type attribute is not derived from the expected base type. This exception is thrown by the DOM-Tree stage. The name and namespace of the expected base type can be obtained using the base_type_name and base_type_namespace functions respectively. The name and namespace of the offending type can be obtained using the derived_type_name and derived_type_namespace functions respectively.

3.3.9 xml_schema::no_prefix_mapping

struct no_prefix_mapping: virtual exception
{
  no_prefix_mapping (const std::basic_string<C>& prefix);

  const std::basic_string<C>&
  prefix () const;

  virtual const char*
  what () const throw ();
};
  

The xml_schema::no_prefix_mapping exception is thrown during the DOM-Tree stage if a namespace prefix is encountered for which a prefix-namespace mapping hasn't been provided. The namespace prefix in question can be obtained using the prefix function.

3.4 Reading from a Local File or URI

Using a local file or URI is the simplest way to parse an XML instance. For example:

using std::auto_ptr;

auto_ptr<type> r1 (name ("test.xml"));
auto_ptr<type> r2 (name ("http://www.codesynthesis.com/test.xml"));
  

Or, in the C++11 mode:

using std::unique_ptr;

unique_ptr<type> r1 (name ("test.xml"));
unique_ptr<type> r2 (name ("http://www.codesynthesis.com/test.xml"));
  

3.5 Reading from std::istream

When using an std::istream instance, you may also pass an optional resource id. This id is used to identify the resource (for example in error messages) as well as to resolve relative paths. For instance:

using std::auto_ptr;

{
  std::ifstream ifs ("test.xml");
  auto_ptr<type> r (name (ifs, "test.xml"));
}

{
  std::string str ("..."); // Some XML fragment.
  std::istringstream iss (str);
  auto_ptr<type> r (name (iss));
}
  

3.6 Reading from xercesc::InputSource

Reading from a xercesc::InputSource instance is similar to the std::istream case except the resource id is maintained by the InputSource object. For instance:

xercesc::StdInInputSource is;
std::auto_ptr<type> r (name (is));
  

3.7 Reading from DOM

Reading from a xercesc::DOMDocument instance allows you to setup a custom XML-DOM stage. Things like DOM parser reuse, schema pre-parsing, and schema caching can be achieved with this approach. For more information on how to obtain DOM representation from an XML instance refer to the Xerces-C++ documentation. In addition, the C++/Tree Mapping FAQ shows how to parse an XML instance to a Xerces-C++ DOM document using the XSD runtime utilities.

The last parsing function is useful when you would like to perform your own XML-to-DOM parsing and associate the resulting DOM document with the object model nodes. The automatic DOMDocument pointer is reset and the resulting object model assumes ownership of the DOM document passed. For example:

// C++98 version.
//
xml_schema::dom::auto_ptr<xercesc::DOMDocument> doc = ...

std::auto_ptr<type> r (
  name (doc, xml_schema::flags::keep_dom | xml_schema::flags::own_dom));

// At this point doc is reset to 0.

// C++11 version.
//
xml_schema::dom::unique_ptr<xercesc::DOMDocument> doc = ...

std::unique_ptr<type> r (
  name (std::move (doc),
        xml_schema::flags::keep_dom | xml_schema::flags::own_dom));

// At this point doc is reset to 0.
  

4 Serialization

This chapter covers various aspects of serializing a tree-like object model to DOM or XML. In this regard, serialization is complimentary to the reverse process of parsing a DOM or XML instance into an object model which is discussed in Chapter 3, "Parsing". Note that the generation of the serialization code is optional and should be explicitly requested with the --generate-serialization option. See the XSD Compiler Command Line Manual for more information.

Each global XML Schema element in the form:

<xsd:element name="name" type="type"/>
  

is mapped to 8 overloaded C++ functions in the form:

// Serialize to std::ostream.
//
void
name (std::ostream&,
      const type&,
      const xml_schema::namespace_fomap& =
        xml_schema::namespace_infomap (),
      const std::basic_string<C>& encoding = "UTF-8",
      xml_schema::flags = 0);

void
name (std::ostream&,
      const type&,
      xml_schema::error_handler&,
      const xml_schema::namespace_infomap& =
        xml_schema::namespace_infomap (),
      const std::basic_string<C>& encoding = "UTF-8",
      xml_schema::flags = 0);

void
name (std::ostream&,
      const type&,
      xercesc::DOMErrorHandler&,
      const xml_schema::namespace_infomap& =
        xml_schema::namespace_infomap (),
      const std::basic_string<C>& encoding = "UTF-8",
      xml_schema::flags = 0);


// Serialize to XMLFormatTarget.
//
void
name (xercesc::XMLFormatTarget&,
      const type&,
      const xml_schema::namespace_infomap& =
        xml_schema::namespace_infomap (),
      const std::basic_string<C>& encoding = "UTF-8",
      xml_schema::flags = 0);

void
name (xercesc::XMLFormatTarget&,
      const type&,
      xml_schema::error_handler&,
      const xml_schema::namespace_infomap& =
        xml_schema::namespace_infomap (),
      const std::basic_string<C>& encoding = "UTF-8",
      xml_schema::flags = 0);

void
name (xercesc::XMLFormatTarget&,
      const type&,
      xercesc::DOMErrorHandler&,
      const xml_schema::namespace_infomap& =
        xml_schema::namespace_infomap (),
      const std::basic_string<C>& encoding = "UTF-8",
      xml_schema::flags = 0);


// Serialize to DOM.
//
xml_schema::dom::[auto|unique]_ptr<xercesc::DOMDocument>
name (const type&,
      const xml_schema::namespace_infomap&
        xml_schema::namespace_infomap (),
      xml_schema::flags = 0);

void
name (xercesc::DOMDocument&,
      const type&,
      xml_schema::flags = 0);
  

You can choose between writing XML to std::ostream or xercesc::XMLFormatTarget and creating a DOM instance in the form of xercesc::DOMDocument. Serialization to ostream or XMLFormatTarget requires a considerably less work while serialization to DOM provides for greater flexibility. Each of these serialization functions is discussed in more detail in the following sections.

4.1 Initializing the Xerces-C++ Runtime

Some serialization functions expect you to initialize the Xerces-C++ runtime while others initialize and terminate it as part of their work. The general rule is as follows: if a function has any arguments or return a value that is an instance of a Xerces-C++ type, then this function expects you to initialize the Xerces-C++ runtime. Otherwise, the function initializes and terminates the runtime for you. Note that it is legal to have nested calls to the Xerces-C++ initialize and terminate functions as long as the calls are balanced.

You can instruct serialization functions that initialize and terminate the runtime not to do so by passing the xml_schema::flags::dont_initialize flag (see Section 4.3, "Flags").

4.2 Namespace Infomap and Character Encoding

When a document being serialized uses XML namespaces, custom prefix-namespace associations can to be established. If custom prefix-namespace mapping is not provided then generic prefixes (p1, p2, etc) are automatically assigned to namespaces as needed. Also, if you would like the resulting instance document to contain the schemaLocation or noNamespaceSchemaLocation attributes, you will need to provide namespace-schema associations. The xml_schema::namespace_infomap class is used to capture this information:

struct namespace_info
{
  namespace_info ();
  namespace_info (const std::basic_string<C>& name,
                  const std::basic_string<C>& schema);

  std::basic_string<C> name;
  std::basic_string<C> schema;
};

// Map of namespace prefix to namespace_info.
//
struct namespace_infomap: public std::map<std::basic_string<C>,
                                          namespace_info>
{
};
  

Consider the following associations as an example:

xml_schema::namespace_infomap map;

map["t"].name = "http://www.codesynthesis.com/test";
map["t"].schema = "test.xsd";
  

This map, if passed to one of the serialization functions, could result in the following XML fragment:

<?xml version="1.0" ?>
<t:name xmlns:t="http://www.codesynthesis.com/test"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.codesynthesis.com/test test.xsd">
  

As you can see, the serialization function automatically added namespace mapping for the xsi prefix. You can change this by providing your own prefix:

xml_schema::namespace_infomap map;

map["xsn"].name = "http://www.w3.org/2001/XMLSchema-instance";

map["t"].name = "http://www.codesynthesis.com/test";
map["t"].schema = "test.xsd";
  

This could result in the following XML fragment:

<?xml version="1.0" ?>
<t:name xmlns:t="http://www.codesynthesis.com/test"
        xmlns:xsn="http://www.w3.org/2001/XMLSchema-instance"
        xsn:schemaLocation="http://www.codesynthesis.com/test test.xsd">
  

To specify the location of a schema without a namespace you can use an empty prefix as in the example below:

xml_schema::namespace_infomap map;

map[""].schema = "test.xsd";
  

This would result in the following XML fragment:

<?xml version="1.0" ?>
<name xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:noNamespaceSchemaLocation="test.xsd">
  

To make a particular namespace default you can use an empty prefix, for example:

xml_schema::namespace_infomap map;

map[""].name = "http://www.codesynthesis.com/test";
map[""].schema = "test.xsd";
  

This could result in the following XML fragment:

<?xml version="1.0" ?>
<name xmlns="http://www.codesynthesis.com/test"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.codesynthesis.com/test test.xsd">
  

Another bit of information that you can pass to the serialization functions is the character encoding method that you would like to use. Common values for this argument are "US-ASCII", "ISO8859-1", "UTF-8", "UTF-16BE", "UTF-16LE", "UCS-4BE", and "UCS-4LE". The default encoding is "UTF-8". For more information on encoding methods see the "Character Encoding" article from Wikipedia.

4.3 Flags

Serialization flags are the last argument of every serialization function. They allow you to fine-tune the process of serialization. The flags argument is optional.

The following flags are recognized by the serialization functions:

xml_schema::flags::dont_initialize
Do not initialize the Xerces-C++ runtime.
xml_schema::flags::dont_pretty_print
Do not add extra spaces or new lines that make the resulting XML slightly bigger but easier to read.
xml_schema::flags::no_xml_declaration
Do not write XML declaration (<?xml ... ?>).

You can pass several flags by combining them using the bit-wise OR operator. For example:

std::auto_ptr<type> r = ...
std::ofstream ofs ("test.xml");
xml_schema::namespace_infomap map;
name (ofs,
      *r,
      map,
      "UTF-8",
      xml_schema::flags::no_xml_declaration |
      xml_schema::flags::dont_pretty_print);
  

For more information on the Xerces-C++ runtime initialization refer to Section 4.1, "Initializing the Xerces-C++ Runtime".

4.4 Error Handling

As with the parsing functions (see Section 3.3, "Error Handling"), to better understand error handling and reporting strategies employed by the serialization functions, it is useful to know that the transformation of a statically-typed tree to an XML instance document happens in two stages. The first stage, performed by the generated code, consist of building a DOM instance from the statically-typed tree . For short, we will call this stage the Tree-DOM stage. The second stage, performed by Xerces-C++, consists of serializing the DOM instance into the XML document. We will call this stage the DOM-XML stage.

All serialization functions except the two that serialize into a DOM instance come in overloaded triples. The first function in such a triple reports error conditions exclusively by throwing exceptions. It accumulates all the serialization errors of the DOM-XML stage and throws them in a single instance of the xml_schema::serialization exception (described below). The second and the third functions in the triple use callback interfaces to report serialization errors and warnings. The two callback interfaces are xml_schema::error_handler and xercesc::DOMErrorHandler. The xml_schema::error_handler interface is described in Section 3.3, "Error Handling". For more information on the xercesc::DOMErrorHandler interface refer to the Xerces-C++ documentation.

The Tree-DOM stage reports error conditions exclusively by throwing exceptions. Individual exceptions thrown by the serialization functions are described in the following sub-sections.

4.4.1 xml_schema::serialization

struct serialization: virtual exception
{
  serialization ();
  serialization (const diagnostics&);

  const diagnostics&
  diagnostics () const;

  virtual const char*
  what () const throw ();
};
  

The xml_schema::diagnostics class is described in Section 3.3.1, "xml_schema::parsing". The xml_schema::serialization exception is thrown if there were serialization errors reported during the DOM-XML stage. If no callback interface was provided to the serialization function, the exception contains a list of errors and warnings accessible using the diagnostics function.

4.4.2 xml_schema::unexpected_element

The xml_schema::unexpected_element exception is described in Section 3.3.3, "xml_schema::unexpected_element". It is thrown by the serialization functions during the Tree-DOM stage if the root element name of the provided DOM instance does not match with the name of the element this serialization function is for.

4.4.3 xml_schema::no_type_info

The xml_schema::no_type_info exception is described in Section 3.3.7, "xml_schema::no_type_info". It is thrown by the serialization functions during the Tree-DOM stage when there is no type information associated with a dynamic type of an element. Usually, catching this exception means that you haven't linked the code generated from the schema defining the type in question with your application or this schema has been compiled without the --generate-polymorphic option.

4.5 Serializing to std::ostream

In order to serialize to std::ostream you will need an object model, an output stream and, optionally, a namespace infomap. For instance:

// Obtain the object model.
//
std::auto_ptr<type> r = ...

// Prepare namespace mapping and schema location information.
//
xml_schema::namespace_infomap map;

map["t"].name = "http://www.codesynthesis.com/test";
map["t"].schema = "test.xsd";

// Write it out.
//
name (std::cout, *r, map);
  

Note that the output stream is treated as a binary stream. This becomes important when you use a character encoding that is wider than 8-bit char, for instance UTF-16 or UCS-4. For example, things will most likely break if you try to serialize to std::ostringstream with UTF-16 or UCS-4 as an encoding. This is due to the special value, '\0', that will most likely occur as part of such serialization and it won't have the special meaning assumed by std::ostringstream.

4.6 Serializing to xercesc::XMLFormatTarget

Serializing to an xercesc::XMLFormatTarget instance is similar the std::ostream case. For instance:

using std::auto_ptr;

// Obtain the object model.
//
auto_ptr<type> r = ...

// Prepare namespace mapping and schema location information.
//
xml_schema::namespace_infomap map;

map["t"].name = "http://www.codesynthesis.com/test";
map["t"].schema = "test.xsd";

using namespace xercesc;

XMLPlatformUtils::Initialize ();

{
  // Choose a target.
  //
  auto_ptr<XMLFormatTarget> ft;

  if (argc != 2)
  {
    ft = auto_ptr<XMLFormatTarget> (new StdOutFormatTarget ());
  }
  else
  {
    ft = auto_ptr<XMLFormatTarget> (
      new LocalFileFormatTarget (argv[1]));
  }

  // Write it out.
  //
  name (*ft, *r, map);
}

XMLPlatformUtils::Terminate ();
  

Note that we had to initialize the Xerces-C++ runtime before we could call this serialization function.

4.7 Serializing to DOM

The mapping provides two overloaded functions that implement serialization to a DOM instance. The first creates a DOM instance for you and the second serializes to an existing DOM instance. While serializing to a new DOM instance is similar to serializing to std::ostream or xercesc::XMLFormatTarget, serializing to an existing DOM instance requires quite a bit of work from your side. You will need to set all the custom namespace mapping attributes as well as the schemaLocation and/or noNamespaceSchemaLocation attributes. The following listing should give you an idea about what needs to be done:

// Obtain the object model.
//
std::auto_ptr<type> r = ...

using namespace xercesc;

XMLPlatformUtils::Initialize ();

{
  // Create a DOM instance. Set custom namespace mapping and schema
  // location attributes.
  //
  DOMDocument& doc = ...

  // Serialize to DOM.
  //
  name (doc, *r);

  // Serialize the DOM document to XML.
  //
  ...
}

XMLPlatformUtils::Terminate ();
  

For more information on how to create and serialize a DOM instance refer to the Xerces-C++ documentation. In addition, the C++/Tree Mapping FAQ shows how to implement these operations using the XSD runtime utilities.

5 Additional Functionality

The C++/Tree mapping provides a number of optional features that can be useful in certain situations. They are described in the following sections.

5.1 DOM Association

Normally, after parsing is complete, the DOM document which was used to extract the data is discarded. However, the parsing functions can be instructed to preserve the DOM document and create an association between the DOM nodes and object model nodes. When there is an association between the DOM and object model nodes, you can obtain the corresponding DOM element or attribute node from an object model node as well as perform the reverse transition: obtain the corresponding object model from a DOM element or attribute node.

Maintaining DOM association is normally useful when the application needs access to XML constructs that are not preserved in the object model, for example, XML comments. Another useful aspect of DOM association is the ability of the application to navigate the document tree using the generic DOM interface (for example, with the help of an XPath processor) and then move back to the statically-typed object model. Note also that while you can change the underlying DOM document, these changes are not reflected in the object model and will be ignored during serialization. If you need to not only access but also modify some aspects of XML that are not preserved in the object model, then type customization with custom parsing constructors and serialization operators should be used instead.

To request DOM association you will need to pass the xml_schema::flags::keep_dom flag to one of the parsing functions (see Section 3.2, "Flags and Properties" for more information). In this case the DOM document is retained and will be released when the object model is deleted. Note that since DOM nodes "out-live" the parsing function call, you need to initialize the Xerces-C++ runtime before calling one of the parsing functions with the keep_dom flag and terminate it after the object model is destroyed (see Section 3.1, "Initializing the Xerces-C++ Runtime").

If the keep_dom flag is passed as the second argument to the copy constructor and the copy being made is of a complete tree, then the DOM association is also maintained in the copy by cloning the underlying DOM document and reestablishing the associations. For example:

using namespace xercesc;

XMLPlatformUtils::Initialize ();

{
  // Parse XML to object model.
  //
  std::auto_ptr<type> r (root (
    "root.xml",
     xml_schema::flags::keep_dom |
     xml_schema::flags::dont_initialize));

   // Copy without DOM association.
   //
   type copy1 (*r);

   // Copy with DOM association.
   //
   type copy2 (*r, xml_schema::flags::keep_dom);
}

XMLPlatformUtils::Terminate ();
  

To obtain the corresponding DOM node from an object model node you will need to call the _node accessor function which returns a pointer to DOMNode. You can then query this DOM node's type and cast it to either DOMAttr* or DOMElement*. To obtain the corresponding object model node from a DOM node, the DOM user data API is used. The xml_schema::dom::tree_node_key variable contains the key for object model nodes. The following schema and code fragment show how to navigate from DOM to object model nodes and in the opposite direction:

<complexType name="object">
  <sequence>
    <element name="a" type="string"/>
  </sequence>
</complexType>

<element name="root" type="object"/>
  
using namespace xercesc;

XMLPlatformUtils::Initialize ();

{
  // Parse XML to object model.
  //
  std::auto_ptr<type> r (root (
    "root.xml",
     xml_schema::flags::keep_dom |
     xml_schema::flags::dont_initialize));

  DOMNode* n = root->_node ();
  assert (n->getNodeType () == DOMNode::ELEMENT_NODE);
  DOMElement* re = static_cast<DOMElement*> (n);

  // Get the 'a' element. Note that it is not necessarily the
  // first child node of 'root' since there could be whitespace
  // nodes before it.
  //
  DOMElement* ae;

  for (n = re->getFirstChild (); n != 0; n = n->getNextSibling ())
  {
    if (n->getNodeType () == DOMNode::ELEMENT_NODE)
    {
      ae = static_cast<DOMElement*> (n);
      break;
    }
  }

  // Get from the 'a' DOM element to xml_schema::string object model
  // node.
  //
  xml_schema::type& t (
    *reinterpret_cast<xml_schema::type*> (
       ae->getUserData (xml_schema::dom::tree_node_key)));

  xml_schema::string& a (dynamic_cast<xml_schema::string&> (t));
}

XMLPlatformUtils::Terminate ();
  

The 'mixed' example which can be found in the XSD distribution shows how to handle the mixed content using DOM association.

5.2 Binary Serialization

Besides reading from and writing to XML, the C++/Tree mapping also allows you to save the object model to and load it from a number of predefined as well as custom data representation formats. The predefined binary formats are CDR (Common Data Representation) and XDR (eXternal Data Representation). A custom format can easily be supported by providing insertion and extraction operators for basic types.

Binary serialization saves only the data without any meta information or markup. As a result, saving to and loading from a binary representation can be an order of magnitude faster than parsing and serializing the same data in XML. Furthermore, the resulting representation is normally several times smaller than the equivalent XML representation. These properties make binary serialization ideal for internal data exchange and storage. A typical application that uses this facility stores the data and communicates within the system using a binary format and reads/writes the data in XML when communicating with the outside world.

In order to request the generation of insertion operators and extraction constructors for a specific predefined or custom data representation stream, you will need to use the --generate-insertion and --generate-extraction compiler options. See the XSD Compiler Command Line Manual for more information.

Once the insertion operators and extraction constructors are generated, you can use the xml_schema::istream and xml_schema::ostream wrapper stream templates to save the object model to and load it from a specific format. The following code fragment shows how to do this using ACE (Adaptive Communication Environment) CDR streams as an example:

<complexType name="object">
  <sequence>
    <element name="a" type="string"/>
    <element name="b" type="int"/>
  </sequence>
</complexType>

<element name="root" type="object"/>
  
// Parse XML to object model.
//
std::auto_ptr<type> r (root ("root.xml"));

// Save to a CDR stream.
//
ACE_OutputCDR ace_ocdr;
xml_schema::ostream<ACE_OutputCDR> ocdr (ace_ocdr);

ocdr << *r;

// Load from a CDR stream.
//
ACE_InputCDR ace_icdr (buf, size);
xml_schema::istream<ACE_InputCDR> icdr (ace_icdr);

std::auto_ptr<object> copy (new object (icdr));

// Serialize to XML.
//
root (std::cout, *copy);
  

The XSD distribution contains a number of examples that show how to save the object model to and load it from CDR, XDR, and a custom format.

Appendix A — Default and Fixed Values

The following table summarizes the effect of default and fixed values (specified with the default and fixed attributes, respectively) on attribute and element values. The default and fixed attributes are mutually exclusive. It is also worthwhile to note that the fixed value semantics is a superset of the default value semantics.

default fixed
element not present optional required optional required
not present invalid instance not present invalid instance
empty default value is used fixed value is used
value value is used value is used provided it's the same as fixed
attribute not present optional required optional required
default value is used invalid schema fixed value is used invalid instance
empty empty value is used empty value is used provided it's the same as fixed
value value is used value is used provided it's the same as fixed