11 Package XML APIs for C

This C implementation of the XML processor (or parser) follows the W3C XML specification (rev REC-xml-19980210) and implements the required behavior of an XML processor in terms of how it must read XML data and the information it must provide to the application.

This chapter contains the following section:


XML Interface

Table 11-1 summarizes the methods available through the XML interface.

Table 11-1 Summary of XML Methods

Function Summary

XmlAccess()

Set access method callbacks for URL.

XmlCreate()

Create an XML Developer's Toolkit xmlctx.

XmlCreateDTD()

Create DTD.

XmlCreateDocument()

Create Document (node).

XmlDestroy()

Destroy an xmlctx.

XmlDiff()

Compares two XML documents.

XmlFreeDocument()

Free a document (releases all resources).

XmlGetEncoding()

Returns data encoding in use by XML context.

XmlHasFeature()

Determine if DOM feature is implemented.

XmlIsSimple()

Returns single-byte (simple) characterset flag.

XmlIsUnicode()

Returns XmlIsUnicode (simple) characterset flag.

XmlLoadDom()

Load (parse) an XML document and produce a DOM.

XmlLoadSax()

Load (parse) an XML document from and produce SAX events.

XmlLoadSaxVA()

Load (parse) an XML document from and produce SAX events [varargs].

XmlSaveDom()

Saves (serializes, formats) an XML document.

XmlVersion()

Returns version string for XDK.



XmlAccess()

Sets the open/read/close callbacks used to load data for a specific URL access method. Overrides the built-in data loading functions for HTTP, FTP, and so on, or provides functions to handle new types, such as UNKNOWN.

Syntax

xmlerr XmlAccess(
   xmlctx *xctx, 
   xmlurlacc access, 
   void *userctx,
   XML_ACCESS_OPEN_F(
      (*openf),
      ctx,
      uri,
      parts,
      length,
      uh),
   XML_ACCESS_READ_F(
      (*readf),
      ctx,
      uh,
      data,
      nraw,
      eoi),
   XML_ACCESS_CLOSE_F(
      (*closef), 
      ctx,
      uh));
Parameter In/Out Description
xctx
IN
XML context
access
IN
URL access method
userctx
IN
user-defined context passed to callbacks
openf
IN
open-access callback function
readf
IN
read-access callback function
closef
IN
close-access callback function

Returns

(xmlerr) numeric error code, XMLERR_OK [0] on success


XmlCreate()

Create an XML Developer's Toolkit xmlctx.

Syntax

xmlctx *XmlCreate(
   xmlerr *err, 
   oratext *name,
   list);
Parameter In/Out Description
err
OUT
returned error code
access
IN
name of context, for debugging
list
IN
NULL-terminated list of variable arguments. Properties common to all xmlctx's, both XDK and XMLType, are:
  • data_encoding is the data encoding in which XML data will be presented through DOM and SAX. Default is UTF-8 and UTF-E on EBCDIC platforms. Single-byte encodings are substantially faster than multibyte encodings; Unicode (UTF-16) uses more memory but has better performance than multibyte.

  • default_input_encoding is the default input encoding). If the encoding of an input document cannot be automatically determined through other methods, this encoding will be the default.

  • error_language is the language (and optional encoding) in which error messages are created. Default is American with UTF-8 encoding. To specify only the language, give the name of the language ("American"). To also specify the encoding, add the period and the Oracle name of the encoding ("American.WE8ISO8859P1").

  • error_handler is the function pointer; see XML_ERRMSG_F. By default, errors output the formatted message to stderr. If an error handler is provided, message will be passed to it, and not printed.

  • error_context is user-defined context for error handler, a context pointer to be passed to the error handler function. It is user-defined; it is just specified here and passed along when an error occurs.

  • input_encoding is the name of a forced input encoding for input documents. Use it to override a document's XMLDecl, and always interpret it in the given encoding. It should be not necessary in normal use, as existing BOMs and XMLDecls should be correct.

  • memory_alloc is a low-level memory allocation function, if not using malloc. If used, the matching free function must also be given. See XML_ALLOC_F.

  • memory_free is a low-level memory freeing function, if not using free. Matches the memory_alloc function.

  • memory_context is a user-defined memory context passed to the alloc and free functions. Its definition and use is entirely up to the user; it is just set here and passed to the callbacks.

The XDK has additional properties:

  • input_buffer_size is the basic I/O buffer size. Default is 256K; the range is 4K to 4MB. Depending on the encoding, 1, 2 or 3 of these buffers may be needed. Note that size is in characters, not bytes. If the buffer holds Unicode data, it will be twice as large.

  • memory_block_size is the size of chunk the high-level memory package will request from the low-level allocator; it is the basic unit of memory allocation. Default is 64K; the range is 16K to 256K.

These optional parameters should be used in the following manner:

xmlctx *XmlCreate(
   xmlerr *err, 
   oratext *name,
   ("data_encoding", dataEncoding),
   ("default_data_encoding", defaultDataEncoding),
   ("error_language", errorLanguage),
   ("error_handler", errorHandler),
   ("error_context", errorContext)
   ("input_encoding", inputEncoding),
   ("memory_alloc", memAlloc),
   ("memory_free", memFree),
   ("memory_context", memContext),
   ("input_buffer_seize", inputBufSize),
   ("memory_block_size", memBlockSize) );

Returns

(xmlctx *) created xmlctx [or NULL on error with err set]


XmlCreateDTD()

Create DTD.

Syntax

xmldocnode* XmlCreateDTD(
   xmlctx *xctx
   oratext *qname,
   oratext *pubid,
   oratext *sysid,
   xmlerr *err);
Parameter In/Out Description
xctx
IN
XML context
qname
IN
qualified name
pubid
IN
external subset public identifier
sysid
IN
external subset system identifier
err
OUT
returned error code

Returns

(xmldtdnode *) new DTD node


XmlCreateDocument()

Creates the initial top-level DOCUMENT node and its supporting infrastructure. If a qualified name is provided, a an element with that name is created and set as the document's root element.

Syntax

xmldocnode* XmlCreateDocument(
   xmlctx *xctx,
   oratext *uri,
   oratext *qname, 
   xmldtdnode *dtd,
   xmlerr *err);
Parameter In/Out Description
xctx
IN
XML context
uri
IN
namespace URI of root element to create, or NULL
qname
IN
qualified name of root element, or NULL if none
dtd
IN
associated DTD node
err
OUT
returned error code

Returns

(xmldocnode *) new Document object.


XmlDestroy()

Destroys an XML context.

Syntax

void XmlDestroy(
   xmlctx *xctx);
Parameter In/Out Description
xctx
IN
XML context

See Also:

XmlCreate()

XmlDiff()

Compares two XML documents, specified either as DOM Trees, files, URIs, orastreams, and so on, and returns its document node. If input documents are not supplied as DOM trees, DOM trees will be created for them.

If the inputs are DOMs, that memory will not be freed when the call completes.

Data(DOM) encoding of both the documents must be the same as the data encoding in the XML context. The DOM for the diff will be created in the data encoding specified by the XML context.

Syntax

xmldocnode *XmlDiff(
   xmlctx *xctx, 
   xmlerr *err,
   ub4  flags,
   xmldfsrct firstSourceType,
   void *firstSource,
   void *firstSourceExtra,
   xmldfsrct secondSourceType,
   void *secondSource,
   void *secondSourceExtra,
   uword hashLevel);
Parameter In/Out Description
xctx
IN
XML context
err
OUT
numeric error code, XMLERR_OK [0] on success
flags
IN
Comparison options. By default, global algorithm and snapshot model are used.
  • XMLDF_FL_DEFAULTS(=0) chooses defaults

  • XMLDF_FL_ALGORITHM_GLOBAL is the global algorithm; it will generate the minimal diff using INSERT, APPEND, DELETE and UPDATE, and needs more memory and time than XMLDF_FL_ALGORITHM_LOCAL

  • XMLDF_FL_ALGORITHM_LOCAL is the local algorithm; it may not generate the minimal diff, but it is faster and uses less space than XMLDF_FL_ALGORITHM_GLOBAL

  • XMLDF_FL_DISABLE_UPDATE disables update operations with global algorithms

  • XMLDF_FL_OUTPUT_SNAPSHOT uses the snapshot model

firstSourceType
IN
Source type for the first document. If 0, assumed to be a DOM document node.
firstSource
IN
Pointer to the first document source
firstSourceExtra
IN
An additional pointer to the first document source; used for the buffer length pointer.
secondSourceType
IN
Source type for the second document. If 0, assumed to be a DOM document node.
secondSource
IN
Pointer to the second document source
secondSourceExtra
IN
An additional pointer to the second document source; used for the buffer length pointer.
hashLevel
IN
1-based depth (counting from the root), where hashing should be used for subtrees. Values less than or equal to 1 indicate no hashing. This value must be specified programmatically.

The hash value for every element node is associated with the entire subtree rooted at that node. During the computation of the diff, there is no further drilling down into the tree beyond hash level depth.

  • If hashing is used with XMLDF_FL_ALGORITHM_GLOBAL, it will speed up diff computation significantly, but may reduce the quality of the diff.

  • With XMLDF_FL_ALGORITHM_LOCAL, it improves the quality of the diff



XmlFreeDocument()

Destroys a document created by XmlCreateDocument or through one of the Load functions. Releases all resources associated with the document, which is then invalid.

Syntax

void XmlFreeDocument(
   xmlctx *xctx,
   xmldocnode *doc);
Parameter In/Out Description
xctx
IN
XML context
doc
IN
document to free


XmlGetEncoding()

Returns data encoding in use by XML context. Ordinarily, the data encoding is chosen by the user, so this function is not needed. However, if the data encoding is not specified, and allowed to default, this function can be used to return the name of that default encoding.

Syntax

oratext *XmlGetEncoding(
   xmlctx *xctx);
Parameter In/Out Description
xctx
IN
XML context

Returns

(oratext *) name of data encoding


XmlHasFeature()

Determine if a DOM feature is implemented. Returns TRUE if the feature is implemented in the specified version, FALSE otherwise.

In level 1, the legal values for package are 'HTML' and 'XML' (case-insensitive), and the version is the string "1.0". If the version is not specified, supporting any version of the feature will cause the method to return TRUE.

  • DOM 1.0 features are "XML" and "HTML".

  • DOM 2.0 features are "Core", "XML", "HTML", "Views", "StyleSheets", "CSS", "CSS2", "Events", "UIEvents", "MouseEvents", "MutationEvents", "HTMLEvents", "Range", "Traversal"

Syntax

boolean XmlHasFeature(
   xmlctx *xctx,
   oratext *feature,
   oratext *version);
Parameter In/Out Description
xctx
IN
XML context
feature
IN
package name of the feature to test
version
IN
version number of the package name to test

Returns

(boolean) feature is implemented?


XmlIsSimple()

Returns a flag saying whether the context's data encoding is "simple", single-byte for each character, like ASCII or EBCDIC.

Syntax

boolean XmlIsSimple(
   xmlctx *xctx);
Parameter In/Out Description
xctx
IN
XML context

Returns

(boolean) TRUE of data encoding is "simple", FALSE otherwise


XmlIsUnicode()

Returns a flag saying whether the context's data encoding is Unicode, UTF-16, with two-byte for each character.

Syntax

boolean XmlIsUnicode(
   xmlctx *xctx);
Parameter In/Out Description
xctx
IN
XML context

Returns

(boolean) TRUE of data encoding is Unicode, FALSE otherwise


XmlLoadDom()

Loads (parses) an XML document from an input source and creates a DOM. The root document node is returned on success, or NULL on failure (with err set).

The function takes two fixed arguments, the xmlctx and an error return code, then zero or more (property, value) pairs, then NULL.

SOURCE Input source is set by one of the following mutually exclusive properties (choose one):

  • ("uri", document URI) [compiler encoding]

  • ("file", document filesystem path) [compiler encoding]

  • ("buffer", address of buffer, "buffer_length", # bytes in buffer)

  • ("stream", address of stream object, "stream_context", pointer to stream object's context)

  • ("stdio", FILE* stream)

PROPERTIES Additional properties:

  • ("dtd", DTD node) DTD for document

  • ("base_uri", document base URI) for documents loaded from other sources than a URI, sets the effective base URI. the document's base URI is needed in order to resolve relative URI include, import, and so on.

  • ("input_encoding", encoding name) forced input encoding [name]

  • ("default_input_encoding", encoding_name) default input encoding to assume if document is not self-describing (no BOM, protocol header, XMLDecl, and so on)

  • ("schema_location", string) schemaLocation of schema for this document. used to figure optimal layout when loading documents into a database

  • ("validate", boolean) when TRUE, turns on DTD validation; by default, only well-formedness is checked. note that schema validation is a separate beast.

  • ("discard_whitespace", boolean) when TRUE, formatting whitespace between elements (newlines and indentation) in input documents is discarded. by default, ALL input characters are preserved.

  • ("dtd_only", boolean) when TRUE, parses an external DTD, not a complete XML document.

  • ("stop_on_warning", boolean) when TRUE, warnings are treated the same as errors and cause parsing, validation, and so on, to stop immediately. by default, warnings are issued but the game continues.

  • ("warn_duplicate_entity", boolean) when TRUE, entities which are declared more than once will cause warnings to be issued. the default is to accept the first declaration and silently ignore the rest.

  • ("no_expand_char_ref", boolean) when TRUE, causes character references to be left unexpanded in the DOM data. ordinarily, character references are replaced by the character they represent. however, when a document is saved those characters entities do not reappear. to way to ensure they remain through load and save is to not expand them.

  • ("no_check_chars", boolean) when TRUE, omits the test of XML [2] Char production: all input characters will be accepted as valid

Syntax

xmldocnode *XmlLoadDom(
   xmlctx *xctx, 
   xmlerr *err, 
   list);
Parameter In/Out Description
xctx
IN
XML context
err
OUT
returned error code
list
IN
NULL-terminated list of variable arguments

Returns

(xmldocnode *) document node on success [NULL on failure with err set]

See Also:

XmlSaveDom()

XmlLoadSax()

Loads (parses) an XML document from an input source and generates a set of SAX events (as user callbacks). Input sources and basic set of properties is the same as for XmlLoadDom.

Syntax

xmlerr XmlLoadSax(
   xmlctx *xctx,
   xmlsaxcb *saxcb,
   void *saxctx, 
   list);
Parameter In/Out Description
xctx
IN
XML context
saxcb
IN
SAX callback structure
saxctx
IN
context passed to SAX callbacks
list
IN
NULL-terminated list of variable arguments

Returns

(xmlerr) numeric error code, XMLERR_OK [0] on success


XmlLoadSaxVA()

Loads (parses) an XML document from an input source and generates a set of SAX events (as user callbacks). Input sources and basic set of properties is the same as for XmlLoadDom.

Syntax

xmlerr XmlLoadSaxVA(
   xmlctx *xctx, 
   xmlsaxcb *saxcb, 
   void *saxctx, 
   va_list va);
Parameter In/Out Description
xctx
IN
XML context
saxcb
IN
SAX callback structure
saxctx
IN
context passed to SAX callbacks
va
IN
NULL-terminated list of variable arguments

Returns

(xmlerr) numeric error code, XMLERR_OK [0] on success


XmlSaveDom()

Serializes document or subtree to the given destination and returns the number of bytes written; if no destination is provided, just returns formatted size but does not output.

If an output encoding is specified, the document will be re-encoded on output; otherwise, it will be in its existing encoding.

The top level is indented step*level spaces, the next level step*(level+1) spaces, and so on.

When saving to a buffer, if the buffer overflows, 0 is returned and err is set to XMLERR_SAVE_OVERFLOW.

DESTINATION Output destination is set by one of the following mutually exclusive properties (choose one):

  • ("uri", document URI) POST, PUT? [compiler encoding]

  • ("file", document filesystem path) [compiler encoding]

  • ("buffer", address of buffer, "buffer_length", # bytes in buffer)

  • ("stream", address of stream object, "stream_context", pointer to stream object's context)

PROPERTIES Additional properties:

  • ("output_encoding", encoding name) name of final encoding for document. unless specified, saved document will be in same encoding as xmlctx.

  • ("indent_step", unsigned) spaces to indent each level of output. default is 4, 0 means no indentation.

  • ("indent_level", unsigned) initial indentation level. default is 0, which means no indentation, flush left.

  • ("xmldecl", boolean) include an XMLDecl in the output document. ordinarily an XMLDecl is output for a compete document (root node is DOC).

  • ("bom", boolean) input a BOM in the output document. usually the BOM is only needed for certain encodings (UTF-16), and optional for others (UTF-8). causes optional BOMs to be output.

  • ("prune", boolean) prunes the output like the unix 'find' command; does not not descend to children, just prints the one node given.

Syntax

ubig_ora XmlSaveDom(
   xmlctx *xctx,
   xmlerr *err,
   xmlnode *root,
   list);
Parameter In/Out Description
xctx
IN
XML context
err
OUT
error code on failure
root
IN
root node or subtree to save
list
IN
NULL-terminated list of variable arguments

Returns

(ubig_ora) number of bytes written to destination

See Also:

XmlLoadDom()

XmlVersion()

Returns the version string for the XDK

Syntax

oratext *XmlVersion();

Returns

(oratext *) version string