This C implementation of the XML processor (or parser) follows the W3C XML specification (rev REC-xml-19980210) and implements the required behavior of an XML processor in terms of how it must read XML data and the information it must provide to the application.
This chapter contains the following section:
Table 11-1 summarizes the methods available through the XML
interface.
Table 11-1 Summary of XML Methods
Function | Summary |
---|---|
Set access method callbacks for URL. |
|
Create an XML Developer's Toolkit |
|
Create DTD. |
|
Create Document (node). |
|
Destroy an |
|
Compares two XML documents. |
|
Free a document (releases all resources). |
|
Returns data encoding in use by XML context. |
|
Determine if DOM feature is implemented. |
|
Returns single-byte (simple) characterset flag. |
|
Returns |
|
Load (parse) an XML document and produce a DOM. |
|
Load (parse) an XML document from and produce SAX events. |
|
Load (parse) an XML document from and produce SAX events [ |
|
Saves (serializes, formats) an XML document. |
|
Returns version string for XDK. |
Sets the open/read/close callbacks used to load data for a specific URL access method. Overrides the built-in data loading functions for HTTP, FTP, and so on, or provides functions to handle new types, such as UNKNOWN
.
xmlerr XmlAccess( xmlctx *xctx, xmlurlacc access, void *userctx, XML_ACCESS_OPEN_F( (*openf), ctx, uri, parts, length, uh), XML_ACCESS_READ_F( (*readf), ctx, uh, data, nraw, eoi), XML_ACCESS_CLOSE_F( (*closef), ctx, uh));
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
access |
IN |
URL access method |
userctx |
IN |
user-defined context passed to callbacks |
openf |
IN |
open-access callback function |
readf |
IN |
read-access callback function |
closef |
IN |
close-access callback function |
(xmlerr)
numeric error code, XMLERR_OK [0] on success
Create an XML Developer's Toolkit xmlctx
.
xmlctx *XmlCreate(
xmlerr *err,
oratext *name,
list);
Parameter | In/Out | Description |
---|---|---|
err |
OUT |
returned error code |
access |
IN |
name of context, for debugging |
list
|
IN |
NULL -terminated list of variable arguments. Properties common to all xmlctx 's, both XDK and XMLType, are:
The XDK has additional properties:
These optional parameters should be used in the following manner: xmlctx *XmlCreate( xmlerr *err, oratext *name, ("data_encoding", dataEncoding), ("default_data_encoding", defaultDataEncoding), ("error_language", errorLanguage), ("error_handler", errorHandler), ("error_context", errorContext) ("input_encoding", inputEncoding), ("memory_alloc", memAlloc), ("memory_free", memFree), ("memory_context", memContext), ("input_buffer_seize", inputBufSize), ("memory_block_size", memBlockSize) ); |
(xmlctx *)
created xmlctx [or NULL
on error with err set]
Create DTD.
xmldocnode* XmlCreateDTD( xmlctx *xctx oratext *qname, oratext *pubid, oratext *sysid, xmlerr *err);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
qname |
IN |
qualified name |
pubid |
IN |
external subset public identifier |
sysid |
IN |
external subset system identifier |
err |
OUT |
returned error code |
(xmldtdnode *)
new DTD node
Creates the initial top-level DOCUMENT
node and its supporting infrastructure. If a qualified name is provided, a an element with that name is created and set as the document's root element.
xmldocnode* XmlCreateDocument( xmlctx *xctx, oratext *uri, oratext *qname, xmldtdnode *dtd, xmlerr *err);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
uri |
IN |
namespace URI of root element to create, or NULL |
qname |
IN |
qualified name of root element, or NULL if none |
dtd |
IN |
associated DTD node |
err |
OUT |
returned error code |
(xmldocnode *)
new Document
object.
Destroys an XML context.
void XmlDestroy( xmlctx *xctx);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
See Also:
XmlCreate()Compares two XML documents, specified either as DOM Trees, files, URIs, orastream
s, and so on, and returns its document node. If input documents are not supplied as DOM trees, DOM trees will be created for them.
If the inputs are DOMs, that memory will not be freed when the call completes.
Data(DOM) encoding of both the documents must be the same as the data encoding in the XML context. The DOM for the diff will be created in the data encoding specified by the XML context.
xmldocnode *XmlDiff( xmlctx *xctx, xmlerr *err, ub4 flags, xmldfsrct firstSourceType, void *firstSource, void *firstSourceExtra, xmldfsrct secondSourceType, void *secondSource, void *secondSourceExtra, uword hashLevel);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
err |
OUT |
numeric error code, XMLERR_OK [0] on success |
flags |
IN |
Comparison options. By default, global algorithm and snapshot model are used.
|
firstSourceType |
IN |
Source type for the first document. If 0 , assumed to be a DOM document node. |
firstSource |
IN |
Pointer to the first document source |
firstSourceExtra |
IN |
An additional pointer to the first document source; used for the buffer length pointer. |
secondSourceType |
IN |
Source type for the second document. If 0 , assumed to be a DOM document node. |
secondSource |
IN |
Pointer to the second document source |
secondSourceExtra |
IN |
An additional pointer to the second document source; used for the buffer length pointer. |
hashLevel |
IN |
1 -based depth (counting from the root), where hashing should be used for subtrees. Values less than or equal to 1 indicate no hashing. This value must be specified programmatically.
The hash value for every element node is associated with the entire subtree rooted at that node. During the computation of the diff, there is no further drilling down into the tree beyond hash level depth.
|
Destroys a document created by XmlCreateDocument
or through one of the Load functions. Releases all resources associated with the document, which is then invalid.
void XmlFreeDocument( xmlctx *xctx, xmldocnode *doc);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
doc |
IN |
document to free |
Returns data encoding in use by XML context. Ordinarily, the data encoding is chosen by the user, so this function is not needed. However, if the data encoding is not specified, and allowed to default, this function can be used to return the name of that default encoding.
oratext *XmlGetEncoding( xmlctx *xctx);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
(oratext *)
name of data encoding
Determine if a DOM feature is implemented. Returns TRUE
if the feature is implemented in the specified version, FALSE
otherwise.
In level 1, the legal values for package are 'HTML' and 'XML' (case-insensitive), and the version is the string "1.0". If the version is not specified, supporting any version of the feature will cause the method to return TRUE
.
DOM 1.0 features are "XML" and "HTML".
DOM 2.0 features are "Core", "XML", "HTML", "Views", "StyleSheets", "CSS", "CSS2", "Events", "UIEvents", "MouseEvents", "MutationEvents", "HTMLEvents", "Range", "Traversal"
boolean XmlHasFeature( xmlctx *xctx, oratext *feature, oratext *version);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
feature |
IN |
package name of the feature to test |
version |
IN |
version number of the package name to test |
(boolean)
feature is implemented?
Returns a flag saying whether the context's data encoding is "simple", single-byte for each character, like ASCII or EBCDIC.
boolean XmlIsSimple( xmlctx *xctx);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
(boolean)
TRUE
of data encoding is "simple", FALSE
otherwise
Returns a flag saying whether the context's data encoding is Unicode, UTF-16, with two-byte for each character.
boolean XmlIsUnicode( xmlctx *xctx);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
(boolean)
TRUE
of data encoding is Unicode, FALSE
otherwise
Loads (parses) an XML document from an input source and creates a DOM. The root document node is returned on success, or NULL
on failure (with err set).
The function takes two fixed arguments, the xmlctx and an error return code, then zero or more (property, value) pairs, then NULL
.
SOURCE
Input source is set by one of the following mutually exclusive properties (choose one):
("uri
", document URI) [compiler encoding]
("file
", document filesystem path) [compiler encoding]
("buffer
", address of buffer, "buffer_length
", # bytes in buffer)
("stream
", address of stream object, "stream_context"
, pointer to stream object's context)
("stdio
", FILE*
stream)
PROPERTIES
Additional properties:
("dtd
", DTD node) DTD for document
("base_uri
", document base URI) for documents loaded from other sources than a URI, sets the effective base URI. the document's base URI is needed in order to resolve relative URI include, import, and so on.
("input_encoding
", encoding name) forced input encoding [name]
("default_input_encoding
", encoding_name
) default input encoding to assume if document is not self-describing (no BOM, protocol header, XMLDecl
, and so on)
("schema_location
", string
) schemaLocation
of schema for this document. used to figure optimal layout when loading documents into a database
("validate
", boolean) when TRUE
, turns on DTD validation; by default, only well-formedness is checked. note that schema validation is a separate beast.
("discard_whitespace
", boolean) when TRUE
, formatting whitespace between elements (newlines and indentation) in input documents is discarded. by default, ALL input characters are preserved.
("dtd_only
", boolean) when TRUE
, parses an external DTD, not a complete XML document.
("stop_on_warning
", boolean) when TRUE
, warnings are treated the same as errors and cause parsing, validation, and so on, to stop immediately. by default, warnings are issued but the game continues.
("warn_duplicate_entity
", boolean) when TRUE
, entities which are declared more than once will cause warnings to be issued. the default is to accept the first declaration and silently ignore the rest.
("no_expand_char_ref
", boolean) when TRUE
, causes character references to be left unexpanded in the DOM data. ordinarily, character references are replaced by the character they represent. however, when a document is saved those characters entities do not reappear. to way to ensure they remain through load and save is to not expand them.
("no_check_chars
", boolean) when TRUE
, omits the test of XML [2] Char production: all input characters will be accepted as valid
xmldocnode *XmlLoadDom(
xmlctx *xctx,
xmlerr *err,
list);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
err |
OUT |
returned error code |
list
|
IN |
NULL -terminated list of variable arguments |
(xmldocnode *)
document node on success [NULL
on failure with err set]
See Also:
XmlSaveDom()Loads (parses) an XML document from an input source and generates a set of SAX events (as user callbacks). Input sources and basic set of properties is the same as for XmlLoadDom
.
xmlerr XmlLoadSax(
xmlctx *xctx,
xmlsaxcb *saxcb,
void *saxctx,
list);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
saxcb |
IN |
SAX callback structure |
saxctx |
IN |
context passed to SAX callbacks |
list
|
IN |
NULL -terminated list of variable arguments |
(xmlerr)
numeric error code, XMLERR_OK [0]
on success
Loads (parses) an XML document from an input source and generates a set of SAX events (as user callbacks). Input sources and basic set of properties is the same as for XmlLoadDom
.
xmlerr XmlLoadSaxVA( xmlctx *xctx, xmlsaxcb *saxcb, void *saxctx, va_list va);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
saxcb |
IN |
SAX callback structure |
saxctx |
IN |
context passed to SAX callbacks |
va |
IN |
NULL -terminated list of variable arguments |
(xmlerr)
numeric error code, XMLERR_OK [0] on success
Serializes document or subtree to the given destination and returns the number of bytes written; if no destination is provided, just returns formatted size but does not output.
If an output encoding is specified, the document will be re-encoded on output; otherwise, it will be in its existing encoding.
The top level is indented step*level spaces, the next level step*(level+1) spaces, and so on.
When saving to a buffer, if the buffer overflows, 0
is returned and err is set to XMLERR_SAVE_OVERFLOW
.
DESTINATION
Output destination is set by one of the following mutually exclusive properties (choose one):
("ur
i", document URI) POST
, PUT
? [compiler encoding]
("file", document filesystem path) [compiler encoding]
("buffer
", address of buffer, "buffer_length
", # bytes in buffer)
("stream
", address of stream object, "stream_context
", pointer to stream object's context)
PROPERTIES Additional properties:
("output_encoding
", encoding name) name of final encoding for document. unless specified, saved document will be in same encoding as xmlctx.
("indent_step
", unsigned) spaces to indent each level of output. default is 4
, 0
means no indentation.
("indent_level
", unsigned) initial indentation level. default is 0
, which means no indentation, flush left.
("xmldecl"
, boolean) include an XMLDecl
in the output document. ordinarily an XMLDecl
is output for a compete document (root node is DOC).
("bom
", boolean) input a BOM in the output document. usually the BOM is only needed for certain encodings (UTF-16), and optional for others (UTF-8). causes optional BOMs to be output.
("prune
", boolean) prunes the output like the unix 'find' command; does not not descend to children, just prints the one node given.
ubig_ora XmlSaveDom(
xmlctx *xctx,
xmlerr *err,
xmlnode *root,
list);
Parameter | In/Out | Description |
---|---|---|
xctx |
IN |
XML context |
err |
OUT |
error code on failure |
root |
IN |
root node or subtree to save |
list
|
IN |
NULL -terminated list of variable arguments |
(ubig_ora)
number of bytes written to destination
See Also:
XmlLoadDom()Returns the version string for the XDK
oratext *XmlVersion();
(oratext *)
version string