doxml Manual Version 0.5: Concepts

$Id: concepts.html,v 1.11 1999/07/24 22:04:04 francis Exp $

Index

Documents

An XML document is represented in doxml by a doxml_document. It is constructed by by the doxml parser: pass a context to doxml_parse(), and it will return a freshly created doxml_document representing the parsed document. For more details, see the syntax details.

Atoms

The central abstraction in doxml is the atom. An atom is a struct which represents either an XML element or a sequence of text (unformatted text, that is, with no elements embedded in it). If the atom is an XML element, it may contain child atoms and/or attributes.

Atoms are constructed by the doxml parser: pass a context to doxml_parse(), and it will return a freshly created atom representing the parsed document.

Attributes

An XML element may have one or more attributes. For example, the element

<a href="foo.html">
has one attribute, whose name is href and whose value is "foo.html".

Syntax details

Namespaces

Namespaces are a way of defining sets of tags independently of each other, so that they can be used in the same document without encountering namespace collisions.

Syntax details

Processing Instructions

A processing instruction (PI) is a piece of out-of-band information embedded in the XML document; it provides extra hints to the application. Why this is necessary is not actually clear to me; I suspect it was something that some members of the working wanted, and the rest could not come up with a strong reason against. For more details, see the syntax details.

Contexts

A context represents an input/parsing session; it can be thought of as an input stream amalgamated with working data for the parser. The input stream, however, may be implemented via various techniques; the current library provides contexts that read from files, FILEs, and strings. A context is obtained with one of the doxml_open_*() functions and closed via doxml_close().