$Id: atom.html,v 1.12 1999/07/24 22:04:04 francis Exp $
An atom is represented in the API
by the datatype doxml_atom. Its definition is as follows:
typedef struct {
const char* name;
const char* space;
const char* prefix;
} doxml_name;
typedef enum {
doxml_atom_text,
doxml_atom_element,
doxml_atom_pi,
doxml_atom_external_entity
} doxml_atom_type;
struct doxml_atom_tag {
/* The atom represents either a sequence of text or an XML
* element or processing instruction.
*/
doxml_atom_type type;
union {
struct {
doxml_name name;
struct {
doxml_attribute* first;
doxml_attribute* last;
} attrs;
struct {
doxml_atom* first;
doxml_atom* last;
} atoms;
} element;
struct {
const char* text;
} text;
struct {
doxml_name name;
const char* value;
} pi;
struct {
const doxml_markupdecl* decl;
} external_entity;
} data;
/* parent points to the atom's enclosing element. If parent is
* NULL, then the atom is the document's root element.
*/
doxml_atom* parent;
doxml_atom* next;
doxml_atom* prev;
};
So, if a points to a doxml_atom, then we
can tell what kind of atom a represents (and, hence,
which field of the union is valid)
by examining a->type.
In addition, the atom has pointers to other atoms to provide the
tree structure of the document: parent points to the atom
that encloses this atom (if any); next and
prev point to the atoms' next and previous siblings.
doxml_atoms are constructed by the function
doxml_parse(), which parses an
XML document and returns a doxml_document,
which in turn contains atoms.
doxml_atoms are
destroyed by the function
doxml_delete_atom().
Note: it is not guaranteed that the
The doxml_name struct encapsulates a name with an
optional namespace and namespace prefix. (The prefix is retained
because namespaces in the DTD are matched only by prefix, since
there's no opportunity to get a namespace definition that early.)