$Id: atom.html,v 1.12 1999/07/24 22:04:04 francis Exp $
An atom is represented in the API
by the datatype doxml_atom
. Its definition is as follows:
typedef struct { const char* name; const char* space; const char* prefix; } doxml_name; typedef enum { doxml_atom_text, doxml_atom_element, doxml_atom_pi, doxml_atom_external_entity } doxml_atom_type; struct doxml_atom_tag { /* The atom represents either a sequence of text or an XML * element or processing instruction. */ doxml_atom_type type; union { struct { doxml_name name; struct { doxml_attribute* first; doxml_attribute* last; } attrs; struct { doxml_atom* first; doxml_atom* last; } atoms; } element; struct { const char* text; } text; struct { doxml_name name; const char* value; } pi; struct { const doxml_markupdecl* decl; } external_entity; } data; /* parent points to the atom's enclosing element. If parent is * NULL, then the atom is the document's root element. */ doxml_atom* parent; doxml_atom* next; doxml_atom* prev; };
So, if a
points to a doxml_atom
, then we
can tell what kind of atom a
represents (and, hence,
which field of the union is valid)
by examining a->type
.
In addition, the atom has pointers to other atoms to provide the
tree structure of the document: parent
points to the atom
that encloses this atom (if any); next
and
prev
point to the atoms' next and previous siblings.
doxml_atom
s are constructed by the function
doxml_parse()
, which parses an
XML document and returns a doxml_document,
which in turn contains atoms.
doxml_atom
s are
destroyed by the function
doxml_delete_atom()
.
Note: it is not guaranteed that the
The doxml_name
struct encapsulates a name with an
optional namespace and namespace prefix. (The prefix is retained
because namespaces in the DTD are matched only by prefix, since
there's no opportunity to get a namespace definition that early.)