(in-package :web-user)

CL-XML: How To: Access Document Components

20040209
james anderson (c)2004,



[loading] [parsing] [accessor functions] [paths] [combinations] [transformations]

functional accessors

(defVar *dm*)

here is how to use cl-xml to get at components in documents. to begin one needs a document

(setq *dm* (parse-document #4P"xml:documentation;howto;howto.xml"))

once the document is parsed, one can use standard common-lisp operators as well as the document component access functions to extract components. for example, in order to print all attributes with the identifier "name" in the element children of the document's root element, one combines standard operators to iterate over the child list, select the element children and ignore the character data, and find the attributes named "name".

(dolist (child (children (root *dm*))) (typecase child (elem-node (print (find '||::name (attributes child) :key #'name)))))

in order to print the attribute value, rather than the node itself, the value operator extracts the value from the attribute nodes.

(dolist (element (children (root *dm*))) (typecase element (elem-node (print (value (find '||::name (attributes element) :key #'name))))))

cl-xml defines several kinds of functions which facilitate this kind of access to document components. the simplest are utility functions which combine traversal, type restrictions, and matching. these include the operators

for example, since the root element of the howto document is named {}inventory,

(name (./ *dm* '||::inventory))
== ||::|inventory|
the top-level children can be enumerated by selecting with a wild-card name, as in
(mapcar #'name (./* (./ *dm* '||::inventory) '||::*))
== (||::section ||::section),

and all of the elements in the document are collected by

(mapcar #'name (.//* *dm* '||::*))
== (||::inventory ||::section ||::item ||::name ||::price ||::description ||::item ||::name ||::price ||::description ||::section ||::item ||::name ||::price ||::description ||::item ||::name ||::price ||::description)
if one needed a list of all of the name or upc attribute values in the document, one could combine a selection filter with the the navigation operation
(remove "" (mapcar #'(lambda (e) (./@-string e '||::name)) (.//* *dm* '||::*)) :test #'equal)
== ("health" "food")
(remove "" (mapcar #'(lambda (e) (./@-string e '||::upc)) (.//* *dm* '||::*)) :test #'equal)
== ("123456789" "445322344" "485672034" "132957764")

for more complex selection and combination operations, one can use binding macros, like destructure-element to designate components by name, position or relation and bind them to variables. this approach can be used to implement simple filters, for example, to transform the original howto document into a simple account of the items in each section,

(pprint (destructure-element ((root-gi ((||::title title))) &rest root-children) (root *dm*) (LIST* (LIST* root-gi (LIST (LIST* '||::title (value title)))) (mapcar #'(lambda (section) (destructure-element ((section-gi ((||::name name))) &rest section-children) section (LIST* (LIST* section-gi (LIST (LIST* '||::name (value name)))) (mapcar #'(lambda (item) (destructure-element ((gi ((||::upc upc)))) item (LIST* gi (LIST (value upc))))) (remove-if #'stringp section-children))))) (remove-if #'stringp root-children)))))
== ((||::inventory (||::title . "OmniCorp Store #45x10^3"))
          ((||::section (||::name . "health")) (||::item "123456789")
           (||::item "445322344"))
          ((||::section (||::name . "food")) (||::item "485672034")
           (||::item "132957764")))
:eof