One of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while

One of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while leaving the engine to choose the best way of fulfilling the request. Today we will look at XPath, a declarative syntax for navigating XML documents. The key observation is that XML, with its tree layout, has exactly one path from the root to each other element in the document. Intuitively, the structure is very similar to a file system path or web URL, and XPath adopts a similar notation. Conceptually, a path identifies a projection of the document (returning only certain types of elements). On top of that, xpath adds support for selection (filtering out unwanted elements) and basic types of aggregation (count, sum, etc.). The result is a language that, while nowhere near as expressive as RA/SQL, is still quite useful for distilling useful bits of information from large XML files. 2

XPath is a fairly intuitive language, requiring only a few basic concepts to understand well. The base unit of an xpath expression is a path step ( location step in the standard). Each step identifies an axis to move along (see slides that follow), an element name to move to, and an optional predicate to filter out unwanted paths. XPath is nested: predicates can themselves contain xpath expressions (which themselves can contain predicates). We will go over each of these concepts in detail. 3

At each step of an xpath query, we can imagine a pointer in the tree that specifies the current location from which the next step will be taken. This location is known as the context node. There are a number of different directions we might move in, called axes. The default axis is to move from the context node to a child named in the path step. Two other well-known axes come from the file system world: the parent axis allows moving toward the root element (usually specified using the short-hand notation../ ) and the self axis makes the next step stay in the same place (short-hand is./ ). 4

Consider the xml tree specified here. Individual elements are labeled with their names, while triangles denote sub-trees of unknown size. The context node (a book ) is shown in the center. It can be referred to in a step using the self axis. The only node on the parent axis is the genre node above it, the ancestors axis contains both genre and book-list (every element along the path from the root to the context node s own parent). The ancestor-or-self axis is exactly what it sounds like, and contains both the context node and its proper ancestors. Moving downward, the child, descendant, and descendant-or-self axes capture children, nodes below the context, and nodes at or below self. Sideways movement is also possible: the following axis includes every node whose opening and closing tags are both after the context node s closing tag (descendants of the context node are not included). The following-sibling axis refers to siblings of the context node that follow it (in other words, elements having the same parent that are found along the following axis). The preceding axis is similar to following, but includes all elements whose opening and closing tags both appear in the document before the context node s opening tag (ancestors are not included). 5

There are also a few special axes that are used to access non-element information: attribute gives access to attributes of the context node, and namespace gives access to a node s namespace (namespaces are a feature of XML that allows for logical grouping of related elements; such tag names consist of a namespace:element pair). Note that these special axes must be the last step in any path expression (other than self steps), because they return strings and path steps work with node sets. Xpath also defines several functions: given the example xml snippet above, ::* selects only elements, ::text() selects only the textual contents of a node, and ::node() selects everything (text and elements). Xpath accesses elements in strict document order (that s why preceding and following axes are meaningful), and so each node has a position relative to its siblings, available by invoking ::position(). Positioning is one-based. 6

Several short-hand notations are available in xpath, some of the more commonlyused ones are shown here. 7

One big difference from a file system: path names are not necessarily unique and an xpath expression always returns a set of nodes (possibly empty or containing only one element). As far as the language is concerned, there could be several book elements, each with multiple title elements as children, and all would be returned by the example xpath queries shown here. A DTD might reasonably forbid books from having multiple titles, but that s an orthogonal matter. 8

A given path expression returns a set of nodes (all nodes along any path that matches the one given); we can filter that set using predicates, which are given as boolean expressions inside square brackets. Here you can see several examples of the kinds of predicate expressions that can be used. It s VERY IMPORTANT to note that xpath applies existential quantification when predicate treats a node set as a scalar value: empty sets are false and non-empty sets are true. The query /book-list/book[editor] thus returns all books having at least one editor. When comparing a node set to a scalar value, the engine uses the predicate to filter the set, then returns false if the resulting set is empty. For example, consider the query /book-list/book[price < 50]. As long as at least one price smaller than 50 exists, the resulting node set is non-empty and the predicate is satisfied; if no price exists, or if all prices are at least 50, then the predicate s node set is empty and the predicate does not pass. Put another way, book-list/book/[editor] is shorthand for book-list/book[count(editor) > 0] and book-list/book[price < 50] is shorthand for book-list/book[count(price[. < 50]) > 0]. To ask for a book whose prices are *all* under 50 (universal quantification) you have to invert the question, to ask for books having no prices over 50: booklist/book[not (price >= 50)] 9

When chaining predicates, realize that position() refers to the position of the node relative to the siblings that are still in the node set. The query /book[3][price < 50]/title says out of all books, return the third one if its price is less than 50 while the query /book[price < 50][3] says out of the books with price less than 50, return the third one 10

//book/author[1] says return the first author of each book (//book/author)[1] says return the first author from the list of all book authors 11

The unnecessary restriction on unions makes very little logical sense, and is arguably a symptom of xpath being designed by a committee (top-down) rather than derived from a sound underlying theory (bottom-up). SQL has plenty of its own flaws, but the features it does have are usually complete and consistent. 13

See http://www.w3.org/tr/xpath/#corelib for the full list of functions (that page contains all other official details about xpath as well). 15