HTML5 selectors API – It’s like a Swiss Army Knife for the DOM
In the infancy of JavaScript, there was little if any concept of an HTML document object model (DOM). Even though JavaScript was invented to enable web developers to manipulate parts of a web page, and in the original implementation, in Netscape 2.0, developers could only access the form elements, links, and images in a page. Useful for form validation, and first widely used for image rollover techniques (think :hover, before CSS), but far from the general purpose tool to create modern web applications we now know (and love/hate).
Newer iterations of the DOM provided developers with access to far more than just that original limited range of elements, as well as the ability to insert, modify and delete elements in an HTML document. But, cross-browser implementations very often differed, and full support for the W3C’s DOM standards have arguably been treated as far more optional than CSS or HTML support.
One of the many reasons for the success of JavaScript libraries like jQuery and Prototype, on top of their easing the pain of cross-browser development was how they made working with the DOM far less painful than it had previously been, and indeed how it was with the standard DOM. Being able to use arbitrary CSS selector notation to get matching elements from a document made the standard DOM methods seem antiquated, or at the every least, far too much like hard work.
Luckily, the standards and browser developers took notice. The W3C developed the Selectors API, a way of easily accessing elements in the DOM using standard CSS selector concepts, and browser developers have baked these into all modern browsers, way back to IE8.
In this short (by my standards) article, we’ll look at the Selectors API, how you use it, browser support, and some little things you might like to keep in mind while using it. Rest assured, it’s now widely supported, so in many cases, you can safely use it, potentially with a fallback for older browsers (IE7 and older specifically) via libraries like jQuery (or more lightweight selector engines like Sizzle, which provides this functionality for jQuery, and other libraries).
The Selectors API
The Selectors API, which many would consider to be part of HTML5, is in fact a separate, small specification from the W3C. It provides only two new methods, querySelector
, and querySelectorAll
, for the Document
, Element
, and DocumentFragment
objects (typically, you’ll use these methods on the document
or element
objects.) But do these methods make life easier for developers?
Before the Selectors API, to access an object in the DOM we could use these methods:
getElementById
(from DOM Level 2 Core) – available for thedocument
elementgetElementsByClassName
, standardized in HTML5, after long non standard browser support, which is supported ondocument
s andelement
sgetElementsByTagName
, from DOM Level 2 Core, available on thedocument
andelement
objects
And there are some legacy ways of accessing elements on a page, which date from the earliest days of JavaScript:
links
is a property of the document object which contains all anchor (a
) andarea
elements with anhref
attributeanchors
is a property of the document object which contains alla
elementsforms
is a property of the document object which contains all form elements
We can also “traverse” the DOM, using:
childNodes
, a property of thedocument
andnode
objectsnextSibling
, a property of anode
, which contains the element directly following it in the same parent elementparentElement
, a property of a node, which contains its parent element.
and related DOM traversal properties and methods.
But, what developers really often want to be able to do (as the success of jQuery and other libraries has shown) is simply say “give me all the elements which match this selector”, or “give me the first element which matches this selector”. And that’s precisely what the simple, powerful Selectors API does. It doesn’t completely do away with the need for DOM traversal, and legacy methods and properties, but it goes a long, long way.
querySelector
querySelector
is a method of the document or any element, which returns the first descendent element which would be selected by its one argument, a CSS selector string. We can use this in place of the document.getElementById('content')
like so: document.querySelector('#content')
(like me, you’ll probably find yourself forgetting to add the #
from time to time in querySelector
, something which doesn’t throw an error, so can be frustrating to track down).
And we can do things like find the first header
element in an HTML5 document, with querySelector('header')
. So far so good. But where querySelector
really shines is we can use any selector (attribute, structural, dynamic, UI, and even selector groups) with it. In most cases, this makes traversing the DOM, and locating a specific element far simpler, and most likely far quicker, as we won’t be looping in JavaScript and accessing all kinds of DOM properties, rather, the query is taking place inside the browser’s far faster native DOM engine.
querySelectorAll
Often, when working with the DOM, we want to manipulate several elements at once, For example, we might want to unobtrusively attach an event listener to all the links with a given class
value. Here, querySelectorAll
is your friend. Just like querySelector
, it takes a single string as an argument, which is a CSS selector. Instead of returning a single element, it returns a NodeList (a kind of JavaScript array) of matching elements. We can then iterate through this array, and manipulate these objects.
For example, we could use it to replace document.links
like so:
document.querySelectorAll('area[href], a[href]')
This finds all area
elements with the href
attribute set, as well as all a
elements with this attribute set as well (notice how we’ve used a selector group, which is quite acceptable with the Selectors API).
Matching elements are returned in the order they appear in the DOM parse tree.
Document or Element?
I mentioned that both the document
, and element
objects implement these two methods – what’s the difference? Well, as you might have guessed, these methods find elements that are descendants of the object you query on. So, if you use the method on a paragraph element, it will only find the descendant elements of that paragraph which match the selector. Other elements in the document which might match it won’t be returned. But, if you use the methods on the document, then any matching element in the document can be found.
Gotchas
If you’ve really got your hands dirty with the DOM, you’ll know that when DOM methods return a NodeList, it is live—that is, the members of the list change, depending on the state of the document.
Let’s say we get all the elements with a class
of “nav” using document.getElementsByClassName('nav')
, and it returns 5 elements, which we keep in a variable.
Now, if we add a new element with class
nav, or remove one of the existing elements with a class
of nav, the NodeList in our variable will be updated to reflect these changes (that’s why it is called a live NodeList).
But querySelector
and querySelectorAll
are different. While they return a NodeList, it is static. So, if we similarly get all elements with a class
of nav using document.querySelectorAll('.nav')
, then regardless of what we subsequently do to the DOM, the length and contents of the NodeList won’t change. Which means, it’s always best to query the DOM just before you need the elements, rather than holding on to elements if your DOM is going to change.
There’s also a performance consideration. Tests of various browsers indicate that querySelectorAll
is slower than getElementByTagName
(though not it would appear in Opera). But, it’s also possible that once available, manipulating the static NodeList may be higher performance than manipulating a dynamic NodeList. And this issue will likely only have an impact in extreme cases. I’d certainly not recommend prematurely optimising by using getElementsByTagName
, getElementsByClass
, getElementById
and so on in place of querySelectorAll
, but it is worth noting you might be able to squeeze a little more performance out by doing so if you really need to.
And it is worth noting too that querySelector
and querySelectorAll
don’t work with every kind of selector. While pseudo-class selectors (like :visited
) work with these methods, pseudo-element selectors, like :first-letter
, :first-line
, :before
and :after
although permissible as arguments, will return null
in the case of querySelector
, and an array of length zero for querySelectorAll
.
A little gotcha this aging developer has found I’m so used to getElementById
and getElementsByClassName
that I find myself forgetting the #
or .
required in the selector string in querySeletor
and querySelectorAll
. As I mentioned a moment ago, it can be frustrating, as this won’t throw an error, but simply return null or an empty NodeList.
Support
All modern browsers, including IE8 and up support both querySelector
and querySelectorAll
. It is however worth noting that the results returned are dependent on what selectors the browser supports. IE8 supports CSS2.1 selectors, though not CSS3 selectors. IE9 supports many CSS3 selectors, but not a number of the UI related pseudo-classes, such as :required
and :invalid
. IE CSS support for versions 5 through 9 is Selectors API specification from the W3C
Great reading, every weekend.
We round up the best writing about the web and send it your way each Friday.