Year round learning for product, design and engineering professionals

Getting Sourcey – native HTML5 Audio and video

Hard perhaps to believe, but the world wide web began without an image element. That’s right, there was no way to include images as part of the content of a web page before Mosaic implemented them (here’s Marc Andreesen proposing the img element at the beginning of 1993). The img element ushered in the age of included content so the web browsers could now display gif, jpeg, and more recently PNG format images in a standard way, but that’s where matters stayed for more than a decade. While the non-standard embed element (ironically, now standardized in HTML5, and later standardized object element enabled developers to include audio, video, and interactive content, browsers did not implement native handling of these kinds of media – the actual rendering was left to plugins, like Flash (while the number of different commonly used plugins today is quite low, in the late 1990s there was an explosion of different plug in technologies.)

While HTML5 for the first time specifies a way of incorporating audio and video content via the mundanely named audio and video elements, the revolution in web content these bring is that browsers now no longer require plugins to play the content, rather they do so “natively”. Which has the advantage that users don’t need to download and install plugins, or keep them up to date. But there’s more to it than that. With plugin content, browsers effectively hand over a part of the browser window and say to the plugin – “ok, this is all yours, do your worst”. Which makes applying style to a video player (rounded corners, or adding a drop shadow, for instance) challenging, and effects like layering HTML based content over the top of plugin content a bit of a lottery.

HTML5 audio and video has been, in browser years, quickly adopted by all major browsers (we’ll cover off browser support in just a moment). But let’s first take a look at how we incorporate audio and video content into HTML5 documents, how to provide fallbacks for browsers which don’t support them, and the inevitable gotchas. And once again, I’ll introduce a couple of tools we’ve developed to make life easier for you when using native multimedia content in your web sites and applications.

There are a lot of great resources out there, which go into much more detail on some aspects we only really touch on here, particularly on the complex issue of formats and codecs, as well as the challenges of making your content accessible. There’s a list of further reading at the end of the article.

We’ll begin by looking at audio, and then follow up with video, which we use in almost exactly the same way as audio.

HTML5 Audio

HTML5 introduces the audio element, for embedding audio files in a page. Browsers which support HTML5 audio provide a simple audio player for playing, scrubbing forward and back through content, as well as setting the volume. It’s also possible to hide these controls, and either autoplay files, or build your own player interface using HTML and the audio element’s JavaScript API (which is beyond the scope of this article, but not really that difficult at all if you are competent in JavaScript).

Setting up the element

The first thing we need to do to include audio in a page is to add an audio element to the HTML. Unlike img, audio is a non empty element – that is it can contain other HTML, so has an opening and closing tag.


Of course, this isn’t going to play anything, just as an img element with no src attribute won’t display an image. So, you’re probably guessing we add the audio file reference using a src attribute. And you’re more or less right. We can do that, but there is another, better way to add a reference to the sound files we want to play. This is because the audio element can have only one src attribute, but because different browsers support different audio formats, at least for now, and for the foreseeable future, we’ll need to provide browsers with multiple sound files, so that we cover all browser bases. We’ll turn to how we do this in a moment, but let’s first look at several other attributes the audio element can take.

The controls attribute, when present, tells a browser to display the default controls for its audio player. If the attribute is not present, then no controls will be displayed.
another boolean attribute, if this is present, the audio file will begin again after it has played indefinitely
if this attribute is present, then once the sound file can begin playing, it will. autoplay should be used carefully and sparingly!
this is now replaced with the preload attribute which we’ll look at in a moment, but as older browsers, for example Safari 4 support autobuffer instead of preload, you may wish to add this attribute as well if you want audio to be buffered by the browser as soon as a page is loaded. Browsers may choose to ignore this, or not support it at all (Opera for example doesn’t support preloading of audio)

Before we continue, these attributes are boolean attributes, which you might not necessarily be as familiar with as you are regular attributes. Boolean attributes (another exampled is checked) are either true, or false, depending on whether they are present or absent on an element. So, they look like this

<audio controls loop></audio>

That is, they don’t take values. If this offends your XHTML sensibilities, they can also be written like so

<audio controls="controls" loop="loop"></audio>

That is, with their name as their value as well. Why they aren’t written as controls=”true” is beyond me, but there you have it.

In addition to these boolean attributes, and the src attribute (which we’ve recommended ignoring) audio can also take the preload attribute, which has one of three values. The preload attribute doesn’t force the browser to preload an audio file, but rather, suggests to the browser how it can provide the best user experience to the user (balancing the impact of downloading a potentially large file, with the benefit of faster response to their choosing to play a file). In the words of the specification, these values are

Hints to the user agent that either the author does not expect the user to need the media resource, or that the server wants to minimise unnecessary traffic.
Hints to the user agent that the author does not expect the user to need the media resource, but that fetching the resource metadata (dimensions, first frame, track list, duration, etc) is reasonable.
Hints to the user agent that the user agent can put the user’s needs first without risk to the server, up to and including optimistically downloading the entire resource.

Getting Sourcey

Having suggested we don’t use the src attribute to link to the audio file to be played, how then can we do this? Well, we saw audio elements can contain other HTML elements, and in this case, we’re going to use the source element (also new in HTML5), to provide links to one or more audio files in various formats. The browser can then download the format it supports, ignoring others to save bandwidth.

Like img, the source element uses the src attribute to link to a file. So, if we want to link to an mp3 file, we do so like this

<source src="">

and, if we put that all together, that’s all we need to play this file in browsers which support mp3 (at present, IE9, Safari, Chrome, Android and iOS).

<audio controls="controls" loop="loop">
	<source src="">

While several browsers will be able to play this content, neither Firefox nor Opera support the mp3 format natively. They both however, support the Ogg Vorbis format, so in addition, we’ll link to an ogg version of this same audio file, for the benefit of users of these browsers.

<audio controls="controls" loop="loop">
	<source src="">
	<source src="">

Where a browser happens to support both ogg and mp3, it will use the first source it finds which it supports, and so in this case, play the mp3 version.

So now we have an audio element, with two sources, which provide audio that can be played in all modern browsers. But what about older browsers? Can we also include support for them? Let’s find out.

As an aside, beneath the seemingly simple issue of audio and video formats, there’s more than a little complexity. We’ll look at this whole issue separately in the gotchas section, but I’ll just note here, that we can give the browser further information about the file, including it’s MIME type, and a specific codec to be used to decode it using the type attribute. This can further help a browser in deciding whether to download a file – audio and video files are typically large, so browsers really should only download files they can very likely play. For now, just note we have the type attribute, which may be useful, for example, to distinguish between audio files of the same format encoded at different bitrates.

Providing fallbacks

Because we’ve long been able to embed audio and video in a web page using Flash (or Silverlight, and other audio playing plugin technologies), we can use this to provide a fallback player for audio content. There’s two approaches to this (for simplicity, we’ll only consider the case of Flash, but a similar approach can be used with Silverlight).

  1. We use the object element to link to a .swf version of our audio file, which Flash can play
  2. We use object to embed a Flash audio player which can play mp3 files

The first approach is simplest, but Flash won’t display controls for the sound file – the user will have to context menu click to play, rewind, etc.

The second approach potentially involves hosting an audio player at your site, or using a free hosted player such as the Google Reader Audio Player, but has the advantage of reducing the number of audio files we need to encode, and of providing a better user experience.

In both cases, we use the object element, as the only browser we really need to worry about is Internet Explorer versions 8 and older, which support object. Unless you’re looking to also cover an old version of a browser that doesn’t support object (which to be honest is unlikely), there’s no need to also include a nested embed element inside the object.

Here’s how we go about using the object element to add the Google Reader player, and our sound file.

<object type="application/x-shockwave-flash" data="" width="400" height="27" >
        <param name="flashvars" value="audioUrl=">
        <param name="src" value=""/>
        <param name="quality" value="best"/>

There’s no great need to go into detail here, because all you really need to know is the value of the flashvars parameter includes the URL of the sound file. Simply replace the URL there, leaving the audioURL= part as is, and you’ll be fine.

Note that this fallback won’t work if the browser supports audio, but none of the audio formats you’ve linked to, or if the files go missing. The browser in these instances will show the native audio player controls (if you’ve used the controls attribute), but no sound will be played.

Lastly, what if we have a browser that supports neither audio, nor the object element? We can include fallback text (and HTML) within the object element (or if we choose not to include a flash based fallback, inside the audio element, after any source elements). We could for example link to a downloadable version of the file. Here’s what this would look like if we haven’t included a Flash based fallback

<audio controls="controls" loop="loop">
	<source src="">
	<source src="">
	<p>Sadly your browser can't play this audio file</p>
	<p>The good news is you can still <a href="">download an MP3 version</a></p>


We’ve covered most of the things that can trip you up with audio (with the exception of the whole issue of formats and codecs which we’ll cover shortly) but keep in mind these as well:

  • fallbacks only come into play if the audio element is not supported, not if the file formats aren’t or if the files are missing.
  • ogg files must be served as audio/ogg or application/ogg. If the server is set up to serve these as another MIME type, Firefox will ignore them.
  • Fallback text is not for the purposes of accessibility, but as a last resort when audio is not supported (we’ll look at accessibility in a moment).

But, on the whole, HTML5 audio is not particularly complicated, and in my opinion far less of a headache than using plugin based audio with the object and embed elements. As I mentioned earlier, I’ve built a web based tool that helps you create audio elements, including fallbacks, and hopefully helps you understand exactly what’s going on. Let me know what you think

HTML5 Video

Let’s now turn to incorporating video in HTML5 documents, and the good news is, it’s more or less identical to audio. Where audio and video are the same, we won’t repeat what we’ve covered earlier.

Adding the element

In place of the audio element, for video we have, not surprisingly, the video element. As with audio, we recommend you don’t use the src attribute to link to the video file, but again the source element, so we can link to different formats (as with audio, different browsers support different formats, so to cover all modern browsers we need at least two different video files).

video elements can take all the attributes of the audio element – controls, loop, autoplay and preload. But in addition, it can also take the poster attribute. The value of the poster is a url that specifies a file to be used as a placeholder image when video is not playing. Here’s a video element, with controls, and a poster

<video controls="controls" poster="">

and while this won’t play any video, it will look impressive, something like this

video player with poster and controls

Adding sources

Now we need to specify the video files to be played. As with audio, we use the source element, with the src attribute pointing to the video file to be played. To cover modern browsers, we need at least two formats, to simplify considerably, h.264 and either ogg/theora or webM/VP8.

<video controls="controls" poster="">
	<source src="">
	<source src="">

Which now gives us video which will play in almost any modern browser, including IE9. again, we’re going to add fallbacks much as we did for audio, to round out our coverage.

Providing fallbacks

The simplest way to add Flash based fallback video content is to create a .swf version of the file, and then embed this in your HTML document using the object element. If Flash is installed this will play in the user’s browser, and while no controls will be shown, if you add a menu parameter with the value of true, then the user can context menu click and play, rewind and otherwise control the video.

<object type="application/x-shockwave-flash" data="" width="320" height="400">
    <param name="movie" value="">
    <param name="menu" value="true">

All we need to concern ourselves with here is that we put the url of our video file as the value of the parameter with the name “movie”, as well as the data attribute of the object element.

There are also various Flash based video players we might include, similarly to the way we included the Google Reader MP3 player. This will likely require serving the Flash application, in addition to the video files.

Again we’ve used only the object element, as we’re really only looking to ensure IE 8 and older is served the video, and as these support object there’s no need to also include the embed element.

Lastly, as with audio, we can provide fallback text or HTML to display when video is not supported. Here, we’ll provide a download link when neither video nor Flash are supported.

<video controls="controls" poster="">
	<source src="">
	<source src="">
	<object type="application/x-shockwave-flash" data="" width="320" height="400">
		<param name="movie" value="">
		<param name="menu" value="true">
		<p>Sadly your browser can't play this video file</p>
		<p>The good news is you can still <a href="">download an MP4 version</a></p>


HTML5 video gotchas are similar to those with audio. Fallbacks are only for when video is not supported at all. Ensure video files are served with the right MIME type. But on the whole, both audio and video are quite usable today.

As I’ve mentioned, the single biggest challenge is ensuring we serve the right video files, to ensure all browsers can play the video. Which means delving a little into the more complex issue of formats and encodings.

Video and Audio formats

Probably I should skip this part, and just point you in the direction of Mark Pilgrim’s fantastic coverage at Dive Into HTML5. But, I feel I should at least cover this issue briefly, so here goes.

When we think of video formats, like h.264, there are in fact 2 and usually 3 separate pieces of technology involved.

  • There’s the container file – for example MPEG4, WebM or Ogg
  • There’s the video data, encoded with one of many possible codecs (decoding/encoding formats) such as H.264, VP8 or Theora
  • There’s audio data encoded with one of many possible audio codecs such as mp3, AAC or Vorbis

So, when we talk about the video format a browser supports, it’s important to understand what this actually means. It means the combination of container, video and audio encoding.

With this in mind, The current state of browser support for audio and video formats is

  • Firefox 3.6+, Chrome, Android 2.1+ and Opera support Ogg/Theora (.ogv).
  • IE9, Safari 4+, Chrome, Android and iOS support MPEG4/h.264 (.mp4), although Chrome 14 will drop support for the format.
  • Firefox 4+, Chrome, Opera, and Android (2.3+) support WebM/VP8 (.webm), as does IE9 if the required codecs are installed on the system (they can be downloaded from here)

So, to cover all modern browsers, we need at a minimum, MPEG4/h.264 and either Ogg/Theora or WebM/VP8

  • Firefox, Chrome, Android and Opera support Ogg/Vorbis (.ogg).
  • IE9, Safari, Chrome, Android and iOS support MP3 (.mp3).
  • Safari, Chrome, IE and iOS support AAC (.aac).

To cover all modern browsers, we need Ogg/Vorbis and either MP3 or AAC.

The type attribute

We mentioned earlier that the source element can take a type attribute in addition to the src attribute. While not required, this can help a browser decide whether it can actually play the content located at the end of the src URL. You might think that the file extension would be enough, but not necessarily so. For example, iPhone with iOS 4 supports H.264 video up to 720p (the so called H.264 “Main Profile”), so can’t play a 1020p H.264 encoded video (that is, a “High Profile” H.264 video). If we wanted to provide various H.264 profiles, for various devices, all these files will have the extension mp4. But we can provide codec information in the type attribute to differentiate between the various versions.

The type attribute’s value provides two pieces of information, separated by a semicolon. First is the MIME type of the file, then (optionally) the codecs used (separated by a comma). So, for example we could provide various H.264 profiles of our video like so:

<source src='video.mp4' type='video/mp4; codecs="avc1.58A01E, mp4a.40.2"'> <!-baseline profile-->
<source src='video.mp4' type='video/mp4; codecs="avc1.4D401E, mp4a.40.2"'>
<!-main profile-->
<source src='video.mp4' type='video/mp4; codecs="avc1.64001E, mp4a.40.2"'> <!-high profile-->

(Which to tell the truth is taken directly from the HTML5 specification, and is beyond my level of understanding of the ins and outs of codecs.), but depending on your circumstances, you may find it useful to include MIME type and codecs information using the type attribute. If you do, I’m sure you’ll know the codecs information to use, and now where to put it.


The web is about inclusivity and access to as wide an audience as possible, which includes those with hearing and visual disabilities. Creating accessible audio and video content is far beyond the scope of this article, but I do have some links for further reading below. But, it is important to note that, as spelt out in the HTML5 specification

In particular, [fallback] content is not intended to address accessibility concerns. To make video content accessible to the blind, deaf, and those with other physical or cognitive disabilities, authors are expected to provide alternative media streams and/or to embed accessibility aids (such as caption or subtitle tracks, audio description tracks, or sign-language overlays) into their media streams.

The tools

I’m not sure whether it’s stupidity, or laziness, but many aspects of web development, if I’ve not used a particular technology for a while, I quickly find myself reaching for a reference. And sadder still, it’s often the book I wrote myself! But in all seriousness, as aspects of CSS and HTML (not to mention JavaScript and the DOM) become increasingly complex, our chances of remembering every aspect of our craft, along with the subtleties of browser support, grow ever smaller (even Einstein was reputed to have said that he didn’t remember the speed of light as he could always look it up). Well, mine certainly do. So, I write little web based tools to replace my dwindling brain cells.

I’ve recently developed two new tools, one each for audio and video, that help you easily do pretty much all we’ve covered here – create the elements, add controls, a poster, and other attributes, sources, and fallback audio, video and HTML. It will even tell you which browsers can play a particular file (well, it will make a guess based on the files’ extensions).

You can find the audio tool here, and the video tool here.

There’s also a similar excellent tool for creating HTML5 video which I wasn’t aware of when creating mine, the Video for Everybody Generator. It’s highly recommended.

Further reading

Here’s a few of the many articles and other resources on HTML5 video and audio I’ve come across in my travels. They cover video and audio encoding, accessibility, some Flash players you might consider for your fallback content, browser support, and more.

The very best place to start is the video chapter in Mark Pilgrim’s Dive into HTML5.
Video for Everybody, the originator of the Flash fallback technique for HTML5 Video

Opera Developer Connection has a series of articles on HTML5 media, including an Introduction to HTML5 Video, and Everything you need to know about HTML5 audio and video.

The HTML5 Doctor has a detailed overview of HTML5 Audio, and video.

HTML5 Rocks has several various articles on native HTML5 media including a quick guide to audio, the basics of HTML5 video

Encoding your audio and video

Here are some great places to get tips and techniques on encoding video and audio for HTML5, and services which you can use to create the right formats of your media files

Flash players for video and audio fallback

There are some open source Flash video players you can host yourself to play your video files.

  • FlowPlayer
  • JW Player
  • Media.js supports both audio and video with fallbacks to Flash
  • VideoJS is a highly regarded JavaScript HTML5 player, with fallback support as well. You can skin it with CSS.
  • JPlayer open source jQuery HTML5 Audio / Video Library (as used in Pandora)

In addition to the commonly used Google Reader Flash MP3 Player, other Flash MP3 players you might consider include


Browser support

As always When Can I Use? is the place to keep up to date with browser support.

Addressing challenges playing audio and video in the Android browser.

A look at the challenges associated with video on the iPad.

A further look at HTML5 media challenges in iOS.

delivering year round learning for front end and full stack professionals

Learn more about us

Thoroughly enjoyed Web Directions — met some great people, heard some inspiring presenters and added a whole bunch of things to my to-do list.

Joel Roberts Web Developer