A look behind H.264, the world's most popular video codec, including encoding parameters and royalty concerns
This is an installment in our ongoing series of "What Is...?" articles, designed to offer definitions, history, and context around significant terms and issues in the online video industry.
H.264 is the most widely used codec on the planet, with significant penetration in optical disc, broadcast, and streaming video markets. However, many uses of H.264 are subject to royalties, something that should be considered prior to its adaption. Other factors to consider include comparative quality against other available technologies, like Google’s WebM, as well as the general availability of decoding capabilities on target platforms and devices. This article discusses H.264 and competitive technologies from these perspectives.
H.264 is a video compression technology, or codec, that was jointly developed by the International Telecommunications Union (as H.264) and International Organization for Standardization/International Electrotechnical Commission Moving Picture Experts Group (as MPEG-4 Part 10, Advanced Video Coding, or AVC). Thus, the terms H.264 and AVC mean the same thing and are interchangeable.
As a video codec, H.264 can be incorporated into multiple container formats, and is frequently produced in the MPEG-4 container format, which uses the .MP4 extension, as well as QuickTime (.MOV), Flash (.F4V), 3GP for mobile phones (.3GP),, and the MPEG transport stream (.ts). Most of the time, but not all the time, H.264 video is encoded with audio compressed with the AAC (Advanced Audio Coding) codec, which is an ISO/IEC standard (MPEG4 Part 3).
As a standard-based codec, H.264 has been implemented by multiple vendors, and each version delivers different levels of quality and configurability. The most widely used H.264 codecs today include the Apple codec, which is used in Apple Compressor and QuickTime Pro, the MainConcept codec, which has been licensed by Adobe, Microsoft, Rhozet, Sorenson Media and Telestream for their encoding products, and the x264 codec, a free software library used in most shareware encoders, and by many UGC vendors to create custom high-volume H.264 encoding systems.
Of the three, the Apple codec produces the lowest quality by a significant margin. Otherwise, x264 has a slight quality edge over MainConcept, though the difference may not be noticeable at the encoding parameters used by most streaming producers.
At a high level, there is some uniformity in the compression parameters available for each codec. For example, all H.264 codecs use different Profiles to encode the video. To explain, there are multiple encoding techniques and algorithms available in H.264 to compress the file. The basic tradeoff for most of them is enhanced quality but a more complex bitstream that’s harder to decode.
Profiles specify which of those techniques and algorithms can be used to create a bitstream, and constitute a convenient meeting point for device manufacturers and video producers. For example, the Baseline profile, which is the simplest profile predominantly used by low-power devices, doesn’t allow the use of B-frames or CABAC entropy encoding. Quality is still good, but not as high as streams produced using the Main or High Profiles that incorporate both techniques.
However, because those techniques weren’t used, the decoding requirements of a Baseline-encoded stream are modest, so low-power devices like the original video-capable iPod can play the video. The iPod documentation specified that these devices could only play H.264 video encoded to the Baseline profile, so video producers seeking to produce content for playback on the iPod encoded using the Baseline profile.
H.264 levels specify the maximum data rate and video resolution that a device can play back. For example, Apple’s iPad 2 specifications indicate that the device can play video encoded using the Main Profile, Level 3.1. This means a maximum video resolution of 1280x720 @ 30 frames per second at a maximum data rate of 14Mbps. Wikipedia has a chart that details the specifics of each H.264 level.
In general, Profiles and Levels are most important when producing for devices, because video encoded using the wrong Profile, or exceeding the parameters specified in a Level, won’t play on those devices. In contrast, when producing for computers, the players enabling H.264 playback, whether QuickTime, Flash, Silverlight or HTML5, can play back video produced at the most advanced Profile supported by most streaming encoding tools (the High Profile), at configurations that exceed full resolution 1080p and beyond.
When producing for general computer playback, it’s more important to consider the practical limitations regarding delivery to the target viewer, and the playback capabilities of the target computer. Though the Flash Player can technically play 1080p video encoded at 15Mbps on a low-power netbook, few connections could deliver that data rate in real time, the costs of delivery would be prohibitive, and the frame rate produced by that low power CPU would probably not be pleasing to viewers.
Beyond Profiles and Levels, there’s a great disparity in the H.264 encoding parameters available to video producers encoding into H.264 format. For example, Figure 1 is the H.264 compression interface from the Adobe Media Encoder. As you can see, you can choose Profile and Level, but no other H.264 parameters.
Figure 1. Adobe Media Encoder's simple H.264 encoding interface.
At the other end of the complexity spectrum is Figure 2, which shows two of the four screens of encoding options available with the x264Encoder, an x264-based QuickTime encoder. As you can see, beyond it’s very high quality, one of the reasons that x264 is popular among serious compressionists and high-volume UGC sites is because it exposes a wide-variety of H.264 encoding parameters enabling extensive optimization.
Figure 2. Some of the encoding options available using the x264Encoder.
Overall, though H.264 is a standard, there is little uniformity in output quality from the various codecs, or the controls used to encode files into the H.264 format for streaming distribution.
A number of companies claim patent rights for intellectual property contributed to the development of H.264, and all that do are members of a patent pool organized by MPEG LA. Under the patent pool, different royalties apply to the different classes of products, as shown in Figure 3.
Figure 3. The H.264 royalty structure from the MPEGLA Summary of AVC/H.264 License Terms FAQ.
On the left are products sold or otherwise distributed with AVC encoders or decoders installed, while on the right are various content categories that include video encoded into H.264 format. Briefly, on the left side of the diagram, royalties start after the first 100,000 units are sold each year, and cap at $3.5 million per year 2005-2006, $4.25 million per year 2007-08, $5 million per year 2009-10, and $6.5 million per year in 2011-15.
For content categories on the right, there are royalties for subscription services, that scale with the number of subscribers, but only start after exceeding 100,000 subscribers. There are also charges for Title-by-Title content sold to viewers (pay-per-view), but only for content longer than 12 minutes in duration.
Continuing clockwise on the right, there is no royalty for H.264 encoded video delivered for free over the Internet, though fees apply for Free Television encoded in H.264 for Broadcast Markets that exceed 99,999 television households. Interestingly, the H.264’s patent group’s policy on free Internet video has varied significantly over time. Initially, there was no royalty until at least January 1, 2011, a policy that dissuaded many high-volume potential users from adopting H.264. Then, in February 2010, MPEG-LA announced that royalties would be delayed until December 31, 2015. Finally, in August 2010, MPEGLA extended the royalty-free license “in perpetuity," with some pundits claiming that this was in response toGoogle’s open-sourcing the VP8 codec acquired from On2. Whatever the reason, those distributing free Internet video encoded via H.264 will never have to pay a license fee to MPEG LA.
H.264 is one of the three codecs available for encoding content for Blu-ray discs, is prevalent in video-conferencing products, and is widely used in television broadcasting, including satellite and cable broadcasting. In the streaming market, H.264 was first adapted by Apple with QuickTime 7 in 2005, and H.264 playback in an iPod also debuted that year. In 2007, Adobe incorporated H.264 support into Flash, with Microsoft announcing support for H.264 support in Silverlight in 2008.
H.264 is currently supported by all new Android devices, in Windows Phone 7, in most new BlackBerry Smartphones, and in the HP webOS.
In terms of browser support, H.264 playback is currently incorporated into Apple’s Safari browser and Microsoft Internet Explorer version 9 via the HTML5 video tag, as well as Google Chrome versions through (at least) 11.0.696.16 beta. However, on January 11, 2011, Google announced that they will remove H.264 support from Google Chrome in “the next couple months.” Neither Mozilla or Opera have incorporated H.264 playback into their browers, citing the costs and restrictions of using a “patent-encumbered” format.
However, Microsoft has released multiple plug-ins that enable H.264 playback within Firefox and Chrome using the HTML5 video tag, though according to Microsoft, these H.264 plug-ins only run on Windows 7.
With version 10.1, the Flash Player enabled GPU-acceleration of H.264 playback within Flash on both the Windows and Macintosh platforms, and version 5 of the Safari browser also accelerates H.264 playback on both platforms. With version 3, Silverlight also enabled GPU-acceleration of H.264 video on both platforms, and Internet Explorer 9 also includes GPU-acceleration.
On the content side, both YouTube and Vimeo quite loudly started supporting H.264 via the HTML5 tag in early 2010. By late 2010, multiple networks, including CBS, CNN, PBS, TNT, ABC and the BBC were encoding at least some of the videos distributed over the Internet using H.264. Virtually all videos produced for iTunes are also encoded in H.264 format.
In terms of video quality, H.264 is generally considered to produce higher quality than both On2’s VP6 and Microsoft’s VC1. Regarding WebM, there is some disagreement. For example, In their respected annual codec comparison, Moscow State University found that when encoding movies, VP8 (the video coded in the WebM format) showed “20-30% lower quality at average.” Another early evaluation of WebM by x264 developer Jason Garrett-Glaser, concluded that “VP8, as a spec, should be a bit better than H.264 Baseline Profile and VC-1. It’s not even close to competitive with H.264 Main or High Profile.”
In contrast, Streaming Media concluded that “the quality difference between VP8 and H.264 will be meaningless at most relevant data rates.”
Codec selection is one of the most fundamental decisions facing streaming media producers. In order to make the right decision, producers should know the characteristics and costs of any technology that they choose. Today, H.264 is the only codec that can reach 98% of the installed base of computers (via Flash), is the predominant codec used in iTunes and can play on all major brands of mobile devices. It’s the highest quality codec available, and though there may be royalties for some for-fee uses, there will never be a royalty for distributing free video over the Internet.
This is an installment in our ongoing series of "What Is...?" articles, designed to offer definitions, history, and context around significant terms and issues in the online video industry.
Executive Summary
H.264 is the most widely used codec on the planet, with significant penetration in optical disc, broadcast, and streaming video markets. However, many uses of H.264 are subject to royalties, something that should be considered prior to its adaption. Other factors to consider include comparative quality against other available technologies, like Google’s WebM, as well as the general availability of decoding capabilities on target platforms and devices. This article discusses H.264 and competitive technologies from these perspectives.
The H.264 Spec
H.264 is a video compression technology, or codec, that was jointly developed by the International Telecommunications Union (as H.264) and International Organization for Standardization/International Electrotechnical Commission Moving Picture Experts Group (as MPEG-4 Part 10, Advanced Video Coding, or AVC). Thus, the terms H.264 and AVC mean the same thing and are interchangeable.
As a video codec, H.264 can be incorporated into multiple container formats, and is frequently produced in the MPEG-4 container format, which uses the .MP4 extension, as well as QuickTime (.MOV), Flash (.F4V), 3GP for mobile phones (.3GP),, and the MPEG transport stream (.ts). Most of the time, but not all the time, H.264 video is encoded with audio compressed with the AAC (Advanced Audio Coding) codec, which is an ISO/IEC standard (MPEG4 Part 3).
The Nuts and Bolts
As a standard-based codec, H.264 has been implemented by multiple vendors, and each version delivers different levels of quality and configurability. The most widely used H.264 codecs today include the Apple codec, which is used in Apple Compressor and QuickTime Pro, the MainConcept codec, which has been licensed by Adobe, Microsoft, Rhozet, Sorenson Media and Telestream for their encoding products, and the x264 codec, a free software library used in most shareware encoders, and by many UGC vendors to create custom high-volume H.264 encoding systems.
Of the three, the Apple codec produces the lowest quality by a significant margin. Otherwise, x264 has a slight quality edge over MainConcept, though the difference may not be noticeable at the encoding parameters used by most streaming producers.
H.264 Profiles and Levels
At a high level, there is some uniformity in the compression parameters available for each codec. For example, all H.264 codecs use different Profiles to encode the video. To explain, there are multiple encoding techniques and algorithms available in H.264 to compress the file. The basic tradeoff for most of them is enhanced quality but a more complex bitstream that’s harder to decode.
Profiles specify which of those techniques and algorithms can be used to create a bitstream, and constitute a convenient meeting point for device manufacturers and video producers. For example, the Baseline profile, which is the simplest profile predominantly used by low-power devices, doesn’t allow the use of B-frames or CABAC entropy encoding. Quality is still good, but not as high as streams produced using the Main or High Profiles that incorporate both techniques.
However, because those techniques weren’t used, the decoding requirements of a Baseline-encoded stream are modest, so low-power devices like the original video-capable iPod can play the video. The iPod documentation specified that these devices could only play H.264 video encoded to the Baseline profile, so video producers seeking to produce content for playback on the iPod encoded using the Baseline profile.
H.264 levels specify the maximum data rate and video resolution that a device can play back. For example, Apple’s iPad 2 specifications indicate that the device can play video encoded using the Main Profile, Level 3.1. This means a maximum video resolution of 1280x720 @ 30 frames per second at a maximum data rate of 14Mbps. Wikipedia has a chart that details the specifics of each H.264 level.
In general, Profiles and Levels are most important when producing for devices, because video encoded using the wrong Profile, or exceeding the parameters specified in a Level, won’t play on those devices. In contrast, when producing for computers, the players enabling H.264 playback, whether QuickTime, Flash, Silverlight or HTML5, can play back video produced at the most advanced Profile supported by most streaming encoding tools (the High Profile), at configurations that exceed full resolution 1080p and beyond.
When producing for general computer playback, it’s more important to consider the practical limitations regarding delivery to the target viewer, and the playback capabilities of the target computer. Though the Flash Player can technically play 1080p video encoded at 15Mbps on a low-power netbook, few connections could deliver that data rate in real time, the costs of delivery would be prohibitive, and the frame rate produced by that low power CPU would probably not be pleasing to viewers.
Other H.264 Encoding Parameters
Beyond Profiles and Levels, there’s a great disparity in the H.264 encoding parameters available to video producers encoding into H.264 format. For example, Figure 1 is the H.264 compression interface from the Adobe Media Encoder. As you can see, you can choose Profile and Level, but no other H.264 parameters.
Figure 1. Adobe Media Encoder's simple H.264 encoding interface.
At the other end of the complexity spectrum is Figure 2, which shows two of the four screens of encoding options available with the x264Encoder, an x264-based QuickTime encoder. As you can see, beyond it’s very high quality, one of the reasons that x264 is popular among serious compressionists and high-volume UGC sites is because it exposes a wide-variety of H.264 encoding parameters enabling extensive optimization.
Figure 2. Some of the encoding options available using the x264Encoder.
Overall, though H.264 is a standard, there is little uniformity in output quality from the various codecs, or the controls used to encode files into the H.264 format for streaming distribution.
H.264 Royalty Status
A number of companies claim patent rights for intellectual property contributed to the development of H.264, and all that do are members of a patent pool organized by MPEG LA. Under the patent pool, different royalties apply to the different classes of products, as shown in Figure 3.
Figure 3. The H.264 royalty structure from the MPEGLA Summary of AVC/H.264 License Terms FAQ.
On the left are products sold or otherwise distributed with AVC encoders or decoders installed, while on the right are various content categories that include video encoded into H.264 format. Briefly, on the left side of the diagram, royalties start after the first 100,000 units are sold each year, and cap at $3.5 million per year 2005-2006, $4.25 million per year 2007-08, $5 million per year 2009-10, and $6.5 million per year in 2011-15.
For content categories on the right, there are royalties for subscription services, that scale with the number of subscribers, but only start after exceeding 100,000 subscribers. There are also charges for Title-by-Title content sold to viewers (pay-per-view), but only for content longer than 12 minutes in duration.
Continuing clockwise on the right, there is no royalty for H.264 encoded video delivered for free over the Internet, though fees apply for Free Television encoded in H.264 for Broadcast Markets that exceed 99,999 television households. Interestingly, the H.264’s patent group’s policy on free Internet video has varied significantly over time. Initially, there was no royalty until at least January 1, 2011, a policy that dissuaded many high-volume potential users from adopting H.264. Then, in February 2010, MPEG-LA announced that royalties would be delayed until December 31, 2015. Finally, in August 2010, MPEGLA extended the royalty-free license “in perpetuity," with some pundits claiming that this was in response toGoogle’s open-sourcing the VP8 codec acquired from On2. Whatever the reason, those distributing free Internet video encoded via H.264 will never have to pay a license fee to MPEG LA.
H.264 Support and Adoption
H.264 is one of the three codecs available for encoding content for Blu-ray discs, is prevalent in video-conferencing products, and is widely used in television broadcasting, including satellite and cable broadcasting. In the streaming market, H.264 was first adapted by Apple with QuickTime 7 in 2005, and H.264 playback in an iPod also debuted that year. In 2007, Adobe incorporated H.264 support into Flash, with Microsoft announcing support for H.264 support in Silverlight in 2008.
H.264 is currently supported by all new Android devices, in Windows Phone 7, in most new BlackBerry Smartphones, and in the HP webOS.
In terms of browser support, H.264 playback is currently incorporated into Apple’s Safari browser and Microsoft Internet Explorer version 9 via the HTML5 video tag, as well as Google Chrome versions through (at least) 11.0.696.16 beta. However, on January 11, 2011, Google announced that they will remove H.264 support from Google Chrome in “the next couple months.” Neither Mozilla or Opera have incorporated H.264 playback into their browers, citing the costs and restrictions of using a “patent-encumbered” format.
However, Microsoft has released multiple plug-ins that enable H.264 playback within Firefox and Chrome using the HTML5 video tag, though according to Microsoft, these H.264 plug-ins only run on Windows 7.
With version 10.1, the Flash Player enabled GPU-acceleration of H.264 playback within Flash on both the Windows and Macintosh platforms, and version 5 of the Safari browser also accelerates H.264 playback on both platforms. With version 3, Silverlight also enabled GPU-acceleration of H.264 video on both platforms, and Internet Explorer 9 also includes GPU-acceleration.
On the content side, both YouTube and Vimeo quite loudly started supporting H.264 via the HTML5 tag in early 2010. By late 2010, multiple networks, including CBS, CNN, PBS, TNT, ABC and the BBC were encoding at least some of the videos distributed over the Internet using H.264. Virtually all videos produced for iTunes are also encoded in H.264 format.
H.264 Comparative Quality
In terms of video quality, H.264 is generally considered to produce higher quality than both On2’s VP6 and Microsoft’s VC1. Regarding WebM, there is some disagreement. For example, In their respected annual codec comparison, Moscow State University found that when encoding movies, VP8 (the video coded in the WebM format) showed “20-30% lower quality at average.” Another early evaluation of WebM by x264 developer Jason Garrett-Glaser, concluded that “VP8, as a spec, should be a bit better than H.264 Baseline Profile and VC-1. It’s not even close to competitive with H.264 Main or High Profile.”
In contrast, Streaming Media concluded that “the quality difference between VP8 and H.264 will be meaningless at most relevant data rates.”
Why H.264 Is Important To You
Codec selection is one of the most fundamental decisions facing streaming media producers. In order to make the right decision, producers should know the characteristics and costs of any technology that they choose. Today, H.264 is the only codec that can reach 98% of the installed base of computers (via Flash), is the predominant codec used in iTunes and can play on all major brands of mobile devices. It’s the highest quality codec available, and though there may be royalties for some for-fee uses, there will never be a royalty for distributing free video over the Internet.
إرسال تعليق