An overview of the basics of encoding and transcoding, including an attempt to settle on some hitherto controversial definitions
This article is part of Streaming Media's "What Is" series.
Streaming media production starts with the infinite real world as captured by the lenses of our camcorders, and ends with the tightly compressed files necessary for streaming delivery. Along the way, the video is digitized, encoded, re-encoded, and frequently transcoded, with possible stops along the way for transrating, transsizing, and transmuxing.
These are terms we use every day, but few, if any, have precise definitions. Until now. Read on for our attempt to bring clarity to the lexicon of streaming media workflows and processes as we follow the life of a video from capture to consumption (on multiple platforms, of course).
We start with the video shot by the camcorder, in this case a Panasonic AG HMC150 (Figure 1). We shoot in 1080 30p (1920x1080 resolution, progressive, 30 frames per second) and the video is stored in AVCHD format at 24Mbps.
Figure 1. Panasonic AG HMC150 camcorder
As part of this process, the video is digitized, or converted from analog to digital, and encoded (or compressed), or stored in a format used for storage or transmission.
We then transfer the video from camcorder to hard drive. When working with analog video, this transfer was called video capture, which--like the process that occurred in the Panasonic camcorder--involved both digitization and (in most instances) compression. However, when working with a digital camcorder, the video is already stored in a compressed digital format. Accordingly, the process of transferring the video from camcorder to hard drive can be a simple file copy, or a transcoding, depending upon which editor you use.
For example, Adobe Premiere Pro works with AVCHD files natively, or in the original format, without any conversion whatsoever. The preferred workflow for Premiere Pro would be to copy the files from camcorder to computer outside of the application, perhaps with Windows Explorer on Windows or File Manager on the Mac, and then import the files into Premiere Pro. This involves no conversion of any kind, and no generational loss.
Figure 2. Ingesting AVCHD into Final Cut Pro 7 involves transcoding from AVCHD to ProRes.
In contrast, and as shown in Figure 2, Final Cut Pro 7 converts all AVCHD files into ProRes format as part of the ingest process. You get to choose the flavor of ProRes, which determines the data rate, but in all cases, the data rate will be at least double that of the AVCHD footage (45Mbps for ProRes 422 Proxy) up to over 13 times higher (330Mbps ProRes 4444), which means minimal quality degradation.
This conversion would be a transcode, which is a conversion from one format to another of like or similar quality, typically performed to make the content compatible with another process or application. All conversions from one format into a different lossy format technically involve a generational loss, though if you used one of the higher bitrate ProRes formats, the practical impact of that generational loss would be irrelevant.
Things get hairy definition-wise when you output a mezzanine file from the editor that you can then input into your encoding program to produce files for distribution (Figure 3). With Premiere Pro, you’d probably output in H.264 format, perhaps at the same or even a higher data rate than the original file. Still, if you adjusted global parameters like brightness or color, this re-encode would involve some generational loss, however small.
Figure 3. Creating the mezzanine file in Adobe Media Encoder. Re-encoding, encoding, transcoding, or transrating?
You could call this re-encode encoding, particularly if the mezzanine file was used for archiving. You could also call this transcoding, since the mezzanine file would be of like or similar quality to the original. You could further argue for transrating, where you’re converting a file to a different data rate using the same codec, though this isn’t the classic application of transrating, which is discussed below. Perhaps, however, it’s simplest and most accurate to call this re-encoding, or the process of saving a file back to its existing format after performing some kind of editing process.
Back to creating our mezzanine file. In Final Cut Pro, if you output to ProRes as your intermediate format, no transcoding takes place, though again, if there’s a global change like brightness or color, some technical generational loss will occur, however small. However, if you output to much lower data rate H.264 format, that’s almost certainly a encode, since you’re converting to a more lossy codec for the purposes of storage or transmission. On the other hand, the intermediate file will be visually indistinguishable from the ProRes, which argues for the transcode designation. I’d vote for transcoding, but it’s a gray area.
You input the mezzanine file into an encoder like Sorenson Squeeze to encode the video into MPEG-2 format for burning to DVD, a generic high-bitrate 720p MP4 file for multiple uses, and a WebM file (I know, no one but YouTube actually uses WebM, but go with me on this one). This is clearly encoding, since you’re converting the file for transmission.
Figure 4. Encoding our mezzanine file to the final distribution formats.
You burn the MPEG-2 file to a DVD, upload the WebM file to your HTML5 compliant website and send one file to an OVP running Wowza Media Server 3 sometime in late 2011. This is where things get really interesting.
At NAB 2011, Wowza announced Wowza Media Server 3, which debuts a new plug-in architecture for add-ins developed for the Media Server. The first two plug-ins from Wowza provide network DVR functionality and a transcoder engine, the latter of which is relevant to this discussion.
Figure 5. Wowza Media Server 3 can transrate, transcode, and transmux.
In operation, I would send the 720p MP4 file to the server, where the transcoder would convert the H.264 streams into multiple lower resolution/data rate streams to use for adaptive streaming. For example, the transcoder might convert the 720p stream into four files, configured at 848x480@1000, 640x360@700, 480x270@500 and 320x180@200Kbps
Technically, this is called transrating, because the transcoder is converting the H.264 streams to a lower bitrate using the same codec, though you could also argue for transsizing because the output resolution is changing as well. Still, as the umbrella term, transcoding is also accurate and is much more widely used. If the transcoder then converted those streams to WebM format (I know, I know) for adaptive delivery (I know, I know), it would clearly be transcoding.
I should say that the Wowza transcoder will only transcode live streams in its initial release, though on-demand conversion is coming. Never let reality get in the way of a good example.
What will be available in the first release is transmuxing, where the server will change the container format, but not the underlying file. For example, to distribute files via HTTP Live Streaming, the packets for our four files must be stored as MPEG-2 transport streams (.ts files), not MP4 files. The H.264 encoding within the wrapper is fine; it’s just the container format that has to be changed.
Wowza Media Server 2 can dynamically repackage the MP4 files into MPEG-2 transport streams today, as can Microsoft’s IIS Services and Akamai’s HD Network via its “in the network” repackaging. At NAB 2011, Adobe announced plans to do the same with a future version of the Flash Media Server.
When identifying a digital to digital file conversion, you have to examine both the intent of the conversion, and what actually happens during the conversion. This leads to the following definitions.
To convert for storage or transmission, particularly when the new file uses a lower data rate than the original, or a more lossy codec.
The process of saving a file back to its existing format after performing some kind of editing process.
To convert to a different format of similar or like quality to gain compatibility with another program or application
To convert to a different data rate using the same format.
Convert to a different resolution using the same format.
Convert to a different container format without changing the file contents.
If you feel strongly about any of these definitions, you’re taking yourself (and this document) way too seriously.
This article is part of Streaming Media's "What Is" series.
Executive Summary
Streaming media production starts with the infinite real world as captured by the lenses of our camcorders, and ends with the tightly compressed files necessary for streaming delivery. Along the way, the video is digitized, encoded, re-encoded, and frequently transcoded, with possible stops along the way for transrating, transsizing, and transmuxing.
These are terms we use every day, but few, if any, have precise definitions. Until now. Read on for our attempt to bring clarity to the lexicon of streaming media workflows and processes as we follow the life of a video from capture to consumption (on multiple platforms, of course).
Video Shot in Camcorder
We start with the video shot by the camcorder, in this case a Panasonic AG HMC150 (Figure 1). We shoot in 1080 30p (1920x1080 resolution, progressive, 30 frames per second) and the video is stored in AVCHD format at 24Mbps.
Figure 1. Panasonic AG HMC150 camcorder
As part of this process, the video is digitized, or converted from analog to digital, and encoded (or compressed), or stored in a format used for storage or transmission.
Video Copied to a Hard Drive
We then transfer the video from camcorder to hard drive. When working with analog video, this transfer was called video capture, which--like the process that occurred in the Panasonic camcorder--involved both digitization and (in most instances) compression. However, when working with a digital camcorder, the video is already stored in a compressed digital format. Accordingly, the process of transferring the video from camcorder to hard drive can be a simple file copy, or a transcoding, depending upon which editor you use.
For example, Adobe Premiere Pro works with AVCHD files natively, or in the original format, without any conversion whatsoever. The preferred workflow for Premiere Pro would be to copy the files from camcorder to computer outside of the application, perhaps with Windows Explorer on Windows or File Manager on the Mac, and then import the files into Premiere Pro. This involves no conversion of any kind, and no generational loss.
Figure 2. Ingesting AVCHD into Final Cut Pro 7 involves transcoding from AVCHD to ProRes.
In contrast, and as shown in Figure 2, Final Cut Pro 7 converts all AVCHD files into ProRes format as part of the ingest process. You get to choose the flavor of ProRes, which determines the data rate, but in all cases, the data rate will be at least double that of the AVCHD footage (45Mbps for ProRes 422 Proxy) up to over 13 times higher (330Mbps ProRes 4444), which means minimal quality degradation.
This conversion would be a transcode, which is a conversion from one format to another of like or similar quality, typically performed to make the content compatible with another process or application. All conversions from one format into a different lossy format technically involve a generational loss, though if you used one of the higher bitrate ProRes formats, the practical impact of that generational loss would be irrelevant.
Creating the Mezzanine File
Things get hairy definition-wise when you output a mezzanine file from the editor that you can then input into your encoding program to produce files for distribution (Figure 3). With Premiere Pro, you’d probably output in H.264 format, perhaps at the same or even a higher data rate than the original file. Still, if you adjusted global parameters like brightness or color, this re-encode would involve some generational loss, however small.
Figure 3. Creating the mezzanine file in Adobe Media Encoder. Re-encoding, encoding, transcoding, or transrating?
You could call this re-encode encoding, particularly if the mezzanine file was used for archiving. You could also call this transcoding, since the mezzanine file would be of like or similar quality to the original. You could further argue for transrating, where you’re converting a file to a different data rate using the same codec, though this isn’t the classic application of transrating, which is discussed below. Perhaps, however, it’s simplest and most accurate to call this re-encoding, or the process of saving a file back to its existing format after performing some kind of editing process.
Back to creating our mezzanine file. In Final Cut Pro, if you output to ProRes as your intermediate format, no transcoding takes place, though again, if there’s a global change like brightness or color, some technical generational loss will occur, however small. However, if you output to much lower data rate H.264 format, that’s almost certainly a encode, since you’re converting to a more lossy codec for the purposes of storage or transmission. On the other hand, the intermediate file will be visually indistinguishable from the ProRes, which argues for the transcode designation. I’d vote for transcoding, but it’s a gray area.
Encoding for Distribution
You input the mezzanine file into an encoder like Sorenson Squeeze to encode the video into MPEG-2 format for burning to DVD, a generic high-bitrate 720p MP4 file for multiple uses, and a WebM file (I know, no one but YouTube actually uses WebM, but go with me on this one). This is clearly encoding, since you’re converting the file for transmission.
Figure 4. Encoding our mezzanine file to the final distribution formats.
You burn the MPEG-2 file to a DVD, upload the WebM file to your HTML5 compliant website and send one file to an OVP running Wowza Media Server 3 sometime in late 2011. This is where things get really interesting.
Transrating and Tranmuxing
At NAB 2011, Wowza announced Wowza Media Server 3, which debuts a new plug-in architecture for add-ins developed for the Media Server. The first two plug-ins from Wowza provide network DVR functionality and a transcoder engine, the latter of which is relevant to this discussion.
Figure 5. Wowza Media Server 3 can transrate, transcode, and transmux.
In operation, I would send the 720p MP4 file to the server, where the transcoder would convert the H.264 streams into multiple lower resolution/data rate streams to use for adaptive streaming. For example, the transcoder might convert the 720p stream into four files, configured at 848x480@1000, 640x360@700, 480x270@500 and 320x180@200Kbps
Technically, this is called transrating, because the transcoder is converting the H.264 streams to a lower bitrate using the same codec, though you could also argue for transsizing because the output resolution is changing as well. Still, as the umbrella term, transcoding is also accurate and is much more widely used. If the transcoder then converted those streams to WebM format (I know, I know) for adaptive delivery (I know, I know), it would clearly be transcoding.
I should say that the Wowza transcoder will only transcode live streams in its initial release, though on-demand conversion is coming. Never let reality get in the way of a good example.
What will be available in the first release is transmuxing, where the server will change the container format, but not the underlying file. For example, to distribute files via HTTP Live Streaming, the packets for our four files must be stored as MPEG-2 transport streams (.ts files), not MP4 files. The H.264 encoding within the wrapper is fine; it’s just the container format that has to be changed.
Wowza Media Server 2 can dynamically repackage the MP4 files into MPEG-2 transport streams today, as can Microsoft’s IIS Services and Akamai’s HD Network via its “in the network” repackaging. At NAB 2011, Adobe announced plans to do the same with a future version of the Flash Media Server.
Summary
When identifying a digital to digital file conversion, you have to examine both the intent of the conversion, and what actually happens during the conversion. This leads to the following definitions.
Encode (or Compress)
To convert for storage or transmission, particularly when the new file uses a lower data rate than the original, or a more lossy codec.
Re-encode
The process of saving a file back to its existing format after performing some kind of editing process.
Transcode
To convert to a different format of similar or like quality to gain compatibility with another program or application
Transrate
To convert to a different data rate using the same format.
Transsize
Convert to a different resolution using the same format.
Transmux
Convert to a different container format without changing the file contents.
If you feel strongly about any of these definitions, you’re taking yourself (and this document) way too seriously.
Thanks for this. I really like what you've posted here and wish you the best of luck with this blog and thanks for sharing. Streaming Transcoder Provider
ردحذفإرسال تعليق