PixelTools’ MPEGRepairHD and Expert-Caption products can insert closed captioning (line 21) into existing MPEG streams without transcoding or re-encoding the video. The closed captioning source can be the actual closed captioning text strings along with their associated time code, formatted captioning codes (as in a .scc file) or custom formatted user data. The tools can also convert the old analog captioning found in one of the top lines of an MPEG stream to proper digital captioning. The captions can be inserted into elementary video, program streams and transport streams.
Most modern broadcast and cable standards specify that the closed captioning data be placed in user_data packets in elementary MPEG video streams. Adding captioning to the elementary MPEG video increases the size of the elementary video stream requiring re-multiplexing of the audio with the new video. PixelTools products accomplish the closed caption insertion in one or two easily automatable passes. The first pass splits the muxed stream into it’s elementary video and audio components (if needed) and concurrently inserts your requested captioning into elementary video user_data. If your output needs to be a multiplexed stream, the second pass multiplexes the newly updated elementary video stream with your extracted audio stream into a Program or Transport multiplexed stream. All of these steps can be automated using DOS shell commands providing you with a single button start to adding captioning to all MPEG files in a folder.
The source of your captioning can be in one of many formats. One of the most popular is the industry standard .SCC. SCC files can be created from many popular captioning programs. SCC files contain the CEA-608 encoded user text and control characters along with the time codes at which they will be inserted into the stream. Note therefore that the SCC time codes must correspond with the time code in your MPEG stream’s GOP headers.
The tools can accept a simple text file format consisting of time codes followed by the text that you wish to insert as captioning at that time. The tools can also accept a text file containing time codes all the sequence of text characters that you wish to be inserted after the user_data header. Using this hex data mode, you can add captioning to support any closed captioning format.
The PixelTools’ captioning capability can convert the CEA-608 two characters per frame into the more advanced CEA-708 format used in HDTV. The PixelTools captioning can add a single caption track into the MPEG stream using multiple different formats. One of the most popular format settings adds the two characters per frame using the SCTE-20 format, the NTSC section of the CEA-708 data, as well as the advanced DTV windowing format of CEA-708.
Download the closedcaptioninserttrial.zip file to obtain a demo version of MPEGRepair that is configured to illustrate some of the closed captioning insertion and display capability.
Unzip the file into a directory and run the CCInsertDemo.bat file to launch the demo. The demo will automatically insert closed captioning from a .SCC file into a short MPEG file. The demo will then decode the updated MPEG file so you can see the closed captioning displayed over the video.
Closed Captioning Samples
There are a large number of methods for adding closed captioning to MPEG video files. PixelTools currently supports seven different closed caption insertion specifications.
You can download sample files containing captioning encoded with all of the formats that we support.
CEA 708 DTV User moveable and re-sizable captioning for HDTV (cea708dtv.mpg)
CEA 708 NTSC Backward compatible CEA-608 captioning in CEA 708 (cea708ntsc.mpg)
CEA 708 DTV & NTSC Both CEA 608 and HDTV captioning (cea708.mpg)
SCTE 20 CEA-608 captioning in a SCTE-20 wrapper (scte20.mpg)
ATSC/CEA 608 CEA-608 captioning in an ATSC wrapper (atsc.mpg)
CEA 608 (also known as Divicom standard) CEA-608 captioning in user data (cea.mpg)
DVD Line 21 CEA 608 captioning that can be decoded by the set top (These captions are in addition to
any DVD subtitles that are decoded by the DVD player itself) (dvd.mpg)
The first step is to select or browse to your source video (that is lacking captioning) in the MPEG Decode | File to Decode or Analyze section of MPEGRepairHD or Expert-Caption. Depressing the down arrow and selecting the File option brings up the source file browser to aid you in locating your source file.
The first step is to select or browse to your source video (that is lacking captioning) in the MPEG Decode | File to Decode or Analyze section of MPEGRepairHD or Expert-Caption. Depressing the down arrow and selecting the File option brings up the source file browser to aid you in locating your source file.
Next, depress the Decode | Configure | Fix Stream option to configure your closed captioning insertion process. Check the Add Closed Captions or Commands from File: switch and enter or browse to your .SCC or .TXT file. Select your desired Closed Captioning Format and Closed Captioning Style from the lists.
Browse or enter the name of the elementary video file that will contain the captioning in the Save Fixed File edit box.
|

Click on image to see larger view of the screen |
If you are starting and ending with a multiplexed MPEG file, you also need to set up the decoder to extract and save the elementary audio file during the first pass.
Depress the Decode Configure | Extract Streams tab, check the Save Elementary Audio box, and browse or type in the destination for your temporary audio file. |

For larger view of the screen |
Depress the OK button to close the configure dialog. Start the process by depressing the Control Decode | Run. Depending on the bit-rate of your video content and the speed of your processor, you may be able to update an hour of video in about 10 minutes. If you open the Decode Statistics window by depressing the St button on the top of the interface, closed caption insertion messages should appear as captioning is being inserted into the MPEG file. If no captioning messages appear, the tool may not have found your selected captioning file or the time code in the captioning file may not correspond with the time code in your MPEG file.
The output from this pass should be an elementary video stream that contains the captions and optionally the elementary audio file that was extracted from the source MPEG file.
If your end result is to be a multiplexed stream, you need to multiplex the elementary video (now with captioning) with your recently extracted audio file.
In MPEGRepairHD, browse or enter your elementary video in the Encode Input edit box and browse or enter your output multiplexed file in the Encode Output edit box. Choose the Optimize Encode | Configure | Audio/Mux property sheet and select the No Encode; Mux Only option. Browse or enter the name of your elementary audio file in the Mux audio from compressed file edit box. Depress the Configure | Mux button to configure the multiplexing process.
In Expert-Caption, select the Decode Function Enable | Configure | Audio/Mux property sheet. Browse or enter the source and destination file names and depress the Configure | Mux button to set up the multiplexing operation. |

Click on image to see larger view of the screen |
Run the multiplex operation by depressing the Control Encode | Run button in MPEGRepairHD or by depressing the Control | Run in the Expert-Caption Audio.Mux property sheet. Depending on file bit-rate and your processor, fast computers should multiplex an hour video in about 5 minutes.
Closed Captioning Source
MPEGRepairHD and Expert-Caption utilize a variety of transcribed caption formats:
Using Closed Captioning .SCC file as Input
The closed captioning file can be of popular closed caption file format (.scc) which consists of the time code followed on the same line by a sequence of sets of 4 HEX codes representing the actual closed captioning data that will be added to the video user_data.
The following is an example of the Edl with formatted closed captioning command:
00:00:00:14 9426 9426 94ad 94ad 9470 9470 9137 9137
00:00:13:06 942c 942c
Using Line 21 Hex data as Input
Any pre-formatted closed captioning data (using any closed captioning standard) can be added via the HEXUSERDATA edl command. Just add the frame number or time code of the frame user data and the actual user data, up to 72 characters long, after the HEXUSERDATA command. . An example of the format of the hex user data is presented in the top of the sample Expert.edl provided with the product.
The following is an example of the Edl raw user data file commands:
FRAME 1038
HEXUSERDATA 04 ff 29 d0 55 9f 28 44
FRAME 1039
HEXUSERDATA 05 ff 39 d4 22 9f 28 44
FRAME 1102
HEXUSERDATA 05 ff 39 d4 22 9f 28 44
Using raw captioning text as Input
A text file containing the actual words to be displayed at the associated time code can be used. This file can have any name.
Using raw captioning text as Input
A text file containing the actual words to be displayed at the associated time code can be used. This file can have any name.
The following is an example of the closed captioning file commands:
FRAME 1038
CLOSEDCAPTION
The little brown cow jumped over
FRAME 1146
CLOSEDCAPTION
the moon!
Using Analog Line 21 data as Input
If your source video includes the analog captioning in one of the visible lines, this data can be extracted and re-added in one of the modern digital captioning formats. Select the Decode Configure | Fix Stream | XCode Analog Captioning from line: option and enter the line containing the analog captioning.
Closed Captioning Format
The Closed Captioning Format selection controls the actual formatting of the text or SCC EDL and DDL commands into the MPEG user data and controls the specific user data locations. The Format selection control is located in the Encode section of MPEGRepairHD in the Optimize Encode | Configure | Line 21 property page. Note that the insertion of closed captioning data into an existing stream using the Decode Fix process utilizes the Closed Captioning Format set in the Encode configuration.
The following Closed Captioning format options are available:
CEA 708 DTV
This option causes up to eighteen closed captioning characters to be inserted into picture header user data fields in display order. When converting text to CEA-708 data, DTV Window 0 will be initialized and enabled. Your closed captioning text will be written to this window at the specified frame or time code.
Analyzing the actual user data using the MPEGRepairHD Decode Statistics window will show Picture user data that will start with the Hex values: 47 41 39 34
CEA 708 NTSC
This option causes up to two closed captioning characters to be inserted into picture header user data fields in display order. These bytes are added to the NTSC channel in the field per your Closed Captioning Style selection. This channel was designated by the CEA to facilitate legacy closed caption processing in CEA-708 systems.
Analyzing the actual user data using the MPEGRepairHD Decode Statistics window will show Picture user data that will start with the Hex values: 47 41 39 34
CEA 708 DTV & NTSC
This option causes up to eighteen closed captioning characters to be inserted into picture header user data fields in display order and up to two closed captioning characters to be added to the NTSC channel in your selected field. When converting text to CEA-708 data, DTV Window 0 will be initialized and enabled. Your closed captioning text will be written to this window at the specified time. This option causes the same closed captioning text to be written to both the CEA-708 window and the NTSC channels.
Analyzing the actual user data using the MPEGRepairHD Decode Statistics window will show Picture user data that will start with the Hex values: 47 41 39 34
CEA 608 (default)
This option causes up to two closed captioning characters to be inserted into picture header user data fields in transmission order. The characters will be entered into the Field 1 or Field 2 closed captioning field per your Closed Captioning Style selection.
Analyzing the actual user data using the MPEGRepairHD Decode Statistics window will show Picture user data that will start with the Hex values: 02 09
ATSC / CEA 608
This option causes up to two closed captioning characters to be inserted into picture header user data fields in display order. The data formatted per CEA-608 and is wrapped in a SCTE-21 and ATSC header. The characters will be entered into the Field 1 or Field 2 closed captioning fields per your Closed Captioning Style selection.
Analyzing the actual user data using the MPEGRepairHD Decode Statistics window will show Picture user data that will start with the Hex values: 47 41 39 34
DVD
This option causes up to 30 closed captioning characters to be inserted into GOP header user data fields. The first set of user data will be inserted at the first GOP at or after the Frame number or time code set in the EDL or DDL file. The remaining user data (over the 30 character per GOP limit) will be inserted in subsequent GOP headers. The characters will be entered into the Field 1 or Field 2 closed captioning field per your Closed Captioning Style selection.
Analyzing the actual user data using the MPEGRepairHD Decode Statistics window will show user data at EVERY GOP header that will start with the Hex values: 43 43 01 f8 9e. The GOP headers will contain the closed captioning data or will contain NULL data.
SCTE 20
This option causes up to two closed captioning characters to be inserted into picture header user data fields in display order. The data formatted per CEA-608 is inserted per the ANSI/SCTE 20 standard. The two captioning characters are not byte aligned and so will not be immediately recognizable when extracted in the MPEGRepairHD decode statistics. The characters will be entered into the Field 1 or Field 2 closed captioning fields per your Closed Captioning Style selection.
Analyzing the actual user data using the MPEGRepairHD Decode Statistics window will show 7 bytes of user_data in the Picture Headers that will start with the Hex values: 03 81.
The ddl file format is similar to that of the edl file format presented above. In addition to the text strings supported with the CLOSEDCAPTION command, any formatted user data can be inserted via the HEXUSERDATA command. Also, the .scc closed caption format is supported. (Be sure to rename the .scc file as Expert.ddl).
Displaying closed captioning via MPEGRepairHD GUI
MPEGRepair and Expert-Caption decoder can extract and display closed captioning information in several formats.
Extracting Digital Closed Captioning Data
Select your file containing the digital closed
captioning in the File to Decode or Analyze Edit box. In the
Decode Configure | Extract Streams dialog, check the Save Digital Data
option and enter or browse the name of the file to hold your extracted
data. Running the decoder will cause all user_data to be saved in your
selected output file.
Extracting Analog Closed Captioning Data
Select your file containing the analog closed
captioning in the File to Decode or Analyze Edit box. In the
Decode Configure | Extract Streams dialog, check the Save Analog Data
option and enter or browse the name of the file to hold your extracted
data.
Running the decoder will cause the two CC bytes
to be extracted from the analog modulated white line at the top of
each frame. The parity
bit will be stripped off of the data before it is saved. The data
in your selected analog CC file will be ASCII text which will include
the CC words displayed on the screen. The data will also include
the CC control characters which will be intermixed with the readable
text.
Extraction in the Statistics Log
All data displayed in the Decode Statistics Log
will also be stored in the Decode.log file. You can select between
the raw user_data and the translated ASCII text and control codes using
the Decode | Statistics | Video Statistics menu options.
Adding closed captioning via ExpertWorkshop SDK
During Decoding
To enable the addition of closed captioning data while decoding a file, create a decoding ddl text file as described above using the CLOSEDCAPTION, or HEXDATA commands or use one of the supported closed captioning file format. Store the name of the resultant MPEG file that will contain the closed captioning data in the DecodeFixName member of the ExpDecFix_str. The ExpDecFix_str is a member of the VidDecodeParams_struc which is sent to the decoder as a member of the ExpDecConfigure API call. During calls to ExpDecodeNextFrame, the source stream is written to the resultant MPEG stream along with the appropriate user_data containing the closed captioning text per the ddl file.
Closed Captioning Insertion Format
The Closed Captioning Format is set using the ExpEncodeConfigure call where the
ExpEncParameters_str | Line21Params_struc CCType contains the actual format.
CCType of:
CEA608 = 0
ATSC_CEA608 = 1
DVD_CC = 2
CEA708_DTV = 3
CEA708_NTSC = 4
CEA708_DTV_NTSC = 5
SCTE20 = 6
Specification
The CGMS and APS flags are added to the MPEG video user data as Extended Data in the line 21 emulation of digital data as specified in the Advanced Television Systems Committee Inc (ATSC) Digital Television Standard A/53, the ISO/IEC 13818-2 MPEG video standard, the Society of Cable and Telecommunications Engineers (SCTE) ANSI/SCTE 20 2004 Methods for Carriage of Closed Captions and Non real-time Sampled Video and ANSI/SCTE 21 2001 Standard for Carriage of NTSC VBI Data in Cable Digital Transport Streams specification, and the Consumer Electronics Association CEA-708-B Digital Television (DTV) Closed Captioning specification, and the Consumer Electronics Association CEA-608-C Line 21 Data Service specification.
Supplemental Information
Line 21 data can have many uses and is governed by multiple overlapping
specifications in support of government and industry regulations. A
little history may aid in the understanding of the features.
Analog
In the 1970’s, extra bandwidth was exploited in the television
broadcast signal that occurred in a normally dead period while the electron
gun was being repositioned to start the painting of each new field. A
US federal mandate required that most broadcasts include closed captioning.
It was determined that two ASCII characters could reliably be included
in the vertical retrace interval (line 21) right before the first visible
horizontal line. Simple decode circuitry was mandated to
be included in all TVs that would provide the extraction and storage
of the line 21 data and allow the TV user to add the closed caption characters
as an overlay to the video in the next fields. Other industry users
fought for access to the line 21 data. The Copy Generation Management
System (CGMS) provided a three state flag that would cause new industry
and federally mandated TV decoders to prevent copying of selected video
content. The flags are “Copy Freely”, “Copy Once”,
and “Copy Never”. The old Macrovision technology of
interfering with the TV broadcast synchronization signals to prevent
video copying was also embedded in line 21 data with Analog Protection
System (APS) flags that instruct the video decoder to interfere with
selected synchronization signals. Also video content rating, program
type information, and program schedule data is broadcast via the line
21
Digital
The advent of MPEG compressed video brought with it more possibilities
for higher bandwidth closed captioning channel and other supplementary
data channels. One of the digital constraints though was to make
the new system backwards compatible with older TV technologies IE allow
for just two bytes of data per field. This evolution has resulted in
a large array of specifications that support the new and old capabilities.
Line 21 data can be added to user_data of each frame of elementary
MPEG video. Line 21 data can also be added as a special stream
within a MPEG transport stream.
Digital Television offers a variety of display sizes and aspect ratios. To accommodate these different dimensions, the CEA-708 specification includes enhanced display features as compared to the older CEA-608 specification. CEA-608 decoders were required to place the captioning overlay in a defined section of the screen and with a specified font size. CEA-708 decoders draw the captioning text in a user defined window anywhere on the display. The user can interactively move and re-size this captioning window. The captioning commands define anchor points that will not move when this window is resizes. As such, CEA-708 requires quite a bit more control overhead to utilize the advanced features. This makes CEA-608 streams not easily converted into full featured CEA-708 streams. CEA-708 does include four optional bytes per frame which are designated as NTSC captioning bytes for backward compatibility with CEA-608 decoders. It is possible to add both the DTV CEA-708 captioning in additional to the NTSC bytes in the sample CEA-708 headers.
Specifications
The Consumer Electronics Association publishes CEA-608-C Line 21 Data
Service that details data formats for closed captioning services and
extended data (including CGMS and APS). The CEA publishes CEA-708-B
Digital Television Closed Captioning that details usage of a 9600 bps
closed channel (ten times the bandwidth of the original channel). The
Advanced Television Systems Committee Inc (ATSC) publishes the Digital
Television Standard A/53 specification that details a format for adding
line 21 data to user_data of an MPEG-2 stream. The International
Organization for Standardization (ISO) publishes the ISO/IEC 13818-2
MPEG video standard that details the inclusion of user_data in compressed
video streams. The Society of Cable and Telecommunications Engineers
(SCTE) publishes the ANSI/SCTE 20 2004 Methods for Carriage of Closed
Captions and Non real-time Sampled Video and ANSI/SCTE 21 2001
Standard for Carriage of NTSC VBI Data in Cable Digital Transport Streams
specification that defines additional VBI services.
PixelTools line 21 support
At the lowest level, MPEGRepairHD is capable of adding frame accurate
user_data during MPEG encoding or to an already existing video elementary
stream. This will support all of the above data services provided
that the data is packaged in the correct format. MPEGRepairHD can add
up to 128 eight bit data bytes to designated frames during encoding
using an Encoding Decision List (EDL) file or to an existing MPEG stream
using a Decoding Decision List (DDL) file. These text files specify
the frame number and corresponding data bytes which will be inserted
in the user_data at the selected picture header. The data bytes
must be formatted appropriately per the desired usage. The encoding
EDL and the decoding DDL files utilize the identical format.
At a higher level, MPEGRepairHD can add CEA-708 or CEA-608 closed captions during encoding or to an existing stream as listed
in a simple text file. The EDL command file reads closed caption text
strings, formats your closed caption text and inserts it into user_data
fields starting at your requested frame number. The MPEGRepairHD
Decode | Configure | Fix Stream dialog also allows selection of a similar
DDL closed caption text file that will be formatted and added to an
existing MPEG stream. The Encode and Decode closed caption text commands
are of identical format.