mpeg2enc - MPEG-1/2 encoder
For most purposes this elementary stream output will need to be multiplexed with one or more audio streams into a program/systems stream using the mplex(1) tool.
Note that although this manual page aims to explain how mpeg2enc can be used effectively it is not intended as an introduction to MPEG-1/2 video which is a fairly complex topic in its own right. The MPEG video format is a somewhat baroque standard with many many options, not all of which necessarily easy to explain or even particular useful in the context of a software encoder.
Much useful practical information for novices can be found in the
mjpeg-HOWTO document that should have been installed with mjpegtools(1)
package. Further information and useful supporting software can be found
on the mjpegtools web-site:
Set the MPEG profile to use. The MPEG standards support a vast number of options. In practice, different applications of the MPEG format place additional constraints of the form the MPEG stream. Setting this flag selects the kind of stream to produce.
- Generic MPEG1.
A basic MPEG-1 profile that lets most parameters
be adjusted for particular applications using the other flags.
Typical applications would be to produce a variable bit-rate MPEG-1
stream with big buffers and a high data-rate software playback
on a computer.
- Standard VCD.
An MPEG1 profile exactly to the VCD2.0 specification. Flag settings that would result in a non-standard stream structure are simply ignored.
- User VCD.
As for profile 2 but bit-rate and video buffer size can be set to non-standard values. Frame size may also be non-standard. Bit-rate and buffer sizes default to those for standard VCD.
- Generic MPEG2.
A basic MPEG-2 profile that lets most parameters be adjusted for particular applications using the other flags. Typical applications would be to produce a MPEG-2 stream with big buffers and long GOP's for software playback on a computer.
- Standard SVCD.
An MPEG-2 profile exactly to the SVCD2.0 specification. Flag settings that would result in a non-standard stream structure are simply ignored.
- Non-standard SVCD.
As for profile 4 but bit-rate, video buffer size, GOP sizes and structure can be set to non-standard values. Frame size may also be non-standard. Bit-rate and buffer sizes default to those for standard SVCD.
- VCD Stills sequence.
Encodes the special style of MPEG stream used for still images on VCDs. To use this profile you must set the target size you wish to compress the images to using the -T flag. Reasonable values are around 35KB for standard resolution stills (352 wide) and 120KB for high resolution stills (704 wide).
- SVCD Stills sequence.
Encodes the special style of MPEG stream used for still images on SVCDs. Both standard (480 wide) and high resolution (704 wide) images are supported. As with VCD stills you select how big each compressed still should be using the -T flag.
- DVD MPEG-2 for 'dvdauthor'
This version adds special dummy navigation packets into the output stream that the dvdauthor tool fills in to make a proper .VOB for authoring. Bit-rate defaults to 7500kbps, buffer sizes to the maximum permitted by the DVD specification.
- DVD MPEG-2.
Just a very basic implementation. Useful with DXR2 board and similar hardware that can decode MPEG-2 only if it is presented in a DVD like form. Bit-rate defaults to 7500kbps, buffer sizes to the maximum permitted by the DVD specification.
Set verbosity level to num. 0 = warnings and errors only, 1 = information as well, 2=really verbose.
Specify which quantization matrices to use instead of the defaults (which can be specified by using "-K default"). Using "-K hi-res" is identical to using the -H option. The value kvcd uses the Kvcd.Net matrices from http://www.kvcd.net/; the value tmpgenc invokes the TMPGEnc matrices from http://www.tmpgenc.net/e_main.html. On average (this depends on the source material), the tmpgenc tables reduce the average bitrate by about 10% and the kvcd tables reduce bitrate by about 16% (compared to the default tables).
Specify when a special 'unit coefficient elimination' algorithm should be applied to the encoded picture blocks. Basically, this procedure forces blocks of a type that don't carry much information but are expensive to encode to be simply skipped. The larger the number the more potentially visible this skipping is likely to be but the more compression is boosted. A negative value means that all coefficients are zeroed, positive means only texture but not base intensity coefficients are zeroed. Values of around 10 or -10 seem to work well with high quality source material. For noisier material it might be worth trying 20 or -20.
Specify how many bi-directionally (B type) difference-encoded frames should be encoded between reference (I or P) frames. The default is 0 except for VCD encoding where it is 2 B frames as required by the standard. Experts differ on how much using B frames improves compression. In practice unless you have really clean material they tend to be fairly useless and sometimes even harmful. Encoding is significantly faster and uses less memory if no B frames are encoded and compression is rarely more than marginally worse.
N.b. If the profile you have selected sets particular values for these parameters it will over-ride these adjustment flags. In particular, there is almost nothing that can be adjusted for the standard VCD and SVCD profiles.
The bit-rate of the output video stream in k Bits/sec. The default is exactly the bit-rate required for VCD streams. Selecting one of the non-generic formats sets a default bit-rate that makes sense for the selected format. If variable bit-rate mode has been selected (see the -q option) this is the maximum bit-rate of the stream.
The maximum video buffer usage required to decode the stream in KBytes. The default is 46KB the (tiny) size specified for VCD. The size to use for SVCD is the (more reasonable) 230KB. If you are encoding for a half-decent software decoder it makes sense to push this up to 500K or more.
Set the target size for (S)VCD still images in KB.
This flag forces the encoder to generate a "sequence header" at the start of every group-of-pictures. This is needed by some player hardware to support fast forward/rewind/random access functions but is a waste of bits otherwise.
The SVCD MPEG-2 profile demands that special "Scan OFfset" which are (in effect) pointers to the place on the final SVCD disk where the video for 0.5 and around 5-10 seconds behind and ahead in the stream is located. The intended use of this information is to support"Fast forward/Rewind" functions. Unfortunately, at the time mpeg2enc encodes the video it doesn't know where the video is going to finally end up. So special dummy "Scan OFfset" values are written which are intended to be filled in during the creation of the SVCD image. Currently the GNU vcdimager tool handles this task. However, in some circumstances the dummy offsets can cause problems. This flags stops mpeg2enc generating them.
In the official SVCD standards the field in the MPEG-2 header information that passes on the encoders "recommended" horizontal resolution to decode the stream to is supposed to take the values 540 (for 4:3 sequences) or 720 (for 16:9 sequences). In practice many players don't work unless the value is 480. This flag, forces mpeg2enc to follow the official standard. It is worth trying if 16:9 sequences play at 4:3 aspect ratio.
This flag deactivates all constraints for the maximum video samplerate or video resolution. Its purpose is to allow the encoding of unusual resolutions of MPEG-video (e.g. 2200 x 576, 160 degrees FOV VR-theatre MPEG movies), but should be used with care: It can possible circumvent a number of other security checks, and untested settings can cause mpeg2enc to crash in this mode.
This flag overrides the field-order specified in the interlacing tag of the input stream header. (If you need this option, it indicates a problem in the capturing/encoding process where the temporal order of the two fields in each frame has been mislabeled. The effect of this is weird "juddering" when playing back the stream on a TV. Check the mjpeg-howto for more information about interlacing problems.)
Force the input stream to be treated as NTSC|PAL|SECAM regardless of what the stream header might suggest. Basically this just sets the defaults for a bunch of other options.
Set the frame-rate of the output-stream. By default, this value is
inferred from the input header. Currently only the standard
MPEG rates are supported. Eventually more-or-less arbitrary rates
will be possible.
0 - illegal
1 - 24000.0/1001.0 (NTSC 3:2 pulldown converted FILM)
2 - 24.0 (NATIVE FILM)
3 - 25.0 (PAL/SECAM VIDEO / converted FILM)
4 - 30000.0/1001.0 (NTSC VIDEO)
5 - 30.0
6 - 50.0 (PAL FIELD RATE)
7 - 60000.0/1001.0 (NTSC FIELD RATE)
8 - 60.0
Set the playback aspect ratio code of the encoded video. By default, this
value is inferred from the input header.
1 - 1 - 1:1 display
2 - 2 - 4:3 display
3 - 3 - 16:9 display
4 - 4 - 2.21:1 display
Setting -p only makes sense for 24frame/sec Movie source material. It sets flags in the output stream that tell the decoder to play the movie as NTSC 60field/sec video using "3:2 pulldown". This is vastly more efficient than encoding as 60field/sec video. The classic application is to transcode a PAL-encoded movie (24fps played too fast at 25 fps!) into NTSC (see the -f flag).
MPEG encoding is a task that can be split over a small number of CPU's quite efficiently. Mpeg2enc can be internally set to split major processing tasks between a number of concurrent threads. This flag adjusts the multi-threading to the optimum to utilise the specified number of CPU's.
It should be noted that even with 1 CPU present some multi-threading is performed: frame input takes place in parallel with encoding. This allows good performance to be achieved when when a seperate machine is being used for pre-processing (decoding from MJPEG, scaling, denoising etc) with the final result pipe to mpeg2enc (e.g. using rsh or ssh).
Setting -M 0 disables all multithreading. This is sometimes useful for debugging or to achieve maximum CPU efficiency on a shared machine. Setting -M 3 on a dual-CPU machine will produce slightly faster results than -M 2 at the price of slightly less CPU efficiency. This is useful if nothing else needs to be done on the encoding machine. In practice there is little point setting -M greater than 4 even if the CPU's are available due to the fairly coarse-grained parallelism used.
Minimum quantization of the output stream. Quantisation controls the precision with which image information is encoded. The lower the number the higher the quality but the greater the required data-rate. For VCD resolution anything below 8 or so produces pretty good quality video. For SVCD the equivalent level is around 10. If this option is set a variable bit rate stream is produced. This is more efficient overall but variable bit-rate MPEG-1 cannot be played by many hardware decoders and/or DVD/(S)VCD drives. If you intend to use a software decoder you'd be insane not to use variable bit-rate.
If this option is set without a maximum bit-rate being specified then quantization is fixed at the specified value.
Set the sequence picture structure and block encoding type for MPEG-2 streams. By default, this value is inferred from the interlacing tag of the input stream. Setting 0 encodes frame-by-frame with support for interlaced video turned off, and specifies that progressive chroma subsampling has been used. Setting 1 encodes frame-by-frame with interlace-adapted motion compensation and block encoding, and specifies that interlaced chroma subsampling has been used. Setting 2 encodes interlaced material field-by-field, which will produce more accurate results for highly textured interlaced material with lots of motion, at the expense of generally less efficiency.
These flags set the minimum and maximum group-of-picture (GOP) size for the output MPEG stream. By default both are set to 12 to suit VCD encoding. If the two values are not identical mpeg2enc applies a simple scene-change detection procedure to try to select GOP sizes that ensure big changes of image coincide with the fully-encoded I-frame starting a new GOP. This can help prevent transient "blockiness". For VCD/SVCD/DVD the largest GOP size is 15 for PAL or 18 for NTSC. Reasonable minimum GOP sizes are 6 or 9. A larger GOP size can help reduce the bit-rate required for a given quality. However, this really only applies to high-quality source material with little noise (e.g. digital video). For broadcast material there is little point setting GOP size much beyond 21 or 24. Even with good source material diminishing returns set in quite rapidly.
Note: Mpeg2enc is currently hard-wired to produce 2 B frames between each I/P frame unless the GOP size forces less. This is reasonable for medium to high bit-rates (>= 1Mbps) but probably sub-optimal for low-bit-rate encoding.
Setting this flag causes the encoder to generate only "closed" GOPs (Groups of Pictures) that can be decoded without reference to their predecessor. This is useful for streams that are supposed to be used in multi-angle DVD's and applications where more easily edittable MPEG is required.
This flag forces the GOP size selection to choose sizes that ensure 2 B frames appear between adjacent I/P frames. Several common MPEG-1 decoders can't handle streams where less than 2 B-frames appear between I/P frames.
This flag sets the amount quantization is reduced for blocks containing unusually large amounts of sharp image detail. Setting this value 0.0 produces the most efficient use of bits but may cause visible/artifacting around detailed sections. A larger value may help suppress artifacts, however, on noisy source material this may cause a "swimming" effect on textured backgrounds as the noise cause blocks to be boosted at random. The default is 0.0 (off).
Luma variance below which quantization boost (-Q) is activated.
This flag sets the motion estimation search radius. For most purposes the default (16) should be just fine. For high-resolution MPEG-2 and active scenes it may be worth bumping it up. However, this will make encoding significantly slower. There is little point reducing the radius. Speed gains are not huge and the impact on quality can be marked.
These options control how radical the encoder is in throwing away apparently poor candidate estimates during motion estimation. A setting of 1 means very few blocks are discarded early which makes for slow encoding but quality as good as it gets. A setting of 4 makes for fast encoding but can impact quality. The -4 flag controls discarding during the initial 4*4 sub-sampled search stage, the -2 flag controls discarding during the secondary 2*2 sub-sampled stage.
Setting this flag adjusts the way texture detail is quantized to reduce the precision with which of high-frequency information encoded. This is very useful for mildy noisy sources. If you have really noisy material the filtering tools available in mjpegtools are a much better bet. The specified number must be in the range 0.0 to 2.0 gives the maxium quantization boost. A useful number to use would be 1.5 or 1.0.
Setting this flag makes the encoder encode as much high-frequency information as possible. This is a good setting for maximising quality at VCD resolution with good quality low-noise source material. It can also help with "swimmy" material if you can spare the bit-rate!
Specifies the precision of the DC component. The default is 9. Most commercial DVDs use 10. Using 9 instead saves a few bits. Using 10 might help to avoid looking larger areas of nearly the same color blocky. A value of 11 is only valid at the next MPEG-2 profile/level so it's not a currently meaningful value to use.
This flag allows the target size of individual sequences in the final multiplexed stream to be set in MBytes. If set mpeg2enc keeps track of how large the eventual stream is getting and inserts a sequence split (actually: sequence end / sequence start) into the output stream each time it reaches the specified limit. The multiplexer mplex(1) can recognise these splits and start a new multiplexed output file each time it encounters one. In this way it is easy to automatically ensure each component sequence file can be burnt onto a CD-R and still be played as a stand-alone MPEG sequence. For the SVCD and VCD profiles the default target sequence length is 700M bytes. For other profiles the default is that sequence length is unlimited.
Since mpeg2enc can't read minds it cannot know in advance what other material will be multiplexed with the output video stream. Thus to get its calculations of where to insert split point right it needs to be told the combined data-rate of the other material that is eventually to be multiplexed with the video. This flag allows this rate to be specified in K bits/sec.
A good rule of thumb is to use the total rate of all the other streams plus 1% of the total rate including video.
There really should be some kind of dynamic noise-reduction algorithm someplace in the mpegtools tool chain.
There needs to be a facility for writing dummy user-data fields so that the multiplexer/imager can insert forward/backward pointers when muxing/imaging an SVCD.
For more info, see our website at