Time is expressed as integer multiples of arbitrary units of time called a
time_base. There are different contexts that have different time bases:
>>> fh = av.open(path) >>> video = fh.streams.video >>> video.time_base Fraction(1, 25) >>> video.codec_context.time_base Fraction(1, 50)
Attributes that represent time on those objects will be in that object’s
>>> video.duration 168 >>> float(video.duration * video.time_base) 6.72
Packet has a
Packet.pts (“presentation” time stamp), and
Frame has a
Frame.dts (“presentation” and “decode” time stamps). Both have a
time_base attribute, but it defaults to the time base of the object that handles them. For packets that is streams. For frames it is streams when decoding, and codec contexts when encoding (which is strange, but it is what it is).
In many cases a stream has a time base of
1 / frame_rate, and then its frames have incrementing integers for times (0, 1, 2, etc.). Those frames take place at
pts * time_base or
0 / frame_rate,
1 / frame_rate,
2 / frame_rate, etc..
>>> p, f = get_nth_packet_and_frame(fh, skip=1) >>> p.time_base Fraction(1, 25) >>> p.dts 1 >>> f.time_base Fraction(1, 25) >>> f.pts 1
Frame.time is a
float in seconds:
>>> f.time 0.04
Time in FFmpeg is not 100% clear to us (see Authority of Documentation). At times the FFmpeg documentation and canonical seeming posts in the forums appear contradictory. We’ve experiemented with it, and what follows is the picture that we are operating under.
When there is no
time_base (such as on AVFormatContext), there is an implicit
For encoding, you (the PyAV developer / FFmpeg “user”) must set AVCodecContext.time_base, ideally to the inverse of the frame rate (or so the library docs say to do if your frame rate is fixed; we’re not sure what to do if it is not fixed), and you may set AVStream.time_base as a hint to the muxer. After you open all the codecs and call avformat_write_header, the stream time base may change, and you must respect it. We don’t know if the codec time base may change, so we will make the safer assumption that it may and respect it as well.
You then prepare AVFrame.pts in AVCodecContext.time_base. The encoded AVPacket.pts is simply copied from the frame by the library, and so is still in the codec’s time base. You must rescale it to AVStream.time_base before muxing (as all stream operations assume the packet time is in stream time base).
For fixed-fps content your frames’
pts would be the frame or sample index (for video and audio, respectively). PyAV should attempt to do this.