1、容器/文件 (Container/file) ,既多媒体源文件
二、FFmpeg 基础---FFmpeg中重要的几个数据结构
- typedef struct AVCodecContext {
- ......
- /**
- * some codecs need / can use extradata like Huffman tables.
- * mjpeg: Huffman tables
- * rv10: additional flags
- * mpeg4: global headers (they can be in the bitstream or here)
- * The allocated memory should be FF_INPUT_BUFFER_PADDING_SIZE bytes larger
- * than extradata_size to avoid prolems if it is read with the bitstream reader.
- * The bytewise contents of extradata must not depend on the architecture or CPU endianness.
- * - encoding: Set/allocated/freed by libavcodec.
- * - decoding: Set/allocated/freed by user.
- */
- uint8_t *extradata;
- int extradata_size;
- /**
- * This is the fundamental unit of time (in seconds) in terms
- * of which frame timestamps are represented. For fixed-fps content,
- * timebase should be 1/framerate and timestamp increments should be
- * identically 1.
- * - encoding: MUST be set by user.
- * - decoding: Set by libavcodec.
- */
- AVRational time_base;
- /* video only */
- /**
- * picture width / height.
- * - encoding: MUST be set by user.
- * - decoding: Set by libavcodec.
- * Note: For compatibility it is possible to set this instead of
- * coded_width/height before decoding.
- */
- int width, height;
- ......
- /* audio only */
- int sample_rate; ///< samples per second
- int channels; ///< number of audio channels
- /**
- * audio sample format
- * - encoding: Set by user.
- * - decoding: Set by libavcodec.
- */
- enum SampleFormat sample_fmt; ///< sample format
- /* The following data should not be initialized. */
- /**
- * Samples per packet, initialized when calling 'init'.
- */
- int frame_size;
- int frame_number; ///< audio or video frame number
- ......
- char codec_name[32];
- enum AVMediaType codec_type; /* see AVMEDIA_TYPE_xxx */
- enum CodecID codec_id; /* see CODEC_ID_xxx */
- /**
- * fourcc (LSB first, so "ABCD" -> ('D'<<24) + ('C'<<16) + ('B'<<8) + 'A').
- * This is used to work around some encoder bugs.
- * A demuxer should set this to what is stored in the field used to identify the codec.
- * If there are multiple such fields in a container then the demuxer should choose the one
- * which maximizes the information about the used codec.
- * If the codec tag field in a container is larger then 32 bits then the demuxer should
- * remap the longer ID to 32 bits with a table or other structure. Alternatively a new
- * extra_codec_tag + size could be added but for this a clear advantage must be demonstrated
- * first.
- * - encoding: Set by user, if not then the default based on codec_id will be used.
- * - decoding: Set by user, will be converted to uppercase by libavcodec during init.
- */
- unsigned int codec_tag;
- ......
- /**
- * Size of the frame reordering buffer in the decoder.
- * For MPEG-2 it is 1 IPB or 0 low delay IP.
- * - encoding: Set by libavcodec.
- * - decoding: Set by libavcodec.
- */
- int has_b_frames;
- /**
- * number of bytes per packet if constant and known or 0
- * Used by some WAV based audio codecs.
- */
- int block_align;
- ......
- /**
- * bits per sample/pixel from the demuxer (needed for huffyuv).
- * - encoding: Set by libavcodec.
- * - decoding: Set by user.
- */
- int bits_per_coded_sample;
- ......
- } AVCodecContext;
- extradata/extradata_size:
续从已经解复用出来的媒体流中继续寻找。在没有找到任何额外信息的情况下,这个buffer指针为空。 - time_base:编解码器的时间基准,实际上就是视频的帧率(或场率)。
- width/height:视频的宽和高。
- sample_rate/channels:音频的采样率和信道数目。
- sample_fmt: 音频的原始采样格式。
- codec_name/codec_type/codec_id/codec_tag:编解码器的信息。
2、AVStream 该结构体描述一个媒体流
- typedef struct AVStream {
- int index; /**< stream index in AVFormatContext */
- int id; /**< format-specific stream ID */
- AVCodecContext *codec; /**< codec context */
- /**
- * Real base framerate of the stream.
- * This is the lowest framerate with which all timestamps can be
- * represented accurately (it is the least common multiple of all
- * framerates in the stream). Note, this value is just a guess!
- * For example, if the time base is 1/90000 and all frames have either
- * approximately 3600 or 1800 timer ticks, then r_frame_rate will be 50/1.
- */
- AVRational r_frame_rate;
- ......
- /**
- * This is the fundamental unit of time (in seconds) in terms
- * of which frame timestamps are represented. For fixed-fps content,
- * time base should be 1/framerate and timestamp increments should be 1.
- */
- AVRational time_base;
- ......
- /**
- * Decoding: pts of the first frame of the stream, in stream time base.
- * Only set this if you are absolutely 100% sure that the value you set
- * it to really is the pts of the first frame.
- * This may be undefined (AV_NOPTS_VALUE).
- * @note The ASF header does NOT contain a correct start_time the ASF
- * demuxer must NOT set this.
- */
- int64_t start_time;
- /**
- * Decoding: duration of the stream, in stream time base.
- * If a source file does not specify a duration, but does specify
- * a bitrate, this value will be estimated from bitrate and file size.
- */
- int64_t duration;
- char language[4]; /** ISO 639-2/B 3-letter language code (empty string if undefined) */
- #endif
- /* av_read_frame() support */
- enum AVStreamParseType need_parsing;
- struct AVCodecParserContext *parser;
- ......
- /* av_seek_frame() support */
- AVIndexEntry *index_entries; /**< Only used if the format does not
- support seeking natively. */
- int nb_index_entries;
- unsigned int index_entries_allocated_size;
- int64_t nb_frames; ///< number of frames in this stream if known or 0
- ......
- /**
- * Average framerate
- */
- AVRational avg_frame_rate;
- ......
- } AVStream;
- index/id:index对应流的索引,这个数字是自动生成的,根据index可以从AVFormatContext::streams表中索引到该流;而id则是流的标识,依赖于具体的容器格式。比如对于MPEG TS格式,id就是pid。
- time_base:流的时间基准,是一个实数,该流中媒体数据的pts和dts都将以这个时间基准为粒度。通常,使用av_rescale/av_rescale_q可以实现不同时间基准的转换。
- start_time:流的起始时间,以流的时间基准为单位,通常是该流中第一个帧的pts。
- duration:流的总时间,以流的时间基准为单位。
- need_parsing:对该流parsing过程的控制域。
- nb_frames:流内的帧数目。
- r_frame_rate/framerate/avg_frame_rate:帧率相关。
- codec:指向该流对应的AVCodecContext结构,调用avformat_open_input时生成。
- parser:指向该流对应的AVCodecParserContext结构,调用avformat_find_stream_info时生成。
- typedef struct AVFormatContext {
- const AVClass *av_class; /**< Set by avformat_alloc_context. */
- /* Can only be iformat or oformat, not both at the same time. */
- struct AVInputFormat *iformat;
- struct AVOutputFormat *oformat;
- void *priv_data;
- ByteIOContext *pb;
- unsigned int nb_streams;
- AVStream *streams[MAX_STREAMS];
- char filename[1024]; /**< input or output filename */
- /* stream info */
- int64_t timestamp;
- char title[512];
- char author[512];
- char copyright[512];
- char comment[512];
- char album[512];
- int year; /**< ID3 year, 0 if none */
- int track; /**< track number, 0 if none */
- char genre[32]; /**< ID3 genre */
- #endif
- int ctx_flags; /**< Format-specific flags, see AVFMTCTX_xx */
- /* private data for pts handling (do not modify directly). */
- /** This buffer is only needed when packets were already buffered but
- not decoded, for example to get the codec parameters in MPEG
- streams. */
- struct AVPacketList *packet_buffer;
- /** Decoding: position of the first frame of the component, in
- AV_TIME_BASE fractional seconds. NEVER set this value directly:
- It is deduced from the AVStream values. */
- int64_t start_time;
- /** Decoding: duration of the stream, in AV_TIME_BASE fractional
- seconds. Only set this value if you know none of the individual stream
- durations and also dont set any of them. This is deduced from the
- AVStream values if not set. */
- int64_t duration;
- /** decoding: total file size, 0 if unknown */
- int64_t file_size;
- /** Decoding: total stream bitrate in bit/s, 0 if not
- available. Never set it directly if the file_size and the
- duration are known as FFmpeg can compute it automatically. */
- int bit_rate;
- /* av_read_frame() support */
- AVStream *cur_st;
- const uint8_t *cur_ptr_deprecated;
- int cur_len_deprecated;
- AVPacket cur_pkt_deprecated;
- #endif
- /* av_seek_frame() support */
- int64_t data_offset; /** offset of the first packet */
- int index_built;
- int mux_rate;
- unsigned int packet_size;
- int preload;
- int max_delay;
- /** number of times to loop output in formats that support it */
- int loop_output;
- int flags;
- #define AVFMT_FLAG_GENPTS 0x0001 ///< Generate missing pts even if it requires parsing future frames.
- #define AVFMT_FLAG_IGNIDX 0x0002 ///< Ignore index.
- #define AVFMT_FLAG_NONBLOCK 0x0004 ///< Do not block when reading packets from input.
- #define AVFMT_FLAG_IGNDTS 0x0008 ///< Ignore DTS on frames that contain both DTS & PTS
- #define AVFMT_FLAG_NOFILLIN 0x0010 ///< Do not infer any values from other values, just return what is stored in the container
- #define AVFMT_FLAG_NOPARSE 0x0020 ///< Do not use AVParsers, you also must set AVFMT_FLAG_NOFILLIN as the fillin code works on frames and no parsing -> no frames. Also seeking to frames can not work if parsing to find frame boundaries has been disabled
- #define AVFMT_FLAG_RTP_HINT 0x0040 ///< Add RTP hinting to the output file
- int loop_input;
- /** decoding: size of data to probe; encoding: unused. */
- unsigned int probesize;
- /**
- * Maximum time (in AV_TIME_BASE units) during which the input should
- * be analyzed in avformat_find_stream_info().
- */
- int max_analyze_duration;
- const uint8_t *key;
- int keylen;
- unsigned int nb_programs;
- AVProgram **programs;
- /**
- * Forced video codec_id.
- * Demuxing: Set by user.
- */
- enum CodecID video_codec_id;
- /**
- * Forced audio codec_id.
- * Demuxing: Set by user.
- */
- enum CodecID audio_codec_id;
- /**
- * Forced subtitle codec_id.
- * Demuxing: Set by user.
- */
- enum CodecID subtitle_codec_id;
- /**
- * Maximum amount of memory in bytes to use for the index of each stream.
- * If the index exceeds this size, entries will be discarded as
- * needed to maintain a smaller size. This can lead to slower or less
- * accurate seeking (depends on demuxer).
- * Demuxers for which a full in-memory index is mandatory will ignore
- * this.
- * muxing : unused
- * demuxing: set by user
- */
- unsigned int max_index_size;
- /**
- * Maximum amount of memory in bytes to use for buffering frames
- * obtained from realtime capture devices.
- */
- unsigned int max_picture_buffer;
- unsigned int nb_chapters;
- AVChapter **chapters;
- /**
- * Flags to enable debugging.
- */
- int debug;
- #define FF_FDEBUG_TS 0x0001
- /**
- * Raw packets from the demuxer, prior to parsing and decoding.
- * This buffer is used for buffering packets until the codec can
- * be identified, as parsing cannot be done without knowing the
- * codec.
- */
- struct AVPacketList *raw_packet_buffer;
- struct AVPacketList *raw_packet_buffer_end;
- struct AVPacketList *packet_buffer_end;
- AVMetadata *metadata;
- /**
- * Remaining size available for raw_packet_buffer, in bytes.
- */
- #define RAW_PACKET_BUFFER_SIZE 2500000
- int raw_packet_buffer_remaining_size;
- /**
- * Start time of the stream in real world time, in microseconds
- * since the unix epoch (00:00 1st January 1970). That is, pts=0
- * in the stream was captured at this real world time.
- * - encoding: Set by user.
- * - decoding: Unused.
- */
- int64_t start_time_realtime;
- } AVFormatContext;
- nb_streams和streams所表示的AVStream结构指针数组包含了所有内嵌媒体流的描述;
- iformat和oformat指向对应的demuxer和muxer指针;
- pb则指向一个控制底层数据读写的ByteIOContext结构。
- start_time和duration是从streams数组的各个AVStream中推断出的多媒体文件的起始时间和长度,以微妙为单位。
- probesize
- mux_rate
- packet_size
- flags
- max_analyze_duration
- key
- max_index_size
- max_picture_buffer
- max_delay
- typedef struct AVPacket {
- /**
- * Presentation timestamp in AVStream->time_base units; the time at which
- * the decompressed packet will be presented to the user.
- * Can be AV_NOPTS_VALUE if it is not stored in the file.
- * pts MUST be larger or equal to dts as presentation cannot happen before
- * decompression, unless one wants to view hex dumps. Some formats misuse
- * the terms dts and pts/cts to mean something different. Such timestamps
- * must be converted to true pts/dts before they are stored in AVPacket.
- */
- int64_t pts;
- /**
- * Decompression timestamp in AVStream->time_base units; the time at which
- * the packet is decompressed.
- * Can be AV_NOPTS_VALUE if it is not stored in the file.
- */
- int64_t dts;
- uint8_t *data;
- int size;
- int stream_index;
- int flags;
- /**
- * Duration of this packet in AVStream->time_base units, 0 if unknown.
- * Equals next_pts - this_pts in presentation order.
- */
- int duration;
- void (*destruct)(struct AVPacket *);
- void *priv;
- int64_t pos; ///< byte position in stream, -1 if unknown
- /**
- * Time difference in AVStream->time_base units from the pts of this
- * packet to the point at which the output from the decoder has converged
- * independent from the availability of previous frames. That is, the
- * frames are virtually identical no matter if decoding started from
- * the very first frame or from this keyframe.
- * Is AV_NOPTS_VALUE if unknown.
- * This field is not the display duration of the current packet.
- *
- * The purpose of this field is to allow seeking in streams that have no
- * keyframes in the conventional sense. It corresponds to the
- * recovery point SEI in H.264 and match_time_delta in NUT. It is also
- * essential for some types of subtitle streams to ensure that all
- * subtitles are correctly displayed after seeking.
- */
- int64_t convergence_duration;
- } AVPacket;
- dts表示解码时间戳,pts表示显示时间戳,它们的单位是所属媒体流的时间基准。
- stream_index给出所属媒体流的索引;
- data为数据缓冲区指针,size为长度;
- duration为数据的时长,也是以所属媒体流的时间基准为单位;
- pos表示该数据在媒体流中的字节偏移量;
- destruct为用于释放数据缓冲区的函数指针;
- flags为标志域,其中,最低为置1表示该数据是一个关键帧。
三、时间信息 / 多媒体同步
List、MP4的moov box,还有一种相对复杂的方案是将时间信息嵌入媒体流的内部,如MPEG TS和Real video,这种方案可以处理变速率的媒体,亦可有效避免同步过程中的时间漂移。
显示时间标签。对于声音来说 ,这两个时间标签是相同的,但对于某些视频编码格式,由于采用了双向预测技术,会造成DTS和PTS的不一致。
- 图像类型: I P P P P P P ... I P P
- DTS: 0 1 2 3 4 5 6... 100 101 102
- PTS: 0 1 2 3 4 5 6... 100 101 102
- 图像类型: I P B B P B B ... I P B
- DTS: 0 1 2 3 4 5 6 ... 100 101 102
- PTS: 0 3 1 2 6 4 5 ... 100 104 102
- 解码器输入:I P B B P B B
- (DTS) 0 1 2 3 4 5 6
- (PTS) 0 3 1 2 6 4 5
- 解码器输出:X I B B P B B P
- (PTS) X 0 1 2 3 4 5 6
通过调用avformat_find_stream_info,多媒体应用可以从AVFormatContext对象中拿到媒体文件的时间信息:主要是总 时间长度和开始时间,此外还有与时间信息相关的比特率和文件大小。其中时间信息的单位是AV_TIME_BASE:微秒。
- ffmpeg基础
背景知识ffmpeg是一款领先的流媒体处理框架,支持编码,解码,转码等功能并可以在linux, Mac OS X, Microsoft Windows编译运行,用它做播放器的有:ffplay,射手播放 ...
- FFmpeg Basics学习笔记(1)ffmpeg基础
1 FFmpeg的由来 FFmpeg缩写中,FF指的是Fast Forward,mpeg是 Moving Pictures Experts Group的缩写.官网:ffmpeg.org 编译好的可执行 ...
- FFmpeg基础库编程开发学习笔记——视频常见格式
声明一下:这些关于ffmpeg的文章仅仅是用于记录我的学习历程和以便于以后查阅,文章中的一些文字可能是直接摘自于其它文章.书籍或者文献,学习ffmpeg相关知识是为了使用在Android上,我也才是刚 ...
- FFmpeg基础知识之————H264编码profile & level控制
H.264有四种画质级别,分别是baseline, extended, main, high: 1.Baseline Profile:基本画质.支持I/P 帧,只支持无交错(Progressive)和 ...
- ffmpeg基础与编译_在VS2008下调试output_example.c(详细步骤)
注意:这个是编译Debug版本的.必要资源:FFMPEG SDK 3.2(已经编译好的,可以去http://www.bairuitech.com/html/ruanjianxiazai/ffmpeg/ ...
- FFmpeg基础库编程开发学习笔记——音频常见格式及字幕格式
声明一下:这些关于ffmpeg的文章仅仅是用于记录我的学习历程和以便于以后查阅,文章中的一些文字可能是直接摘自于其它文章.书籍或者文献,学习ffmpeg相关知识是为了使用在Android上,我也才是刚 ...
- ffmpeg基础使用
- 【转】FFmpeg 基本用法
FFmpeg FFmpeg 基本用法 本课要解决的问题 1.FFmpeg的转码流程是什么? 2.常见的视频格式包含哪些内容吗? 3.如何把这些内容从视频文件中抽取出来? 4.如何从一种格式转换为另一种 ...
- ffmpeg基本用法
FFmpeg FFmpeg 基本用法 本课要解决的问题 1.FFmpeg的转码流程是什么? 2.常见的视频格式包含哪些内容吗? 3.如何把这些内容从视频文件中抽取出来? 4.如何从一种格式转换为另一种 ...
- sed练习2
[root@node2 ~]# cp /etc/passwd /server/scprits/ [root@node2 ~]# cd /server/scprits/ [root@node2 scpr ...
- JS中的箭头函数与this
转载自:https://juejin.im/post/5aa1eb056fb9a028b77a66fd#heading-1 JavaScript在ES6语法中新增了箭头函数,相较于传统函数,箭头函数不 ...
- openlayers在底图上添加静态icon
越学习openlayer你会发现openlayer是真的很强大,今天记录一下学习的成果,需求是做那种室内的CAD的场景然后里面展示人员icon并且实时展示人员的位置信息,以及点击弹出对应人员的一些位置 ...
- Mysql 视图&事务&触发器
参考资料 一.视图 视图的含义: 视图是一个虚拟表,是从数据库中一个或者多个表中导出来的表. 1.创建视图 #语法:CREATE VIEW 视图名称 AS SQL语句 create view teac ...
- Spring Cloud微服务安全实战_4-4_OAuth2协议与微服务安全
接上篇文章,在这个流程中,PostMan可以代表客户端应用,订单服务是资源服务器,唯一缺少的是 认证服务器 ,下面来搭建认证服务器 项目结构: Pom.xml : DependencyManager ...
- STL——sort函数的实现原理
实现原理 sort结合了快速排序.堆排序.直接插入排序三种排序方法. 根据不同的数量级别以及不同情况,能自动选用合适的排序方法.当数据量较大时采用快速排序,分段递归.一旦分段后的数据量小于某个阀值,为 ...
- [LeetCode] 16. 3Sum Closest 最近三数之和
Given an array nums of n integers and an integer target, find three integers in nums such that the s ...
- [LeetCode] 12. Integer to Roman 整数转化成罗马数字
Roman numerals are represented by seven different symbols: I, V, X, L, C, D and M. Symbol Value I 1 ...
- 用Python搞定九宫格式的朋友圈。内附“马云”朋友圈
PIL(Python Imaging Library)是一个非常强大的Python库,但是它支持Python2.X, 在Python3中则使用的是Pillow库,它是从PIL中fork出来的一个分支. ...
- 结合Spring实现策略模式
最近系统需要对不同维度的数据进行差异化计算,也就会使用不同算法.为了以后更加容易扩展,结合Spring框架及策略模式对实现架构做了系统设计. 1. 定义策略接口(Strategy): import c ...