原文链接:

https://stackoverflow.com/questions/32028437/what-are-bitstream-filters-in-ffmpeg

Let me explain by example. FFmpeg video decoders typically work by converting one video frame per call to avcodec_decode_video2. So the input is expected to be "one image" worth of bitstream data. Let's consider this issue of going from a file (an array of bytes of disk) to images for a second.

For "raw" (annexb) H264 (.h264/.bin/.264 files), the individual nal unit data (sps/pps header bitstreams or cabac-encoded frame data) is concatenated in a sequence of nal units, with a start code (00 00 01 XX) in between, where XX is the nal unit type. (In order to prevent the nal data itself to have 00 00 01 data, it is RBSP escaped.) So a h264 frame parser can simply cut the file at start code markers. They search for successive packets that start with and including 00 00 01, until and excluding the next occurence of 00 00 01. Then they parse the nal unit type and slice header to find which frame each packet belongs to, and return a set of nal units making up one frame as input to the h264 decoder.

H264 data in .mp4 files is different, though. You can imagine that the 00 00 01 start code can be considered redundant if the muxing format already has length markers in it, as is the case for mp4. So, to save 3 bytes per frame, they remove the 00 00 01 prefix. They also put the PPS/SPS in the file header instead of prepending it before the first frame, and these also miss their 00 00 01 prefixes. So, if I were to input this into the h264 decoder, which expects the prefixes for all nal units, it wouldn't work. The h264_mp4toannexb bitstream filter fixes this, by identifying the pps/sps in the extracted parts of the file header (ffmpeg calls this "extradata"), prepending this and each nal from individual frame packets with the start code, and concatenating them back together before inputting them in the h264 decoder.

You might now feel that there's a very fine line distinction between a "parser" and a "bitstream filter". This is true. I think the official definition is that a parser takes a sequence of input data and splits it in frames without discarding any data or adding any data. The only thing a parser does is change packet boundaries. A bitstream filter, on the other hand, is allowed to actually modify the data. I'm not sure this definition is entirely true (see e.g. vp9 below), but it's the conceptual reason mp4toannexb is a BSF, not a parser (because it adds 00 00 01 prefixes).

Other cases where such "bitstream tweaks" help keep decoders simple and uniform, but allow us to support all files variants that happen to exist in the wild:

  • mpeg4 (divx) b frame unpacking (to get B-frames sequences like IBP, which are coded as IPB, in AVI and get timestamps correct, people came up with this concept of B-frame packing where I-B-P / I-P-B is packed in frames as I-(PB)-(), i.e. the third packet is empty and the second has two frames. This means the timestamp associated with the P and B frame at the decoding phase is correct. It also means you have two frames worth of input data for one packet, which violates ffmpeg's one-frame-in-one-frame-out concept, so we wrote a bsf to split the packet back in two - along with deleting the marker that says that the packet contains two frames, hence a BSF and not a parser - before inputting it into the decoder. In practice, this solves otherwise hard problems with frame multithreading. VP9 does the same thing (called superframes), but splits frames in the parser, so the parser/BSF split isn't always theoretically perfect; maybe VP9's should be called a BSF)
  • hevc mp4 to annexb conversion (same story as above, but for hevc)
  • aac adts to asc conversion (this is basically the same as h264/hevc annexb vs. mp4, but for aac audio)
分离某些封装格式(例如MP4/FLV/MKV等)中的H.264的时候,需要首先写入SPS和PPS,否则会导致分离出来的数据没有SPS、PPS而无法播放。H.264码流的SPS和PPS信息存储在AVCodecContext结构体的extradata中。
需要使用FFmpeg中名称为 “h264_mp4toannexb" 等的 Bitstream Filter 处理。
 
旧的API已经被弃用,如
AVBitStreamFilterContext *av_bitstream_filter_init(const char *name);
int av_bitstream_filter_filter(AVBitStreamFilterContext *bsfc,
AVCodecContext *avctx, const char *args,
uint8_t **poutbuf, int *poutbuf_size,
const uint8_t *buf, int buf_size, int keyframe);

新版需要使用如下API实现功能:

// Get filter
const AVBitStreamFilter *av_bsf_next(void **opaque);
const AVBitStreamFilter *av_bsf_get_by_name(const char *name); // Init filter
int av_bsf_alloc(const AVBitStreamFilter *filter, AVBSFContext **ctx);
int avcodec_parameters_copy(AVCodecParameters *dst, const AVCodecParameters *src);
int av_bsf_init(AVBSFContext *ctx); // Use filter
int av_bsf_send_packet(AVBSFContext *ctx, AVPacket *pkt);
int av_bsf_receive_packet(AVBSFContext *ctx, AVPacket *pkt); // Free
void av_bsf_free(AVBSFContext **ctx);
 
 

FFmpeg—— Bitstream Filters 作用的更多相关文章

  1. iOS: FFMpeg编译和使用问题总结

    iOS: FFmpeg编译和使用问题总结 折磨了我近一周多时间的FFmpeg库编译问题终于解决了,必须得把这一段时间来遇到过的坑全写出来.如果急着解决问题,编译最新版本的FFmpeg库请直接看第二部分 ...

  2. FFmpeg解码H264及swscale缩放详解

    本文概要: 本文介绍著名开源音视频编解码库ffmpeg如何解码h264码流,比较详细阐述了其h264码流输入过程,解码原理,解码过程.同时,大部分应用环境下,以原始码流视频大小展示并不是最佳方式,因此 ...

  3. iOS: FFmpeg编译和使用 学习

    ffmpeg是一个多平台多媒体处理工具,处理视频和音频的功能非常强大.目前在网上搜到的iOS上使用FFMPEG的资料都比较陈旧,而FFMPEG更新迭代比较快: 且网上的讲解不够详细,对于初次接触FFM ...

  4. FFmpeg源代码简单分析:configure

    ===================================================== FFmpeg的库函数源代码分析文章列表: [架构图] FFmpeg源代码结构图 - 解码 F ...

  5. Xcode编译ffmpeg(2)

    iOS: FFmpeg编译和使用问题总结 折磨了我近一周多时间的FFmpeg库编译问题终于解决了,必须得把这一段时间来遇到过的坑全写出来.如果急着解决问题,编译最新版本的FFmpeg库请直接看第二部分 ...

  6. 【图像处理】FFmpeg解码H264及swscale缩放详解

      http://blog.csdn.net/gubenpeiyuan/article/details/19548019 主题 FFmpeg 本文概要: 本文介绍著名开源音视频编解码库ffmpeg如何 ...

  7. ffmpeg编译参数详解

    Usage: configure [options]用 法:configure [选项]Options: [defaults in brackets after descriptions]选   项: ...

  8. FFMPEG ./configure 参数及意义

    FFMPEG版本:2.6.2,编译环境:ubuntu 14.4. 不同版本的FFMPEG参数可能不同,可在FFMPEG目录下使用以下命令查看 ./configure --help --help pri ...

  9. [原]如何用Android NDK编译FFmpeg

    我们知道在Ubuntu下直接编译FFmpeg是很简单的,主要是先执行./configure,接着执行make命令来编译,完了紧接着执行make install执行安装.那么如何使用Android的ND ...

随机推荐

  1. ArcScene 创建三维模型数据

    1. 拉伸 添加面元素图层 在图层上右键----属性 , 设置拉伸值,可以输入固定值或者选择字段值. 2. 导入   3DMAX 的 3ds 文件,和 Google SketchUp 的skp文件, ...

  2. elasitic search fresh flush segment merge

    new document首先在in memory buffer 中 (1)fresh 触发条件:默认one second 执行一次 执行过程:将memory buffer中documents 写入至f ...

  3. MySQL数据库渗透及漏洞利用总结

    Mysql数据库是目前世界上使用最为广泛的数据库之一,很多著名公司和站点都使用Mysql作为其数据库支撑,目前很多架构都以Mysql作为数据库管理系统,例如LAMP.和WAMP等,在针对网站渗透中,很 ...

  4. Web_0003:关于PHP上传文件大小的限制

    相关设置如下: 1,file_uploads = on  是否允许通过HTTP上传文件的开关,默认为ON即是开 2,upload_max_filesize = 8m ; 即允许上传文件大小的最大值.默 ...

  5. Python之三:运算符与表达式

    1.运算符: 1.1.运算符种类: 运算符  名称  说明  例子  + 加    5+4  - 减      *  乘      /  除      //  取整除  商的整数部分  3//2,结果 ...

  6. MyEclipse CI 2019.4 完美激活版(含离线包+激活工具+安装教程)

    ps:MyEclipse目前已更新至2019.12.5,但是目前还没有有效的激活方式,本次文章以2019.4为例; 该文章使用离线安装的方式进行安装,在线安装不保证不会出现各类小问题,离线安装包及激活 ...

  7. 【Unity|C#】基础篇(20)——枚举器与迭代器(IEnumerable/IEnumerator)

    [学习资料] <C#图解教程>(第18章):https://www.cnblogs.com/moonache/p/7687551.html 电子书下载:https://pan.baidu. ...

  8. 曼孚科技:AI领域9种常见的监督学习算法

    监督学习是机器学习中一种十分重要的算法.与无监督学习相比,监督学习有明确的目标.​ 分类与回归是监督学习两个主要任务,常见的监督学习算法主要有以下9种: 1 朴素贝叶斯 分类 2 决策树 分类 3 支 ...

  9. 19新生赛 谁更nb

    题目描述: 有一堆石子共有N个.syx xxh两个人轮流拿,syx先拿.每次最少拿1颗,最多拿K颗,拿到最后1颗石子的人获 胜.syx xxh都非常聪明,拿石子的过程中不会出现失误.给出N和K,问最后 ...

  10. C++-基类的析构函数为什么要加virtual虚析构函数(转)

    知识背景 要弄明白这个问题,首先要了解下C++中的动态绑定. 关于动态绑定的讲解,请参阅:  C++中的动态类型与动态绑定.虚函数.多态实现 正题 直接的讲,C++中基类采用virtual虚析构函数是 ...