首先来看mp4的封装格式,mp4数据都被放在一个个的箱子当中,也就是box,box的字节序为网络字节序,也就是大端存储,box由header和body组成,header指明box的大小和类型,body根据header的类型存储对应的内容。

box size有三种可能:

box开头的4个字节为box size,该大小包括box header以及整个box的大小,这样我们就可以在文件中定位各个box

box size为1,则表明这个box的大小为large size(mdat)

box size为0,表明这个box是文件的最后一个box,文件结尾即box的结尾

box size后面紧接着是32位的box type,一般为4个字符,比如ftyp moov等(整个box header为8字节),来看看比较重要的box type:

ftyp box:file type,该box只能有一个,该box应该被放在文件的最开始,指示该mp4文件应用的相关信息,不能被其他box包含;

moov box:一种容器箱子container box,意思是该box中装的是box,该box中包含有文件媒体的元数据信息,具体信息要通过解析子box获得;该box只有一个,并且不能被其他box包含;一般情况下会包含一个mvhd子box和若干trak子box;该box是解析mp4文件最重要的一个box,包含了音视频数据的编码格式、音视频数据样本、chunks大小、存储位置(offset,为音视频每帧数据在mdat box中的具体位置)、DTS、PTS等;

mvhd box:movie header box,描述了具体音频或视频流无关的文件整体信息,duration为媒体时长和timescale为时长单位

trak box:track box,它是一个container box,包含了该track的媒体数据的引用和描述。trak box必须韩寒有一个tkhd 和 一个mdia 子box

tkhd box:trak header box,描述track的信息的box,如果是视频会有宽高信息

elst box:记录了流的起始时间,该值可用来计算PTS和DTS

mdia box:track media structure 描述了这条音视频track的媒体数据样本的主要信息,非常重要!同样它也是一个container box,包含有mdhd、hdlr、minf等box

mdhd box:存储有当前track的timescale 和 duration信息,这里的timescale和duration和mvhd box中是不一样的,这里的信息是当前track用于计算媒体时长的信息,计算真正的duration需要用该值除以timescale

hdlr box:存储了当前track的stream type,是video还是audio,但是在MPEG4Extractor中似乎并不是按照这个信息来判断audio和video的

stbl box:子box中存储了codec type以及相关信息,每帧视频在文件中的位置以及PTS等信息

stsd box:该box的子box用于存储当前track的编码类型,如果是avc那么它的子box avcC会存储有SPS、PPS等信息

stts box:decoding time to samp box,保存有参数对sample_count 和 sample_delta,sample_delta可以理解为sample的持续时间,除以mdhd中的timescale就是真实时间,1/(sample_delta / timescale)这样就可以计算出帧率了

stss box:sync sample box,存放了关键帧的序号,seek时需要从关键帧开始解码,里面有个entry count表示关键帧数量

ctss box:composition time to sample box,表示PTS和DTS之间的差值,如果没有该box,说明不存在B帧,PTS等于DTS;DTS计算方法sample_delta * sample_cnt - start_time,如果有B帧那么PTS计算方法为DTS+composition_offset

stsc box:sample to chunk box,媒体数据样本被打包进chunks,chunks和样本samples大小不固定,该box说明chunks关联样本的信息

stsz box:sample size box,记录了每个样本的大小,

stco box:chunk offset box,描述每个chunk相对文件的偏移量,需要根据stsc中的信息计算每个sample对应的offset

参考:mp4封装格式各box类型讲解及IBP帧计算 - 知乎 (zhihu.com)

参考:视频解码研究之PTS(2)Mp4格式,AVI格式和MKV格式_面海烹鲜的博客-CSDN博客_avi pts

MP4在线解析:Online Mp4 Parser

接下来看看MPEG4Extractor中是如何解析文件的。

status_t MPEG4Extractor::parseChunk(off64_t *offset, int depth) {
ALOGV("entering parseChunk %lld/%d", (long long)*offset, depth); if (*offset < 0) {
ALOGE("b/23540914");
return ERROR_MALFORMED;
}
if (depth > 100) {
ALOGE("b/27456299");
return ERROR_MALFORMED;
} // 先读取8个字节,前4个字节为box size,后4个字节为box type
uint32_t hdr[2];
if (mDataSource->readAt(*offset, hdr, 8) < 8) {
return ERROR_IO;
}
uint64_t chunk_size = ntohl(hdr[0]);
int32_t chunk_type = ntohl(hdr[1]);
off64_t data_offset = *offset + 8; // 如果truck size 为1,说明为mdat box,这个box的最小值为16
if (chunk_size == 1) {
if (mDataSource->readAt(*offset + 8, &chunk_size, 8) < 8) {
return ERROR_IO;
}
chunk_size = ntoh64(chunk_size);
data_offset += 8; if (chunk_size < 16) {
// The smallest valid chunk is 16 bytes long in this case.
return ERROR_MALFORMED;
}
} else if (chunk_size == 0) { // 如果chunk_size 为 0 说明当前为最后一个box
if (depth == 0) {
// atom extends to end of file
off64_t sourceSize;
if (mDataSource->getSize(&sourceSize) == OK) {
chunk_size = (sourceSize - *offset); // 最后一个box的size需要根据文件大小来判断
} else {
// XXX could we just pick a "sufficiently large" value here?
ALOGE("atom size is 0, and data source has no size");
return ERROR_MALFORMED;
}
} else {
// not allowed for non-toplevel atoms, skip it
*offset += 4;
return OK;
}
} else if (chunk_size < 8) {
// The smallest valid chunk is 8 bytes long.
ALOGE("invalid chunk size: %" PRIu64, chunk_size);
return ERROR_MALFORMED;
} char chunk[5];
// 将type转换为ASSIC码
MakeFourCCString(chunk_type, chunk);
ALOGV("chunk: %s @ %lld, %d", chunk, (long long)*offset, depth); if (kUseHexDump) {
static const char kWhitespace[] = " ";
const char *indent = &kWhitespace[sizeof(kWhitespace) - 1 - 2 * depth];
printf("%sfound chunk '%s' of size %" PRIu64 "\n", indent, chunk, chunk_size); char buffer[256];
size_t n = chunk_size;
if (n > sizeof(buffer)) {
n = sizeof(buffer);
}
if (mDataSource->readAt(*offset, buffer, n)
< (ssize_t)n) {
return ERROR_IO;
} hexdump(buffer, n);
} PathAdder autoAdder(&mPath, chunk_type); // (data_offset - *offset) is either 8 or 16
// 计算box中的数据的长度,data_offset为读取的位置,offset为起始位置
off64_t chunk_data_size = chunk_size - (data_offset - *offset);
if (chunk_data_size < 0) {
ALOGE("b/23540914");
return ERROR_MALFORMED;
} // 检查box的大小,如果不是mdat,但是其数据大小超过一定范围说明这个box存在问题
if (chunk_type != FOURCC("mdat") && chunk_data_size > kMaxAtomSize) {
char errMsg[100];
sprintf(errMsg, "%s atom has size %" PRId64, chunk, chunk_data_size);
ALOGE("%s (b/28615448)", errMsg);
android_errorWriteWithInfoLog(0x534e4554, "28615448", -1, errMsg, strlen(errMsg));
return ERROR_MALFORMED;
} // 不去研究这个box
if (chunk_type != FOURCC("cprt")
&& chunk_type != FOURCC("covr")
&& mPath.size() == 5 && underMetaDataPath(mPath)) {
off64_t stop_offset = *offset + chunk_size;
*offset = data_offset;
while (*offset < stop_offset) {
status_t err = parseChunk(offset, depth + 1);
if (err != OK) {
return err;
}
} if (*offset != stop_offset) {
return ERROR_MALFORMED;
} return OK;
} switch(chunk_type) {
case FOURCC("moov"):
case FOURCC("trak"):
case FOURCC("mdia"):
case FOURCC("minf"):
case FOURCC("dinf"):
case FOURCC("stbl"):
case FOURCC("mvex"):
case FOURCC("moof"):
case FOURCC("traf"):
case FOURCC("mfra"):
case FOURCC("udta"):
case FOURCC("ilst"):
case FOURCC("sinf"):
case FOURCC("schi"):
case FOURCC("edts"):
case FOURCC("wave"):
{
// 如果是moov box,但是其深度不为0,意思是moov box在一个container box中,那么就报错
if (chunk_type == FOURCC("moov") && depth != 0) {
ALOGE("moov: depth %d", depth);
return ERROR_MALFORMED;
}
// 如果是moov box,但是已经初始化完毕了,说明前面已经解析过一个moov了,那也是不对的
if (chunk_type == FOURCC("moov") && mInitCheck == OK) {
ALOGE("duplicate moov");
return ERROR_MALFORMED;
} if (chunk_type == FOURCC("moof") && !mMoofFound) {
// store the offset of the first segment
mMoofFound = true;
mMoofOffset = *offset;
} if (chunk_type == FOURCC("stbl")) {
ALOGV("sampleTable chunk is %" PRIu64 " bytes long.", chunk_size); if (mDataSource->flags()
& (DataSourceBase::kWantsPrefetching
| DataSourceBase::kIsCachingDataSource)) {
CachedRangedDataSource *cachedSource =
new CachedRangedDataSource(mDataSource); if (cachedSource->setCachedRange(
*offset, chunk_size,
true /* assume ownership on success */) == OK) {
mDataSource = cachedSource;
} else {
delete cachedSource;
}
} if (mLastTrack == NULL) {
return ERROR_MALFORMED;
}
// 扫描到stbl之后为Track创建一个SampleTable,后面来看这个SampleTable做什么用的
mLastTrack->sampleTable = new SampleTable(mDataSource);
} bool isTrack = false;
if (chunk_type == FOURCC("trak")) {
if (depth != 1) {
ALOGE("trak: depth %d", depth);
return ERROR_MALFORMED;
}
isTrack = true;
// 扫描到trak box,则在Track链表上添加一个节点
ALOGV("adding new track");
Track *track = new Track;
if (mLastTrack) {
mLastTrack->next = track;
} else {
mFirstTrack = track;
}
mLastTrack = track; track->meta = AMediaFormat_new();
// 给track设置一个默认的mime
AMediaFormat_setString(track->meta,
AMEDIAFORMAT_KEY_MIME, "application/octet-stream");
} // 上面的box type都是conatiner box,这里会去递归解析子box
off64_t stop_offset = *offset + chunk_size;
*offset = data_offset; // 子box的起始位置起始就是原先的起始位置 + box header length(8)
while (*offset < stop_offset) { // pass udata terminate
if (mIsQT && stop_offset - *offset == 4 && chunk_type == FOURCC("udta")) {
// handle the case that udta terminates with terminate code x00000000
// note that 0 terminator is optional and we just handle this case.
uint32_t terminate_code = 1;
mDataSource->readAt(*offset, &terminate_code, 4);
if (0 == terminate_code) {
*offset += 4;
ALOGD("Terminal code for udta");
continue;
} else {
ALOGW("invalid udta Terminal code");
}
}
// 递归去parse
status_t err = parseChunk(offset, depth + 1);
if (err != OK) {
if (isTrack) {
mLastTrack->skipTrack = true;
break;
}
return err;
}
} if (*offset != stop_offset) {
return ERROR_MALFORMED;
} // 递归解析结束之后,如果是解析的trak box,那就要整理解析的内容到Track当中
if (isTrack) {
int32_t trackId;
// There must be exactly one track header per track.
// 如果track没有trackid,那么将当前track置为skip
if (!AMediaFormat_getInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_TRACK_ID, &trackId)) {
mLastTrack->skipTrack = true;
} status_t err = verifyTrack(mLastTrack);
if (err != OK) {
mLastTrack->skipTrack = true;
} // skipTrack被置为true说明该track无效,会从链表中删除该Track
if (mLastTrack->skipTrack) {
ALOGV("skipping this track...");
Track *cur = mFirstTrack; if (cur == mLastTrack) {
delete cur;
mFirstTrack = mLastTrack = NULL;
} else {
while (cur && cur->next != mLastTrack) {
cur = cur->next;
}
if (cur) {
cur->next = NULL;
}
delete mLastTrack;
mLastTrack = cur;
} return OK;
} // place things we built elsewhere into their final locations // put aggregated tx3g data into the metadata
if (mLastTrack->mTx3gFilled > 0) {
ALOGV("Putting %zu bytes of tx3g data into meta data",
mLastTrack->mTx3gFilled);
AMediaFormat_setBuffer(mLastTrack->meta,
AMEDIAFORMAT_KEY_TEXT_FORMAT_DATA,
mLastTrack->mTx3gBuffer, mLastTrack->mTx3gFilled);
// drop it now to reduce our footprint
free(mLastTrack->mTx3gBuffer);
mLastTrack->mTx3gBuffer = NULL;
mLastTrack->mTx3gFilled = 0;
mLastTrack->mTx3gSize = 0;
} const char *mime;
AMediaFormat_getString(mLastTrack->meta, AMEDIAFORMAT_KEY_MIME, &mime);
// 判断mime是否为Video_dobly_vision,后面的暂时就不看了
if (!strcasecmp(mime, MEDIA_MIMETYPE_VIDEO_DOLBY_VISION)) {
void *data;
size_t size; if (AMediaFormat_getBuffer(mLastTrack->meta, AMEDIAFORMAT_KEY_CSD_2, &data, &size)) {
const uint8_t *ptr = (const uint8_t *)data;
const uint8_t profile = ptr[2] >> 1;
const uint8_t bl_compatibility_id = (ptr[4]) >> 4;
bool create_two_tracks = false; if (bl_compatibility_id && bl_compatibility_id != 15) {
create_two_tracks = true;
} if (4 == profile || 7 == profile ||
(profile >= 8 && profile < 11 && create_two_tracks)) {
// we need a backward compatible track
ALOGV("Adding new backward compatible track");
Track *track_b = new Track; track_b->timescale = mLastTrack->timescale;
track_b->sampleTable = mLastTrack->sampleTable;
track_b->includes_expensive_metadata = mLastTrack->includes_expensive_metadata;
track_b->skipTrack = mLastTrack->skipTrack;
track_b->elst_needs_processing = mLastTrack->elst_needs_processing;
track_b->elst_media_time = mLastTrack->elst_media_time;
track_b->elst_segment_duration = mLastTrack->elst_segment_duration;
track_b->elst_shift_start_ticks = mLastTrack->elst_shift_start_ticks;
track_b->elst_initial_empty_edit_ticks = mLastTrack->elst_initial_empty_edit_ticks;
track_b->subsample_encryption = mLastTrack->subsample_encryption; track_b->mTx3gBuffer = mLastTrack->mTx3gBuffer;
track_b->mTx3gSize = mLastTrack->mTx3gSize;
track_b->mTx3gFilled = mLastTrack->mTx3gFilled; track_b->meta = AMediaFormat_new();
AMediaFormat_copy(track_b->meta, mLastTrack->meta); mLastTrack->next = track_b;
track_b->next = NULL; auto id = track_b->meta->mFormat->findEntryByName(AMEDIAFORMAT_KEY_CSD_2);
track_b->meta->mFormat->removeEntryAt(id); if (4 == profile || 7 == profile || 8 == profile ) {
AMediaFormat_setString(track_b->meta,
AMEDIAFORMAT_KEY_MIME, MEDIA_MIMETYPE_VIDEO_HEVC);
} else if (9 == profile) {
AMediaFormat_setString(track_b->meta,
AMEDIAFORMAT_KEY_MIME, MEDIA_MIMETYPE_VIDEO_AVC);
} else if (10 == profile) {
AMediaFormat_setString(track_b->meta,
AMEDIAFORMAT_KEY_MIME, MEDIA_MIMETYPE_VIDEO_AV1);
} // Should never get to else part mLastTrack = track_b;
}
}
}
} else if (chunk_type == FOURCC("moov")) {
// 如果当前递归扫描的是moov box,那么将mInitCheck置为true
mInitCheck = OK; return UNKNOWN_ERROR; // Return a dummy error.
}
break;
}
// 暂时不研究这个,应该是用于加密视频播放
case FOURCC("schm"):
{ *offset += chunk_size;
if (!mLastTrack) {
return ERROR_MALFORMED;
} uint32_t scheme_type;
if (mDataSource->readAt(data_offset + 4, &scheme_type, 4) < 4) {
return ERROR_IO;
}
scheme_type = ntohl(scheme_type);
int32_t mode = kCryptoModeUnencrypted;
switch(scheme_type) {
case FOURCC("cbc1"):
{
mode = kCryptoModeAesCbc;
break;
}
case FOURCC("cbcs"):
{
mode = kCryptoModeAesCbc;
mLastTrack->subsample_encryption = true;
break;
}
case FOURCC("cenc"):
{
mode = kCryptoModeAesCtr;
break;
}
case FOURCC("cens"):
{
mode = kCryptoModeAesCtr;
mLastTrack->subsample_encryption = true;
break;
}
}
if (mode != kCryptoModeUnencrypted) {
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_CRYPTO_MODE, mode);
}
break;
} // elst这个box 保存有视频的起始时间
case FOURCC("elst"):
{
*offset += chunk_size; if (!mLastTrack) {
return ERROR_MALFORMED;
} // 读取版本信息
// See 14496-12 8.6.6
uint8_t version;
if (mDataSource->readAt(data_offset, &version, 1) < 1) {
return ERROR_IO;
} // 读取box中内容条数
uint32_t entry_count;
if (!mDataSource->getUInt32(data_offset + 4, &entry_count)) {
return ERROR_IO;
} if (entry_count > 2) {
/* We support a single entry for gapless playback or negating offset for
* reordering B frames, two entries (empty edit) for start offset at the moment.
*/
ALOGW("ignoring edit list with %d entries", entry_count);
} else {
off64_t entriesoffset = data_offset + 8;
uint64_t segment_duration;
int64_t media_time;
bool empty_edit_present = false;
for (int i = 0; i < entry_count; ++i) {
switch (version) {
// 这里只看version为0的版本
case 0: {
uint32_t sd;
int32_t mt;
// 读取segment_duration,应该就是track的时长
// 读取media_time,为流的起始时间用于计算DTS和PTS
if (!mDataSource->getUInt32(entriesoffset, &sd) ||
!mDataSource->getUInt32(entriesoffset + 4, (uint32_t*)&mt)) {
return ERROR_IO;
}
segment_duration = sd;
media_time = mt;
// 4(segment duration) + 4(media time) + 4(media rate)
entriesoffset += 12;
break;
}
case 1: {
if (!mDataSource->getUInt64(entriesoffset, &segment_duration) ||
!mDataSource->getUInt64(entriesoffset + 8, (uint64_t*)&media_time)) {
return ERROR_IO;
}
// 8(segment duration) + 8(media time) + 4(media rate)
entriesoffset += 20;
break;
}
default:
return ERROR_IO;
break;
}
// Empty edit entry would have to be first entry.
if (media_time == -1 && i == 0) {
empty_edit_present = true;
ALOGV("initial empty edit ticks: %" PRIu64, segment_duration);
/* In movie header timescale, and needs to be converted to media timescale
* after we get that from a track's 'mdhd' atom,
* which at times come after 'elst'.
*/
mLastTrack->elst_initial_empty_edit_ticks = segment_duration;
} else if (media_time >= 0 && i == 0) {
ALOGV("first edit list entry - from gapless playback files");
// 保存elst信息到Track当中
mLastTrack->elst_media_time = media_time;
mLastTrack->elst_segment_duration = segment_duration;
ALOGV("segment_duration: %" PRIu64 " media_time: %" PRId64,
segment_duration, media_time);
// media_time is in media timescale as are STTS/CTTS entries.
mLastTrack->elst_shift_start_ticks = media_time;
} else if (empty_edit_present && i == 1) {
// Process second entry only when the first entry was an empty edit entry.
ALOGV("second edit list entry");
mLastTrack->elst_shift_start_ticks = media_time;
} else {
ALOGW("for now, unsupported entry in edit list %" PRIu32, entry_count);
}
}
// save these for later, because the elst atom might precede
// the atoms that actually gives us the duration and sample rate
// needed to calculate the padding and delay values
mLastTrack->elst_needs_processing = true;
}
break;
}
// 如果有frmabox
case FOURCC("frma"):
{
*offset += chunk_size; uint32_t original_fourcc;
if (mDataSource->readAt(data_offset, &original_fourcc, 4) < 4) {
return ERROR_IO;
}
original_fourcc = ntohl(original_fourcc);
ALOGV("read original format: %d", original_fourcc); if (mLastTrack == NULL) {
return ERROR_MALFORMED;
}
// 设定track的mime
AMediaFormat_setString(mLastTrack->meta,
AMEDIAFORMAT_KEY_MIME, FourCC2MIME(original_fourcc));
uint32_t num_channels = 0;
uint32_t sample_rate = 0;
if (AdjustChannelsAndRate(original_fourcc, &num_channels, &sample_rate)) {
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_CHANNEL_COUNT, num_channels);
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_SAMPLE_RATE, sample_rate);
} if (!mIsQT && original_fourcc == FOURCC("alac")) {
off64_t tmpOffset = *offset;
status_t err = parseALACSampleEntry(&tmpOffset);
if (err != OK) {
ALOGE("parseALACSampleEntry err:%d Line:%d", err, __LINE__);
return err;
}
*offset = tmpOffset + 8;
} break;
} // ...... // 解析track header
case FOURCC("tkhd"):
{
*offset += chunk_size; status_t err;
// 主要用来解析track id,video track的width、height,并且保存在meta data中
if ((err = parseTrackHeader(data_offset, chunk_data_size)) != OK) {
return err;
} break;
} // ...... // 解析mdhd
case FOURCC("mdhd"):
{
*offset += chunk_size; if (chunk_data_size < 4 || mLastTrack == NULL) {
return ERROR_MALFORMED;
} uint8_t version;
if (mDataSource->readAt(
data_offset, &version, sizeof(version))
< (ssize_t)sizeof(version)) {
return ERROR_IO;
} off64_t timescale_offset; if (version == 1) {
timescale_offset = data_offset + 4 + 16;
} else if (version == 0) {
timescale_offset = data_offset + 4 + 8;
} else {
return ERROR_IO;
} // 读取timescale
uint32_t timescale;
if (mDataSource->readAt(
timescale_offset, &timescale, sizeof(timescale))
< (ssize_t)sizeof(timescale)) {
return ERROR_IO;
} if (!timescale) {
ALOGE("timescale should not be ZERO.");
return ERROR_MALFORMED;
} // 将timescale保存到track中
mLastTrack->timescale = ntohl(timescale); // 14496-12 says all ones means indeterminate, but some files seem to use
// 0 instead. We treat both the same.
int64_t duration = 0;
if (version == 1) {
if (mDataSource->readAt(
timescale_offset + 4, &duration, sizeof(duration))
< (ssize_t)sizeof(duration)) {
return ERROR_IO;
}
if (duration != -1) {
duration = ntoh64(duration);
}
} else {
// 这里只看version为0的版本
uint32_t duration32;
// 读取当前track的duration
if (mDataSource->readAt(
timescale_offset + 4, &duration32, sizeof(duration32))
< (ssize_t)sizeof(duration32)) {
return ERROR_IO;
}
if (duration32 != 0xffffffff) {
duration = ntohl(duration32);
}
}
if (duration != 0 && mLastTrack->timescale != 0) {
// 真正的duration需要用这边获取的duration除以timescale
long double durationUs = ((long double)duration * 1000000) / mLastTrack->timescale;
if (durationUs < 0 || durationUs > INT64_MAX) {
ALOGE("cannot represent %lld * 1000000 / %lld in 64 bits",
(long long) duration, (long long) mLastTrack->timescale);
return ERROR_MALFORMED;
}
// 设置给meta的duration是用的微秒
AMediaFormat_setInt64(mLastTrack->meta, AMEDIAFORMAT_KEY_DURATION, durationUs);
} uint8_t lang[2];
off64_t lang_offset;
if (version == 1) {
lang_offset = timescale_offset + 4 + 8;
} else if (version == 0) {
lang_offset = timescale_offset + 4 + 4;
} else {
return ERROR_IO;
} if (mDataSource->readAt(lang_offset, &lang, sizeof(lang))
< (ssize_t)sizeof(lang)) {
return ERROR_IO;
} // To get the ISO-639-2/T three character language code
// 1 bit pad followed by 3 5-bits characters. Each character
// is packed as the difference between its ASCII value and 0x60.
char lang_code[4];
lang_code[0] = ((lang[0] >> 2) & 0x1f) + 0x60;
lang_code[1] = ((lang[0] & 0x3) << 3 | (lang[1] >> 5)) + 0x60;
lang_code[2] = (lang[1] & 0x1f) + 0x60;
lang_code[3] = '\0';
// 给meta设置key language
AMediaFormat_setString(mLastTrack->meta, AMEDIAFORMAT_KEY_LANGUAGE, lang_code); break;
} // 非常中要的box,子box可以解析出mime
case FOURCC("stsd"):
{
uint8_t buffer[8];
if (chunk_data_size < (off64_t)sizeof(buffer)) {
return ERROR_MALFORMED;
} if (mDataSource->readAt(
data_offset, buffer, 8) < 8) {
return ERROR_IO;
} if (U32_AT(buffer) != 0) {
// Should be version 0, flags 0.
return ERROR_MALFORMED;
} uint32_t entry_count = U32_AT(&buffer[4]); if (entry_count > 1) {
// For 3GPP timed text, there could be multiple tx3g boxes contain
// multiple text display formats. These formats will be used to
// display the timed text.
// For encrypted files, there may also be more than one entry.
const char *mime; if (mLastTrack == NULL)
return ERROR_MALFORMED; CHECK(AMediaFormat_getString(mLastTrack->meta, AMEDIAFORMAT_KEY_MIME, &mime));
if (strcasecmp(mime, MEDIA_MIMETYPE_TEXT_3GPP) &&
strcasecmp(mime, "application/octet-stream")) {
// For now we only support a single type of media per track.
mLastTrack->skipTrack = true;
*offset += chunk_size;
break;
}
}
off64_t stop_offset = *offset + chunk_size;
*offset = data_offset + 8;
for (uint32_t i = 0; i < entry_count; ++i) {
// 递归parse子box,可以解析出mime type
status_t err = parseChunk(offset, depth + 1);
if (err != OK) {
return err;
}
} if (*offset != stop_offset) {
return ERROR_MALFORMED;
}
break;
} // stsd子box type如果是以下内容,说明是audio track
case FOURCC("mp4a"):
case FOURCC("enca"):
case FOURCC("samr"):
case FOURCC("sawb"):
case FOURCC("Opus"):
case FOURCC("twos"):
case FOURCC("sowt"):
case FOURCC("alac"):
case FOURCC("fLaC"):
case FOURCC(".mp3"):
case 0x6D730055: // "ms U" mp3 audio
{
if (mIsQT && depth >= 1 && mPath[depth - 1] == FOURCC("wave")) { if (chunk_type == FOURCC("alac")) {
off64_t offsetTmp = *offset;
status_t err = parseALACSampleEntry(&offsetTmp);
if (err != OK) {
ALOGE("parseALACSampleEntry err:%d Line:%d", err, __LINE__);
return err;
}
} // Ignore all atoms embedded in QT wave atom
ALOGV("Ignore all atoms embedded in QT wave atom");
*offset += chunk_size;
break;
} uint8_t buffer[8 + 20];
if (chunk_data_size < (ssize_t)sizeof(buffer)) {
// Basic AudioSampleEntry size.
return ERROR_MALFORMED;
} if (mDataSource->readAt(
data_offset, buffer, sizeof(buffer)) < (ssize_t)sizeof(buffer)) {
return ERROR_IO;
} uint16_t data_ref_index __unused = U16_AT(&buffer[6]);
uint16_t version = U16_AT(&buffer[8]);
uint32_t num_channels = U16_AT(&buffer[16]); uint16_t sample_size = U16_AT(&buffer[18]);
uint32_t sample_rate = U32_AT(&buffer[24]) >> 16; if (mLastTrack == NULL)
return ERROR_MALFORMED; off64_t stop_offset = *offset + chunk_size;
*offset = data_offset + sizeof(buffer); if (mIsQT) {
if (version == 1) {
if (mDataSource->readAt(*offset, buffer, 16) < 16) {
return ERROR_IO;
} #if 0
U32_AT(buffer); // samples per packet
U32_AT(&buffer[4]); // bytes per packet
U32_AT(&buffer[8]); // bytes per frame
U32_AT(&buffer[12]); // bytes per sample
#endif
*offset += 16;
} else if (version == 2) {
uint8_t v2buffer[36];
if (mDataSource->readAt(*offset, v2buffer, 36) < 36) {
return ERROR_IO;
} #if 0
U32_AT(v2buffer); // size of struct only
sample_rate = (uint32_t)U64_AT(&v2buffer[4]); // audio sample rate
num_channels = U32_AT(&v2buffer[12]); // num audio channels
U32_AT(&v2buffer[16]); // always 0x7f000000
sample_size = (uint16_t)U32_AT(&v2buffer[20]); // const bits per channel
U32_AT(&v2buffer[24]); // format specifc flags
U32_AT(&v2buffer[28]); // const bytes per audio packet
U32_AT(&v2buffer[32]); // const LPCM frames per audio packet
#endif
*offset += 36;
}
} if (chunk_type != FOURCC("enca")) {
// if the chunk type is enca, we'll get the type from the frma box later
AMediaFormat_setString(mLastTrack->meta,
AMEDIAFORMAT_KEY_MIME, FourCC2MIME(chunk_type));
AdjustChannelsAndRate(chunk_type, &num_channels, &sample_rate); if (!strcasecmp(MEDIA_MIMETYPE_AUDIO_RAW, FourCC2MIME(chunk_type))) {
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_BITS_PER_SAMPLE, sample_size);
if (chunk_type == FOURCC("twos")) {
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_PCM_BIG_ENDIAN, 1);
}
}
} // 将读取出的sample size和sample rate保存到meta当中
ALOGV("*** coding='%s' %d channels, size %d, rate %d\n",
chunk, num_channels, sample_size, sample_rate);
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_CHANNEL_COUNT, num_channels);
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_SAMPLE_RATE, sample_rate); // ...... if (!mIsQT && chunk_type == FOURCC("alac")) {
data_offset += sizeof(buffer); status_t err = parseALACSampleEntry(&data_offset);
if (err != OK) {
ALOGE("parseALACSampleEntry err:%d Line:%d", err, __LINE__);
return err;
}
*offset = data_offset;
CHECK_EQ(*offset, stop_offset);
} if (chunk_type == FOURCC("fLaC")) { // From https://github.com/xiph/flac/blob/master/doc/isoflac.txt
// 4 for mime, 4 for blockType and BlockLen, 34 for metadata
uint8_t flacInfo[4 + 4 + 34];
// skipping dFla, version
data_offset += sizeof(buffer) + 12;
size_t flacOffset = 4;
// Add flaC header mime type to CSD
strncpy((char *)flacInfo, "fLaC", 4);
if (mDataSource->readAt(
data_offset, flacInfo + flacOffset, sizeof(flacInfo) - flacOffset) <
(ssize_t)sizeof(flacInfo) - flacOffset) {
return ERROR_IO;
}
data_offset += sizeof(flacInfo) - flacOffset; AMediaFormat_setBuffer(mLastTrack->meta, AMEDIAFORMAT_KEY_CSD_0, flacInfo,
sizeof(flacInfo));
*offset = data_offset;
CHECK_EQ(*offset, stop_offset);
} while (*offset < stop_offset) {
// 继续递归子box
status_t err = parseChunk(offset, depth + 1);
if (err != OK) {
return err;
}
} if (*offset != stop_offset) {
return ERROR_MALFORMED;
}
break;
} // 如果box type是以下内容,那么说明当前track为video track
case FOURCC("mp4v"):
case FOURCC("encv"):
case FOURCC("s263"):
case FOURCC("H263"):
case FOURCC("h263"):
case FOURCC("avc1"):
case FOURCC("hvc1"):
case FOURCC("hev1"):
case FOURCC("dvav"):
case FOURCC("dva1"):
case FOURCC("dvhe"):
case FOURCC("dvh1"):
case FOURCC("dav1"):
case FOURCC("av01"):
{
uint8_t buffer[78];
if (chunk_data_size < (ssize_t)sizeof(buffer)) {
// Basic VideoSampleEntry size.
return ERROR_MALFORMED;
} if (mDataSource->readAt(
data_offset, buffer, sizeof(buffer)) < (ssize_t)sizeof(buffer)) {
return ERROR_IO;
} uint16_t data_ref_index __unused = U16_AT(&buffer[6]);
uint16_t width = U16_AT(&buffer[6 + 18]);
uint16_t height = U16_AT(&buffer[6 + 20]); // The video sample is not standard-compliant if it has invalid dimension.
// Use some default width and height value, and
// let the decoder figure out the actual width and height (and thus
// be prepared for INFO_FOMRAT_CHANGED event).
if (width == 0) width = 352;
if (height == 0) height = 288; // printf("*** coding='%s' width=%d height=%d\n",
// chunk, width, height); if (mLastTrack == NULL)
return ERROR_MALFORMED; if (chunk_type != FOURCC("encv")) {
// if the chunk type is encv, we'll get the type from the frma box later
AMediaFormat_setString(mLastTrack->meta,
AMEDIAFORMAT_KEY_MIME, FourCC2MIME(chunk_type));
}
// 同样可以解析出视频的宽高,并且将他们设置到meta当中
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_WIDTH, width);
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_HEIGHT, height); off64_t stop_offset = *offset + chunk_size;
*offset = data_offset + sizeof(buffer);
while (*offset < stop_offset) {
// 继续parse子box
status_t err = parseChunk(offset, depth + 1);
if (err != OK) {
return err;
}
} if (*offset != stop_offset) {
return ERROR_MALFORMED;
}
break;
} // 解析stco,这里面存储的是trunk在mtdt中的偏移量
case FOURCC("stco"):
case FOURCC("co64"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL)) {
return ERROR_MALFORMED;
} // 设置chunk offset的参数,当时创建sampleTable时,是直接将包含stbl box在内的剩余数据全部拷贝到了sample table当中
status_t err =
mLastTrack->sampleTable->setChunkOffsetParams(
chunk_type, data_offset, chunk_data_size); *offset += chunk_size; if (err != OK) {
return err;
} break;
} case FOURCC("stsc"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL))
return ERROR_MALFORMED; // 设置stsc的相关数据区域
status_t err =
mLastTrack->sampleTable->setSampleToChunkParams(
data_offset, chunk_data_size); *offset += chunk_size; if (err != OK) {
return err;
} break;
} case FOURCC("stsz"):
case FOURCC("stz2"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL)) {
return ERROR_MALFORMED;
}
// 设置stsz的数据区域
status_t err =
mLastTrack->sampleTable->setSampleSizeParams(
chunk_type, data_offset, chunk_data_size); *offset += chunk_size; if (err != OK) {
return err;
} adjustRawDefaultFrameSize(); size_t max_size;
err = mLastTrack->sampleTable->getMaxSampleSize(&max_size); if (err != OK) {
return err;
} if (max_size != 0) {
// Assume that a given buffer only contains at most 10 chunks,
// each chunk originally prefixed with a 2 byte length will
// have a 4 byte header (0x00 0x00 0x00 0x01) after conversion,
// and thus will grow by 2 bytes per chunk.
if (max_size > SIZE_MAX - 10 * 2) {
ALOGE("max sample size too big: %zu", max_size);
return ERROR_MALFORMED;
}
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_MAX_INPUT_SIZE, max_size + 10 * 2);
} else {
// No size was specified. Pick a conservatively large size.
uint32_t width, height;
if (!AMediaFormat_getInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_WIDTH, (int32_t*)&width) ||
!AMediaFormat_getInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_HEIGHT,(int32_t*) &height)) {
ALOGE("No width or height, assuming worst case 1080p");
width = 1920;
height = 1080;
} else {
// A resolution was specified, check that it's not too big. The values below
// were chosen so that the calculations below don't cause overflows, they're
// not indicating that resolutions up to 32kx32k are actually supported.
if (width > 32768 || height > 32768) {
ALOGE("can't support %u x %u video", width, height);
return ERROR_MALFORMED;
}
} const char *mime;
CHECK(AMediaFormat_getString(mLastTrack->meta, AMEDIAFORMAT_KEY_MIME, &mime));
if (!strncmp(mime, "audio/", 6)) {
// for audio, use 128KB
max_size = 1024 * 128;
} else if (!strcmp(mime, MEDIA_MIMETYPE_VIDEO_AVC)
|| !strcmp(mime, MEDIA_MIMETYPE_VIDEO_HEVC)
|| !strcmp(mime, MEDIA_MIMETYPE_VIDEO_DOLBY_VISION)) {
// AVC & HEVC requires compression ratio of at least 2, and uses
// macroblocks
max_size = ((width + 15) / 16) * ((height + 15) / 16) * 192;
} else {
// For all other formats there is no minimum compression
// ratio. Use compression ratio of 1.
max_size = width * height * 3 / 2;
}
// HACK: allow 10% overhead
// TODO: read sample size from traf atom for fragmented MPEG4.
max_size += max_size / 10;
// 设定最大的buffer输入大小
AMediaFormat_setInt32(mLastTrack->meta, AMEDIAFORMAT_KEY_MAX_INPUT_SIZE, max_size);
} // NOTE: setting another piece of metadata invalidates any pointers (such as the
// mimetype) previously obtained, so don't cache them.
const char *mime;
CHECK(AMediaFormat_getString(mLastTrack->meta, AMEDIAFORMAT_KEY_MIME, &mime));
// Calculate average frame rate.
if (!strncasecmp("video/", mime, 6)) {
size_t nSamples = mLastTrack->sampleTable->countSamples();
if (nSamples == 0) {
int32_t trackId;
if (AMediaFormat_getInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_TRACK_ID, &trackId)) {
for (size_t i = 0; i < mTrex.size(); i++) {
Trex *t = &mTrex.editItemAt(i);
if (t->track_ID == (uint32_t) trackId) {
if (t->default_sample_duration > 0) {
int32_t frameRate =
mLastTrack->timescale / t->default_sample_duration;
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_FRAME_RATE, frameRate);
}
break;
}
}
}
} else {
int64_t durationUs;
if (AMediaFormat_getInt64(mLastTrack->meta,
AMEDIAFORMAT_KEY_DURATION, &durationUs)) {
if (durationUs > 0) {
int32_t frameRate = (nSamples * 1000000LL +
(durationUs >> 1)) / durationUs;
// 给meta设置帧率
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_FRAME_RATE, frameRate);
}
}
ALOGV("setting frame count %zu", nSamples);
// 给meta设置帧数量
AMediaFormat_setInt32(mLastTrack->meta,
AMEDIAFORMAT_KEY_FRAME_COUNT, nSamples);
}
} break;
} case FOURCC("stts"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL))
return ERROR_MALFORMED; *offset += chunk_size; if (depth >= 1 && mPath[depth - 1] != FOURCC("stbl")) {
char chunk[5];
MakeFourCCString(mPath[depth - 1], chunk);
ALOGW("stts's parent box (%s) is not stbl, skip it.", chunk);
break;
} status_t err =
mLastTrack->sampleTable->setTimeToSampleParams(
data_offset, chunk_data_size); if (err != OK) {
return err;
} break;
} case FOURCC("ctts"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL))
return ERROR_MALFORMED; *offset += chunk_size; status_t err =
mLastTrack->sampleTable->setCompositionTimeToSampleParams(
data_offset, chunk_data_size); if (err != OK) {
return err;
} break;
} case FOURCC("stss"):
{
if ((mLastTrack == NULL) || (mLastTrack->sampleTable == NULL))
return ERROR_MALFORMED; *offset += chunk_size; status_t err =
mLastTrack->sampleTable->setSyncSampleParams(
data_offset, chunk_data_size); if (err != OK) {
return err;
} break;
} // ...... // 如果avc1的子box是avcC,那么可以解析出sps pps信息
case FOURCC("avcC"):
{
*offset += chunk_size; auto buffer = heapbuffer<uint8_t>(chunk_data_size); if (buffer.get() == NULL) {
ALOGE("b/28471206");
return NO_MEMORY;
} if (mDataSource->readAt(
data_offset, buffer.get(), chunk_data_size) < chunk_data_size) {
return ERROR_IO;
} if (mLastTrack == NULL)
return ERROR_MALFORMED; // 将读取到的buffer作为csd buffer
AMediaFormat_setBuffer(mLastTrack->meta,
AMEDIAFORMAT_KEY_CSD_AVC, buffer.get(), chunk_data_size); break;
}
case FOURCC("hvcC"):
{
auto buffer = heapbuffer<uint8_t>(chunk_data_size); if (buffer.get() == NULL) {
ALOGE("b/28471206");
return NO_MEMORY;
} if (mDataSource->readAt(
data_offset, buffer.get(), chunk_data_size) < chunk_data_size) {
return ERROR_IO;
} if (mLastTrack == NULL)
return ERROR_MALFORMED;
// 同样的,如果是hevc,也去读取vps sps pps信息作为csd buffer,存储到meta中
AMediaFormat_setBuffer(mLastTrack->meta,
AMEDIAFORMAT_KEY_CSD_HEVC, buffer.get(), chunk_data_size); *offset += chunk_size;
break;
}
case FOURCC("av1C"):
{
auto buffer = heapbuffer<uint8_t>(chunk_data_size); if (buffer.get() == NULL) {
ALOGE("b/28471206");
return NO_MEMORY;
} if (mDataSource->readAt(
data_offset, buffer.get(), chunk_data_size) < chunk_data_size) {
return ERROR_IO;
} if (mLastTrack == NULL)
return ERROR_MALFORMED; AMediaFormat_setBuffer(mLastTrack->meta,
AMEDIAFORMAT_KEY_CSD_0, buffer.get(), chunk_data_size); *offset += chunk_size;
break;
}
// 杜比相关内容
case FOURCC("dvcC"):
case FOURCC("dvvC"): { CHECK_EQ(chunk_data_size, 24); auto buffer = heapbuffer<uint8_t>(chunk_data_size); if (buffer.get() == NULL) {
ALOGE("b/28471206");
return NO_MEMORY;
} if (mDataSource->readAt(data_offset, buffer.get(), chunk_data_size) < chunk_data_size) {
return ERROR_IO;
} if (mLastTrack == NULL)
return ERROR_MALFORMED; AMediaFormat_setBuffer(mLastTrack->meta, AMEDIAFORMAT_KEY_CSD_2,
buffer.get(), chunk_data_size);
AMediaFormat_setString(mLastTrack->meta, AMEDIAFORMAT_KEY_MIME,
MEDIA_MIMETYPE_VIDEO_DOLBY_VISION); *offset += chunk_size;
break;
} // ...... // mvhd中解析出的是文件的元信息
case FOURCC("mvhd"):
{
*offset += chunk_size; if (depth != 1) {
ALOGE("mvhd: depth %d", depth);
return ERROR_MALFORMED;
}
if (chunk_data_size < 32) {
return ERROR_MALFORMED;
} uint8_t header[32];
if (mDataSource->readAt(
data_offset, header, sizeof(header))
< (ssize_t)sizeof(header)) {
return ERROR_IO;
} uint64_t creationTime;
uint64_t duration = 0;
if (header[0] == 1) {
creationTime = U64_AT(&header[4]);
mHeaderTimescale = U32_AT(&header[20]);
duration = U64_AT(&header[24]);
if (duration == 0xffffffffffffffff) {
duration = 0;
}
} else if (header[0] != 0) {
return ERROR_MALFORMED;
} else {
creationTime = U32_AT(&header[4]);
mHeaderTimescale = U32_AT(&header[12]);
uint32_t d32 = U32_AT(&header[16]);
if (d32 == 0xffffffff) {
d32 = 0;
}
duration = d32;
}
if (duration != 0 && mHeaderTimescale != 0 && duration < UINT64_MAX / 1000000) {
AMediaFormat_setInt64(mFileMetaData,
AMEDIAFORMAT_KEY_DURATION, duration * 1000000 / mHeaderTimescale);
} String8 s;
if (convertTimeToDate(creationTime, &s)) {
AMediaFormat_setString(mFileMetaData, AMEDIAFORMAT_KEY_DATE, s.string());
} break;
} // 将mMdatFound置为true,并将chunk_size返回
case FOURCC("mdat"):
{
mMdatFound = true; *offset += chunk_size;
break;
} // hdlr中的handler_type并不会作为mime type,但是应该是可以用来确定audio和video
case FOURCC("hdlr"):
{
*offset += chunk_size; if (underQTMetaPath(mPath, 3)) {
break;
} uint32_t buffer;
if (mDataSource->readAt(
data_offset + 8, &buffer, 4) < 4) {
return ERROR_IO;
} uint32_t type = ntohl(buffer);
// For the 3GPP file format, the handler-type within the 'hdlr' box
// shall be 'text'. We also want to support 'sbtl' handler type
// for a practical reason as various MPEG4 containers use it.
if (type == FOURCC("text") || type == FOURCC("sbtl")) {
if (mLastTrack != NULL) {
AMediaFormat_setString(mLastTrack->meta,
AMEDIAFORMAT_KEY_MIME, MEDIA_MIMETYPE_TEXT_3GPP);
}
} break;
} // ...... // 这个box我记得可能是存储的媒体的缩略图等信息
case FOURCC("tx3g"):
{
if (mLastTrack == NULL)
return ERROR_MALFORMED; // complain about ridiculous chunks
if (chunk_size > kMaxAtomSize) {
return ERROR_MALFORMED;
} // complain about empty atoms
if (chunk_data_size <= 0) {
ALOGE("b/124330204");
android_errorWriteLog(0x534e4554, "124330204");
return ERROR_MALFORMED;
} // should fill buffer based on "data_offset" and "chunk_data_size"
// instead of *offset and chunk_size;
// but we've been feeding the extra data to consumers for multiple releases and
// if those apps are compensating for it, we'd break them with such a change
// if (mLastTrack->mTx3gBuffer == NULL) {
mLastTrack->mTx3gSize = 0;
mLastTrack->mTx3gFilled = 0;
}
if (mLastTrack->mTx3gSize - mLastTrack->mTx3gFilled < chunk_size) {
size_t growth = kTx3gGrowth;
if (growth < chunk_size) {
growth = chunk_size;
}
// although this disallows 2 tx3g atoms of nearly kMaxAtomSize...
if ((uint64_t) mLastTrack->mTx3gSize + growth > kMaxAtomSize) {
ALOGE("b/124330204 - too much space");
android_errorWriteLog(0x534e4554, "124330204");
return ERROR_MALFORMED;
}
uint8_t *updated = (uint8_t *)realloc(mLastTrack->mTx3gBuffer,
mLastTrack->mTx3gSize + growth);
if (updated == NULL) {
return ERROR_MALFORMED;
}
mLastTrack->mTx3gBuffer = updated;
mLastTrack->mTx3gSize += growth;
} if ((size_t)(mDataSource->readAt(*offset,
mLastTrack->mTx3gBuffer + mLastTrack->mTx3gFilled,
chunk_size))
< chunk_size) { // advance read pointer so we don't end up reading this again
*offset += chunk_size;
return ERROR_IO;
} mLastTrack->mTx3gFilled += chunk_size;
*offset += chunk_size;
break;
} case FOURCC("ac-3"):
{
*offset += chunk_size;
// bypass ac-3 if parse fail
if (parseAC3SpecificBox(data_offset) != OK) {
if (mLastTrack != NULL) {
ALOGW("Fail to parse ac-3");
mLastTrack->skipTrack = true;
}
}
return OK;
} case FOURCC("ec-3"):
{
*offset += chunk_size;
// bypass ec-3 if parse fail
if (parseEAC3SpecificBox(data_offset) != OK) {
if (mLastTrack != NULL) {
ALOGW("Fail to parse ec-3");
mLastTrack->skipTrack = true;
}
}
return OK;
} case FOURCC("ac-4"):
{
*offset += chunk_size;
// bypass ac-4 if parse fail
if (parseAC4SpecificBox(data_offset) != OK) {
if (mLastTrack != NULL) {
ALOGW("Fail to parse ac-4");
mLastTrack->skipTrack = true;
}
}
return OK;
} case FOURCC("ftyp"):
{
if (chunk_data_size < 8 || depth != 0) {
return ERROR_MALFORMED;
} off64_t stop_offset = *offset + chunk_size;
uint32_t numCompatibleBrands = (chunk_data_size - 8) / 4;
std::set<uint32_t> brandSet;
for (size_t i = 0; i < numCompatibleBrands + 2; ++i) {
if (i == 1) {
// Skip this index, it refers to the minorVersion,
// not a brand.
continue;
} uint32_t brand;
if (mDataSource->readAt(data_offset + 4 * i, &brand, 4) < 4) {
return ERROR_MALFORMED;
} brand = ntohl(brand);
brandSet.insert(brand);
} if (brandSet.count(FOURCC("qt ")) > 0) {
mIsQT = true;
} else {
if (brandSet.count(FOURCC("mif1")) > 0
&& brandSet.count(FOURCC("heic")) > 0) {
ALOGV("identified HEIF image"); mIsHeif = true;
brandSet.erase(FOURCC("mif1"));
brandSet.erase(FOURCC("heic"));
} if (!brandSet.empty()) {
// This means that the file should have moov box.
// It could be any iso files (mp4, heifs, etc.)
mHasMoovBox = true;
if (mIsHeif) {
ALOGV("identified HEIF image with other tracks");
}
}
} *offset = stop_offset; break;
} default:
{
// check if we're parsing 'ilst' for meta keys
// if so, treat type as a number (key-id).
if (underQTMetaPath(mPath, 3)) {
status_t err = parseQTMetaVal(chunk_type, data_offset, chunk_data_size);
if (err != OK) {
return err;
}
} *offset += chunk_size;
break;
}
} return OK;
}

Sample Table持有一个DataSource,解析stts、stss等box时把对应的偏移量以及结束位置初始化了SampleTable,

MPEG4Extractor::getTrack

MediaTrackHelper *MPEG4Extractor::getTrack(size_t index) {
status_t err;
if ((err = readMetaData()) != OK) {
return NULL;
}
// 循环拿到nIndex对应的track
Track *track = mFirstTrack;
while (index > 0) {
if (track == NULL) {
return NULL;
} track = track->next;
--index;
} if (track == NULL) {
return NULL;
} // 检查trackID
Trex *trex = NULL;
int32_t trackId;
if (AMediaFormat_getInt32(track->meta, AMEDIAFORMAT_KEY_TRACK_ID, &trackId)) {
for (size_t i = 0; i < mTrex.size(); i++) {
Trex *t = &mTrex.editItemAt(i);
if (t->track_ID == (uint32_t) trackId) {
trex = t;
break;
}
}
} else {
ALOGE("b/21657957");
return NULL;
} ALOGV("getTrack called, pssh: %zu", mPssh.size());
// 检查mime
const char *mime;
if (!AMediaFormat_getString(track->meta, AMEDIAFORMAT_KEY_MIME, &mime)) {
return NULL;
} sp<ItemTable> itemTable;
// 如果是avc,那么需要检查CSD buffer
if (!strcasecmp(mime, MEDIA_MIMETYPE_VIDEO_AVC)) {
void *data;
size_t size;
if (!AMediaFormat_getBuffer(track->meta, AMEDIAFORMAT_KEY_CSD_AVC, &data, &size)) {
return NULL;
} const uint8_t *ptr = (const uint8_t *)data;
// 读取CSB buffer,检查configurationVersion值
if (size < 7 || ptr[0] != 1) { // configurationVersion == 1
return NULL;
}
} else if (!strcasecmp(mime, MEDIA_MIMETYPE_VIDEO_HEVC)
|| !strcasecmp(mime, MEDIA_MIMETYPE_IMAGE_ANDROID_HEIC)) {
void *data;
size_t size;
if (!AMediaFormat_getBuffer(track->meta, AMEDIAFORMAT_KEY_CSD_HEVC, &data, &size)) {
return NULL;
} const uint8_t *ptr = (const uint8_t *)data; if (size < 22 || ptr[0] != 1) { // configurationVersion == 1
return NULL;
}
if (!strcasecmp(mime, MEDIA_MIMETYPE_IMAGE_ANDROID_HEIC)) {
itemTable = mItemTable;
}
} else if (!strcasecmp(mime, MEDIA_MIMETYPE_VIDEO_DOLBY_VISION)) {
void *data;
size_t size;
if (!AMediaFormat_getBuffer(track->meta, AMEDIAFORMAT_KEY_CSD_2, &data, &size)) {
return NULL;
} const uint8_t *ptr = (const uint8_t *)data; // dv_major.dv_minor Should be 1.0 or 2.1
if (size != 24 || ((ptr[0] != 1 || ptr[1] != 0) && (ptr[0] != 2 || ptr[1] != 1))) {
return NULL;
}
} else if (!strcasecmp(mime, MEDIA_MIMETYPE_VIDEO_AV1)) {
void *data;
size_t size;
if (!AMediaFormat_getBuffer(track->meta, AMEDIAFORMAT_KEY_CSD_0, &data, &size)) {
return NULL;
}
const uint8_t *ptr = (const uint8_t *)data; if (size < 5 || ptr[0] != 0x81) { // configurationVersion == 1
return NULL;
}
} ALOGV("track->elst_shift_start_ticks :%" PRIu64, track->elst_shift_start_ticks); uint64_t elst_initial_empty_edit_ticks = 0;
if (mHeaderTimescale != 0) {
// Convert empty_edit_ticks from movie timescale to media timescale.
uint64_t elst_initial_empty_edit_ticks_mul = 0, elst_initial_empty_edit_ticks_add = 0;
if (__builtin_mul_overflow(track->elst_initial_empty_edit_ticks, track->timescale,
&elst_initial_empty_edit_ticks_mul) ||
__builtin_add_overflow(elst_initial_empty_edit_ticks_mul, (mHeaderTimescale / 2),
&elst_initial_empty_edit_ticks_add)) {
ALOGE("track->elst_initial_empty_edit_ticks overflow");
return nullptr;
}
elst_initial_empty_edit_ticks = elst_initial_empty_edit_ticks_add / mHeaderTimescale;
}
ALOGV("elst_initial_empty_edit_ticks in MediaTimeScale :%" PRIu64,
elst_initial_empty_edit_ticks); // 创建MediaSource并返回
MPEG4Source* source =
new MPEG4Source(track->meta, mDataSource, track->timescale, track->sampleTable,
mSidxEntries, trex, mMoofOffset, itemTable,
track->elst_shift_start_ticks, elst_initial_empty_edit_ticks);
if (source->init() != OK) {
delete source;
return NULL;
}
return source;
}

MPEG4Source::read

media_status_t MPEG4Source::read(
MediaBufferHelper **out, const ReadOptions *options) {
Mutex::Autolock autoLock(mLock); CHECK(mStarted); if (options != nullptr && options->getNonBlocking() && !mBufferGroup->has_buffers()) {
*out = nullptr;
return AMEDIA_ERROR_WOULD_BLOCK;
} if (mFirstMoofOffset > 0) {
return fragmentedRead(out, options);
} *out = NULL; int64_t targetSampleTimeUs = -1; int64_t seekTimeUs;
ReadOptions::SeekMode mode; // 用于seek读取
if (options && options->getSeekTo(&seekTimeUs, &mode)) {
ALOGV("seekTimeUs:%" PRId64, seekTimeUs);
if (mIsHeif) {
CHECK(mSampleTable == NULL);
CHECK(mItemTable != NULL);
int32_t imageIndex;
if (!AMediaFormat_getInt32(mFormat, AMEDIAFORMAT_KEY_TRACK_ID, &imageIndex)) {
return AMEDIA_ERROR_MALFORMED;
} status_t err;
if (seekTimeUs >= 0) {
err = mItemTable->findImageItem(imageIndex, &mCurrentSampleIndex);
} else {
err = mItemTable->findThumbnailItem(imageIndex, &mCurrentSampleIndex);
}
if (err != OK) {
return AMEDIA_ERROR_UNKNOWN;
}
} else {
// 解析出seek mode
uint32_t findFlags = 0;
switch (mode) {
case ReadOptions::SEEK_PREVIOUS_SYNC:
findFlags = SampleTable::kFlagBefore;
break;
case ReadOptions::SEEK_NEXT_SYNC:
findFlags = SampleTable::kFlagAfter;
break;
case ReadOptions::SEEK_CLOSEST_SYNC:
case ReadOptions::SEEK_CLOSEST:
findFlags = SampleTable::kFlagClosest;
break;
case ReadOptions::SEEK_FRAME_INDEX:
findFlags = SampleTable::kFlagFrameIndex;
break;
default:
CHECK(!"Should not be here.");
break;
}
if( mode != ReadOptions::SEEK_FRAME_INDEX) {
int64_t elstInitialEmptyEditUs = 0, elstShiftStartUs = 0;
if (mElstInitialEmptyEditTicks > 0) {
elstInitialEmptyEditUs = ((long double)mElstInitialEmptyEditTicks * 1000000) /
mTimescale;
/* Sample's composition time from ctts/stts entries are non-negative(>=0).
* Hence, lower bound on seekTimeUs is 0.
*/
seekTimeUs = std::max(seekTimeUs - elstInitialEmptyEditUs, (int64_t)0);
}
if (mElstShiftStartTicks > 0) {
elstShiftStartUs = ((long double)mElstShiftStartTicks * 1000000) / mTimescale;
seekTimeUs += elstShiftStartUs;
}
ALOGV("shifted seekTimeUs:%" PRId64 ", elstInitialEmptyEditUs:%" PRIu64
", elstShiftStartUs:%" PRIu64, seekTimeUs, elstInitialEmptyEditUs,
elstShiftStartUs);
} uint32_t sampleIndex;
// 调用Sample Table的findSampleAttime方法,根据seek mode来查找到seek sample index
status_t err = mSampleTable->findSampleAtTime(
seekTimeUs, 1000000, mTimescale,
&sampleIndex, findFlags); if (mode == ReadOptions::SEEK_CLOSEST
|| mode == ReadOptions::SEEK_FRAME_INDEX) {
// We found the closest sample already, now we want the sync
// sample preceding it (or the sample itself of course), even
// if the subsequent sync sample is closer.
findFlags = SampleTable::kFlagBefore;
} uint32_t syncSampleIndex = sampleIndex;
// assume every non-USAC audio sample is a sync sample. This works around
// seek issues with files that were incorrectly written with an
// empty or single-sample stss block for the audio track
if (err == OK && (!mIsAudio || mIsUsac)) {
err = mSampleTable->findSyncSampleNear(
sampleIndex, &syncSampleIndex, findFlags);
} // 获取到sample对应的开始位置以及长度
uint64_t sampleTime;
if (err == OK) {
err = mSampleTable->getMetaDataForSample(
sampleIndex, NULL, NULL, &sampleTime);
} if (err != OK) {
if (err == ERROR_OUT_OF_RANGE) {
// An attempt to seek past the end of the stream would
// normally cause this ERROR_OUT_OF_RANGE error. Propagating
// this all the way to the MediaPlayer would cause abnormal
// termination. Legacy behaviour appears to be to behave as if
// we had seeked to the end of stream, ending normally.
return AMEDIA_ERROR_END_OF_STREAM;
}
ALOGV("end of stream");
return AMEDIA_ERROR_UNKNOWN;
} if (mode == ReadOptions::SEEK_CLOSEST
|| mode == ReadOptions::SEEK_FRAME_INDEX) {
if (mElstInitialEmptyEditTicks > 0) {
sampleTime += mElstInitialEmptyEditTicks;
}
if (mElstShiftStartTicks > 0){
if (sampleTime > mElstShiftStartTicks) {
sampleTime -= mElstShiftStartTicks;
} else {
sampleTime = 0;
}
}
targetSampleTimeUs = (sampleTime * 1000000ll) / mTimescale;
}
// 记录下当前读取的sampleIndex
mCurrentSampleIndex = syncSampleIndex;
} if (mBuffer != NULL) {
mBuffer->release();
mBuffer = NULL;
} // fall through
} off64_t offset = 0;
size_t size = 0;
int64_t cts;
uint64_t stts;
bool isSyncSample;
bool newBuffer = false;
if (mBuffer == NULL) {
newBuffer = true; status_t err;
if (!mIsHeif) {
// 读取出sample对应的offset、size
err = mSampleTable->getMetaDataForSample(mCurrentSampleIndex, &offset, &size,
(uint64_t*)&cts, &isSyncSample, &stts);
if(err == OK) {
if (mElstInitialEmptyEditTicks > 0) {
cts += mElstInitialEmptyEditTicks;
}
// 计算DTS
if (mElstShiftStartTicks > 0) {
// cts can be negative. for example, initial audio samples for gapless playback.
cts -= (int64_t)mElstShiftStartTicks;
}
}
} else {
err = mItemTable->getImageOffsetAndSize(
options && options->getSeekTo(&seekTimeUs, &mode) ?
&mCurrentSampleIndex : NULL, &offset, &size); cts = stts = 0;
isSyncSample = 0;
ALOGV("image offset %lld, size %zu", (long long)offset, size);
} if (err != OK) {
if (err == ERROR_END_OF_STREAM) {
return AMEDIA_ERROR_END_OF_STREAM;
}
return AMEDIA_ERROR_UNKNOWN;
} // 猜测是向内存池申请内存块
err = mBufferGroup->acquire_buffer(&mBuffer); if (err != OK) {
CHECK(mBuffer == NULL);
return AMEDIA_ERROR_UNKNOWN;
}
if (size > mBuffer->size()) {
ALOGE("buffer too small: %zu > %zu", size, mBuffer->size());
mBuffer->release();
mBuffer = NULL;
return AMEDIA_ERROR_UNKNOWN; // ERROR_BUFFER_TOO_SMALL
}
} // ......
// 读到avc/hevc数据,处理数据并返回给上层
else {
// Whole NAL units are returned but each fragment is prefixed by
// the start code (0x00 00 00 01).
ssize_t num_bytes_read = 0;
bool mSrcBufferFitsDataToRead = size <= mSrcBufferSize;
if (mSrcBufferFitsDataToRead) {
// 将对应sample读到srcBuffer中
num_bytes_read = mDataSource->readAt(offset, mSrcBuffer, size);
} else {
// We are trying to read a sample larger than the expected max sample size.
// Fall through and let the failure be handled by the following if.
android_errorWriteLog(0x534e4554, "188893559");
} if (num_bytes_read < (ssize_t)size) {
mBuffer->release();
mBuffer = NULL; return mSrcBufferFitsDataToRead ? AMEDIA_ERROR_IO : AMEDIA_ERROR_MALFORMED;
} uint8_t *dstData = (uint8_t *)mBuffer->data();
size_t srcOffset = 0;
size_t dstOffset = 0; // 这里我觉得是一帧视频会有相当多的NALU构成,扫描每个NALU,检查其有效性并且加上NALU起始标志位
while (srcOffset < size) {
bool isMalFormed = !isInRange((size_t)0u, size, srcOffset, mNALLengthSize);
size_t nalLength = 0;
if (!isMalFormed) {
nalLength = parseNALSize(&mSrcBuffer[srcOffset]);
srcOffset += mNALLengthSize;
isMalFormed = !isInRange((size_t)0u, size, srcOffset, nalLength);
} if (isMalFormed) {
//if nallength abnormal,ignore it.
ALOGW("abnormal nallength, ignore this NAL");
srcOffset = size;
break;
} if (nalLength == 0) {
continue;
} if (dstOffset > SIZE_MAX - 4 ||
dstOffset + 4 > SIZE_MAX - nalLength ||
dstOffset + 4 + nalLength > mBuffer->size()) {
ALOGE("b/27208621 : %zu %zu", dstOffset, mBuffer->size());
android_errorWriteLog(0x534e4554, "27208621");
mBuffer->release();
mBuffer = NULL;
return AMEDIA_ERROR_MALFORMED;
} // 给HEVC 和 AVC 加上 NALU 的起始标志位
dstData[dstOffset++] = 0;
dstData[dstOffset++] = 0;
dstData[dstOffset++] = 0;
dstData[dstOffset++] = 1;
memcpy(&dstData[dstOffset], &mSrcBuffer[srcOffset], nalLength);
srcOffset += nalLength;
dstOffset += nalLength;
}
CHECK_EQ(srcOffset, size);
CHECK(mBuffer != NULL);
mBuffer->set_range(0, dstOffset); // 设定当前读取帧的PTS以及duration
AMediaFormat *meta = mBuffer->meta_data();
AMediaFormat_clear(meta);
AMediaFormat_setInt64(
meta, AMEDIAFORMAT_KEY_TIME_US, ((long double)cts * 1000000) / mTimescale);
AMediaFormat_setInt64(
meta, AMEDIAFORMAT_KEY_DURATION, ((long double)stts * 1000000) / mTimescale); if (targetSampleTimeUs >= 0) {
AMediaFormat_setInt64(
meta, AMEDIAFORMAT_KEY_TARGET_TIME, targetSampleTimeUs);
} if (mIsAVC) {
uint32_t layerId = FindAVCLayerId(
(const uint8_t *)mBuffer->data(), mBuffer->range_length());
AMediaFormat_setInt32(meta, AMEDIAFORMAT_KEY_TEMPORAL_LAYER_ID, layerId);
} else if (mIsHEVC) {
int32_t layerId = parseHEVCLayerId(
(const uint8_t *)mBuffer->data(), mBuffer->range_length());
if (layerId >= 0) {
AMediaFormat_setInt32(meta, AMEDIAFORMAT_KEY_TEMPORAL_LAYER_ID, layerId);
}
} if (isSyncSample) {
AMediaFormat_setInt32(meta, AMEDIAFORMAT_KEY_IS_SYNC_FRAME, 1);
} // 将sampleindex向后移动
++mCurrentSampleIndex;
// 将数据返回给上层
*out = mBuffer;
mBuffer = NULL; return AMEDIA_OK;
}
}

mp4封装格式与MPEG4Extractor的更多相关文章

  1. mp4封装格式各box类型讲解及IBP帧计算

    mp4封装格式各box类型讲解及IBP帧计算 目录 mp4封装格式各box类型讲解及IBP帧计算 box ftyp box moov box mvhd box (Movie Header Box) t ...

  2. ISO/IEC 15444-12 MP4 封装格式标准摘录 5

    目录 Segments Segment Type Box Segment Index Box Subsegment Index Box Producer Reference Time Box Supp ...

  3. ISO/IEC 15444-12 MP4 封装格式标准摘录 4

    目录 Movie Fragments Movie Extends Box Movie Extends Header Box Track Extends Box Movie Fragment Box M ...

  4. ISO/IEC 15444-12 MP4 封装格式标准摘录 3

    目录 Track Data Layout Structures Data Information Box Data Reference Box Sample Size Boxes Compact Sa ...

  5. ISO/IEC 15444-12 MP4 封装格式标准摘录 2

    目录 Track Media Structure Media Box Media Header Box Handler Reference Box Media Information Box Medi ...

  6. H.264标准(一)mp4封装格式详解

    在网络层,互联网提供所有应用程序都要使用的两种类型的服务,尽管目前理解这些服务的细节并不重要,但在所有TCP/IP概述中,都不能忽略他们: 无连接分组交付服务(Connectionless Packe ...

  7. 最简单的基于FFmpeg的封装格式处理:视音频复用器(muxer)

    ===================================================== 最简单的基于FFmpeg的封装格式处理系列文章列表: 最简单的基于FFmpeg的封装格式处理 ...

  8. 多媒体封装格式详解---MP4

    MP4文件格式详解——结构概述 http://blog.csdn.net/pirateleo/article/details/7061452 一.基本概念 1. 文件,由许多Box和FullBox组成 ...

  9. 最简单的基于FFMPEG的封装格式转换器(无编解码)

    本文介绍一个基于FFMPEG的封装格式转换器.所谓的封装格式转换,就是在AVI,FLV,MKV,MP4这些格式之间转换(相应.avi,.flv,.mkv,.mp4文件).须要注意的是,本程序并不进行视 ...

  10. 最简单的基于FFmpeg的封装格式处理:视音频分离器(demuxer)

    ===================================================== 最简单的基于FFmpeg的封装格式处理系列文章列表: 最简单的基于FFmpeg的封装格式处理 ...

随机推荐

  1. windows下配置mask2former(facebook版)

    1.安装Anaconda 2.安装PyCharm 3.创建python3.8环境(最高3.8因为有一个依赖包最高支持python3.8) 4.安装GCC 下载地址:https://sourceforg ...

  2. 一、Unity调用Xcode封装方法(工程引用文件)

    1.Xcode新建Static Library 工程 (我起的名字是UnityExtend 可以在接下来的图中看到) 2.打包unity ios工程 unity打包ios 打出Xcode工程 3.打开 ...

  3. leetcode:1381. 设计一个支持增量操作的栈

    1381. 设计一个支持增量操作的栈 请你设计一个支持下述操作的栈. 实现自定义栈类 CustomStack : CustomStack(int maxSize):用 maxSize 初始化对象,ma ...

  4. ddddocr基本使用和介绍

    ddddocr基本使用和介绍 摘要:在使用爬虫登录网站的时候,经常输入用户名和密码后会遇到验证码,这时候就需要用到今天给大家介绍的python第三方库ddddocr,ddddocr是一款强大的通用开源 ...

  5. 【GDKOI 2024 TG Day2】不休陀螺(top) 题解

    考虑一个卡牌区间怎样才不是"陀螺无限". 一个是费用在打到一半时费用就不够了.考虑构造一个卡牌序列使其尽量能够在打到一半时费用就不够,如何构造呢? 把 \(a_i > b_i ...

  6. 力扣5(java)-最长回文串(中等)

    题目: 给你一个字符串 s,找到 s 中最长的回文子串. 示例 1: 输入:s = "babad"输出:"bab"解释:"aba" 同样是符 ...

  7. HarmonyOS NEXT应用开发案例——行程地址交换动画

    介绍 本示例介绍使用显式动画 animateTo 实现左右地址交换动画.该场景多用于机票.火车票购买等出行类订票软件中. 效果预览图 使用说明 加载完成后显示地址交换动画页面,点击中间的图标,左右两边 ...

  8. EasyNLP发布融合语言学和事实知识的中文预训练模型CKBERT

    简介: 本⽂简要介绍CKBERT的技术解读,以及如何在EasyNLP框架.HuggingFace Models和阿里云机器学习平台PAI上使⽤CKBERT模型. 导读 预训练语言模型在NLP的各个应用 ...

  9. iLogtail开源之路

    简介: 2022年6月底,阿里云iLogtail代码完整开源,正式发布了完整功能的iLogtail社区版.iLogtail作为阿里云SLS官方标配的采集器,多年以来一直稳定服务阿里集团.蚂蚁集团以及众 ...

  10. 利器解读!Linux 内核调测中最最让开发者头疼的 bug 有解了|龙蜥技术

    ​简介:通过在Anolis 5.10 内核中增强 kfence 的功能,实现了一个线上的.精准的.可定制的内存调试解决方案. 编者按:一直持续存在内核内存调测领域两大行业难题: "内存被改& ...