FFMpeg的解碼流程

shaobin0604@163.com 2011-10-10

展開全文

FFMpeg的解碼流程

1. 從基礎(chǔ)談起
先給出幾個概念，以在后面的分析中方便理解
Container:在音視頻中的容器，一般指的是一種特定的文件格式，里面指明了所包含的
    音視頻，字幕等相關(guān)信息
Stream:這個詞有些微妙，很多地方都用到，比如TCP，SVR4系統(tǒng)等，其實在音視頻，你
    可以理解為單純的音頻數(shù)據(jù)或者視頻數(shù)據(jù)等
Frames:這個概念不是很好明確的表示，指的是Stream中的一個數(shù)據(jù)單元，要真正對這
    個概念有所理解，可能需要看一些音視頻編碼解碼的理論知識
Packet:是Stream的raw數(shù)據(jù)
Codec:Coded + Decoded
其實這些概念在在FFmpeg中都有很好的體現(xiàn)，我們在后續(xù)分析中會慢慢看到

2.解碼的基本流程
我很懶，于是還是選擇了從<An ffmpeg and SDL Tutorial>中的流程概述:

10 OPEN video_stream FROM video.avi
20 READ packet FROM video_stream INTO frame
30 IF frame NOT COMPLETE GOTO 20
40 DO SOMETHING WITH frame
50 GOTO 20

這就是解碼的全過程，一眼看去，是不是感覺不過如此:),不過，事情有深有淺，從淺
到深，然后從深回到淺可能才是一個有意思的過程，我們的故事，就從這里開始，展開
來講。

3.例子代碼
在<An ffmpeg and SDL Tutorial 1>中，給出了一個陽春版的解碼器，我們來仔細看看
陽春后面的故事，為了方便講述，我先貼出代碼：
#include <ffmpeg/avcodec.h>
#include <ffmpeg/avformat.h>

#include <stdio.h>

void SaveFrame(AVFrame *pFrame, int width, int height, int iFrame) {
  FILE *pFile;
  char szFilename[32];
  int  y;
 
  // Open file
  sprintf(szFilename, "frame%d.ppm", iFrame);
  pFile=fopen(szFilename, "wb");
  if(pFile==NULL)
    return;
 
  // Write header
  fprintf(pFile, "P6\n%d %d\n255\n", width, height);
 
  // Write pixel data
  for(y=0; y<height; y++)
    fwrite(pFrame->data[0]+y*pFrame->linesize[0], 1, width*3, pFile);
 
  // Close file
  fclose(pFile);
}

int main(int argc, char *argv[]) {
  AVFormatContext *pFormatCtx;
  int             i, videoStream;
  AVCodecContext  *pCodecCtx;
  AVCodec         *pCodec;
  AVFrame         *pFrame;
  AVFrame         *pFrameRGB;
  AVPacket        packet;
  int             frameFinished;
  int             numBytes;
  uint8_t         *buffer;
 
  if(argc < 2) {
    printf("Please provide a movie file\n");
    return -1;
  }
  // Register all formats and codecs
  ########################################
  [1]
  ########################################
  av_register_all();
 
  // Open video file
  ########################################
  [2]
  ########################################
  if(av_open_input_file(&pFormatCtx, argv[1], NULL, 0, NULL)!=0)
    return -1; // Couldn't open file
 
  // Retrieve stream information
  ########################################
  [3]
  ########################################
  if(av_find_stream_info(pFormatCtx)<0)
    return -1; // Couldn't find stream information
 
  // Dump information about file onto standard error
  dump_format(pFormatCtx, 0, argv[1], 0);
 
  // Find the first video stream
  videoStream=-1;
  for(i=0; i<pFormatCtx->nb_streams; i++)
    if(pFormatCtx->streams[i]->codec->codec_type==CODEC_TYPE_VIDEO) {
      videoStream=i;
      break;
    }
  if(videoStream==-1)
    return -1; // Didn't find a video stream
 
  // Get a pointer to the codec context for the video stream
  pCodecCtx=pFormatCtx->streams[videoStream]->codec;
 
  // Find the decoder for the video stream
  pCodec=avcodec_find_decoder(pCodecCtx->codec_id);
  if(pCodec==NULL) {
    fprintf(stderr, "Unsupported codec!\n");
    return -1; // Codec not found
  }
  // Open codec
  if(avcodec_open(pCodecCtx, pCodec)<0)
    return -1; // Could not open codec
 
  // Allocate video frame
  pFrame=avcodec_alloc_frame();
 
  // Allocate an AVFrame structure
  pFrameRGB=avcodec_alloc_frame();
  if(pFrameRGB==NULL)
    return -1;
   
  // Determine required buffer size and allocate buffer
  numBytes=avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width,
                  pCodecCtx->height);
  buffer=(uint8_t *)av_malloc(numBytes*sizeof(uint8_t));
 
  // Assign appropriate parts of buffer to image planes in pFrameRGB
  // Note that pFrameRGB is an AVFrame, but AVFrame is a superset
  // of AVPicture
  avpicture_fill((AVPicture *)pFrameRGB, buffer, PIX_FMT_RGB24,
         pCodecCtx->width, pCodecCtx->height);
 
  // Read frames and save first five frames to disk
  ########################################
  [4]
  ########################################
  i=0;
  while(av_read_frame(pFormatCtx, &packet)>=0) {
    // Is this a packet from the video stream?
    if(packet.stream_index==videoStream) {
      // Decode video frame
      avcodec_decode_video(pCodecCtx, pFrame, &frameFinished,
               packet.data, packet.size);
     
      // Did we get a video frame?
      if(frameFinished) {
    // Convert the image from its native format to RGB
    img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB24,
                    (AVPicture*)pFrame, pCodecCtx->pix_fmt,
                    pCodecCtx->width,
                    pCodecCtx->height);
   
    // Save the frame to disk
    if(++i<=5)
      SaveFrame(pFrameRGB, pCodecCtx->width, pCodecCtx->height,
            i);
      }
    }
   
    // Free the packet that was allocated by av_read_frame
    av_free_packet(&packet);
  }
 
  // Free the RGB image
  av_free(buffer);
  av_free(pFrameRGB);
 
  // Free the YUV frame
  av_free(pFrame);
 
  // Close the codec
  avcodec_close(pCodecCtx);
 
  // Close the video file
  av_close_input_file(pFormatCtx);
 
  return 0;
}

代碼注釋得很清楚，沒什么過多需要講解的，關(guān)于其中的什么YUV420，RGB，PPM等格式
，如果不理解，麻煩還是google一下，也可以參考:http://barrypopy./里面
的相關(guān)文章

其實這部分代碼，很好了Demo了怎么樣去抓屏功能的實現(xiàn)，但我們得去看看魔術(shù)師在后
臺的一些手法，而不只是簡單的享受其表演。

4.背后的故事
真正的難度，其實就是上面的[1],[2],[3],[4],其他部分，都是數(shù)據(jù)結(jié)構(gòu)之間的轉(zhuǎn)換，
如果你認真看代碼的話，不難理解其他部分。

[1]:沒什么太多好說的，如果不明白，看我轉(zhuǎn)載的關(guān)于FFmepg框架的文章

[2]:先說說里面的AVFormatContext *pFormatCtx結(jié)構(gòu)，字面意思理解AVFormatContext
就是關(guān)于AVFormat(其實就是我們上面說的Container格式)的所處的Context(場景)，自
然是保存Container信息的總控結(jié)構(gòu)了，后面你也可以看到，基本上所有的信息，都可
以從它出發(fā)而獲取到
   
我們來看看av_open_input_file()都做了些什么：
[libavformat/utils.c]
int av_open_input_file(AVFormatContext **ic_ptr, const char *filename,
                       AVInputFormat *fmt,
                       int buf_size,
                       AVFormatParameters *ap)
{
    ......
    if (!fmt) {
        /* guess format if no file can be opened */
        fmt = av_probe_input_format(pd, 0);
    }

   ......
    err = av_open_input_stream(ic_ptr, pb, filename, fmt, ap);
   ......
}

這樣看來，只是做了兩件事情：
1). 偵測容器文件格式
2). 從容器文件獲取Stream的信息

這兩件事情，實際上就是調(diào)用特定文件的demuxer以分離Stream的過程:

具體流程如下:

av_open_input_file
    |
    +---->av_probe_input_format從first_iformat中遍歷注冊的所有demuxer以
    |     調(diào)用相應(yīng)的probe函數(shù)
    |
    +---->av_open_input_stream調(diào)用指定demuxer的read_header函數(shù)以獲取相關(guān)
          流的信息ic->iformat->read_header

如果反過來再參考我轉(zhuǎn)貼的關(guān)于ffmpeg框架的文章，是否清楚一些了呢:)

[3]:簡單從AVFormatContext獲取Stream的信息，沒什么好多說的

[4]:先簡單說一些ffmpeg方面的東西，從理論角度說過來，Packet可以包含frame的部
分數(shù)據(jù)，但ffmpeg為了實現(xiàn)上的方便，使得對于視頻來說，每個Packet至少包含一
frame,對于音頻也是相應(yīng)處理，這是實現(xiàn)方面的考慮，而非協(xié)議要求.
因此，在上面的代碼實際上是這樣的：
    從文件中讀取packet，從Packet中解碼相應(yīng)的frame;
    從幀中解碼;
    if(解碼幀完成)
        do something();

我們來看看如何獲取Packet,又如何從Packet中解碼frame的。

av_read_frame
    |
    +---->av_read_frame_internal
        |
        +---->av_parser_parse調(diào)用的是指定解碼器的s->parser->parser_parse函數(shù)以從raw packet中重構(gòu)frame

avcodec_decode_video
    |
    +---->avctx->codec->decode調(diào)用指定Codec的解碼函數(shù)
   
因此，從上面的過程可以看到，實際上分為了兩部分：

一部分是解復用(demuxer),然后是解碼(decode)

使用的分別是：
av_open_input_file()            ---->解復用

av_read_frame()            |
                           |    ---->解碼   
avcodec_decode_video()     |

5.后面該做些什么
結(jié)合這部分和轉(zhuǎn)貼的ffmepg框架的文章，應(yīng)該可以基本打通解碼的流程了，后面的問題則是針對具體容器格式和具體編碼解碼器的分析，后面我們繼續(xù)


參考：
[1]. <An ffmpeg and SDL Tutorial>
     http://www./ffmpeg/tutorial01.html
    
[2]. <FFMpeg框架代碼閱讀>
     http://blog.csdn.net/wstarx/archive/2007/04/20/1572393.aspx