| #FFmpeg關于Nvidia支持介紹 
 ##NVDEC/CUVID(官方介紹如下)官方鏈接:http://trac./wiki/HWAccelIntro
 CUDA (NVENC/NVDEC)NVENC and NVDEC are NVIDIA’s hardware-accelerated encoding and decoding APIs. They used to be called CUVID. They can be used for encoding and decoding on Windows and Linux. FFmpeg refers to NVENC/NVDEC interconnect as CUDA.
  
 NVDEC offers decoders for H.264, HEVC, MJPEG, MPEG-1/2/4, VP8/VP9, VC-1. Codec support varies by hardware (see the ?GPU compatibility table).  
 Note that FFmpeg offers both NVDEC and CUVID hwaccels. They differ in how frames are decoded and forwarded in memory.  
 The full set of codecs being available only on Pascal hardware, which adds VP9 and 10 bit support. The note about missing ffnvcodec from NVENC applies for NVDEC as well. Sample decode using CUDA: ffmpeg -hwaccel cuda -i input output
 Sample decode using CUVID: ffmpeg -c:v h264_cuvid -i input output
 FFplay only supports older option -vcodec (not -c:v) and only CUVID. 
ffplay -vcodec hevc_cuvid file.mp4
 Full hardware transcode with NVDEC and NVENC: 
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input -c:v h264_nvenc -preset slow output
 If ffmpeg was compiled with support for libnpp, it can be used to insert a GPU based scaler into the chain: ffmpeg -hwaccel_device 0 -hwaccel cuda -i input -vf scale_npp=-1:720 -c:v h264_nvenc -preset slow output.mkv
 The -hwaccel_device option can be used to specify the GPU to be used by the hwaccel in ffmpeg. 上面一段話的總結是,我們有兩種方式去調用h264的解碼,第一種是通過加速器-hwaccel cuda去調用,第二種是通過-c:v h264_cuvid ,這兩種方式都是GPU解碼,底層調用的都是ffnvcodec的API,只是調用方式不同而已。 總結一下:cuvid和nvdec底層調用的解碼API都是ffnvcodec中提供的API,兩者本質沒有上面區(qū)別。
 在調用區(qū)別是:
 cuvid在ffmpeg是一個外部解碼器(類似于libx264外部庫),可以直接通過avcodec_find_decoder_by_name(h264_cuvid、libx265等)直接獲取到一個解碼器,這個解碼器內部使用的是ffnvcodec的API來解碼。nvdec是一個加速解碼器,在調用的過程中先打開一個解碼器,比如h264,注意,這個解碼器是ffmpeg內部自己寫的解碼器,然后給這個解碼器的上下文AVCodecContext指定一個加速硬件,比如cuda,然后在實際使用過程中,如果發(fā)現(xiàn)指定了硬件加速器,那么就進入cuda的解碼器中,也就是ffnvcodec的API中,如果沒有加速器,進進入ffmpeg自己寫的cpu的軟解碼的邏輯中。
 綜上所述,cuvid和nvenc是Nvidia的第三方編解碼庫(你以前是不是覺的nvdec和nvenc是Nvidia的第三方解碼器),nvdec是解碼的加速器,就是ffmpeg內部自己寫了一個h264的解碼代碼(根據h264標準),在這些代碼中內嵌了一個硬解碼加速器,比如cuda,如果你指定了使用cuda硬件,那么就會跳入硬解碼的邏輯中。 
 下面詳細介紹一下 目前FFmpeg的第三方庫支持中有關英偉達的支持有如下幾個,注意后面的[autodetect]表示不指定disable就自動檢測: The following libraries provide various hardware acceleration features:
  --disable-cuvid          disable Nvidia CUVID support [autodetect]
  --disable-ffnvcodec      disable dynamically linked Nvidia code [autodetect]
  --disable-nvdec          disable Nvidia video decoding acceleration (via hwaccel) [autodetect]
  --disable-nvenc          disable Nvidia video encoding code [autodetect]
 ##那么這四個有什么聯(lián)系和區(qū)別呢? 下面是configure中硬件加速自動檢測的列表,可以看到有我們剛才說的四個NVIDIA模塊。 HWACCEL_AUTODETECT_LIBRARY_LIST='
    ...
    cuda
    cuvid
    ...
    ffnvcodec
    nvdec
    nvenc
    ...
'
AUTODETECT_LIBS='
    $EXTERNAL_AUTODETECT_LIBRARY_LIST
    $HWACCEL_AUTODETECT_LIBRARY_LIST
    $THREADS_LIST
'
 下面是自動檢測的流程,其實就是檢查頭文件、庫文件是否存在,能否通過編譯(一個簡單的main函數(shù)) #下面是檢測ffnvcodec開關以及自動檢測其頭文件和庫文件是否可以用
#ffnvcodec是Nvidia提供的關于編解碼的頭文件
if ! disabled ffnvcodec; then
    ffnv_hdr_list='ffnvcodec/nvEncodeAPI.h ffnvcodec/dynlink_cuda.h ffnvcodec/dynlink_cuviddec.h ffnvcodec/dynlink_nvcuvid.h'
    check_pkg_config ffnvcodec 'ffnvcodec >= 9.1.23.1' '$ffnv_hdr_list' '' || \
      check_pkg_config ffnvcodec 'ffnvcodec >= 9.0.18.3 ffnvcodec < 9.1' '$ffnv_hdr_list' '' || \
      check_pkg_config ffnvcodec 'ffnvcodec >= 8.2.15.10 ffnvcodec < 8.3' '$ffnv_hdr_list' '' || \
      check_pkg_config ffnvcodec 'ffnvcodec >= 8.1.24.11 ffnvcodec < 8.2' '$ffnv_hdr_list' ''
fi
#查看編碼頭文件ffnvcodec/nvEncodeAPI.h和庫文件ffnvcodec是否可以通過編譯
enabled nvenc &&
    test_cc -I$source_path <<EOF || disable nvenc
#include <ffnvcodec/nvEncodeAPI.h>
NV_ENCODE_API_FUNCTION_LIST flist;
void f(void) { struct { const GUID guid; } s[] = { { NV_ENC_PRESET_HQ_GUID } }; }
int main(void) { return 0; }
EOF
#這里同上,檢測頭文件ffnvcodec/dynlink_cuda.h ffnvcodec/dynlink_cuviddec.h是否存在
if enabled_any nvdec cuvid; then
    check_type 'ffnvcodec/dynlink_cuda.h ffnvcodec/dynlink_cuviddec.h' 'CUVIDAV1PICPARAMS'
fi
 在上面的解碼模塊中有一個命令enabled_any nvdec cuvid從這里可以看到(它倆使用的是相同的頭文件)nvdec和cuvid最終依賴的是一個底層庫。 接下來檢測上述檢測是否通過 enabled(){
    test '${1#!}' = '$1' && op='=' || op='!='
    eval test 'x\$${1#!}' $op 'xyes'
}
requested(){
    test '${1#!}' = '$1' && op='=' || op='!='
    eval test 'x\$${1#!}_requested' $op 'xyes'
}
# Check if requested libraries were found.
for lib in $AUTODETECT_LIBS; do
    requested $lib && ! enabled $lib && die 'ERROR: $lib requested but not found';
done
 ##FFmpeg源代碼分析下面是cuviddec,c解碼器模板內容:
 // * Nvidia CUVID decoder
#include 'libavutil/hwcontext.h'
#include 'compat/cuda/dynlink_loader.h'
#include 'avcodec.h'
#include 'decode.h'
#include 'hwconfig.h'
#include 'nvdec.h'
#include 'internal.h'
static av_cold int cuvid_decode_init(AVCodecContext *avctx);
//這里是一個宏定義模板
 #define DEFINE_CUVID_CODEC(x, X, bsf_name)     static const AVClass x##_cuvid_class = {         .class_name = #x '_cuvid',         .item_name = av_default_item_name,         .option = options,         .version = LIBAVUTIL_VERSION_INT,     };     const AVCodec ff_##x##_cuvid_decoder = {         .name           = #x '_cuvid',         .long_name      = NULL_IF_CONFIG_SMALL('Nvidia CUVID ' #X ' decoder'),         .type           = AVMEDIA_TYPE_VIDEO,         .id             = AV_CODEC_ID_##X,         .priv_data_size = sizeof(CuvidContext),         .priv_class     = &x##_cuvid_class,         .init           = cuvid_decode_init,         .close          = cuvid_decode_end,         .receive_frame  = cuvid_output_frame,         .flush          = cuvid_flush,         .bsfs           = bsf_name,         .capabilities   = AV_CODEC_CAP_DELAY | AV_CODEC_CAP_AVOID_PROBING | AV_CODEC_CAP_HARDWARE,         .caps_internal  = FF_CODEC_CAP_SETS_FRAME_PROPS,         .pix_fmts       = (const enum AVPixelFormat[]){ AV_PIX_FMT_CUDA,                                                         AV_PIX_FMT_NV12,                                                         AV_PIX_FMT_P010,                                                         AV_PIX_FMT_P016,                                                         AV_PIX_FMT_NONE },         .hw_configs     = cuvid_hw_configs,         .wrapper_name   = 'cuvid',     };
 上面cuvid_decode_init 、uvid_decode_end這些回調函數(shù)內部使用的就是ffnvcodec中的API. 然后再看編碼器,這里是Nvidia編碼器nvenc.h #include <ffnvcodec/nvEncodeAPI.h>
#include 'compat/cuda/dynlink_loader.h'
int ff_nvenc_encode_init(AVCodecContext *avctx);
int ff_nvenc_encode_close(AVCodecContext *avctx);
int ff_nvenc_receive_packet(AVCodecContext *avctx, AVPacket *pkt);
void ff_nvenc_encode_flush(AVCodecContext *avctx);
extern const enum AVPixelFormat ff_nvenc_pix_fmts[];
extern const AVCodecHWConfigInternal *const ff_nvenc_hw_configs[];
 這里是nvenc_h264.c,英偉達關于H264的編碼器 static const AVClass h264_nvenc_class = {
    .class_name = 'h264_nvenc',
    .item_name = av_default_item_name,
    .option = options,
    .version = LIBAVUTIL_VERSION_INT,
};
const AVCodec ff_h264_nvenc_encoder = {
    .name           = 'h264_nvenc',
    .long_name      = NULL_IF_CONFIG_SMALL('NVIDIA NVENC H.264 encoder'),
    .type           = AVMEDIA_TYPE_VIDEO,
    .id             = AV_CODEC_ID_H264,
    .init           = ff_nvenc_encode_init,
    .receive_packet = ff_nvenc_receive_packet,
    .close          = ff_nvenc_encode_close,
    .flush          = ff_nvenc_encode_flush,
    .priv_data_size = sizeof(NvencContext),
    .priv_class     = &h264_nvenc_class,
    .defaults       = defaults,
    .capabilities   = AV_CODEC_CAP_DELAY | AV_CODEC_CAP_HARDWARE |
                      AV_CODEC_CAP_ENCODER_FLUSH | AV_CODEC_CAP_DR1,
    .caps_internal  = FF_CODEC_CAP_INIT_CLEANUP,
    .pix_fmts       = ff_nvenc_pix_fmts,
    .wrapper_name   = 'nvenc',
    .hw_configs     = ff_nvenc_hw_configs,
};
 上面ff_nvenc_receive_packet、ff_nvenc_encode_close這些回調函數(shù)內部使用的就是ffnvcodec中的API.Nvidia支持的加速編碼碼還包含:
 上面的兩個我們稱之為編解碼器,是因為構造他它們的結構體是AVCodec,它們都注冊在編解碼器中的數(shù)組中: static const FFCodec * const codec_list[] = {
    &ff_h264_nvenc_encoder;,
    &ff_hevc_cuvid_decoder,
    &ff_libx264_encoder,
    &ff_amv_encoder,
   ...
    &ff_apng_decoder,
    &ff_arbc_decoder,
    &ff_argo_decoder,
    &ff_asv1_decoder,
    &ff_adpcm_ima_ws_decoder,
    &ff_adpcm_ms_decoder,
    &ff_adpcm_mtaf_decoder,
    &ff_adpcm_psx_decoder,
    &ff_adpcm_sbpro_2_decoder,
    &ff_bintext_decoder,
    &ff_xbin_decoder,
    &ff_idf_decoder,
    &ff_av1_decoder,
    NULL };
 那下面這個就是加速器,它是由AVHWAccel構成的這里是nvdec.h,里面是NVIDIA解碼sdk的封裝
 // * HW decode acceleration through NVDEC
typedef struct NVDECContext ;
typedef struct NVDECFrame;
#include 'compat/cuda/dynlink_loader.h'
int ff_nvdec_decode_init(AVCodecContext *avctx);
int ff_nvdec_decode_uninit(AVCodecContext *avctx);
int ff_nvdec_start_frame(AVCodecContext *avctx, AVFrame *frame);
int ff_nvdec_start_frame_sep_ref(AVCodecContext *avctx, AVFrame *frame, int has_sep_ref);
int ff_nvdec_end_frame(AVCodecContext *avctx);
int ff_nvdec_simple_end_frame(AVCodecContext *avctx);
int ff_nvdec_simple_decode_slice(AVCodecContext *avctx, const uint8_t *buffer,
                                 uint32_t size);
int ff_nvdec_frame_params(AVCodecContext *avctx,
                          AVBufferRef *hw_frames_ctx,
                          int dpb_size,
                          int supports_444);
int ff_nvdec_get_ref_idx(AVFrame *frame);
typedef struct H264Context {
    const AVClass *class;
    AVCodecContext *avctx;
    ...   
    }
typedef struct AVCodecContext {
       /**
     * Hardware accelerator in use
     * - encoding: unused.
     * - decoding: Set by libavcodec
     */
    const struct AVHWAccel *hwaccel;
    ...
    }
const AVHWAccel ff_h264_nvdec_hwaccel = {
    .name                 = 'h264_nvdec',
    .type                 = AVMEDIA_TYPE_VIDEO,
    .id                   = AV_CODEC_ID_H264,
    .pix_fmt              = AV_PIX_FMT_CUDA,
    .start_frame          = nvdec_h264_start_frame,
    .end_frame            = ff_nvdec_end_frame,
    .decode_slice         = nvdec_h264_decode_slice,
    .frame_params         = nvdec_h264_frame_params,
    .init                 = ff_nvdec_decode_init,
    .uninit               = ff_nvdec_decode_uninit,
    .priv_data_size       = sizeof(NVDECContext),
};
 上面nvdec_h264_start_frame、nvdec_h264_frame_params這些回調函數(shù)內部使用的就是ffnvcodec中的API. Nvidia支持的加速解碼還包含: nvdec_av1nvdec_h264nvdec_hevcnvdec_mjpegnvdec_mpeg4nvdev_mpeg12nvdec_vc1nvdec_vp8nvdec_vp9
 然后來看看在ffmpeg內部解碼器中是怎么調用加速器的,下面是編解碼器的上下文,在struct AVCodecContext中有這么一個成員變量
     /**
     * Hardware accelerator in use
     * - encoding: unused.
     * - decoding: Set by libavcodec
     */
    const struct AVHWAccel *hwaccel;
    AVBufferRef *hw_device_ctx;
 如果你在打開ffmpeg提供的解碼器時,指定了加速器cuda,那么就會在下面調用中進入硬件加速解碼這些函數(shù)實際在h264.c中調用:
 static int decode_nal_units(H264Context *h, const uint8_t *buf, int buf_size){
    ...
    if (h->nb_slice_ctx_queued == max_slice_ctx) {
                if (h->avctx->hwaccel) {
                    ret = avctx->hwaccel->decode_slice(avctx, nal->raw_data, nal->raw_size);
                    h->nb_slice_ctx_queued = 0;
                } ...
    }
            ...
}
static int decode_nal_units(H264Context *h, const uint8_t *buf, int buf_size){
  ...
case H264_NAL_SPS: {
            GetBitContext tmp_gb = nal->gb;
            if (avctx->hwaccel && avctx->hwaccel->decode_params) {
                ret = avctx->hwaccel->decode_params(avctx,
                                                    nal->type,
                                                    nal->raw_data,
                                                    nal->raw_size);
                if (ret < 0)
                    goto end;
            }
            ...
}
 參考:http://trac./wiki/HWAccelIntro |