このブログは、株式会社フィックスターズのエンジニアが、あらゆるテーマについて自由に書いているブログです。

2017年9月26日 Koji Ueno

前回は、単にエンコードしただけですが、もう少しいじってみます。

固定フレームレート化

前回のコードだと、入力ファイルのフレームのptsをほぼそのまま使っていますが、フレームを増やしたり、減らしたり、順番を変えたりしたい場合、自分でptsを再定義する必要があります。フレームレートは固定にしてptsを再定義してみましょう。前回のコードから少し修正します。decode_allで入力ファイルを全てデコードした後、time_baseを設定したいフレームレートの逆数に設定します。

time_base = av_make_q(1001, 30000);

今回はフレームレートを30000/1001 (29.97fps) にしたいので、こうします。

次に、エンコードするフレームのptsを設定しているところ

int64_t pts = av_frame_get_best_effort_timestamp(frame);
frame->pts = av_rescale_q(pts, time_base, codec_context->time_base);

これを以下のように変更します。

frame->pts = av_rescale_q(frame_count++, time_base, codec_context->time_base);

frame_countはwhileループの外で定義してください。

int frame_count = 0;
while(frames.size() > 0) {
  ...
  frame->pts = av_rescale_q(frame_count++, time_base, stream->time_base);
  ...
}

１フレームごとにframe_countは+1されます。

これで、ptsが再定義され固定フレームレート化します。

ファイル出力しないでプログラムで受け取る

今のプログラムでは、エンコードしたmp4データはファイル出力されますが、ファイル出力しないで、プログラムで受け取るようにしてみましょう。

format_context->pbにセットするAVIOContextを自分で作れば、プログラムで受け取れます。avio_openの代わりに以下のようにコールバック関数を渡してallocします。

// ファイル出力の代わりに呼ばれる
int write_packet(void *opaque, uint8_t *buf, int buf_size) {
  // do something
  return 0;
}
...
int bufsize= 16 * 1024;
unsigned char* buffer = (unsigned char*)av_malloc(bufsize);
AVIOContext* io_context = avio_alloc_context(
    buffer, bufsize, 1, nullptr, nullptr, write_packet, nullptr);

bufsizeは適当な大きさにしてください。このio_contextを解放するときは、avio_closeの代わりに以下のようにbufferとio_contextをav_freeします。

av_freep(&io_context->buffer);
av_freep(&io_context);

再エンコードしないで映像と音声を再mux

これまでデコード、エンコードをやってきましたが、デコードもエンコードもしないで、コンテナだけ入れ替えてみましょう。movからmkvに入れ替えてみます。

デコード、エンコードはしないので、av_read_frameで受け取ったパケットをav_interleaved_write_frameで出力に渡せば良さそうです。ではそのコードを書いていきます。

まず、いつものように初期化

av_register_all();

入力ファイルを開きます。

// open input
const char* input_path = "hoge.mov";
AVFormatContext* input_context = nullptr;
if (avformat_open_input(&input_context, input_path, nullptr, nullptr) != 0) {
  printf("avformat_open_input failed\n");
}

出力ファイルを開きます。

// open output
const char* output_path = "output.mkv";
AVIOContext* io_context = nullptr;
if (avio_open(&io_context, output_path, AVIO_FLAG_WRITE) < 0) {
  printf("avio_open failed\n");
}

出力ファイルのAVFormatContextをallocして、io_contextをセットします。フォーマットはmkvにしたいので、”matroska”と指定します。

  AVFormatContext* output_context = nullptr;
  if (avformat_alloc_output_context2(&output_context, nullptr, "matroska", nullptr) < 0) {
    printf("avformat_alloc_output_context2 failed\n");
  }
  output_context->pb = io_context;

入力ファイルのストリーム情報を取得

// find streams
if (avformat_find_stream_info(input_context, nullptr) < 0) {
  printf("avformat_find_stream_info failed\n");
}

ここまでは、これまでやってきたことと同じです。

次に、入力ファイルのストリームを列挙して、対応する出力ストリームを作っていきます。

std::vector<int> stream_map(input_context->nb_streams, -1);
for (int i = 0; i < (int)input_context->nb_streams; ++i) {
  AVStream* in_stream = input_context->streams[i];
  AVCodec* codec = avcodec_find_decoder(in_stream->codecpar->codec_id);
  if (codec == nullptr) {
    printf("codec not found for stream %d\n", i);
    continue;
  }
  AVStream* out_stream = avformat_new_stream(output_context, codec);
  if (out_stream == NULL) {
    printf("avformat_new_stream failed\n");
  }
  out_stream->sample_aspect_ratio = in_stream->sample_aspect_ratio;
  out_stream->time_base = in_stream->time_base;
  if (avcodec_parameters_copy(out_stream->codecpar, in_stream->codecpar) < 0) {
    printf("avcodec_parameters_copy failed\n");
  }
  out_stream->codecpar->codec_tag = 0;
  stream_map[i] = out_stream->index;
}

各入力ストリームに対して、コーデックを探して、コーデックがあれば（ffmpegが対応していれば）出力ストリームを作ります。必要なパラメータを入力ストリームからコピーして、stream_mapに入力ストリームと出力ストリームの対応を記憶していきます。

out_stream->codecpar->codec_tag = 0;

これは、おまじないです。ないとフォーマットによってエラーになるので、書いてください。

出力フォーマットを設定できたので、ファイルを読み書きしていきます。

まず、avformat_write_headerを呼び出します。

if (avformat_write_header(output_context, nullptr) < 0) {
  printf("avformat_write_header failed\n");
}

で、読み書きしていきます。

AVPacket packet = AVPacket();
while (av_read_frame(input_context, &packet) == 0) {
  int out_index = stream_map[packet.stream_index];
  if (out_index != -1) {
    AVRational in_time_base = input_context->streams[packet.stream_index]->time_base;
    AVRational out_time_base = output_context->streams[out_index]->time_base;
    packet.stream_index = out_index;
    av_packet_rescale_ts(&packet, in_time_base, out_time_base);
    if (av_interleaved_write_frame(output_context, &packet) != 0) {
      printf("av_interleaved_write_frame failed\n");
    }
  }
  else {
    av_packet_unref(&packet);
  }
}

先程作ったstream_mapで出力ストリームの番号を取得しています。出力にないストリームは-1にしたので、そのストリームのパケットは捨てます。stream_indexの設定と、タイムスタンプの変換を行って、output_contextに渡しています。

ファイルを全部処理したらav_write_trailerを呼び出して、

if (av_write_trailer(output_context) != 0) {
  printf("av_write_trailer failed\n");
}

解放処理して終わりです。

avformat_close_input(&input_context);
avformat_free_context(output_context);
avio_close(io_context);

説明したコード全文を貼っておきます。

#define __STDC_CONSTANT_MACROS
#define __STDC_LIMIT_MACROS
#include &lt;stdio.h&gt;
#include &lt;vector&gt;
extern "C" {
#include &lt;libavutil/imgutils.h&gt;
#include &lt;libavcodec/avcodec.h&gt;
#include &lt;libavformat/avformat.h&gt;
}
#pragma comment(lib, "avutil.lib")
#pragma comment(lib, "avcodec.lib")
#pragma comment(lib, "avformat.lib")

int main(int argc, char* argv[])
{
  av_register_all();

  // open input
  const char* input_path = "hoge.mov";
  AVFormatContext* input_context = nullptr;
  if (avformat_open_input(&input_context, input_path, nullptr, nullptr) != 0) {
    printf("avformat_open_input failed\n");
  }

  // open output
  const char* output_path = "output.mkv";
  AVIOContext* io_context = nullptr;
  if (avio_open(&io_context, output_path, AVIO_FLAG_WRITE) < 0) {
    printf("avio_open failed\n");
  }

  AVFormatContext* output_context = nullptr;
  if (avformat_alloc_output_context2(&output_context, nullptr, "matroska", nullptr) < 0) {
    printf("avformat_alloc_output_context2 failed\n");
  }

  output_context->pb = io_context;

  // find streams
  if (avformat_find_stream_info(input_context, nullptr) < 0) {
    printf("avformat_find_stream_info failed\n");
  }

  std::vector<int> stream_map(input_context->nb_streams, -1);
  for (int i = 0; i < (int)input_context->nb_streams; ++i) {
    AVStream* in_stream = input_context->streams[i];
    AVCodec* codec = avcodec_find_decoder(in_stream->codecpar->codec_id);
    if (codec == nullptr) {
      printf("codec not found for stream %d\n", i);
      continue;
    }
    AVStream* out_stream = avformat_new_stream(output_context, codec);
    if (out_stream == NULL) {
      printf("avformat_new_stream failed\n");
    }
    out_stream->sample_aspect_ratio = in_stream->sample_aspect_ratio;
    out_stream->time_base = in_stream->time_base;
    if (avcodec_parameters_copy(out_stream->codecpar, in_stream->codecpar) < 0) {
      printf("avcodec_parameters_copy failed\n");
    }
    out_stream->codecpar->codec_tag = 0;
    stream_map[i] = out_stream->index;
  }

  if (avformat_write_header(output_context, nullptr) < 0) {
    printf("avformat_write_header failed\n");
  }

  AVPacket packet = AVPacket();

  while (av_read_frame(input_context, &packet) == 0) {
    int out_index = stream_map[packet.stream_index];
    if (out_index != -1) {
      AVRational in_time_base = input_context->streams[packet.stream_index]->time_base;
      AVRational out_time_base = output_context->streams[out_index]->time_base;
      packet.stream_index = out_index;
      av_packet_rescale_ts(&packet, in_time_base, out_time_base);
      if (av_interleaved_write_frame(output_context, &packet) != 0) {
        printf("av_interleaved_write_frame failed\n");
      }
    }
    else {
      av_packet_unref(&packet);
    }
  }

  if (av_write_trailer(output_context) != 0) {
    printf("av_write_trailer failed\n");
  }

  avformat_close_input(&input_context);
  avformat_free_context(output_context);
  avio_close(io_context);

  return 0;
}

y4mで外部エンコーダに渡す

FFmpegに組み込まれているエンコーダではなく、x264やx265などのエンコーダ単体のバイナリにフレームを渡してエンコードしたいときもあると思います。コマンドラインからは以下のようにしてffmpegからx264にフレームを渡すことができますが、これをFFmpeg APIを使ってやってみましょう。

ffmpeg -i hoge.mov -f yuv4mpegpipe - | x264 --demuxer y4m --crf 22 -o output.264 -

基本的には前回書いたエンコードのコードとほぼ同じです。フォーマットやコーデック指定だけ修正して、x264をプログラムから起動して、データを渡してやればいいだけです。前回書いたエンコードのコードをベースに修正していきます。

まず、出力をファイルではなくプログラムで受け取りたいので、上で書いたようにavio_alloc_contextを使ってAVIOContextを作るように修正します。

int bufsize = 16 * 1024;
unsigned char* buffer = (unsigned char*)av_malloc(bufsize);
AVIOContext* io_context = avio_alloc_context(
  buffer, bufsize, 1, nullptr, nullptr, write_packet, nullptr);

write_packetのコールバック関数は、起動したx264の標準入力に書き込むようにします。

static int write_packet(void *opaque, uint8_t *buf, int buf_size) {
  // エンコーダに渡す
  DWORD bytesWritten = 0;
  if (WriteFile(writeHandle, buf, buf_size, &bytesWritten, nullptr) == 0) {
    printf("failed to write to stdin pipe\n");
  }
  return 0;
}

writeHandleを作るコードはffmpegとは特に関係ないので、説明はしませんが、CreatePipeで作ったパイプのハンドルです。相方の読み取りハンドルをSTARTUPINFOのhStdInputにセットして、x264のexe起動CreateProcessに渡して、x264の標準入力に書き込むことができるようにしたものです。

あとは、フォーマットを”yuv4mpegpipe”、コーデックを”wrapped_avframe”にすればOK

AVFormatContext* format_context = nullptr;
if (avformat_alloc_output_context2(
  &format_context, nullptr, "yuv4mpegpipe", nullptr) < 0) {
  printf("avformat_alloc_output_context2 failed\n");
}

format_context->pb = io_context;

AVCodec* codec = avcodec_find_encoder_by_name("wrapped_avframe");
if (codec == nullptr) {
  printf("encoder not found ...\n");
}

残りのコードは、io_contextの解放だけ修正が必要ですが、それ以外そのままです。

av_freep(&io_context->buffer);
av_freep(&io_context);

コードを貼っておきます。

#define __STDC_CONSTANT_MACROS
#define __STDC_LIMIT_MACROS
#include &lt;stdio.h&gt;
#include &lt;deque&gt;
#include &lt;Windows.h&gt;
extern "C" {
#include &lt;libavutil/imgutils.h&gt;
#include &lt;libavcodec/avcodec.h&gt;
#include &lt;libavformat/avformat.h&gt;
}
#pragma comment(lib, "avutil.lib")
#pragma comment(lib, "avcodec.lib")
#pragma comment(lib, "avformat.lib")

static void decode_all()
{
  // decode_all()は前回と同じなので省略
}

HANDLE writeHandle;
PROCESS_INFORMATION pi;

static void launch_x264()
{
  HANDLE readHandle;
  SECURITY_ATTRIBUTES sa = SECURITY_ATTRIBUTES();
  sa.nLength = sizeof(sa);
  sa.bInheritHandle = TRUE;
  sa.lpSecurityDescriptor = nullptr;
  if (CreatePipe(&readHandle, &writeHandle, &sa, 0) == 0) {
    printf("failed to create pipe\n");
  }

  STARTUPINFO si = STARTUPINFO();
  pi = PROCESS_INFORMATION();

  si.cb = sizeof(si);
  // 本当はhStdOutput, hStdErrorも設定する必要があるが省略
  si.hStdInput = readHandle;
  si.dwFlags |= STARTF_USESTDHANDLES;

  // 必要ないハンドルは継承を無効化
  if (SetHandleInformation(writeHandle, HANDLE_FLAG_INHERIT, 0) == 0)
  {
    printf("failed to set handle information\n");
  }

  char* args = "x264 --demuxer y4m --crf 22 -o output.264 -";
  if (CreateProcess(nullptr, args,
    nullptr, nullptr, TRUE, 0, nullptr, nullptr, &si, &pi) == 0) {
    printf("プロセス起動に失敗。exeのパスを確認してください。\n");
  }

  // 子プロセス用のハンドルは必要ないので閉じる
  if (readHandle != nullptr) {
    CloseHandle(readHandle);
    readHandle = nullptr;
  }
}

static void close_x264() {

  if (writeHandle != nullptr) {
    CloseHandle(writeHandle);
    writeHandle = nullptr;
  }

  DWORD exitCode;
  if (pi.hProcess != nullptr) {
    // 子プロセスの終了を待つ
    WaitForSingleObject(pi.hProcess, INFINITE);
    // 終了コード取得
    GetExitCodeProcess(pi.hProcess, &exitCode);

    CloseHandle(pi.hProcess);
    CloseHandle(pi.hThread);
    pi.hProcess = nullptr;
  }
}

static int write_packet(void *opaque, uint8_t *buf, int buf_size) {
  // エンコーダに渡す
  DWORD bytesWritten = 0;
  if (WriteFile(writeHandle, buf, buf_size, &bytesWritten, nullptr) == 0) {
    printf("failed to write to stdin pipe\n");
  }
  return 0;
}

int main(int argc, char* argv[])
{
  av_register_all();

  decode_all();

  time_base = av_make_q(1001, 30000);

  int bufsize = 16 * 1024;
  unsigned char* buffer = (unsigned char*)av_malloc(bufsize);
  AVIOContext* io_context = avio_alloc_context(
    buffer, bufsize, 1, nullptr, nullptr, write_packet, nullptr);

  AVFormatContext* format_context = nullptr;
  if (avformat_alloc_output_context2(
    &format_context, nullptr, "yuv4mpegpipe", nullptr) < 0) {
    printf("avformat_alloc_output_context2 failed\n");
  }

  format_context->pb = io_context;

  AVCodec* codec = avcodec_find_encoder_by_name("wrapped_avframe");
  if (codec == nullptr) {
    printf("encoder not found ...\n");
  }

  AVCodecContext* codec_context = avcodec_alloc_context3(codec);
  if (codec_context == nullptr) {
    printf("avcodec_alloc_context3 failed\n");
  }

  // set picture properties
  AVFrame* first_frame = frames[0];
  codec_context->pix_fmt = (AVPixelFormat)first_frame->format;
  codec_context->width = first_frame->width;
  codec_context->height = first_frame->height;
  codec_context->field_order = AV_FIELD_PROGRESSIVE;
  codec_context->color_range = first_frame->color_range;
  codec_context->color_primaries = first_frame->color_primaries;
  codec_context->color_trc = first_frame->color_trc;
  codec_context->colorspace = first_frame->colorspace;
  codec_context->chroma_sample_location = first_frame->chroma_location;
  codec_context->sample_aspect_ratio = first_frame->sample_aspect_ratio;

  // set timebase
  codec_context->time_base = time_base;

  // generate global header when the format require it
  if (format_context->oformat->flags & AVFMT_GLOBALHEADER) {
    codec_context->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
  }

  if (avcodec_open2(codec_context, codec_context->codec, nullptr) != 0) {
    printf("avcodec_open2 failed\n");
  }

  AVStream* stream = avformat_new_stream(format_context, codec);
  if (stream == nullptr) {
    printf("avformat_new_stream failed\n");
  }

  stream->sample_aspect_ratio = codec_context->sample_aspect_ratio;
  stream->time_base = codec_context->time_base;

  if (avcodec_parameters_from_context(stream->codecpar, codec_context) < 0) {
    printf("avcodec_parameters_from_context failed\n");
  }

  if (avformat_write_header(format_context, nullptr) < 0) {
    printf("avformat_write_header failed\n");
  }

  launch_x264();

  while (frames.size() > 0) {
    AVFrame* frame = frames.front();
    frames.pop_front();
    int64_t pts = av_frame_get_best_effort_timestamp(frame);
    frame->pts = av_rescale_q(pts, time_base, codec_context->time_base);
    frame->key_frame = 0;
    frame->pict_type = AV_PICTURE_TYPE_NONE;
    if (avcodec_send_frame(codec_context, frame) != 0) {
      printf("avcodec_send_frame failed\n");
    }
    av_frame_free(&frame);
    AVPacket packet = AVPacket();
    while (avcodec_receive_packet(codec_context, &packet) == 0) {
      packet.stream_index = 0;
      av_packet_rescale_ts(&packet, codec_context->time_base, stream->time_base);
      if (av_interleaved_write_frame(format_context, &packet) != 0) {
        printf("av_interleaved_write_frame failed\n");
      }
    }
  }

  // flush encoder
  if (avcodec_send_frame(codec_context, nullptr) != 0) {
    printf("avcodec_send_frame failed\n");
  }
  AVPacket packet = AVPacket();
  while (avcodec_receive_packet(codec_context, &packet) == 0) {
    packet.stream_index = 0;
    av_packet_rescale_ts(&packet, codec_context->time_base, stream->time_base);
    if (av_interleaved_write_frame(format_context, &packet) != 0) {
      printf("av_interleaved_write_frame failed\n");
    }
  }

  if (av_write_trailer(format_context) != 0) {
    printf("av_write_trailer failed\n");
  }

  close_x264();

  avcodec_free_context(&codec_context);
  avformat_free_context(format_context);
  av_freep(&io_context->buffer);
  av_freep(&io_context);

  return 0;
}

About Author

Koji Ueno

Favorite Post

「OpenFOAMスレッド並列化のための基礎検討」を投稿＆発表してきました
2018年2月6日
FFmpeg API の使い方(1): デコードしてみる
2017年8月22日
ディリクレ過程混合モデルによるクラスタリングの振舞い方
2017年10月31日

FFmpeg APIの使い方(4): エンコード他

固定フレームレート化

ファイル出力しないでプログラムで受け取る

再エンコードしないで映像と音声を再mux

y4mで外部エンコーダに渡す

Tags

About Author

Koji Ueno

Leave a Comment コメントをキャンセル

Tags

Favorite Post

Archives

Categories

コンピュータビジョンセミナーvol.2 開催のお知らせ - ニュース一覧 - 株式会社フィックスターズ in Realizing Self-Driving Cars with General-Purpose Processors 日本語版

【Docker】NVIDIA SDK Managerでエラー無く環境構築する【Jetson】 | マサキノート in NVIDIA SDK Manager on Dockerで快適なJetsonライフ

Windowsカーネルドライバを自作してWinDbgで解析してみる① - かえるのほんだな in Windowsデバイスドライバの基本動作を確認する (1)

2021年版G検定チートシート | エビワークス in ニューラルネットの共通フォーマット対決！ NNEF vs ONNX

YOSHIFUJI Naoki in CUDAデバイスメモリもスマートポインタで管理したい

Social Media