Skip to content

Commit 98c7a56

Browse files
add chapter 3
1 parent a28cd42 commit 98c7a56

4 files changed

Lines changed: 183 additions & 0 deletions

File tree

README.md

Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -698,3 +698,186 @@ But to make sure that I'm not lying to you. You can use the amazing site/tool [g
698698
As you can see it has a single `mdat` atom/box, **this is place where the video and audio frames are**. Now load the fragmented mp4 to see which how it spreads the `mdat` boxes.
699699

700700
![fragmented mp4 boxes](/img/boxes_fragmente_mp4.png)
701+
702+
## Chapter 3 - transcoding
703+
704+
> #### TLDR; show me the [code](/3_transcoding.c) and execution.
705+
> ```bash
706+
> $ make run_transcoding
707+
> ```
708+
> We'll skip some details, but don't worry: the [source code is available at github](/3_transcoding.c).
709+
710+
711+
712+
In this chapter, we're going to create a minimalist transcoder, written in C, that can convert videos coded in H264 to H265 using **FFmpeg/libav** library specifically [libavcodec](https://ffmpeg.org/libavcodec.html), libavformat, and libavutil.
713+
714+
![media transcoding flow](/img/transcoding_flow.png.png)
715+
> _Just a quick recap:_ **AVFormatContext** is the abstraction for the format of the media file, aka container (ex: MKV, MP4, Webm, TS), the **AVStream** represents each type of data for a given format (ex: audio, video, subtitle, metadata), **AVPacket** is a slice of compressed data obtained from the AVStream that can be decoded by an **AVCodec** (ex: av1, h264, vp9, hevc) generating a raw data called **AVFrame**.
716+
717+
### Transmuxing
718+
719+
Let's start with the simple transmuxing operation and then we can build upon this code, the first step is to **load the input file**.
720+
721+
```c
722+
// Allocate an AVFormatContext
723+
avfc = avformat_alloc_context();
724+
// Open an input stream and read the header.
725+
avformat_open_input(avfc, in_filename, NULL, NULL);
726+
// Read packets of a media file to get stream information.
727+
avformat_find_stream_info(avfc, NULL);
728+
```
729+
730+
Now we're going to set up the decoder, the `AVFormatContext` will give us access to all the `AVStream` components and for each one of them, we can get their `AVCodec` and create the respective `AVCodecContext` and finally we can open the given codec so we can proceed to the decode process.
731+
`
732+
> The **AVCodecContext** holds data about media configuration such as bit rate, frame rate, sample rate, channels, height, and many others.
733+
734+
```c
735+
for (int i = 0; i < avfc->nb_streams; i++)
736+
{
737+
AVStream *avs = avfc->streams[i];
738+
AVCodec *avc = avcodec_find_decoder(avs->codecpar->codec_id);
739+
AVCodecContext *avcc = avcodec_alloc_context3(*avc);
740+
avcodec_parameters_to_context(*avcc, avs->codecpar);
741+
avcodec_open2(*avcc, *avc, NULL);
742+
}
743+
```
744+
745+
We need to prepare the output media file for transmuxing as well, we first **allocate memory** for the output `AVFormatContext`. We create **each stream** in the output format. In order to pack the stream properly, we **copy the codec parameters** from the decoder.
746+
747+
We **set the flag** `AV_CODEC_FLAG_GLOBAL_HEADER` which tells the encoder that it can use the global headers and finally we open the output **file for write** and persist the headers.
748+
749+
```c
750+
avformat_alloc_output_context2(&encoder_avfc, NULL, NULL, out_filename);
751+
752+
AVStream *avs = avformat_new_stream(encoder_avfc, NULL);
753+
avcodec_parameters_copy(avs->codecpar, decoder_avs->codecpar);
754+
755+
if (encoder_avfc->oformat->flags & AVFMT_GLOBALHEADER)
756+
encoder_avfc->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
757+
758+
avio_open(&encoder_avfc->pb, encoder->filename, AVIO_FLAG_WRITE);
759+
avformat_write_header(encoder->avfc, &muxer_opts);
760+
761+
```
762+
763+
We're getting the `AVPacket`'s from the decoder, adjusting the timestamps, and write the packet properly to the output file. Even though the function `av_interleaved_write_frame` says "write frame" we are storing the packet. We finish the transmuxing process by writing the stream trailer to the file.
764+
765+
```c
766+
AVFrame *input_frame = av_frame_alloc();
767+
AVPacket *input_packet = av_packet_alloc();
768+
769+
while (av_read_frame(decoder_avfc, input_packet) >= 0)
770+
{
771+
av_packet_rescale_ts(input_packet, decoder_video_avs->time_base, encoder_video_avs->time_base);
772+
av_interleaved_write_frame(*avfc, input_packet) < 0);
773+
}
774+
775+
av_write_trailer(encoder_avfc);
776+
```
777+
778+
### Transcoding
779+
780+
The previous section showed a simple transmuxer program, now we're going to add the capability to encode files, specifically we're going to enable it to transcode videos from `h264` to `h265`.
781+
782+
After we prepared the decoder but before we prepare the output media file we're going to set up the encoder.
783+
784+
* Create the video `AVStream` in the encoder,
785+
* Use the `AVCodec` called `libx265`,
786+
* Create the `AVCodecContext` based in the created codec,
787+
* Set up basic attributes for the transcoding session, and
788+
* Open the codec and copy parameters from the context to the stream.
789+
790+
```c
791+
AVRational input_framerate = av_guess_frame_rate(decoder_avfc, decoder_video_avs, NULL);
792+
AVStream *video_avs = avformat_new_stream(encoder_avfc, NULL);
793+
794+
char *codec_name = "libx265";
795+
char *codec_priv_key = "x265-params";
796+
// we're going to use internal options for the x265
797+
// it disables the scene change detection and fix then
798+
// GOP on 60 frames.
799+
char *codec_priv_value = "keyint=60:min-keyint=60:scenecut=0";
800+
801+
AVCodec *video_avc = avcodec_find_encoder_by_name(codec_name);
802+
AVCodecContext *video_avcc = avcodec_alloc_context3(video_avc);
803+
// encoder codec params
804+
av_opt_set(sc->video_avcc->priv_data, codec_priv_key, codec_priv_value, 0);
805+
video_avcc->height = decoder_ctx->height;
806+
video_avcc->width = decoder_ctx->width;
807+
video_avcc->pix_fmt = video_avc->pix_fmts[0];
808+
// control rate
809+
video_avcc->bit_rate = 2 * 1000 * 1000;
810+
video_avcc->rc_buffer_size = 4 * 1000 * 1000;
811+
video_avcc->rc_max_rate = 2 * 1000 * 1000;
812+
video_avcc->rc_min_rate = 2.5 * 1000 * 1000;
813+
// time base
814+
video_avcc->time_base = av_inv_q(input_framerate);
815+
video_avs->time_base = sc->video_avcc->time_base;
816+
817+
avcodec_open2(sc->video_avcc, sc->video_avc, NULL);
818+
avcodec_parameters_from_context(sc->video_avs->codecpar, sc->video_avcc);
819+
```
820+
821+
We need to expand our decoding loop for the video stream transcoding:
822+
823+
* Send the empty `AVPacket` to the decoder,
824+
* Receive the uncompressed `AVFrame`,
825+
* Start to transcode this raw frame,
826+
* Send the raw frame,
827+
* Receive the compressed, based on our codec, `AVPacket`,
828+
* Set up the timestamp, and
829+
* Write it to the output file.
830+
831+
```c
832+
AVFrame *input_frame = av_frame_alloc();
833+
AVPacket *input_packet = av_packet_alloc();
834+
835+
while (av_read_frame(decoder_avfc, input_packet) >= 0)
836+
{
837+
int response = avcodec_send_packet(decoder_video_avcc, input_packet);
838+
while (response >= 0) {
839+
response = avcodec_receive_frame(decoder_video_avcc, input_frame);
840+
if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) {
841+
break;
842+
} else if (response < 0) {
843+
return response;
844+
}
845+
if (response >= 0) {
846+
encode(encoder_avfc, decoder_video_avs, encoder_video_avs, decoder_video_avcc, input_packet->stream_index);
847+
}
848+
av_frame_unref(input_frame);
849+
}
850+
av_packet_unref(input_packet);
851+
}
852+
av_write_trailer(encoder_avfc);
853+
854+
// used function
855+
int encode(AVFormatContext *avfc, AVStream *dec_video_avs, AVStream *enc_video_avs, AVCodecContext video_avcc int index) {
856+
AVPacket *output_packet = av_packet_alloc();
857+
int response = avcodec_send_frame(video_avcc, input_frame);
858+
859+
while (response >= 0) {
860+
response = avcodec_receive_packet(video_avcc, output_packet);
861+
if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) {
862+
break;
863+
} else if (response < 0) {
864+
return -1;
865+
}
866+
867+
output_packet->stream_index = index;
868+
output_packet->duration = enc_video_avs->time_base.den / enc_video_avs->time_base.num / dec_video_avs->avg_frame_rate.num * dec_video_avs->avg_frame_rate.den;
869+
870+
av_packet_rescale_ts(output_packet, dec_video_avs->time_base, enc_video_avs->time_base);
871+
response = av_interleaved_write_frame(avfc, output_packet);
872+
}
873+
av_packet_unref(output_packet);
874+
av_packet_free(&output_packet);
875+
return 0;
876+
}
877+
878+
```
879+
880+
We converted the media stream from `h264` codec:
881+
![h264 codec properties](/img/h264_properties.png)
882+
To the `HEVC` codec:
883+
![hevc codec properties](/img/hevc_properties.png)

img/h264_properties.png

130 KB
Loading

img/hevc_properties.png

86.3 KB
Loading

img/transcoding_flow.png

639 KB
Loading

0 commit comments

Comments
 (0)