videox

package
v0.0.0-...-b7e086b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 26, 2024 License: MIT Imports: 21 Imported by: 0

README

Annex-B performance hit

On a Raspberry Pi 5, our Annex-B encoder (the bit that adds the Emulation Prevention Byte) can encode 716 MB/s. Memcpy on this platform is 4690 MB/s. Decode is 916 MB/s.

You can use misc_test.cpp to measure the speed yourself (instructions at top of that file).

I don't have enough numbers right now to figure out the total system impact, but my gut doesn't like it. It seems plausible that one should be able to improve the speed of the encoder, but I don't know how. The alternative that I'm considering is to delay encoding to Annex-B for as long as possible - perhaps even doing it in the browser immediately before display.

If we're recording to disk, then it would be useful to avoid this penalty completely, but that precludes us from using regular video formats like mp4. On the other hand, we might want to avoid regular formats anyway.

Documentation

Index

Constants

View Source
const EnableEmulationPreventBytesEscaping = false

Topic: $ANNEXB-CONFUSION Here's the story: When we receive packets from Hikvision cameras, via github.com/bluenviron/gortsplib, the packets are supposedly NALUFormatRBSP, aka raw data bits, with no start codes, and no emulation prevention bytes. The codecs seem to want packets in SODB (aka AnnexB) encoding, so we dutifully encode the raw packets into AnnexB, with emulation prevention bytes added. HOWEVER, when we activate this code path, we get sporadic errors from ffmpeg, telling us that we've got bad frames. If we comment out the code that does the emulation prevention byte injection, then these errors go away. To be clear, we must inject the start codes. This is unambiguous. It's the emulation prevention bytes that cause errors. This confusion is the reason for this constant. At some point we'll hopefully learn more, and make better sense of this. Right now the culprit could be any one of these: 1. HikVision cameras 2. gortsplib 3. The way I'm using the h264 codec in ffmpeg 4. My SODB/Annex-B encoder 5. My understanding

Variables

View Source
var ErrResourceTemporarilyUnavailable = errors.New("Resource temporarily unavailable") // common response from avcodec_receive_frame if a frame is not available
View Source
var NALUPrefix = []byte{0x00, 0x00, 0x01}

This is the prefix that we add whenever we need to encode into AnnexB This must remain in sync with the behaviour inside EncodeAnnexB()

Functions

func AnnexBWorstSize

func AnnexBWorstSize(rawLen int) int

Return the worst case size of an Annex-B encoded packet, given the size of the raw packet (including a 3 byte start code).

func DecodeAnnexB

func DecodeAnnexB(encoded []byte) []byte

Decode an Annex-B encoded packet into a Raw Byte Sequence Payload (RBSP). We assume that you're handling the 3 or 4 byte NALU prefix outside of this function.

func DecodeSinglePacketToImage

func DecodeSinglePacketToImage(packet *VideoPacket) (*cimg.Image, error)

Creates a decoder and attempts to decode a single IDR packet. This was built for extracting a thumbnail during a long recording. Obviously this is quite expensive, because you're creating a decoder for just a single frame.

func EncodeAnnexB

func EncodeAnnexB(raw []byte, flags AnnexBEncodeFlags) []byte

Encode an RBSP (Raw Byte Sequence Packet) into Annex-B format, optionally adding a 3 byte start code (00.00.01) to the beginning of the encoded byte stream. This encoding adds the "emulation prevention byte" (0x03) where necessary, if the relevant flag is set.

func EncodeAnnexBInto

func EncodeAnnexBInto(raw []byte, flags AnnexBEncodeFlags, dst []byte) (encodedSize int, bufferSizeOK bool)

Encode an RBSP (Raw Byte Sequence Packet) into Annex-B format, optionally adding a 3 byte start code (00.00.01) to the beginning of the encoded byte stream. This encoding adds the "emulation prevention byte" (0x03) where necessary.

func ExtractFrame

func ExtractFrame(srcFilename string, atSecond float64, outputWidth int) ([]byte, error)

Extract a single frame from a video file and return the JPEG bytes If outputWidth is zero, then we use the same width as the input video

func ExtractVideoDuration

func ExtractVideoDuration(srcFilename string) (time.Duration, error)

Extract the duration of a video file

func IsVisualPacket

func IsVisualPacket(t h264.NALUType) bool

func ParseBinFilename

func ParseBinFilename(filename string) (packetNumber int, naluNumber int, timeNS int64)

This is just used for debugging and testing

func ParseSPS

func ParseSPS(nalu []byte) (width, height int, err error)

Parse a raw SPS NALU (not annex-b)

func RunAppCombinedOutput

func RunAppCombinedOutput(app_name string, args []string) ([]byte, error)

app_name is an executable, such as "ffmpeg" or "ffprobe" args must not include the executable name as the first parameter Returns the string output from exec.Cmd's "CombinedOutput" method.

func TranscodeMediumQualitySeekable

func TranscodeMediumQualitySeekable(srcFilename, dstFilename string) error

Transcode the high quality video stream into a slightly lower quality stream, with keyframes every 8 frames, and with noise reduction. This is for use on our training platform, where people need to be able to seek randomly inside a video.

func TranscodeSeekable

func TranscodeSeekable(srcFilename, dstFilename string) error

Transcode a video to make it easy for a low powered mobile browser to seek to random video positions

func WrapAvErr

func WrapAvErr(err C.int) error

Types

type AnnexBEncodeFlags

type AnnexBEncodeFlags int

Flags that control how EncodeAnnexB works

const (
	AnnexBEncodeFlagNone                        AnnexBEncodeFlags = 0 // This is nonsensical - it is simply a memcpy
	AnnexBEncodeFlagAddStartCode                AnnexBEncodeFlags = 1 // Add the 3 byte start code 00 00 01
	AnnexBEncodeFlagAddEmulationPreventionBytes AnnexBEncodeFlags = 2 // Add emulation prevention bytes (0x03) where necessary
)

type DecoderOptions

type DecoderOptions struct {
	Codec    string
	Filename string
}

If you're decoding a file, provide the filename. If you're decoding a stream, provide the codec

type EmulationState

type EmulationState int

EmulationState can be used to inform us whether a NALU has any emulation prevention bytes. This is a tiny optimization that we can use to avoid decoding from Annex-B into raw bytes.

const (
	EmulationStateUnknown                EmulationState = iota // We don't know what's inside
	EmulationStateContainsEmulationBytes                       // There is at least one emulation prevention byte
	EmulationStateNoEmulationBytes                             // There were no byte sequences that needed the 0x03 emulation prevention byte
)

type H264Decoder

type H264Decoder struct {
	// contains filtered or unexported fields
}

H264Decoder is a wrapper around ffmpeg's H264 decoder.

func NewH264Decoder

func NewH264Decoder(options DecoderOptions) (*H264Decoder, error)

NewH264Decoder allocates a new H264Decoder.

func NewH264FileDecoder

func NewH264FileDecoder(filename string) (*H264Decoder, error)

Create a new decoder that will decode the given file

func NewH264StreamDecoder

func NewH264StreamDecoder(codec string) (*H264Decoder, error)

Create a new decoder that you will feed with packets

func (*H264Decoder) Close

func (d *H264Decoder) Close()

Close closes the decoder.

func (*H264Decoder) Decode

func (d *H264Decoder) Decode(packet *VideoPacket) (*accel.YUVImage, error)

Decode the packet and return a copy of the YUV image. This is used when decoding a stream (not a file).

func (*H264Decoder) DecodeDeepRef

func (d *H264Decoder) DecodeDeepRef(packet *VideoPacket) (*accel.YUVImage, error)

WARNING: The image returned is only valid while the decoder is still alive, and it will be clobbered by the subsequent DecodeDeepRef/Decode(). The pixels in the returned image are not a garbage-collected Go slice. They point directly into the libavcodec decode buffer. That's why the function name has the "DeepRef" suffix.

func (*H264Decoder) Height

func (d *H264Decoder) Height() int

func (*H264Decoder) NextFrame

func (d *H264Decoder) NextFrame() (*accel.YUVImage, error)

NextFrame reads the next frame from a file and returns a copy of the YUV image.

func (*H264Decoder) NextFrameDeepRef

func (d *H264Decoder) NextFrameDeepRef() (*accel.YUVImage, error)

NextFrameDeepRef will read the next frame from a file and return a deep reference into the libavcodec decoded image buffer. The next call to NextFrame/NextFrameDeepRef will invalidate that image.

func (*H264Decoder) Width

func (d *H264Decoder) Width() int

type MPGTSEncoder

type MPGTSEncoder struct {
	// contains filtered or unexported fields
}

MPGTSEncoder allows to encode H264 NALUs into MPEG-TS.

func NewMPEGTSEncoder

func NewMPEGTSEncoder(log log.Log, output io.Writer, sps []byte, pps []byte) (*MPGTSEncoder, error)

NewMPEGTSEncoder allocates a mpegtsEncoder.

func (*MPGTSEncoder) Close

func (e *MPGTSEncoder) Close() error

close closes all the mpegtsEncoder resources.

func (*MPGTSEncoder) Encode

func (e *MPGTSEncoder) Encode(nalus []NALU, pts time.Duration) error

encode encodes H264 NALUs into MPEG-TS.

type NALU

type NALU struct {
	// If zero, then no prefix, and RBSP format.
	// If 3 or 4, then the first N bytes of Payload are 00 00 01 or 00 00 00 01 respectively, and SODB format.
	// The only valid values for PrefixLen are: 0,3,4
	PrefixLen int
	Emulation EmulationState
	Payload   []byte
}

A NALU that is one of:

  1. Raw Byte Sequence Payload (RBSP)
  2. String of Data Bits (SODB) - aka Annex-B encoding

RBSP has no prefix, and no emulation prevention bytes. SODB has a 3 or 4 byte prefix, and emulation prevention bytes.

func WrapRawNALU

func WrapRawNALU(raw []byte) NALU

Wrap a raw buffer in a NALU object. Do not clone memory, or add prefix bytes.

func (*NALU) DeepClone

func (n *NALU) DeepClone() NALU

func (*NALU) DeepCloneToFormat

func (n *NALU) DeepCloneToFormat(format NALUFormat) NALU

func (*NALU) Format

func (n *NALU) Format() NALUFormat

Returns either RBSP or SODB

func (*NALU) RBSPPayload

func (n *NALU) RBSPPayload() []byte

Returns the raw payload in RBSP format (no prefix bytes, and no emulation prevention bytes)

func (*NALU) SODBPayload

func (n *NALU) SODBPayload() []byte

Returns the payload in SODB format (prefix/start code bytes and emulation prevention bytes)

func (*NALU) ShallowCloneToFormat

func (n *NALU) ShallowCloneToFormat(format NALUFormat) NALU

Return a clone of a NALU in the given encoding. The clone is shallow (i.e. references same memory) if possible.

func (*NALU) Type

func (n *NALU) Type() h264.NALUType

Return the NALU type

type NALUFormat

type NALUFormat int

Type of NALU (either RBSP or SODB)

const (
	NALUFormatUnknown NALUFormat = iota // A 'nil' value
	NALUFormatRBSP                      // Raw Byte Sequence Payload (No start code, no emulation prevention bytes)
	NALUFormatSODB                      // String of Data Bits (Annex-B encoding. Has start code and emulation prevention bytes)
)

type PacketBuffer

type PacketBuffer struct {
	Packets []*VideoPacket
}

A list of packets, with some helper functions

func LoadBinDir

func LoadBinDir(dir string) (*PacketBuffer, error)

Opposite of RawBuffer.DumpBin NOTE: We don't attempt to inject SPS and PPS into RawBuffer, but would be trivial for H264.. just look at first byte of payload... (67 and 68 for SPS and PPS)

func (*PacketBuffer) DecodeHeader

func (r *PacketBuffer) DecodeHeader() (width, height int, err error)

Decode SPS and PPS to extract header information

func (*PacketBuffer) DumpBin

func (r *PacketBuffer) DumpBin(dir string) error

Dump each NALU to a .raw file

func (*PacketBuffer) ExtractThumbnail

func (r *PacketBuffer) ExtractThumbnail() (*cimg.Image, error)

Decode the center-most keyframe This is O(1), assuming no errors or funny business like no keyframes.

func (*PacketBuffer) FirstNALUOfType

func (r *PacketBuffer) FirstNALUOfType(ofType h264.NALUType) *NALU

Returns the first NALU of the given type, or nil if none found

func (*PacketBuffer) IndexOfFirstNALUOfType

func (r *PacketBuffer) IndexOfFirstNALUOfType(ofType h264.NALUType) (packetIdx int, indexInPacket int)

func (*PacketBuffer) ResetPTS

func (r *PacketBuffer) ResetPTS()

Adjust all PTS values so that the first frame starts at time 0

func (*PacketBuffer) SaveToMP4

func (r *PacketBuffer) SaveToMP4(filename string) error

func (*PacketBuffer) SaveToMPEGTS

func (r *PacketBuffer) SaveToMPEGTS(log log.Log, output io.Writer) error

Extract saved buffer into an MPEGTS stream

type VideoEncoder

type VideoEncoder struct {
	// contains filtered or unexported fields
}

func NewVideoEncoder

func NewVideoEncoder(format, filename string, width, height int) (*VideoEncoder, error)

NewVideoEncoder creates a new video encoder You must Close() a video encoder when you are done using it, otherwise you will leak ffmpeg objects

func (*VideoEncoder) Close

func (v *VideoEncoder) Close()

func (*VideoEncoder) WriteNALU

func (v *VideoEncoder) WriteNALU(dts, pts time.Duration, nalu NALU) error

func (*VideoEncoder) WritePacket

func (v *VideoEncoder) WritePacket(dts, pts time.Duration, packet *VideoPacket) error

func (*VideoEncoder) WriteTrailer

func (v *VideoEncoder) WriteTrailer() error

type VideoPacket

type VideoPacket struct {
	RecvID    int64     // Arbitrary monotonically increasing ID. Used to detect dropped packets, or other issues like that.
	RecvTime  time.Time // Wall time when the packet was received. This is obviously subject to network jitter etc, so not a substitute for PTS
	H264NALUs []NALU
	H264PTS   time.Duration
	WallPTS   time.Time // Reference wall time combined with the received PTS. We consider this the ground truth/reality of when the packet was recorded.
	IsBacklog bool      // a bit of a hack to inject this state here. maybe an integer counter would suffice? (eg nBacklogPackets)
}

VideoPacket is what we store in our ring buffer

func ClonePacket

func ClonePacket(nalusIn [][]byte, pts time.Duration, recvTime time.Time, wallPTS time.Time) *VideoPacket

Clone a packet of NALUs and return the cloned packet NOTE: gortsplib re-uses buffers, which is why we copy the payloads. NOTE2: I think that after upgrading gortsplib in Jan 2024, it no longer re-uses buffers, so I should revisit the requirement of our deep clone here.

func (*VideoPacket) Clone

func (p *VideoPacket) Clone() *VideoPacket

Deep clone of packet buffer

func (*VideoPacket) EncodeToAnnexBPacket

func (p *VideoPacket) EncodeToAnnexBPacket() []byte

Encode all NALUs in the packet into AnnexB format (i.e. with 00,00,01 prefix bytes)

func (*VideoPacket) FirstNALUOfType

func (p *VideoPacket) FirstNALUOfType(t h264.NALUType) *NALU

Returns the first NALU of the given type, or nil if none exists

func (*VideoPacket) HasIDR

func (p *VideoPacket) HasIDR() bool

Returns true if this packet has a keyframe

func (*VideoPacket) HasType

func (p *VideoPacket) HasType(t h264.NALUType) bool

Return true if this packet has a NALU of type t inside

func (*VideoPacket) IsIFrame

func (p *VideoPacket) IsIFrame() bool

Return true if this packet has one NALU which is an intermediate frame

func (*VideoPacket) PayloadBytes

func (p *VideoPacket) PayloadBytes() int

Returns the number of bytes of NALU data. If the NALUs have annex-b prefixes, then these are included in the size.

func (*VideoPacket) Summary

func (p *VideoPacket) Summary() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL