[HN Gopher] How to stream media using WebRTC and FFmpeg, and why...
       ___________________________________________________________________
        
       How to stream media using WebRTC and FFmpeg, and why it's a bad
       idea
        
       Author : dimes
       Score  : 74 points
       Date   : 2021-01-30 17:17 UTC (5 hours ago)
        
 (HTM) web link (blog.maxwellgale.com)
 (TXT) w3m dump (blog.maxwellgale.com)
        
       | mandis wrote:
       | >And finally, we encounter a large issue without a good solution.
       | In encoded videos, a key frame is a frame in the video that
       | contains all the visual information needed to render itself
       | without any additional metadata. These are much larger than
       | normal frames, and contribute greatly to the bitrate. Ideally,
       | there would be as a few keyframes as possible. However, when a
       | new user starts consuming a stream, they need at least one
       | keyframe to view the video. WebRTC solves this problem using the
       | RTP Control Protocl (RTCP). When a new user consumes a stream,
       | they send a Full Intra Request (FIR) to the producer. When a
       | producer receives this request, they insert a keyframe into the
       | stream. This keeps the bitrate low while ensuring all the users
       | can view the stream. FFmpeg does not support RTCP. This means
       | that the default FFmpeg settings will produce output that won't
       | be viewable if consumed mid-stream, at least until a key frame is
       | received. Therefore, the parameter -force_key_frames
       | expr:gte(t,n_forced*4) is needed, which produces a key frame
       | every 4 seconds.
       | 
       | in case someone was wondering why it was a bad idea
        
         | pantalaimon wrote:
         | I'm sure this could be implemented if someone were to sit down
         | and implement it.
        
         | wwweston wrote:
         | Thanks for the easy summary.
         | 
         | One thing to consider: for some IRL performances, it's not
         | uncommon that if you arrive late, you might be seated at the
         | timing discretion of an usher. I understand digital experiences
         | may carry different expectations, but I could see building an
         | experience around this, perhaps starting with audio-only and
         | maybe even a countdown to a next keyframe event (every minute?)
         | while a "please wait to be seated" is shown.
        
         | amelius wrote:
         | Ok, so only a problem in _live_ streams?
         | 
         | (And I suppose also when _seeking_ inside a stream)
        
           | SahAssar wrote:
           | It's a problem when playing from a non-start, non-keyframe
           | point (which in practice means any arbitrary point). I'm
           | guessing that's what you meant.
        
           | gregoriol wrote:
           | It seems to me that you can't seek with a webrtc stream, as
           | it is at least.
        
             | SahAssar wrote:
             | webrtc is just a stream but you can absolutely tell the
             | sending side to seek to a certain point.
             | 
             | If webrtc is your TV then the sending side is your VHS. You
             | can tell the VHS to rewind or forward, but telling your TV
             | to do the same is impossible. It just shows you what it
             | gets.
        
               | gregoriol wrote:
               | I meant it's not part of webrtc, but you indeed can
               | implement a lot of things around it.
        
               | [deleted]
        
         | tatersolid wrote:
         | H.264 and most other modern codecs support "intra refresh" to
         | avoid this problem, at the cost of a marginally higher bitrate
         | overall. Think of this as a "rolling keyframe slice" which
         | marches across the screen every few seconds.
         | 
         | http://www.chaneru.com/Roku/HLS/X264_Settings.htm#intra-refr...
        
           | dimes wrote:
           | I was not aware of this at the time of writing, but it solves
           | a large problem we've been having. Thank you so much for
           | pointing that out.
           | 
           | Edit: I've just tried using intra refresh, and it works
           | pretty well, but the key frame interval is still required.
        
           | Dylan16807 wrote:
           | I would say intra refresh solves a different problem. You
           | still have to wait for the intra refresh to cover the frame
           | before you can start watching properly. That takes just as
           | long as waiting for a keyframe, and needs slightly more
           | bytes.
           | 
           | The benefit of intra refresh is that you avoid having any
           | particularly large frames. If you're using a sub-second
           | buffer, then intra refresh makes your maximum frame size
           | _much_ smaller without sacrificing quality. It 's a godsend
           | for getting latency down to tiny amounts. But if you have 1
           | or 2 seconds of buffer then it's no big deal if a keyframe is
           | an order of magnitude bigger than other frames, and intra
           | refresh is pointless.
           | 
           | Also it's not really a codec thing, it's a clever encoder
           | trick that you can do on basically anything.
        
             | dimes wrote:
             | Yes, this is a good point. Intra refresh does reduce
             | variability of the bitrate, but the bitrate is still higher
             | than it would need to be if rtcp was supported.
        
         | legohead wrote:
         | Still sounds better than a FIR. If you consider a big streamer
         | with thousands of users. Users are constantly arriving and
         | leaving, so the keyframe requests are going to be so constant
         | that I can see keyframes being generated much more often than 4
         | seconds (assuming I understand it all correctly).
        
           | dimes wrote:
           | Usually you'll have an SFU between the users and the streamer
           | that can limit the number of requests to one every X seconds.
        
       | kuter wrote:
       | For the purpose of one to many type of live streaming you would
       | probably want to use HLS.
       | 
       | Twitch uses it's own transcoding system. Here is a interesting
       | read from their engineering blog [0]
       | 
       | [0] https://blog.twitch.tv/en/2017/10/10/live-video-
       | transmuxing-...
        
         | Feoj wrote:
         | If you want to acheive something approaching the latency
         | advantages of WebRTC with HLS its well worth checking out the
         | low latency HLS work by Apple and the wider video-dev
         | community.
         | 
         | https://developer.apple.com/documentation/http_live_streamin...
         | https://tools.ietf.org/html/draft-pantos-hls-rfc8216bis-08
        
       | doubleorseven wrote:
       | Isn't Opus the only codec WebRTC supports? If so, I think it's
       | another main parameter to note.
        
         | opencl wrote:
         | It's not the only codec, but it's the only high quality codec
         | mandated by the spec and supported by all the browsers.
         | 
         | G711 is also mandated by the spec but it's a low quality codec
         | intended for speech with a fixed 8kHz sampling rate. There are
         | a few other codecs supported by Chrome and Safari but not
         | Firefox.
        
         | oplav wrote:
         | H264 is also supported though it's limited to the Constrained
         | Baseline profile [0]. That said, I have been able to use an
         | H264 stream encoded with a Main profile that still worked in
         | Chrome, so it could just be a strong recommendation.
         | 
         | https://developer.mozilla.org/en-US/docs/Web/Media/Formats/W...
        
       | rlyshw wrote:
       | What about the WebRTC part?
       | 
       | The post ends at RTP out from FFMPEG. Maybe I'm supposed to know
       | how to consume that with WebRTC but in my investigation it's not
       | at all straightforward... the WebRTC consumer needs to become
       | aware of the stream through a whole complicated signaling and
       | negotiation process. How is that handled after the FFMPEG RTP
       | stream is produced?
        
         | jbaudanza wrote:
         | I use MediaSoup to bridge between ffmpeg and WebRTC. It works
         | pretty well, and I like that it's all node based.
        
       | Sean-Der wrote:
       | To get it into the browser check out rtp-to-webrtc[0]
       | 
       | Another big piece missing here is congestion control. It isn't
       | just about keeping bitrate low, but figuring out what you can
       | use. It is a really interesting topic to measure RTT/Loss to
       | figure out what is available. You don't get that in ffmpeg or
       | GStreamer yet. The best intro to this is the BBR IETF doc IMO [1]
       | 
       | [0] https://github.com/pion/webrtc/tree/master/examples/rtp-
       | to-w...
       | 
       | [1] https://tools.ietf.org/html/draft-cardwell-iccrg-bbr-
       | congest...
        
       | _Gyan_ wrote:
       | > -bsv:v h264_metadata=level=3.1
       | 
       | This should be `-bsf:v` and it's not required since this command
       | encodes and the encoder has been informed via `-level`.
        
         | dimes wrote:
         | Thanks for the feedback. I've removed it.
        
       ___________________________________________________________________
       (page generated 2021-01-30 23:00 UTC)