https://www.cnx-software.com/2021/02/28/lyra-audio-codec-enables-3-kbps-bitrate-for-high-quality-voice-calls/ Skip to content CNX Software - Embedded Systems News News, Tutorials, Reviews, and How-Tos related to Embedded Linux and Android, Raspberry Pi, Arduino, ESP8266, Development Boards, SBC's, TV Boxes, Mini PCs, etc.. Orange Pi Amazon Store Menu * About + About CNX Software + Contact Us + Advertisement & Consulting Services + Work for Us + Support CNX Software + Privacy Policy * Development Kits + x86 & Arm Linux Development Boards + MCU Development Kits + Hackable Gadgets + My Hardware * How-Tos & Training Materials + Embedded Linux Development + Technical Glossary + AllWinner How-tos + AMLogic How-tos + Android How-tos + Automation & IoT How-tos + Freescale/ NXP i.MX How-tos + Raspberry Pi How-tos + Rockchip How-tos * Reviews * Jobs & Events + Embedded Systems Jobs + Events * Shop + Buy Review Samples + Coupon Codes & Promos + Recommended Products Posted on February 28, 2021February 28, 2021 by Jean-Luc Aufranc (CNXSoft) - 15 Comments on Lyra audio codec enables high-quality voice calls at 3 kbps bitrate Lyra audio codec enables high-quality voice calls at 3 kbps bitrate We're often writing about new video codecs like AV1 or H.266, and recently, we covered AVIF picture format that offers an improved quality/compression ratio against WebP and JPEG, but there's also work done on audio codecs. Notably, we noted Opus 1.2 offered decent speech quality with a bitrate as low as 12 kbps when it was outed in 2017, the release of Opus 1.3 in 2019 improved the codec further with high-quality speech possible at just 9 kbps. But Google AI recently unveiled Lyra very low-bitrate codec for speech compression that achieves high speech quality with a bitrate as low as 3kbps. Lyra audio codec vs Opus vs Speex Before we go into the details of Lyra codec, Google compared a reference audio file encoded with Lyra at 3 kbps, Opus at 6 kbps (the minimum bitrate for Opus), and Speex at 3 kbps, and users reported Lyra to sound the best, and close to the original. You can actually try it by yourself. Clean Speech Original Opus @ 6kbps Lyra @ 3kbps Speex @ 3kbps Noisy Environment Original Opus @ 6kbps Lyra @ 3kbps Speex @ 3kbps Speex 3kbps sounded pretty bad for all samples. I feel Opus 6kbps and Lyra 3kbps sound about the same with the clean speech samples, but Lyra reproduces the background music better in the noisy environment. So how does Lyra work? Google AI explains the basic architecture of the Lyra codec relies on features (log mel spectrograms), or distinctive speech attributes, representing speech energy in different frequency bands, extracted from speech every 40ms and then compressed for transmission. On the receiving end, a generative model uses those features to recreate the speech signal. How does Lyra work?Lyra works in a similar way to the Mixed Excitation Linear Predictive (MELP) speech coding standard developed by the United States Department of Defense (US DoD) for military applications and satellite communications, secure voice, and secure radio devices. Lyra also leverages natural-sounding generative models to maintain a low bitrate while achieving high quality, similar to the one achieved by higher bitrate codecs. Using these models as a baseline, we've developed a new model capable of reconstructing speech using minimal amounts of data. Lyra harnesses the power of these new natural-sounding generative models to maintain the low bitrate of parametric codecs while achieving high quality, on par with state-of-the-art waveform codecs used in most streaming and communication platforms today. The drawback of waveform codecs is that they achieve this high quality by compressing and sending over the signal sample-by-sample, which requires a higher bitrate and, in most cases, isn't necessary to achieve natural sounding speech. One concern with generative models is their computational complexity. Lyra avoids this issue by using a cheaper recurrent generative model, a WaveRNN variation, that works at a lower rate, but generates in parallel multiple signals in different frequency ranges that it later combines into a single output signal at the desired sample rate. This trick enables Lyra to not only run on cloud servers, but also on-device on mid-range phones in real time (with a processing latency of 90ms, which is in line with other traditional speech codecs). This generative model is then trained on thousands of hours of speech data and optimized, similarly to WaveNet, to accurately recreate the input audio. Lyra will enable intelligible, high-quality voice calls even with poor quality signals, low bandwidth, and/or congested network connections. It does not only work for English, as Google has trained the model with thousands of hours of audio with speakers in over 70 languages using open-source audio libraries and then verifying the audio quality with expert and crowdsourced listeners. The company also expects video calls to become possible on a 56kbps dial-in modem connection thanks to the combination of AV1 video codec with Lyra audio codec. One of the first app to use the Lyra audio codec will be Google Duo video-calling app, where it will be used on very low bandwidth connections. The company also plans to work on acceleration using GPUs and AI accelerators and has started to investigate whether the technologies used for Lyra can also be leveraged to create a general-purpose audio codec for music and non-speech audio. More details can be found on Google AI blog post. jean-luc aufranc cnxsoft Jean-Luc Aufranc (CNXSoft) Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011. Support CNX Software - Donate via PayPal or cryptocurrencies, become a Patron on Patreon, or buy review samples Related posts: 1. AIY Projects Voice Kit Transforms Raspberry Pi 3 Into Google Home, Comes Free with Raspberry Pi Magazine 2. Intel Quark S1000 "Sue Creek" Processor to Support On-Chip Speech Recognition Pebble Tracker with IoTex 3. Amlogic A111, A112 & A113 Processors are Designed for Audio Applications, Smart Speakers 4. $16 Banana Pi BPI-EAI80 Cortex-M4F Board Embeds AI Accelerator, WiFi Module 5. Adafruit Voice Bonnet is meant for DIY Raspberry Pi Smart Speakers CategoriesAudio Tagsartificial intelligence, audio Connect with: Facebook Twitter Subscribe Login Notify of [new follow-up comments ] [ ] [>] guest [ ] {} [+] [ ] [ ] Name* [ ] Email* [ ] Website [ ] I agree to the Privacy Policy The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment. [ ] [Post Comment] guest [ ] {} [+] [ ] [ ] Name* [ ] Email* [ ] Website [ ] I agree to the Privacy Policy The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment. [ ] [Post Comment] 15 Comments oldest newest most voted Load More Comments Advertisements Post navigation Previous PostPrevious STM32U5 Cortex-M33 MCU gets more performance, 2D graphics accelerator, and advanced security Next PostNext Raspberry Pi CM4 Carrier Board comes with RS485/Modbus, CAN, 1-wire interfaces (Crowdfunding) Follow Us on FacebookFollow Us on FacebookFollow Us on TwitterFollow Us on TwitterFollow Us on LinkedInFollow Us on LinkedInFollow Us on MeWeFollow Us on MeWeFollow Us on YouTubeFollow Us on YouTubeFollow Us on RSSFollow Us on RSS Follow CNX Software on Google NewsFollow CNX Software on Google News Subscribe to CNX Software by Email Search for: [ ] Search Tiger Lake UP3 COM Express moduleTiger Lake UP3 COM Express module Trending Posts - Last 7 Days SPONSORS Gateworks High Performance SBCGateworks High Performance SBC TrustOnX TV BoxTrustOnX TV Box RK3399Pro SBCRK3399Pro SBC 4K digital signage player4K digital signage player Station M1 Geek PCStation M1 Geek PC Advertisements Recent Comments * Occam on A neat way to add a reset button to Raspberry Pi Pico * itchy n scratchy on Lyra audio codec enables high-quality voice calls at 3 kbps bitrate * itchy n scratchy on Lyra audio codec enables high-quality voice calls at 3 kbps bitrate * David Willmore on Lyra audio codec enables high-quality voice calls at 3 kbps bitrate * DAT on 3.5-inch Atom x6000E embedded SBC features 3x GbE, 2x SATA, 6x USB, and more Subscribe to Comments RSS Feed Advertisements Latest Reviews Maker Pi Pico STEM board mini review with CircuitPython Maker Pi Pico STEM board mini review with CircuitPythonMaker Pi Pico STEM board mini review with CircuitPython In my early list of third-party Raspberry Pi RP2040 boards, I shortly mentioned Cytron Maker Pi Pico baseboard for Raspberry Pi Pico that exposes all pins via female headers, includes LEDs for all GPIOs pin, six Grove connectors, three user... [...] Vacos Cam AI Security Camera Review - Part 1: Specifications, Unboxing and Teardown Vacos Cam AI Security Camera Review - Part 1: Specifications, Unboxing and TeardownVacos Cam AI Security Camera Review - Part 1: Specifications, Unboxing and Teardown As we've seen in our Reolink RLC-810A review, AI security cameras greatly reduce the number of false alerts generated by motion sensors, and the Reolink 4K security camera we tested was capable of people and vehicle detection. The Reolink model... [...] Beelink SEI Review - A Core i3-10110U Mini PC Tested with Windows and Ubuntu Beelink SEI Review - A Core i3-10110U Mini PC Tested with Windows and UbuntuBeelink SEI Review - A Core i3-10110U Mini PC Tested with Windows and Ubuntu Beelink has launched a new range of mini PCs called the SEi Series. Similar in size and appearance to an Intel 'NUC' they are available in various configurations. Beelink sent a Core i3-10110U SEi model for review which is the... [...] Change Ad Consent Do not sell my data Copyright 2021 - CNX Software Limited Privacy Policy Proudly powered by WordPress This website uses cookies to improve your experience. We'll assume you're ok with this, but if you don't like these, you can remove them Accept Privacy & Cookies Policy Close Privacy Overview This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience. Necessary [*] Necessary Always Enabled Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information. Non-necessary [*] Non-necessary Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website. SAVE & ACCEPT wpDiscuz [ ] Insert