https://github.com/livekit/agents Skip to content Navigation Menu Toggle navigation Sign in * Product + Actions Automate any workflow + Security Find and fix vulnerabilities + Codespaces Instant dev environments + GitHub Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions By size + Enterprise + Teams + Startups By industry + Healthcare + Financial services + Manufacturing By use case + CI/CD & Automation + DevOps + DevSecOps * Resources Topics + AI + DevOps + Security + Software Development + View all Explore + Learning Pathways + White papers, Ebooks, Webinars + Customer Stories + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Enterprise + Enterprise platform AI-powered developer platform Available add-ons + Advanced Security Enterprise-grade security features + GitHub Copilot Enterprise-grade AI features + Premium Support Enterprise-grade 24/7 support * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up Reseting focus You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} livekit / agents Public * Notifications You must be signed in to change notification settings * Fork 243 * Star 1.3k Build real-time multimodal AI applications [?] docs.livekit.io/agents License Apache-2.0 license 1.3k stars 243 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings * Code * Issues 66 * Pull requests 25 * Actions * Projects 0 * Security * Insights Additional navigation options * Code * Issues * Pull requests * Actions * Projects * Security * Insights livekit/agents This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main BranchesTags Go to file Code Folders and files Last commit Last Name Name message commit date Latest commit History 724 Commits .changeset .changeset .github .github examples examples livekit-agents livekit-agents livekit-plugins livekit-plugins tests tests .gitignore .gitignore .gitmodules .gitmodules 0.8-migration-guide.md 0.8-migration-guide.md LICENSE LICENSE NOTICE NOTICE README.md README.md package.json package.json pnpm-lock.yaml pnpm-lock.yaml pnpm-workspace.yaml pnpm-workspace.yaml ruff.toml ruff.toml View all files Repository files navigation * README * Apache-2.0 license The LiveKit icon, the name of the repository and some sample code in the background. Looking for the JS/TS library? Check out AgentsJS [NEW] OpenAI Realtime API support We're partnering with OpenAI on a new MultimodalAgent API in the Agents framework. This class completely wraps OpenAI's Realtime API, abstract away the raw wire protocol, and provide an ultra-low latency WebRTC transport between GPT-4o and your users' devices. This same stack powers Advanced Voice in the ChatGPT app. * Try the Realtime API in our playground [code] * Check out our guide to building your first app with this new API What is Agents? The Agents framework allows you to build AI-driven server programs that can see, hear, and speak in realtime. Your agent connects with end user devices through a LiveKit session. During that session, your agent can process text, audio, images, or video streaming from a user's device, and have an AI model generate any combination of those same modalities as output, and stream them back to the user. Features * Plugins for popular LLMs, transcription and text-to-speech services, and RAG databases * High-level abstractions for building voice agents or assistants with automatic turn detection, interruption handling, function calling, and transcriptions * Compatible with LiveKit's telephony stack, allowing your agent to make calls to or receive calls from phones * Integrated load balancing system that manages pools of agents with edge-based dispatch, monitoring, and transparent failover * Running your agents is identical across localhost, self-hosted, and LiveKit Cloud environments Installation To install the core Agents library: pip install livekit-agents Plugins The framework includes a variety of plugins that make it easy to process streaming input or generate output. For example, there are plugins for converting text-to-speech or running inference with popular LLMs. Here's how you can install a plugin: pip install livekit-plugins-openai The following plugins are available today: Plugin Features livekit-plugins-anthropic LLM livekit-plugins-azure STT, TTS livekit-plugins-deepgram STT livekit-plugins-cartesia TTS livekit-plugins-elevenlabs TTS livekit-plugins-playht TTS livekit-plugins-google STT, TTS livekit-plugins-nltk Utilities for working with text livekit-plugins-rag Utilities for performing RAG livekit-plugins-openai LLM, STT, TTS, Assistants API, Realtime API livekit-plugins-silero VAD Documentation and guides Documentation on the framework and how to use it can be found here Example agents * A basic voice agent using a pipeline of STT, LLM, and TTS [demo | code] * Voice agent using the new OpenAI Realtime API [demo | code] * Super fast voice agent using Cerebras hosted Llama 3.1 [demo | code] * Voice agent using Cartesia's Sonic model [demo] * Agent that looks up the current weather via function call [code] * Voice agent that performs a RAG-based lookup [code] * Video agent that publishes a stream of RGB frames [code] * Transcription agent that generates text captions from a user's speech [code] * A chat agent you can text who will respond back with genereated speech [code] * Localhost multi-agent conference call [code] * Moderation agent that uses Hive to detect spam/abusive video [ code] Contributing The Agents framework is under active development in a rapidly evolving field. We welcome and appreciate contributions of any kind, be it feedback, bugfixes, features, new plugins and tools, or better documentation. You can file issues under this repo, open a PR, or chat with us in LiveKit's Slack community. LiveKit Ecosystem Realtime Browser * iOS/macOS/visionOS * Android * Flutter * React SDKs Native * Rust * Node.js * Python * Unity * Unity (WebGL) Server Node.js * Golang * Ruby * Java/Kotlin * Python * Rust * APIs PHP (community) UI React * Android Compose * SwiftUI Components Agents Python * Node.js * Playground Frameworks Services LiveKit server * Egress * Ingress * SIP Resources Docs * Example apps * Cloud * Self-hosting * CLI About Build real-time multimodal AI applications [?] docs.livekit.io/agents Topics real-time video ai voice agents voice-assistant multimodal Resources Readme License Apache-2.0 license Activity Custom properties Stars 1.3k stars Watchers 31 watching Forks 243 forks Report repository Releases 72 livekit-plugins-openai@0.10.2 Latest Oct 3, 2024 + 71 releases Used by 101 * @radiantgraph * @guruyaya * @livekit-examples * @NoumaanAhamed * @BQLogics * @ankit06ch * @team-telnyx * @ojusave + 93 Contributors 42 * * * * * * * * * * * * * * + 28 contributors Languages * Python 89.5% * C++ 7.0% * CMake 1.6% * C 1.3% * Other 0.6% Footer (c) 2024 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact * Manage cookies * Do not share my personal information You can't perform that action at this time.