https://github.com/pipecat-ai/pipecat Skip to content Navigation Menu Toggle navigation Sign in * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions For + Enterprise + Teams + Startups + Education By Solution + CI/CD & Automation + DevOps + DevSecOps Resources + Learning Pathways + White papers, Ebooks, Webinars + Customer Stories + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} pipecat-ai / pipecat Public * Notifications * Fork 8 * Star 274 * Open Source framework for voice and multimodal conversational AI pipecat.ai License BSD-2-Clause license 274 stars 8 forks Branches Tags Activity Star Notifications * Code * Issues 1 * Pull requests 5 * Actions * Projects 0 * Security * Insights Additional navigation options * Code * Issues * Pull requests * Actions * Projects * Security * Insights pipecat-ai/pipecat This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main BranchesTags Go to file Code Folders and files Last Last Name Name commit commit message date Latest commit History 419 Commits .github/workflows .github/workflows docs docs examples examples src/pipecat src/pipecat tests tests .dockerignore .dockerignore .gitignore .gitignore Dockerfile Dockerfile LICENSE LICENSE README.md README.md dev-requirements.txt dev-requirements.txt dot-env.template dot-env.template linux-py3.10-requirements.txt linux-py3.10-requirements.txt macos-py3.10-requirements.txt macos-py3.10-requirements.txt pipecat.png pipecat.png pyproject.toml pyproject.toml View all files Repository files navigation * README * BSD-2-Clause license pipecat Pipecat PyPI Discord pipecat is a framework for building voice (and multimodal) conversational agents. Things like personal coaches, meeting assistants, story-telling toys for kids, customer support bots, intake flows, and snarky social companions. Take a look at some example apps: [image] [image] [image] [image] Getting started with voice agents You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when you're ready. You can also add a telephone number, [?] image output, video input, use different LLMs, and more. # install the module pip install pipecat-ai # set up an .env file with API keys cp dot-env.template .env By default, in order to minimize dependencies, only the basic framework functionality is available. Some third-party AI services require additional dependencies that you can install with: pip install "pipecat-ai[option,...]" Your project may or may not need these, so they're made available as optional requirements. Here is a list: * AI services: anthropic, azure, fal, moondream, openai, playht, silero, whisper * Transports: local, websocket, daily Code examples * foundational -- small snippets that build on each other, introducing one or two concepts at a time * example apps -- complete applications that you can use as starting points for development A simple voice agent running locally Here is a very basic Pipecat bot that greets a user when they join a real-time session. We'll use Daily for real-time media transport, and ElevenLabs for text-to-speech. #app.py import asyncio import aiohttp from pipecat.frames.frames import EndFrame, TextFrame from pipecat.pipeline.pipeline import Pipeline from pipecat.pipeline.task import PipelineTask from pipecat.pipeline.runner import PipelineRunner from pipecat.services.elevenlabs import ElevenLabsTTSService from pipecat.transports.services.daily import DailyParams, DailyTransport async def main(): async with aiohttp.ClientSession() as session: # Use Daily as a real-time media transport (WebRTC) transport = DailyTransport( room_url=..., token=..., "Bot Name", DailyParams(audio_out_enabled=True)) # Use Eleven Labs for Text-to-Speech tts = ElevenLabsTTSService( aiohttp_session=session, api_key=..., voice_id=..., ) # Simple pipeline that will process text to speech and output the result pipeline = Pipeline([tts, transport.output()]) # Create Pipecat processor that can run one or more pipelines tasks runner = PipelineRunner() # Assign the task callable to run the pipeline task = PipelineTask(pipeline) # Register an event handler to play audio when a # participant joins the transport WebRTC session @transport.event_handler("on_participant_joined") async def on_new_participant_joined(transport, participant): participant_name = participant["info"]["userName"] or '' # Queue a TextFrame that will get spoken by the TTS service (Eleven Labs) await task.queue_frames([TextFrame(f"Hello there, {participant_name}!"), EndFrame()]) # Run the pipeline task await runner.run(task) if __name__ == "__main__": asyncio.run(main()) Run it with: python app.py Daily provides a prebuilt WebRTC user interface. Whilst the app is running, you can visit at https://.daily.co/ and listen to the bot say hello! WebRTC for production use WebSockets are fine for server-to-server communication or for initial development. But for production use, you'll need client-server audio to use a protocol designed for real-time media transport. (For an explanation of the difference between WebSockets and WebRTC, see this post.) One way to get up and running quickly with WebRTC is to sign up for a Daily developer account. Daily gives you SDKs and global infrastructure for audio (and video) routing. Every account gets 10,000 audio/video/transcription minutes free each month. Sign up here and create a room in the developer Dashboard. What is VAD? Voice Activity Detection -- very important for knowing when a user has finished speaking to your bot. If you are not using press-to-talk, and want Pipecat to detect when the user has finished talking, VAD is an essential component for a natural feeling conversation. Pipecast makes use of WebRTC VAD by default when using a WebRTC transport layer. Optionally, you can use Silero VAD for improved accuracy at the cost of higher CPU usage. pip install pipecat-ai[silero] The first time your run your bot with Silero, startup may take a while whilst it downloads and caches the model in the background. You can check the progress of this in the console. Hacking on the framework itself Note that you may need to set up a virtual environment before following the instructions below. For instance, you might need to run the following from the root of the repo: python3 -m venv venv source venv/bin/activate From the root of this repo, run the following: pip install -r dev-requirements.txt -r {env}-requirements.txt python -m build This builds the package. To use the package locally (eg to run sample files), run pip install --editable . If you want to use this package from another directory, you can run: pip install path_to_this_repo Running tests From the root directory, run: pytest --doctest-modules --ignore-glob="*to_be_updated*" src tests Setting up your editor This project uses strict PEP 8 formatting. Emacs You can use use-package to install py-autopep8 package and configure autopep8 arguments: (use-package py-autopep8 :ensure t :defer t :hook ((python-mode . py-autopep8-mode)) :config (setq py-autopep8-options '("-a" "-a", "--max-line-length=100"))) autopep8 was installed in the venv environment described before, so you should be able to use pyvenv-auto to automatically load that environment inside Emacs. (use-package pyvenv-auto :ensure t :defer t :hook ((python-mode . pyvenv-auto-run))) Visual Studio Code Install the autopep8 extension. Then edit the user settings ( Ctrl-Shift-P Open User Settings (JSON)) and set it as the default Python formatter, enable formatting on save and configure autopep8 arguments: "[python]": { "editor.defaultFormatter": "ms-python.autopep8", "editor.formatOnSave": true }, "autopep8.args": [ "-a", "-a", "--max-line-length=100" ], Getting help [?] Join our Discord [?] Reach us on Twitter About Open Source framework for voice and multimodal conversational AI pipecat.ai Topics real-time ai voice chatbots chatbot-framework voice-assistant Resources Readme License BSD-2-Clause license Activity Custom properties Stars 274 stars Watchers 6 watching Forks 8 forks Report repository Releases 8 tags Packages 0 No packages published Contributors 8 * @Moishe * @aconchillo * @chadbailey59 * @jptaylor * @kwindla * @lazeratops * @jamsea * @rahulunair Languages * Python 99.3% * Dockerfile 0.7% Footer (c) 2024 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact * Manage cookies * Do not share my personal information You can't perform that action at this time.