https://github.com/transcriptionstream/transcriptionstream Skip to content Navigation Menu Toggle navigation Sign in * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions For + Enterprise + Teams + Startups + Education By Solution + CI/CD & Automation + DevOps + DevSecOps Resources + Learning Pathways + White papers, Ebooks, Webinars + Customer Stories + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} transcriptionstream / transcriptionstream Public * Notifications * Fork 15 * Star 199 * turnkey self-hosted offline transcription and diarization service with llm summary License GPL-3.0 license 199 stars 15 forks Branches Tags Activity Star Notifications * Code * Issues 0 * Pull requests 0 * Discussions * Actions * Projects 0 * Security * Insights Additional navigation options * Code * Issues * Pull requests * Discussions * Actions * Projects * Security * Insights transcriptionstream/transcriptionstream This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main BranchesTags Go to file Code Folders and files Last Last Name Name commit commit message date Latest commit History 58 Commits ts-gpu ts-gpu ts-web ts-web .env .env .gitignore .gitignore LICENSE.txt LICENSE.txt README README README.md README.md docker-compose-nobuild.yml docker-compose-nobuild.yml docker-compose.yml docker-compose.yml install.sh install.sh run.sh run.sh start-nobuild.sh start-nobuild.sh View all files Repository files navigation * README * GPL-3.0 license Transcription Stream Community Edition Created by https://transcription.stream with special thanks to MahmoudAshraf97 and his work on whisper-diarization, and to jmorganca for Ollama and its amazing simplicity in use. Overview Transcription Stream is a turnkey self-hosted diarization service that works completely offline. Out of the box it includes: * drag and drop diarization and transcription via SSH * a web interface for upload, review, and download of files * summarization with Ollama and Mistral * Meilisearch for full text search A web interface and SSH drop zones make this simple to use and implement into your workflows. Ollama allows for a powerful toolset, limited only by your prompt skills, to perform complex operations on your transcriptions. Meiliesearch adds ridiculously fast full text search. Use the web interface to upload, listen to, review, and download output files, or drop files via SSH into transcribe or diarize. Files are processed with output placed into a named and dated folder. Have a quick look at the install and ts-web walkthrough videos for a better idea. ssh upload and transcribed upload file to be diarized to the diarize folder transcribed files in their folders ts-web interface Example Image ts-gpu diarization example watch video on youtube mistral summary local ollama mistral summary prompt_text = f""" Summarize the transcription below. Be sure to include pertinent information about the speakers, including name and anything else shared. Provide the summary output in the following style Speakers: names or identifiers of speaking parties Topics: topics included in the transcription Ideas: any ideas that may have been mentioned Dates: dates mentioned and what they correspond to Locations: any locations mentioned Action Items: any action items Summary: overall summary of the transcription The transcription is as follows {transcription_text} """ Prerequisite: NVIDIA GPU Warning: The resulting ts-gpu image is ~26GB and might take a hot second to create Quickstart (no build) Pulls all docker images and starts services ./start-nobuild.sh Build and Run Instructions If you'd like to build the images locally Automated Install and Run chmod +x install.sh; ./install.sh; Run chmod +x run.sh; ./run.sh Additional Information Ports * SSH: 22222 * HTTP: 5006 * Ollama: 11434 * Meilisearch: 7700 SSH Server Access * Port: 22222 * User: transcriptionstream * Password: nomoresaastax * Usage: Place audio files in transcribe or diarize. Completed files are stored in transcribed. Web Interface * URL: http://dockerip:5006 * Features: + Audio file upload/download + Task completion alerts with interactive links + HTML5 web player with speed control and transcription highlighting + Time-synced transcription scrubbing/highlighting/scrolling Ollama api * URL: http://dockerip:11434 * change the prompt used, in /ts-gpu/ts-summarize.py Meilisearch api * URL: http://dockerip:7700 Warning: This is example code for example purposes and should not be used in production environments without additional security measures. Customization and Troubleshooting * Update variables in the .env file * Change the password for transcriptionstream in the ts-gpu Dockerfile. * Update the Ollama api endpoint IP in .env if you want to use a different endpoint * Update the secret in .env for ts-web * Use .env to choose which models are included in the initial build. * Change the prompt text in ts-gpu/ts-summarize.py to fit your needs. Update ts-web/templates/transcription.html if you want to call it something other than summary. * 12GB of vram may not be enough to run both whisper-diarization and ollama mistral. Whisper-diarization is fairly light on gpu memory out of the box, but Ollama's runner holds enough gpu memory open causing the diarization/transcription to run our of CUDA memory on occasion. Since I can't run both on the same host reliably, I've set the batch size for both whisper-diarization and whisperx to 16, from their default 8, and let a m series mac run the Ollama endpoint. To-do * Need to fix an issue with ts-web that throws an error to console when loading a transcription when a summary.txt file does not also exist. Lots of other annoyances with ts-web, but it's functional. * Need to add a search/control interface to ts-web for Meilisearch About turnkey self-hosted offline transcription and diarization service with llm summary Topics automation speech-recognition transcription whisper speaker-diarization diarization llm whisperx ollama mistral-7b Resources Readme License GPL-3.0 license Activity Stars 199 stars Watchers 3 watching Forks 15 forks Report repository Releases No releases published Packages 0 No packages published Languages * Python 27.1% * JavaScript 25.4% * Shell 18.7% * HTML 13.4% * CSS 9.5% * Dockerfile 5.9% Footer (c) 2024 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact * Manage cookies * Do not share my personal information You can't perform that action at this time.