https://github.com/transcriptionstream/transcriptionstream

Skip to content

Navigation Menu

Toggle navigation
 
Sign in

  * Product
      +  
        Actions
        Automate any workflow
      +  
        Packages
        Host and manage packages
      +  
        Security
        Find and fix vulnerabilities
      +  
        Codespaces
        Instant dev environments
      +  
        Copilot
        Write better code with AI
      +  
        Code review
        Manage code changes
      +  
        Issues
        Plan and track work
      +  
        Discussions
        Collaborate outside of code
    Explore
      + All features
      + Documentation
      + GitHub Skills
      + Blog
  * Solutions
    For
      + Enterprise
      + Teams
      + Startups
      + Education
    By Solution
      + CI/CD & Automation
      + DevOps
      + DevSecOps
    Resources
      + Learning Pathways
      + White papers, Ebooks, Webinars
      + Customer Stories
      + Partners
  * Open Source
      +  
        GitHub Sponsors
        Fund open source developers
      +  
        The ReadME Project
        GitHub community articles
    Repositories
      + Topics
      + Trending
      + Collections
  * Pricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search
[                    ]
Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

[                    ] [ ] Include my email address so I can be
contacted
Cancel Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name [                    ] 
Query [                    ]

To see all available qualifiers, see our documentation.

Cancel Create saved search
Sign in
Sign up
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session. You switched accounts on another tab or window. Reload
to refresh your session. Dismiss alert
{{ message }}
transcriptionstream / transcriptionstream Public

  * Notifications
  * Fork 15
  * Star 199
  * 

turnkey self-hosted offline transcription and diarization service
with llm summary

License

GPL-3.0 license
199 stars 15 forks Branches Tags Activity
Star
Notifications

  * Code
  * Issues 0
  * Pull requests 0
  * Discussions
  * Actions
  * Projects 0
  * Security
  * Insights

Additional navigation options

  * Code
  * Issues
  * Pull requests
  * Discussions
  * Actions
  * Projects
  * Security
  * Insights

transcriptionstream/transcriptionstream

This commit does not belong to any branch on this repository, and may
belong to a fork outside of the repository.
 main
BranchesTags
  
Go to file
Code

Folders and files

                                                        Last    Last
           Name                        Name            commit  commit
                                                       message  date
Latest commit

 

History

58 Commits
 
ts-gpu                      ts-gpu                              

ts-web                      ts-web                              

.env                        .env                                

.gitignore                  .gitignore                          

LICENSE.txt                 LICENSE.txt                         

README                      README                              

README.md                   README.md                           

docker-compose-nobuild.yml  docker-compose-nobuild.yml          

docker-compose.yml          docker-compose.yml                  

install.sh                  install.sh                          

run.sh                      run.sh                              

start-nobuild.sh            start-nobuild.sh                    

View all files

Repository files navigation

  * README
  * GPL-3.0 license

Transcription Stream Community Edition

 

Created by https://transcription.stream with special thanks to
MahmoudAshraf97 and his work on whisper-diarization, and to jmorganca
for Ollama and its amazing simplicity in use.

Overview

 

Transcription Stream is a turnkey self-hosted diarization service
that works completely offline. Out of the box it includes:

  * drag and drop diarization and transcription via SSH
  * a web interface for upload, review, and download of files
  * summarization with Ollama and Mistral
  * Meilisearch for full text search

A web interface and SSH drop zones make this simple to use and
implement into your workflows. Ollama allows for a powerful toolset,
limited only by your prompt skills, to perform complex operations on
your transcriptions. Meiliesearch adds ridiculously fast full text
search.

Use the web interface to upload, listen to, review, and download
output files, or drop files via SSH into transcribe or diarize. Files
are processed with output placed into a named and dated folder. Have
a quick look at the install and ts-web walkthrough videos for a
better idea.

                     ssh upload and transcribed

                                   
upload file to be diarized to the diarize folder transcribed files in
their folders

ts-web interface

 
Example Image

ts-gpu diarization example

 
watch video on youtube

                           mistral summary

                                   
                    local ollama mistral summary

prompt_text = f"""
Summarize the transcription below. Be sure to include pertinent information about the speakers, including name and anything else shared.
Provide the summary output in the following style

Speakers: names or identifiers of speaking parties
Topics: topics included in the transcription
Ideas: any ideas that may have been mentioned
Dates: dates mentioned and what they correspond to
Locations: any locations mentioned
Action Items: any action items

Summary: overall summary of the transcription

The transcription is as follows

{transcription_text}

"""

Prerequisite: NVIDIA GPU

    Warning: The resulting ts-gpu image is ~26GB and might take a hot
    second to create

Quickstart (no build)

 

Pulls all docker images and starts services

 

./start-nobuild.sh

Build and Run Instructions

 


    If you'd like to build the images locally

Automated Install and Run

 

chmod +x install.sh;
./install.sh;

Run

 

chmod +x run.sh;
./run.sh

Additional Information

 

Ports

 

  * SSH: 22222
  * HTTP: 5006
  * Ollama: 11434
  * Meilisearch: 7700

SSH Server Access

 

  * Port: 22222
  * User: transcriptionstream
  * Password: nomoresaastax
  * Usage: Place audio files in transcribe or diarize. Completed
    files are stored in transcribed.

Web Interface

 

  * URL: http://dockerip:5006
  * Features:
      + Audio file upload/download
      + Task completion alerts with interactive links
      + HTML5 web player with speed control and transcription
        highlighting
      + Time-synced transcription scrubbing/highlighting/scrolling

Ollama api

 

  * URL: http://dockerip:11434
  * change the prompt used, in /ts-gpu/ts-summarize.py

Meilisearch api

 

  * URL: http://dockerip:7700

    Warning: This is example code for example purposes and should not
    be used in production environments without additional security
    measures.

Customization and Troubleshooting

 

  * Update variables in the .env file
  * Change the password for transcriptionstream in the ts-gpu
    Dockerfile.
  * Update the Ollama api endpoint IP in .env if you want to use a
    different endpoint
  * Update the secret in .env for ts-web
  * Use .env to choose which models are included in the initial
    build.
  * Change the prompt text in ts-gpu/ts-summarize.py to fit your
    needs. Update ts-web/templates/transcription.html if you want to
    call it something other than summary.
  * 12GB of vram may not be enough to run both whisper-diarization
    and ollama mistral. Whisper-diarization is fairly light on gpu
    memory out of the box, but Ollama's runner holds enough gpu
    memory open causing the diarization/transcription to run our of
    CUDA memory on occasion. Since I can't run both on the same host
    reliably, I've set the batch size for both whisper-diarization
    and whisperx to 16, from their default 8, and let a m series mac
    run the Ollama endpoint.

To-do

 

  * Need to fix an issue with ts-web that throws an error to console
    when loading a transcription when a summary.txt file does not
    also exist. Lots of other annoyances with ts-web, but it's
    functional.
  * Need to add a search/control interface to ts-web for Meilisearch

About

turnkey self-hosted offline transcription and diarization service
with llm summary

Topics

automation speech-recognition transcription whisper 
speaker-diarization diarization llm whisperx ollama mistral-7b

Resources

Readme

License

GPL-3.0 license
Activity

Stars

199 stars

Watchers

3 watching

Forks

15 forks
Report repository

Releases

No releases published

Packages 0

No packages published

Languages

  * Python 27.1%
  * JavaScript 25.4%
  * Shell 18.7%
  * HTML 13.4%
  * CSS 9.5%
  * Dockerfile 5.9%

Footer

 (c) 2024 GitHub, Inc.

Footer navigation

  * Terms
  * Privacy
  * Security
  * Status
  * Docs
  * Contact
  * Manage cookies
  * Do not share my personal information

You can't perform that action at this time.