https://github.com/lifeiteng/OmniSenseVoice

Skip to content

Navigation Menu

Toggle navigation
 
Sign in

  * Product
      +  
        GitHub Copilot
        Write better code with AI
      +  
        Security
        Find and fix vulnerabilities
      +  
        Actions
        Automate any workflow
      +  
        Codespaces
        Instant dev environments
      +  
        Issues
        Plan and track work
      +  
        Code Review
        Manage code changes
      +  
        Discussions
        Collaborate outside of code
      +  
        Code Search
        Find more, search less
    Explore
      + All features
      + Documentation
      + GitHub Skills
      + Blog
  * Solutions
    By size
      + Enterprise
      + Teams
      + Startups
    By industry
      + Healthcare
      + Financial services
      + Manufacturing
    By use case
      + CI/CD & Automation
      + DevOps
      + DevSecOps
  * Resources
    Topics
      + AI
      + DevOps
      + Security
      + Software Development
      + View all
    Explore
      + Learning Pathways
      + White papers, Ebooks, Webinars
      + Customer Stories
      + Partners
  * Open Source
      +  
        GitHub Sponsors
        Fund open source developers
      +  
        The ReadME Project
        GitHub community articles
    Repositories
      + Topics
      + Trending
      + Collections
  * Enterprise
      +  
        Enterprise platform
        AI-powered developer platform
    Available add-ons
      +  
        Advanced Security
        Enterprise-grade security features
      +  
        GitHub Copilot
        Enterprise-grade AI features
      +  
        Premium Support
        Enterprise-grade 24/7 support
  * Pricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search
[                    ]
Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

[                    ] [ ] Include my email address so I can be
contacted
Cancel Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name [                    ] 
Query [                    ]

To see all available qualifiers, see our documentation.

Cancel Create saved search
Sign in
Sign up Reseting focus
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session. You switched accounts on another tab or window. Reload
to refresh your session. Dismiss alert
{{ message }}
lifeiteng / OmniSenseVoice Public

  * Notifications You must be signed in to change notification
    settings
  * Fork 10
  * Star 485

Omni SenseVoice: High-Speed Speech Recognition with words timestamps
[?]

485 stars 10 forks Branches Tags Activity
Star
Notifications You must be signed in to change notification settings

  * Code
  * Issues 6
  * Pull requests 0
  * Actions
  * Projects 0
  * Security
  * Insights

Additional navigation options

  * Code
  * Issues
  * Pull requests
  * Actions
  * Projects
  * Security
  * Insights

lifeiteng/OmniSenseVoice

This commit does not belong to any branch on this repository, and may
belong to a fork outside of the repository.
 main
BranchesTags
  
Go to file
Code

Folders and files

                                                Last commit   Last
         Name                    Name             message    commit
                                                              date
Latest commit

 

History

21 Commits
 
src/omnisense           src/omnisense                        
tests                   tests                                
.flake8                 .flake8                              
.gitignore              .gitignore                           
.pre-commit-config.yaml .pre-commit-config.yaml              
README.md               README.md                            
setup.py                setup.py                             
View all files

Repository files navigation

  * README

Omni SenseVoice 

 

The Ultimate Speech Recognition Solution

 

Built on SenseVoice, Omni SenseVoice is optimized for lightning-fast
inference and precise timestamps--giving you a smarter, faster way to
handle audio transcription!

Install

 

pip install .

Usage

 

omnisense transcribe [OPTIONS] AUDIO_PATH

Key Options:

  * --language: Automatically detect the language or specify (auto,
    zh, en, yue, ja, ko).
  * --textnorm: Choose whether to apply inverse text normalization
    (withitn for inverse normalized or woitn for raw).
  * --device-id: Run on a specific GPU (default: -1 for CPU).
  * --quantize: Use a quantized model for faster processing.
  * --help: Display detailed help information.

Benchmark

 

omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size
10 --textnorm woitn --language en benchmark/data/manifests/libritts/
libritts_cuts_dev-clean.jsonl

   Optimize         GPU      WER [?] RTF [?] Speed Up 
baseline(onnx) NVIDIA L4 GPU 4.47%  0.1200 1x
torch          NVIDIA L4 GPU 5.02%  0.0022 50x

  * With Omni SenseVoice, experience up to 50x faster processing
    without sacrificing accuracy.

# LibriTTS
DIR=benchmark/data
lhotse download libritts -p dev-clean benchmark/dataLibriTTS
lhotse prepare libritts -p dev-clean benchmark/data/LibriTTS/LibriTTS benchmark/data/manifests/libritts

lhotse cut simple --force-eager -r benchmark/data/manifests/libritts/libritts_recordings_dev-clean.jsonl.gz \
    -s benchmark/data/manifests/libritts/libritts_supervisions_dev-clean.jsonl.gz \
    benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 -
-textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

omnisense benchmark -s --num-workers 4 --device-id 0 --batch-size 16 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

Contributing 

 

Step 1: Code Formatting

 

Set up pre-commit hooks:

pip install pre-commit==3.6.0
pre-commit install

Step 2: Pull Request

 

Submit your awesome improvements through a PR. 

About

Omni SenseVoice: High-Speed Speech Recognition with words timestamps
[?]

Resources

Readme
Activity

Stars

485 stars

Watchers

8 watching

Forks

10 forks
Report repository

Releases

No releases published

Packages 0

No packages published

Languages

  * Python 100.0%

Footer

 (c) 2024 GitHub, Inc.

Footer navigation

  * Terms
  * Privacy
  * Security
  * Status
  * Docs
  * Contact
  * Manage cookies
  * Do not share my personal information

You can't perform that action at this time.