https://github.com/rentruewang/bocoel

Skip to content
Toggle navigation
 
Sign in

  * Product
      +  
        Actions
        Automate any workflow
      +  
        Packages
        Host and manage packages
      +  
        Security
        Find and fix vulnerabilities
      +  
        Codespaces
        Instant dev environments
      +  
        Copilot
        Write better code with AI
      +  
        Code review
        Manage code changes
      +  
        Issues
        Plan and track work
      +  
        Discussions
        Collaborate outside of code
    Explore
      + All features
      + Documentation
      + GitHub Skills
      + Blog
  * Solutions
    For
      + Enterprise
      + Teams
      + Startups
      + Education
    By Solution
      + CI/CD & Automation
      + DevOps
      + DevSecOps
    Resources
      + Learning Pathways
      + White papers, Ebooks, Webinars
      + Customer Stories
      + Partners
  * Open Source
      +  
        GitHub Sponsors
        Fund open source developers
      +  
        The ReadME Project
        GitHub community articles
    Repositories
      + Topics
      + Trending
      + Collections
  * Pricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search
[                    ]
Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

[                    ] [ ] Include my email address so I can be
contacted
Cancel Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name [                    ] 
Query [                    ]

To see all available qualifiers, see our documentation.

Cancel Create saved search
Sign in
Sign up
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session. You switched accounts on another tab or window. Reload
to refresh your session. Dismiss alert
{{ message }}
rentruewang / bocoel Public

  * Notifications
  * Fork 0
  * Star 125
  * 

Bayesian Optimization as a Coverage Tool for Evaluating LLMs. 10
times faster and accurate evaluation (benchmarking) with just a few
lines of modular code.

rentruewang.github.io/bocoel/

License

Apache-2.0 license
125 stars 0 forks Branches Tags Activity
Star
Notifications

  * Code
  * Issues 0
  * Pull requests 0
  * Actions
  * Projects 0
  * Security
  * Insights

Additional navigation options

  * Code
  * Issues
  * Pull requests
  * Actions
  * Projects
  * Security
  * Insights

rentruewang/bocoel

This commit does not belong to any branch on this repository, and may
belong to a fork outside of the repository.
 main
BranchesTags
  
Go to file
Code

Folders and files

       Name                Name          Last commit     Last commit
                                           message          date
Latest commit

 

History

189 Commits
 
.github/workflows   .github/workflows                    

assets              assets                               

bocoel              bocoel                               

docs                docs                                 

examples            examples                             

tests               tests                                

.gitignore          .gitignore                           

CHANGELOG.md        CHANGELOG.md                         

CODE_OF_CONDUCT.md  CODE_OF_CONDUCT.md                   

CONTRIBUTING.md     CONTRIBUTING.md                      

LICENSE.md          LICENSE.md                           

README.md           README.md                            

mkdocs.yml          mkdocs.yml                           

pyproject.toml      pyproject.toml                       

View all files

Repository files navigation

  * README
  * Code of conduct
  * Apache-2.0 license

 [?] BoCoEL

 Bayesian Optimization as a Coverage Tool for Evaluating Large
Language Models

Logo

Publish Build Pages Formatting Type Checking Unit Testing

GitHub License PyPI - Python Version Built with Material for MkDocs

  Why BoCoEL?

Large language models are expensive and slow behemoths, and
evaluating them on gigantic modern datasets only makes it worse.

If only there is a way to just select a meaningful (and small) subset
of the corpus and obtain a highly accurate evaluation.....

Wait, sounds like Bayesian Optmization!

Bocoel works in the following steps:

 1. Encode individual entry into embeddings (way cheaper / faster
    than LLM and reusable).
 2. Use Bayesian optimization to select queries to evaluate.
 3. Use the queries to retrieve from our corpus (with the encoded
    embeddings).
 4. Profit.

The evaluations generated are easily managed by the provided manager
utility.

  Features

  *  Accurately evaluate large language models with just tens of
    samples from your selected corpus.
  * [?] Uses the power of Bayesian optimization to select an optimal
    set of samples for language model to evaluate.
  *  Evalutes the corpus on the model in addition to evaluating the
    model on corpus.
  *  Support for GPT2, Pythia, LLAMA and more through integration
    with huggingface transformers and datasets
  *  Modular design.
  *  Efficient representation of the corpus / dataset such as
    N-sphere representation or whitening of the latent space to
    agument evaluation quality.

  Give us a star!

Like what you see? Please consider giving this a star ()!

 [?][?] Bayesian Optimization

[68747470733a2f2f757]

Simply put, Bayesian optimization aims to optimize either the
exploration objective (the purple area in the image) or the
exploitation object (the height of the black dots). It uses Gaussian
processes as a backbone for inference, and uses an acquisition
function to decide where to sample next. See here for an a more
in-depth introduction.

Since Bayesian optimization works well with expensive-to-evaluate
black-box model (paraphrase: LLM), it is perfect for this particular
use case. Bocoel uses Bayesian optimization as a backbone for
exploring the embedding space given by our corpus, which allows it to
select a good subset acting as a mini snapshot of the corpus.

 [?] Performance Implications

LLMs are painfully slow, especially generative ones (which is what is
usually referred to as LLM), since sequence generation is sequential
by nature.

Despite bocoel's requirement to use an embedder to encode the entire
corpus, embedders are faster than LLMs by orders of magnitude and the
time is gained back by practically any savings in evaluating LLMs.

 [?] Installation

I don't want optional dependencies:

pip install bocoel

Give me the full experience (all optional dependencies):

pip install "bocoel[all]"

  Usage

See the folder examples/getting_started for a simplistic usage of the
library to get started with just a few lines of code.

 [?] Develop with BoCoEL

Usage examples are under the folder examples. API reference can be
found here.

  Contributing

Contributors wanted! Don't be shy. Feel free to file issues and PRs.
For PRs, please follow the guide on contributing and the code of
conduct. Openness and inclusiveness are taken very seriously.

 [?] Roadmap: work in progress

  *  Simpler usage. I should provide a high level wrapper for the
    entire library s.t. evaluations can be run in one line.
  *  Visualization module of the evaluation.
  *  Integration of alternative methods (random, kmedoids...) with
    Gaussian process.
  *  Integration with more backends such as VLLM and OpenAI's API.
  *  Support for Python 3.11+

 [?] License and Citation

The code is available under Apache License.

If you find this project helpful in your research, please cite this
work at

@misc{bocoel2024,
    title = {BoCoEL: Bayesian Optimization as a Coverage Tool for Evaluating Large Language Models},
    url = {https://rentruewang.github.io/bocoel/research/},
    author = {Wang, RenChu and Chuang, Yung-Sung},
    month = {January},
    year = {2024}
}

About

Bayesian Optimization as a Coverage Tool for Evaluating LLMs. 10
times faster and accurate evaluation (benchmarking) with just a few
lines of modular code.

rentruewang.github.io/bocoel/

Resources

Readme

License

Apache-2.0 license

Code of conduct

Code of conduct
Activity

Stars

125 stars

Watchers

2 watching

Forks

0 forks
Report repository

Releases

6 tags

Packages 0

No packages published

Contributors 6

  * @rentruewang
  * @github-actions[bot]
  * @voidism
  * @doctryucsd
  * @PsyDak-Meng
  * @gauthameyunni

Languages

  * Python 100.0%

Footer

 (c) 2024 GitHub, Inc.

Footer navigation

  * Terms
  * Privacy
  * Security
  * Status
  * Docs
  * Contact
  * Manage cookies
  * Do not share my personal information

You can't perform that action at this time.