https://lambdalabs.com/blog/unveiling-hermes-3-the-first-fine-tuned-llama-3.1-405b-model-is-on-lambdas-cloud

Introducing 1-Click Clusters(tm), on-demand GPU clusters in the cloud
for training large AI models. Learn more
lambda logo

  * Cloud Show submenu for Cloud
      + Cloud Sign-In
      + 1-Click Clusters
      + On-Demand Cloud
      + Reserved Cloud
  * Datacenter Show submenu for Datacenter
      + Lambda NVIDIA DGX Systems NVIDIA DGX SystemsNVIDIA's latest
        generation of infrastructure for enterprise AI.
      + Lambda Scalar server with up to 8x NVIDIA GPUs (PCIe) Scalar
        ServerPCIe server with up to 8x customizable NVIDIA Tensor
        Core GPUs and dual Xeon or AMD EPYC processors.
  * Desktops Show submenu for Desktops
      + Lambda GPU Workstation RTX 5000 Ada, RTX A6000, RTX 6000 Ada 
        Vector Pro GPU WorkstationLambda's GPU workstation designed
        for AI. Up to four fully customizable NVIDIA GPUs.
      + Lambda GPU Desktop 4090 GPUs Vector GPU DesktopLambda's GPU
        desktop for deep learning. Configured with two NVIDIA RTX
        4090s.
      + Lambda Vector One 4090 GPU Vector One GPU DesktopLambda's
        single GPU desktop. Configured with a single NVIDIA RTX 4090.
  * Company Show submenu for Company
      + About
      + Careers
      + Professional Services
      + Partners
  * Resources Show submenu for Resources
      + GPU Benchmarks
      + Blog
      + ML Times
      + Lambda Stack
      + Documentation
      + Forum
      + Research
      + Technical Support
  *  

Open main navigation Close main navigation

  * Cloud Show submenu for Cloud
      + Cloud Sign-In
      + 1-Click Clusters
      + On-Demand Cloud
      + Reserved Cloud
  * Datacenter Show submenu for Datacenter
      + NVIDIA DGX Systems
      + Scalar Server
  * Desktops Show submenu for Desktops
      + Vector Pro Workstation
      + Vector Desktop
      + Vector One Desktop
  * Company Show submenu for Company
      + About
      + Careers
      + Professional Services
      + Partners
  * Resources Show submenu for Resources
      + GPU Benchmarks
      + Blog
      + ML Times
      + Lambda Stack
      + Documentation
      + Forum
      + Research
  * Support
  * +1 (866) 711-2025

 
+1 (866) 711-2025

Unveiling Hermes 3: The First Fine-Tuned Llama 3.1 405B Model is on
Lambda's Cloud

[mites]
Mitesh Agrawal
August 15, 2024 4 min read
machine learning news announcements gpus infiniband gpu-cloud text
generation distributed training gpu clusters lambda cloud NVIDIA H100
LLMs 1-Click Cluster
[Hermes-3]

Try Hermes 3 for free with the New Lambda Chat Completions API and
Lambda Chat.

 

Introducing Hermes 3: A new era for Llama fine-tuning

We are thrilled to announce our partner Nous Research's launch of
Hermes 3 --the first full-parameter fine-tune of Meta's groundbreaking
Llama 3.1 405B model, trained on Lambda's 1-Click Cluster. Designed
for the open-source community, Hermes 3 is a neutrally-aligned
generalist model with exceptional reasoning capabilities, now
available for free through the new Lambda Chat Completions API and
Lambda Chat interface.

Powered by an 8-node Lambda 1-Click Cluster, Nous Research achieved
outstanding results in just a few short weeks. Hermes 3 meets or
exceeds Llama 3.1 Instruct on Open Source LLM benchmarks (see table
below). 

    "Lambda's 1-Click Clusters make the experience of renting and
    using a multi-node cluster as simple and easy as renting and
    using a single node," 

-Jeffrey Quesnelle, co-founder of Nous Research

 

Hermes 3: A uniquely unlocked, uncensored, and steerable model

Hermes 3 is the latest advancement in Nous Research's series of
models, which have been downloaded over 33 million times. This
instruct-tuned model is specifically designed to be flexible and
adept at following instructions. It excels in complex role-playing
and creative writing, offering users more immersive character
portrayals, deeper simulations, and unexpected fictional experiences.

Hermes 3 benchmarks

In addition to its creative capabilities, Hermes 3 is an invaluable
tool for professionals requiring advanced reasoning and
decision-making abilities. Its strategic planning and operational
decision-making features include function-calling, step-labeled
reasoning, and more.

 

Optimized for efficiency

Hermes 3 was meticulously trained using synthesized data and
supervised fine-tuning on Meta's Llama 3.1 405B base model. This was
followed by reinforcement learning from human feedback (RLHF) and
finally, quantization using Neural Magic's FP8 method.

This optimization effectively reduces the model's VRAM and disk
requirements by approximately 50%, allowing it to run on a single
node.

    "Since the start of my journey in AI I wanted to bring about the
    realization of an open source frontier level model that aligns to
    you, the user - not some corporation or higher authority before
    the user. Today, with Hermes 3 405B, we've achieved that goal, a
    model that is frontier level, but truly aligned to you. 

    Thanks to our hard work on data synthesis and post training
    research, we were able to make a dataset that is fully synthetic
    over almost a year in the making to train Hermes 3 - and will be
    releasing much more to come."

-Teknium, cofounder of Nous Research

For those seeking dedicated access and flexibility, Hermes 3 can run
on a single node (available on-demand on Lambda's Cloud), or quickly
scale to a multi-node 1-Click Cluster for further fine-tuning using
Lambda's scalable cluster infrastructure. 

 

Try Hermes 3 for free - for a limited time!

We're excited to offer the AI/ML community free access to Hermes 3
through Lambda's new Chat Completions API, fully compatible with the
OpenAI API. It provides endpoints for creating completions, chat
completions and listing models.

No complex setup is required--simply generate a Cloud API key from
Lambda's dashboard (sign-up) and start exploring with our
documentation's help. 

For a more interactive experience, we're also providing a simple chat
interface: try your prompts in Lambda Chat! 

More on the Deep Learning Blog:

[X]

Hugging Face x Lambda: Whisper Fine-Tuning Event

Lambda is thrilled to team up with Hugging Face, a community platform
that enables users to build,...

how-do-i-fine-tune-llama-2-on-lambda-gpu-cloud

Fine tuning Meta's LLaMA 2 on Lambda GPU Cloud

This blog post provides instructions on how to fine tune Llama 2
models on Lambda Cloud using a...

How to fine tune stable diffusion: how we made the text-to-pokemon
model at Lambda

How to fine tune stable diffusion: how we made the text-to-pokemon
model at Lambda

Stable Diffusion is great at many things, but not great at
everything, and getting results in a...

[lambda-log]

Resources

GPU Benchmarks Blog Lambda Stack Documentation Forum Research

Company

About Careers Professional Services Partners

Support

Technical Support Partner Portal

Contact

Contact Us P. 1 (866) 711-2025
---------------------------------------------------------------------
(c) 2024 All rights reserved.
Terms of Service     Privacy Policy