https://lambdalabs.com/blog/unveiling-hermes-3-the-first-fine-tuned-llama-3.1-405b-model-is-on-lambdas-cloud Introducing 1-Click Clusters(tm), on-demand GPU clusters in the cloud for training large AI models. Learn more lambda logo * Cloud Show submenu for Cloud + Cloud Sign-In + 1-Click Clusters + On-Demand Cloud + Reserved Cloud * Datacenter Show submenu for Datacenter + Lambda NVIDIA DGX Systems NVIDIA DGX SystemsNVIDIA's latest generation of infrastructure for enterprise AI. + Lambda Scalar server with up to 8x NVIDIA GPUs (PCIe) Scalar ServerPCIe server with up to 8x customizable NVIDIA Tensor Core GPUs and dual Xeon or AMD EPYC processors. * Desktops Show submenu for Desktops + Lambda GPU Workstation RTX 5000 Ada, RTX A6000, RTX 6000 Ada Vector Pro GPU WorkstationLambda's GPU workstation designed for AI. Up to four fully customizable NVIDIA GPUs. + Lambda GPU Desktop 4090 GPUs Vector GPU DesktopLambda's GPU desktop for deep learning. Configured with two NVIDIA RTX 4090s. + Lambda Vector One 4090 GPU Vector One GPU DesktopLambda's single GPU desktop. Configured with a single NVIDIA RTX 4090. * Company Show submenu for Company + About + Careers + Professional Services + Partners * Resources Show submenu for Resources + GPU Benchmarks + Blog + ML Times + Lambda Stack + Documentation + Forum + Research + Technical Support * Open main navigation Close main navigation * Cloud Show submenu for Cloud + Cloud Sign-In + 1-Click Clusters + On-Demand Cloud + Reserved Cloud * Datacenter Show submenu for Datacenter + NVIDIA DGX Systems + Scalar Server * Desktops Show submenu for Desktops + Vector Pro Workstation + Vector Desktop + Vector One Desktop * Company Show submenu for Company + About + Careers + Professional Services + Partners * Resources Show submenu for Resources + GPU Benchmarks + Blog + ML Times + Lambda Stack + Documentation + Forum + Research * Support * +1 (866) 711-2025 +1 (866) 711-2025 Unveiling Hermes 3: The First Fine-Tuned Llama 3.1 405B Model is on Lambda's Cloud [mites] Mitesh Agrawal August 15, 2024 4 min read machine learning news announcements gpus infiniband gpu-cloud text generation distributed training gpu clusters lambda cloud NVIDIA H100 LLMs 1-Click Cluster [Hermes-3] Try Hermes 3 for free with the New Lambda Chat Completions API and Lambda Chat. Introducing Hermes 3: A new era for Llama fine-tuning We are thrilled to announce our partner Nous Research's launch of Hermes 3 --the first full-parameter fine-tune of Meta's groundbreaking Llama 3.1 405B model, trained on Lambda's 1-Click Cluster. Designed for the open-source community, Hermes 3 is a neutrally-aligned generalist model with exceptional reasoning capabilities, now available for free through the new Lambda Chat Completions API and Lambda Chat interface. Powered by an 8-node Lambda 1-Click Cluster, Nous Research achieved outstanding results in just a few short weeks. Hermes 3 meets or exceeds Llama 3.1 Instruct on Open Source LLM benchmarks (see table below). "Lambda's 1-Click Clusters make the experience of renting and using a multi-node cluster as simple and easy as renting and using a single node," -Jeffrey Quesnelle, co-founder of Nous Research Hermes 3: A uniquely unlocked, uncensored, and steerable model Hermes 3 is the latest advancement in Nous Research's series of models, which have been downloaded over 33 million times. This instruct-tuned model is specifically designed to be flexible and adept at following instructions. It excels in complex role-playing and creative writing, offering users more immersive character portrayals, deeper simulations, and unexpected fictional experiences. Hermes 3 benchmarks In addition to its creative capabilities, Hermes 3 is an invaluable tool for professionals requiring advanced reasoning and decision-making abilities. Its strategic planning and operational decision-making features include function-calling, step-labeled reasoning, and more. Optimized for efficiency Hermes 3 was meticulously trained using synthesized data and supervised fine-tuning on Meta's Llama 3.1 405B base model. This was followed by reinforcement learning from human feedback (RLHF) and finally, quantization using Neural Magic's FP8 method. This optimization effectively reduces the model's VRAM and disk requirements by approximately 50%, allowing it to run on a single node. "Since the start of my journey in AI I wanted to bring about the realization of an open source frontier level model that aligns to you, the user - not some corporation or higher authority before the user. Today, with Hermes 3 405B, we've achieved that goal, a model that is frontier level, but truly aligned to you. Thanks to our hard work on data synthesis and post training research, we were able to make a dataset that is fully synthetic over almost a year in the making to train Hermes 3 - and will be releasing much more to come." -Teknium, cofounder of Nous Research For those seeking dedicated access and flexibility, Hermes 3 can run on a single node (available on-demand on Lambda's Cloud), or quickly scale to a multi-node 1-Click Cluster for further fine-tuning using Lambda's scalable cluster infrastructure. Try Hermes 3 for free - for a limited time! We're excited to offer the AI/ML community free access to Hermes 3 through Lambda's new Chat Completions API, fully compatible with the OpenAI API. It provides endpoints for creating completions, chat completions and listing models. No complex setup is required--simply generate a Cloud API key from Lambda's dashboard (sign-up) and start exploring with our documentation's help. For a more interactive experience, we're also providing a simple chat interface: try your prompts in Lambda Chat! More on the Deep Learning Blog: [X] Hugging Face x Lambda: Whisper Fine-Tuning Event Lambda is thrilled to team up with Hugging Face, a community platform that enables users to build,... how-do-i-fine-tune-llama-2-on-lambda-gpu-cloud Fine tuning Meta's LLaMA 2 on Lambda GPU Cloud This blog post provides instructions on how to fine tune Llama 2 models on Lambda Cloud using a... How to fine tune stable diffusion: how we made the text-to-pokemon model at Lambda How to fine tune stable diffusion: how we made the text-to-pokemon model at Lambda Stable Diffusion is great at many things, but not great at everything, and getting results in a... [lambda-log] Resources GPU Benchmarks Blog Lambda Stack Documentation Forum Research Company About Careers Professional Services Partners Support Technical Support Partner Portal Contact Contact Us P. 1 (866) 711-2025 --------------------------------------------------------------------- (c) 2024 All rights reserved. Terms of Service Privacy Policy