https://dvc.org/ by Iterative * Use Cases * Doc * Blog * Course * Community Meet the CommunityTestimonialsContributeLearnEvents * Support * More Tools DataChain Wrangle unstructured data in Python using AI helpers at scale DVC Studio Track experiments and share insights from ML projects VS Code Extension Local ML model development and experiment tracking * Use CasesDocBlogCourseCommunitySupport * * Get Started | Get Enterprise by Iterative * Use Cases * Doc * Blog * Course * Community + [icon-commu]Meet Us + [icon-commu]Testimonials + [icon-contr]Contribute + [icon-learn]Learn + [icon-event]Events * Support + [icon-mail]E-Mail + GitHub + [icon-disco]Discord + Twitter * All Tools + Datachain logoDataChain + Studio logoDVC Studio + DVC logoVS Code Extension Get startedGet Enterprise What's new Scalable PDF Document Processing with DataChain and Unstructured.io [svg] [M][unstructur] Extract and parse text from documents and create vector embeddings in a scalable and distributed way (and less than 70 lines of code). Read more. DataChain Open-Source Release A New Way to Manage your Unstructured Data Star us on GitHub Data Version Control - and much more - for the GenAI era Free and open source, forever. Manage and version images, audio, video, and text files in storage and organize your ML modeling process into a reproducible workflow. DVC Logo GenAI DataChain ---Github Logo DVC Logo Data and model versioning 13.6KGithub Logo [svg] VisualizationVisualization Explore and enrich annotated datasets with custom embeddings, auto-labeling, and bias removal at billion-file scale -- without modifying your data. Connect to versioned data sources and code with pipelines, track experiments, register models -- all based on GitOps principles. Learn about DataChain Download DVC Get Started with Datachain Logo * Filter a billion samples in seconds Datasets are getting larger, but the ability to iterate rapidly and efficiently is as important as ever. * Create datasets from queries Save the results of a query in a dataset that you can use to train your ML models. Star us on Github DataChain and DVC: Better Together Build the datasets you need without modifying your data sources. Create pipelines that connect your versioned datasets, code, and models together for effective experiment tracking the GitOps way. Get Started with DVC Logo * Connect storage to repo Keep large data and model files alongside code and share via your cloud storage. * Configure steps as you go Declare dependencies and outputs at each step to build reproducible end-to-end pipelines. * Track experiments in Git Track experiments in your repo, compare results and restore entire experiment states cross-team. Get started DownloadDownload(pip, conda, brew)[triangle] pip, conda, brew macOSWindowsLinux DebLinux RPM Get VS Code ExtensionDVC For VS CodeGet VS Code Extension Empowering thousands of users and customers from startups to Fortune 500 companies Aicon logo Billie logo Cyclica logo Degould logo Huggingface logo Inlab Digital logo UBS logo Mantis logo Papercup logo Pieces logo Sicara logo UKHO logo XP Inc logo Kibsi logo Summer Sports logo Motorway logo Aicon logo Billie logo Cyclica logo Degould logo Huggingface logo Inlab Digital logo UBS logo Mantis logo Papercup logo Pieces logo Sicara logo UKHO logo XP Inc logo Kibsi logo Summer Sports logo Motorway logo Aicon logo Billie logo Cyclica logo Degould logo Huggingface logo Inlab Digital logo UBS logo Mantis logo Papercup logo Pieces logo Sicara logo UKHO logo XP Inc logo Kibsi logo Summer Sports logo Motorway logo Aicon logo Billie logo Cyclica logo Degould logo Huggingface logo Inlab Digital logo UBS logo Mantis logo Papercup logo Pieces logo Sicara logo UKHO logo XP Inc logo Kibsi logo Summer Sports logo Motorway logo [svg] Subscribe for updates. We won't spam you. [ ] [ ] Subscribe Keep updated on blog posts with our RSS Feed! [svg] Product * Overview * Use Cases Help * Support * Get started * Community * Documentation Community * Twitter * Github * Discord Company * Blog * Privacy Policy * Career * Media Kit More Tools * DataChain * DVC Studio * VS Code Extension By Iterative - an open platform to operationalize AI An open platform to operationalize AI