[HN Gopher] Update on Llama adoption
___________________________________________________________________
Update on Llama adoption
Author : meetpateltech
Score : 142 points
Date : 2024-08-29 14:38 UTC (8 hours ago)
(HTM) web link (ai.meta.com)
(TXT) w3m dump (ai.meta.com)
| nikolayasdf123 wrote:
| LLAVA is pretty great
| simonw wrote:
| Do you know of any good paid API providers offering LLAVA? I
| want to experiment with it out a bunch more without having to
| host it locally myself.
| nikolayasdf123 wrote:
| nope. I am self-hosting. support is pretty good actually.
| llama.cpp supports it (v1.6 too; and in openai API server as
| well). ollama supports it. open-web-ui chat too.
|
| using it now on desktop (I am in China, so no OpenAI here)
| and in cloud cluster on project.
| xyc wrote:
| Cloudflare has it https://developers.cloudflare.com/workers-
| ai/models/llava-1....
|
| Locally it's actually quite easy to setup. I've made an app
| https://recurse.chat/ which supports Llava 1.6. It takes a
| zero-config approach so you can just start chatting and the
| app downloads the model for you.
| xyc wrote:
| Just realized I read your blog about Llava llamafile which
| got me interested in local AI and made the app :)
|
| What's your reservation about running it locally?
| bfirsh wrote:
| https://replicate.com/yorickvp/llava-13b :)
| nuz wrote:
| What are you using it for? Curious if there's any interesting
| purposes I haven't thought of
| josefresco wrote:
| It is! Just downloaded it the other day and while far from
| perfect it's pretty neat. I uploaded a Gene Wilder/Charlie in
| the Chocolate Factory meme and it incorrectly told me that it
| was Johnny Depp. Close I guess! I run LLAVA and llama (among
| other models) using https://ollama.com
|
| As a "web builder" I do think these tools will be very useful
| for accessibility (eventually), specifically generating
| descriptive alt tags for images.
| dakial1 wrote:
| ...says the owner of it.
|
| Now seriously, by Llama being "sort of" open source, it does not
| seem to be something someone can fork and develop/evolve it
| without Meta, right? If one day Meta comes and says "we are
| closing Llama and evolving it in a proprietary mode from now on"
| would this Llama indie scene continue to exist?
|
| If this is the case wouldn't this be considered a dumping
| strategy by Meta, to cut revenue streams from other platforms
| (Gemini/OpenAI/Anthropic) and contain their growth?
| RicoElectrico wrote:
| The models can be fine-tuned which is good enough.
| mupuff1234 wrote:
| Not good enough to be considered open source.
| bunderbunder wrote:
| Realistically the problem here might just be that the
| concept of open source doesn't really fit machine learning
| models very well, and we should stop trying to force it.
|
| Sharing the end product, but not the tools and resources
| used to produce it, is how open source has always worked.
| If I develop software for a commercial operating system
| using a commercial toolchain, and distribute the source
| code under GPL, we would call that software open source.
| Others who get the code don't automatically get the ability
| to develop it themselves, but that's kind of beside the
| point. I don't have the rights to publicly redistribute
| those tools, anyway; the only part I can put under an open
| source license is the part for which I have copyright.
|
| Training data for a LLM like Llama works similarly when it
| comes to copyright law. They don't own copyright and/or
| redistribution rights for all of it, so they _can 't_ make
| it open, even if they want to.
|
| If that seems unsatisfying, that's because it is.
| Unfortunately, though, I don't think the Free Software
| community is going to get very far by continuing to try to
| fight today's openness and digital sovereignty battles
| using tactics and doctrine that were developed in the 20th
| century.
| koolala wrote:
| It does fit it. Perfectly. It's incredible. Like an
| Internet of all Human Knowledge released before 1965.
| OpenAI could of done this. The battle to me is just
| people respecting ideas instead of saying they are
| impossible or unnecessary because what we have is good
| enough.
| islewis wrote:
| "good enough" is incredibly subjective here. Maybe good
| enough for you, but there are many things that are not
| possible with either the dataset or the weights being
| available.
| talldayo wrote:
| And some things are impossible even with both the dataset
| and weights. Say you wanted to train the same model as is
| released, using Meta's hypothetically released training
| data. You _also_ need to know the starting parameters, the
| specific hardware and it 's quirks during training, _the
| order the data is trained in_ as well as any other
| preprocessing techniques used to treat the text.
|
| Considering how ludicrously expensive it would be to even
| attempt a ground-up retrain (as well as how it might be
| impossible), weights are enough for 99% of people.
| koolala wrote:
| Good-enough? Please please type out what a truely open source
| ai model with open weights and open data would be like. I
| picture it like a Tower of Babel! Very far from "Good-
| enough"!
| asdfksa wrote:
| Nice marketing that is written for investors. Let us translate:
|
| > By making our Llama models openly available we've seen a
| vibrant and diverse AI ecosystem come to life [...]
|
| They all use the same model and the same transformer algorithm.
| The model has an EULA, you need to apply for downloading it, the
| training data set and the training software are closed.
|
| > Open source promotes a more competitive ecosystem that's good
| for consumers, good for companies (including Meta), and
| ultimately good for the world.
|
| So the "competitive" system means that everyone uses LLama and
| PyTorch.
|
| > In addition to Amazon Web Services (AWS) and Microsoft's Azure,
| we've partnered with Databricks, Dell, Google Cloud, Groq,
| NVIDIA, IBM watsonx, Scale AI, Snowflake, and others to better
| help developers unlock the full potential of our models.
|
| Sounds really open.
| nightski wrote:
| Far more open than the competition. I'll take it.
| koolala wrote:
| Don't let a gift be a curse.
| ru552 wrote:
| >They all use the same model and the same transformer
| algorithm. The model has an EULA, you need to apply for
| downloading it, the training data set and the training software
| are closed.
|
| Everything in that sentence is false except the training data
| part.
|
| >So the "competitive" system means that everyone uses LLama and
| PyTorch.
|
| This sentence shows you don't understand the LLM landscape and
| it's also false.
|
| >Sounds really open
|
| Correct. They partner with practically every vendor available
| for inference, which, isn't even needed if you run their models
| locally.
|
| Meta has done a lot of wrong things over the years. How they
| are approaching LLMs is not one of them.
| freilanzer wrote:
| > Everything in that sentence is false except the training
| data part.
|
| You do need to apply on Huggingface to download the model.
|
| > This sentence shows you don't understand the LLM landscape
| and it's also false.
|
| PyTorch definitely is the most used ML framework.
| arbven wrote:
| Could you provide a link for downloading the complete and
| exact training software for the latest models?
|
| You need to provide an email address and click a license
| agreement. Then you get a download link that expires after a
| day. I do not have to do this with the Linux kernel. Perhaps
| you are downloading from within Meta and are not exposed to
| these issues?
| ekianjo wrote:
| erm its is still way more open than "openAI" or Anthropic...
| bob1029 wrote:
| > The model has an EULA, you need to apply for downloading it
|
| I am confused - I grabbed Ollama and pulled down some of these
| models. I don't recall having to go through any legal
| agreements. I just type: ollama pull llama3.1
|
| Maybe I missed something and am actually 10 steps behind. Who
| knows anymore. This whole space is totally insane to me.
| arbven wrote:
| https://github.com/meta-llama/llama3
|
| "To download the model weights and tokenizer, please visit
| the Meta Llama website and accept our License.
|
| Once your request is approved, you will receive a signed URL
| over email. Then, run the download.sh script, passing the URL
| provided when prompted to start the download.
|
| Pre-requisites: Ensure you have wget and md5sum installed.
| Then run the script: ./download.sh.
|
| Remember that the links expire after 24 hours and a certain
| amount of downloads. You can always re-request a link if you
| start seeing errors such as 403: Forbidden."
| bob1029 wrote:
| Go try it.
|
| https://ollama.com
|
| You may come away surprised.
| SushiHippie wrote:
| You still agree to this EULA by using it:
|
| https://ollama.com/library/llama3.1/blobs/0ba8f0e314b4
|
| > By clicking "I Accept" below or by using or
| distributing any portion or element of the Llama
| Materials, you agree to be bound by this Agreement.
| bob1029 wrote:
| > You still agree to this EULA by using it
|
| I think my lawyer would have a few things to say about
| automatic legal agreements hidden somewhere in source
| control.
|
| Is the ollama project part of Meta? Is that what's going
| on here?
| regularfry wrote:
| You never see that agreement with `ollama run`. It's not
| even a shrink-wrap licence - there's no indication
| there's a restriction _at all_ between requesting the
| model and the API receiving requests for it. This
| situation is probably going to end up with the ollama
| folks getting a slap on the wrist and told to implement
| some shrink-wrap wording but until then, nobody can be
| bound by that licence because Meta can 't demonstrate
| that anyone has seen the offer.
| greentea23 wrote:
| I think this is just the ollama site rehosting in violation
| of the license (unless there is some fine print I am
| missing). Huggingface makes you login and accept the
| agreement.
| rjdagost wrote:
| Meta has spent massive sums of money to train these models and
| they've released the models to the public. You can fine-tune
| the models. You can see the source code and the architecture of
| the model. The EULA is commercially-friendly.
|
| You are free to quibble over how truly "open source" these
| models are, but I am very thankful that Meta has released them.
| koolala wrote:
| Thank them then. Please don't use your gratitude to also wash
| out an entire cultural idea because billionares make you
| grateful.
| kfhga wrote:
| Open source developers have spent far more time to develop
| the truly free stack that Meta uses to power its business in
| the first place.
|
| I am grateful to these developers. I am not grateful for a
| half open release and the redefinition of established terms.
| Which, judging by the downvoting in this thread, are now
| spread with fire and sword.
| phyrex wrote:
| A lot of these open source developers that made and
| improved this "truly free stack" are employed by meta and
| other big techs
| alsjas wrote:
| The stack was very usable in 2010. At that time, _some_
| gcc and kernel developers were employed by SuSE and
| RedHat. It was not common to be employed by a large
| corporation to work on open source.
|
| Projects like Python were completely usable then. But the
| corporations came, infiltrated existing projects and
| added often useless things. Python is not much better now
| than in 2010.
|
| So you have perhaps React and PyTorch. That is a tiny bit
| of the huge OSS stack. Does Meta pay for ncurses? for
| xterm? Of course not, it only supports flashy projects
| that are highly marketable and takes the rest for
| granted.
|
| So no, only a tiny fraction of the really important OSS
| devs are employed by FAANG.
| talldayo wrote:
| > Does Meta pay for ncurses? for xterm?
|
| Should they? Both of those are client-side software that
| aren't even really being monetized or profited-off by
| Meta. You could maybe get mad at Meta's employees for not
| donating to the software they rely on, but in the case of
| ncurses and xterm they're both provided without cost.
| They're not even server-side software, much less a
| deliberate infrastructure decision.
|
| There's an oddly extremist sect of people that seem to
| entirely misunderstand what GNU and Free software is. It
| does not exist to stop people from charging money for
| software. It does not exist to prevent private interests
| or corporations from contributing to projects. It does
| not exist to solicit donations from it's users. All of
| these are _options_ that some GNU or FOSS projects can
| choose to embody, not a static rule that they must all
| abide by. Since _Cathedral and the Bazaar_ was published,
| people have been scrutinizing different approaches to
| Free Software and contrasting their impacts. We don 't
| have to champion one approach versus the other because
| they ultimately coexist and often end up stimulating FOSS
| development in the long run.
|
| > Python is not much better now than in 2010.
|
| C'mon, now. Next you're going to tell me about how great
| Perl is in 2024.
| adjhgG wrote:
| So, in this submission Meta adjacent opinions have called
| OSS supporters all sorts of names while being upvoted.
|
| At least Meta is shows its true colors here. It must have
| hurt that the OSS position has arrived at the Economist
| yesterday, so everyone is circling the wagons.
| talldayo wrote:
| Nobody here really has an agenda, least of all on HN
| where the majority of us hate Facebook like the living
| devil. Everyone remembers Cambridge Analytica and the
| ensuing drama, but we're also up-to-date on _all_ of
| FAANG 's exploits. Meta is a supporter of Open Source,
| and arguably contributes multitudes more than Apple or
| Amazon does. This idea that strings-attached weights
| releases tank their reputation is stupid; Meta's
| contribution is self-evident, and only looks stupid when
| you hold them to nonsense standards that _no_ company
| would hold up to. Really, which Fortune 500 companies are
| donating to xterm and ncurses anyways? Is there _anyone_?
|
| Again, there are arguments you can make that have weight
| but this isn't one of them. Every person with connection
| to wireless internet is running a firmware blob on their
| "open source" computer, it doesn't mean they're unable to
| bootstrap from source. Similarly, people that design Open
| Source infrastructure around Meta's binary weights aren't
| threatening their business at all. An "open" release of
| Llama wouldn't help those end-users, isn't even
| guaranteed to build Llama, and is too large to
| effectively fork or derive from. There's a good reason
| engineers aren't paying attention to the dramatic and
| insubstantial exposes that get written in finance rags.
| mkesper wrote:
| Llama isn't open source at all. Stop using that phrase for your
| product featuring even an EULA.
| ekianjo wrote:
| open source is so ambiguous its a useless expression at this
| stage. At least FOSS is less problematic.
| kzrdude wrote:
| https://opensource.org/osd
| benterix wrote:
| A few decades ago an organization was founded specifically to
| address statements such as this one. That's why some early
| Microsoft attempts at competing with OS had to be called
| "shared source", not "open source".
| CamperBob2 wrote:
| Unfortunately they didn't do a competent job at addressing
| it, which would have involved trademarking the phrase. As a
| result, "Open Source" means whatever you, I, Meta, or
| anyone else wants it to mean.
| riedel wrote:
| This a hard stance if you talk just about the code (not the
| model weights). The llama community licence is a bit weird and
| probably not an OSI compliant licence, but close. Regarding
| weights this is different, but to me it is actually difficult
| to understand still now to aplly copyright law here. Having
| said that one nicht und erstand why certain stupid looking
| clauses went into the code licence. If we do not understand
| copyright of model weights and do not have court rulings in the
| use oft training data und er different copyright regimes (US
| and EU), I would not care too much. We are still in the Wild
| West.
| tourmalinetaco wrote:
| "Close" is not good enough for using a term with a very
| specific meaning. OSI = open source, everything else is
| source-available (which its arguable that either even
| applies, because the source of the weights, the dataset, is
| not available).
|
| I agree that for Llama, things are weird and they want to
| cover their bases, and that its better than nothing, but the
| specific use of "open source" is a long-running corporate
| dilution of what open source really means and I am tired of
| it.
| segmondy wrote:
| Having access to a weight doesn't make it open. Else you can
| make the argument that Microsoft Word is open source because
| you have access to the binary.
| HPsquared wrote:
| Indeed, weights are literally binary data. Not human-
| readable!
| boroboro4 wrote:
| The access to modify (uptrain/finetune) these weights is
| the same between Meta & others, unlike with Word (where
| Microsoft has an advantage because they have code and can
| recompile it). I think this is the only thing which
| matters in practical terms.
| HPsquared wrote:
| Lots of binary executables and libraries can be
| customised too. That doesn't make them open-source.
| boroboro4 wrote:
| This neither makes modification easy, or that's how owner
| of the code does modification themselves, and that's
| where the difference is.
| spunker540 wrote:
| Word is a good analogy here.
|
| The model is a static data file like a word doc.
|
| Meta open sourced the code to run inference on the model
| (ie the code for Microsoft word reading a doc file).
|
| They also open sourced the code to train/fine tune the
| model. (Ie the code for Microsoft word writing a doc file)
|
| Then they released a special doc (the llama 3 model), but
| didn't include the script they used to create that doc.
| owlbite wrote:
| I feel the arbitrary split between code and weights makes
| little sense when discussing if these models are "open
| source" in the copyleft meaning of the term. If the average
| user can't meaningfully use your product without agreeing to
| non-free terms then it's morally closed source.
|
| Anything else and you're just open-source-washing your
| proprietary technology.
| ensignavenger wrote:
| I tend to see weights as nothing more than data, data which
| may not even be copyrightable. But Meta keeps calling their
| data "open source" when they clearly do not release the
| model under an open source license, and that is terrible,
| awful and misleading.
| koolala wrote:
| Model Weights are not the Source. Why can't that be obviously
| like a binary isn't source code - a binary is compiled from
| source. You can open-license the data in a binary so it can
| be reverse-engineered / modded but that doesn't make it open
| source.
| Der_Einzige wrote:
| Why should anyone care about following a license?
|
| Llama did not license its training data. It's almost impossible
| to prove a particular LLM was used to generate any particular
| text, and there's likely a bunch of illegal content within the
| dataset used to train the model (as is the case for most other
| LLMs)...
|
| So why should I care about following a license? They have no
| mechanism to enforce it. They have no mechanism to detect when
| it's being violated. They themselves indicated hostilities to
| other licenses, so why not ignore it?
| OKRainbowKid wrote:
| Thousands of lawyers and billions to spend.
| Der_Einzige wrote:
| Again, how do you reliably prove that I used your model if
| I do a bunch of tricks to "hide" this?
|
| i.e. high temperature, exotic samplers (i.e. typicality
| sampling), using lora/soft prompts/representation
| engineering, using it as part of a chain on top of other
| AI, etc
|
| I don't care if every lawyer on earth is hired by Meta.
| Show me evidence that any particular LLM can be trivially
| fingerprinted based on its outputs. No, that "red token
| green token" paper on watermarking
| (https://arxiv.org/pdf/2301.10226) is not an example of
| this because of how trivial it is to defeat.
|
| Edit: I can't reply to the comment saying "Subpoena" but
| this commentator seems to think that using any LLM at all
| is grounds for a court to issue a Subpoena requiring you to
| disclose which LLM you're using. If this actually happened,
| you'd see a massive chilling effect. Also, what stops
| someone from silently replacing the model with a non
| infringing one the moment someone starts asking questions?
|
| I'm pretty sure that most courts aren't capable of getting
| expert testimony which is good enough to deduce that I
| silently swapped out my blarg_3.1 model which was made
| using a 1/3 llama3 merge with something else with gloop_1.5
| which is no longer infringing.
|
| Like seriously, I again ask, given the idea of courts with
| warrants and Subpoena's, why should I care about meta's
| licensing?
|
| Edit2: If you're afraid of an employee leaking this info,
| _don 't tell your employees_. Good thing clever model
| merging leaves no traces if you delete metadata!
| JumpCrisscross wrote:
| > _how do you reliably prove that I used your model if I
| do a bunch of tricks to "hide" this?_
|
| Subpoenas.
| Der_Einzige wrote:
| If you live in a glass house, you won't start throwing
| stones.
|
| https://www.forbes.com/sites/alexandralevine/2023/12/20/s
| tab...
|
| https://www.airforcetimes.com/news/your-air-
| force/2024/03/26... (i.e. as source of where LLMs get
| their classified data from)
|
| If you scrape a large enough part of the internet, you're
| naturally going to get extremely illegal training data
| that you won't effectively filter out. I guarantee you
| that at least a tiny bit of highly classified information
| was not filtered out of most LLM training data (it wasn't
| found during the filter step), and it's quite remarkable
| that this and the above revelations have not led to
| anyone being Subpoena'd or related in regards to it.
|
| So no, I think that folks will be literally the
| _opposite_ of litigious on this issue. You want to play
| that game Zuck? Let 's see what happens when I hire my
| researchers to find the dirt on your models dataset. We
| will see then who "settles out of court".
| ensignavenger wrote:
| An employee (or a hacker) leaking the fact, maybe even
| leaking details, is all it takes to get the bloodhounds
| called on you and now you are subject to discovery. Sure,
| you can lie, but if you are doing anything large enough
| to get attention, you are exposing yourself to possible
| liability.
| tobyjsullivan wrote:
| Trap streets
|
| https://en.wikipedia.org/wiki/Trap_street
| JumpCrisscross wrote:
| > _Llama isn 't open source at all. Stop using that phrase for
| your product featuring even an EULA_
|
| We don't have a commonly-accepted definition of what open
| source means for LLMs. Just negating Facebook's doesn't advance
| the discussion.
|
| The open-source community is fractured between those who want
| it to mean weights available (Facebook); weights and
| transformer available; weights with no use restrictions (I
| think this is you); and weights, transformer and training data
| with no restrictions (obviously not workable, not even the
| OSI's proposed definition goes that far [1]).
|
| In a world where the only LLMs are proprietary or Llama and the
| open-source community either remains fractured or chooses an
| unworkable ask, the latter wil define how the term is used.
|
| [1] https://opensource.org/deepdive/drafts/open-source-ai-
| defini...
| lrrna wrote:
| We have that definition. The user needs the complete
| capability to reproduce what is distributed. Which means
| training data and the source used to train the model.
|
| If you distribute the output of bison (say, foo.c) and not
| the foo.y sources, you would get pushback.
|
| Then there is the EULA which makes it closed source right
| from the start.
| JumpCrisscross wrote:
| > _Which means training data and the source used to train
| the model_
|
| This makes an LLM (emphasis on large) that is open source
| per this definition legally impossible. In every
| jurisdiction of consequence.
|
| People like the term open source. It will get used. It is
| currently undefined. If the choice is an impractical
| definition and a bad one, we'll get stuck with the bad one.
| (See: hacker v cracker, crypto(currency) v crypto(graphy),
| _et cetera_.)
| lern_too_spel wrote:
| You shouldn't redefine a term if it doesn't apply. Just
| make a new term. This is how we ended up with the MiB vs.
| MB confusion where some systems incorrectly use the
| latter, making it effectively useless because the reader
| doesn't know if they really meant MB or actually meant
| MiB instead.
| JumpCrisscross wrote:
| > _shouldn 't redefine a term if it doesn't apply. Just
| make a new term_
|
| Maybe. But this isn't how language works. Particularly
| not English.
|
| A corollary of No true Scotsman [1] is the person
| administering that purity test rarely gets to define a
| Scotsman.
|
| [1] https://en.wikipedia.org/wiki/No_true_Scotsman
| lern_too_spel wrote:
| That doesn't mean that we, the people who should know
| better, should contribute to making words lose meaning.
| lostmsu wrote:
| Why not?
| ghath wrote:
| This is not an organic use of an altered meaning. The
| term is imposed by huge corporations who force it on
| their developers and everyone who wants to get funding in
| the Llama ecosystem.
|
| It has nothing to do with natural language evolution.
| JumpCrisscross wrote:
| > _an organic use of an altered meaning. The term is
| imposed by huge corporations who force it_
|
| LLMs were entirely proprietary. In that context, Facebook
| put forward a foundational model one can run on their own
| hardware and called it open.
|
| At the time, nobody had defined what an open-source LLM
| was. People came out to say Llama wasn't open. But nobody
| rigorously proposed a practical definition. The OSI has
| started, but they're being honest about it being a draft.
| In the meantime, people are organically discussing open
| versus proprietary models in a variety of contexts, most
| of which aren't particularly concerned with the OSI's
| definitions.
| _proofs wrote:
| you are wildly conflating the difference between a
| naturally occurring evolution in a word's usage and
| meaning (ie: slang becomes canon becomes slang cycle),
| and intentionally misusing an existing, established
| meaning, and then pushing for that misuse (and
| misunderstanding) to become canon.
|
| one is pragmatic, ergonomic, and motivated by advancing
| relationships between communicating persons in a
| naturally occurring way because the common denominator is
| a quick race to mutual understanding.
|
| the other is manufactured, and not motivated by relation
| and advancing communication, but by how the shift in
| understanding benefits the one pushing for it, and often
| involves telling people how to think but is doing it
| through subversion.
| JumpCrisscross wrote:
| > _conflating the difference between a naturally
| occurring evolution in a word 's usage and meaning (ie:
| slang becomes canon becomes slang cycle), and
| intentionally misusing an existing, established meaning_
|
| Sort of. I'm claiming the meaning of open source when
| applied to AI is unsettled. There are guiding principles
| that seem to imply Llama is _not_ open source. But merely
| pointing that out without offering a practical
| alternative definition almost guarantees that the
| intentionally-misused definition Facebook is promulgating
| becomes the accepted one.
| _proofs wrote:
| fair point, and i think i can appreciate more where you
| are coming from now.
|
| however i do not think the alternatives need to be
| proposed at this moment in time because right now, the
| discussion is about _holding people who intentionally
| reframe and misuse words_ accountable for their "double-
| speak" given the term's precedent.
|
| conventionally, and by historical collective
| understanding, it is not open source.
|
| i get you are attempting to highlight AI perhaps means
| this should be a definition reconsidered, but the irony
| here is the message itself is distorted due to the
| conflation, hence why consistency in language to me seems
| self-evident as a net good.
|
| there is most certainly a difference between naturally
| occurring (which our brains reeeally support in terms of
| language development and symbolic communication), and
| manufactured (and therefore pushing for a word, or more
| aptly, a perspective's adoption).
|
| i'd rather words manifest through a common need to reach
| mutual understanding as a means to relate to one another
| and this world, rather than having someone who stands to
| benefit from the change in definition, tell me what it
| means, and then expect me to just "agree", while they
| campaign around that and pretend it's the established
| definition (and not actually their own revised version).
|
| it'd be one thing if people who were throwing the term
| around so loosely would be transparent: "Hey, we know
| this isn't historically what everyone means by _OSS,
| but... that 's OSS your OSS this is OSS, everything is
| OSS_
|
| instead a lot of these narratives are standing on the
| shoulders of the original definition and context of what
| it means to be OSS, and therefore the pedigree,
| implications (and whatever else for PR spin/influence),
| and simultaneously diluting what it means in the process
| as the definition gets further and further obfuscated by
| those influencing the change, and its pedigree is relied
| on as a distraction away from what is being done, or
| actually said.
| lrrna wrote:
| It is not undefined. The wrong term is just repeated in
| marketing campaigns, by Meta developers and those
| building businesses on Llama until people believe it.
|
| They could use the more correct Open Weights (which is
| still a euphemism because of the EULA).
|
| But they do not, and they know perfectly well what they
| are doing. They are the ones responsible for these
| discussions, but they double down and blame the true OSS
| people.
| JumpCrisscross wrote:
| > _It is not undefined_
|
| Of course it is. Look at this thread. Look at the policy
| discussions around regulating AI. Hell, look at the OSI's
| _draft_ definition [1].
|
| Pretending something is rigorously defined the way you
| want it to be doesn't make it so.
|
| [1] https://opensource.org/deepdive/drafts/open-source-
| ai-defini...
| achrono wrote:
| That's fallacious. It's a problem of popular use (and
| sales incentives), not of definition.
|
| "Open-source" represents a cluster of concepts but at the
| core of it there is a specific definition in spirit at
| least -- you can see the source for yourself, and compile
| it for yourself.
|
| If the source is not available, why would you want to
| call it open-source? Just call it something else. As
| simple as that.
| halJordan wrote:
| Definitions are not proscriptive. You cannot define a
| word and then coerce everyone to use that definition via
| your word.
|
| Definitions flow out of usage. The definition clarifies
| how the word is used and what people mean when they do
| use it.
|
| You are, in a very literal sense, doing what Orwell, et
| al was so desperately against by actively controlling how
| language is permitted to be used.
| autoexec wrote:
| Definitions are very often proscriptive. Communication
| only works when people are able to understand the
| language being used. Imagine how well networks would
| function if we didn't have documented protocols that
| define what things mean and how they should be
| understood.
|
| Nobody can "force" someone else to use the correct
| definitions of words, but when people disregard their
| established meanings they risk communication breaking
| down and the confusion and misunderstandings that follow.
| If I went around speaking nonsense or making up my own
| invented definitions for established words I shouldn't
| expect to be understood and others would be perfectly
| right to correct me or ask that I stick to using the well
| understood and documented meaning of words if I expect to
| have a productive conversation.
|
| It's also perfectly fair to call out people who twist the
| meaning of words intentionally so that they can lie,
| mislead, and manipulate others. When it comes to
| products, companies can't just say "Words can mean
| anything I say they do! There are no rules!" to get away
| with false advertising.
| JumpCrisscross wrote:
| > _when people disregard their established meanings they
| risk communication breaking down_
|
| The meaning of words drifts in every living language.
|
| > _perfectly fair to call out people who twist the
| meaning of words intentionally_
|
| We don't have consensus around what open source means for
| LLMs. Facebook is pretending we do. But so is everyone in
| this thread claiming there is a single true definition of
| an open source LLM.
| autoexec wrote:
| > The meaning of words drifts in every living language.
|
| And it does result in a lot of confusion and
| misunderstanding until gradually people are taught the
| new definitions and how they are used. There are also
| groups of people who deliberately and continuously
| redefine words because they don't want to be widely
| understood. Some want to develop a means to signal to and
| identify others within their in-group, and some want to
| keep outsiders from understanding them so they can speak
| more openly in mixed company.
|
| > We don't have consensus around what open source means
| for LLMs.
|
| There are people who will argue about what open source
| means for anything. It's okay that open source means
| different things to different people, but it does result
| in confusion in discussions until people make their
| definitions clear.
|
| I don't think that Facebook has earned the benefit of the
| doubt, in fact they've more than earned our skepticism,
| so it's very reasonable to see their new definition of
| "open source" as being nothing but marketing rhetoric at
| best, or at worst, as an attempt to twist our still
| developing consensus on what open source means into
| something that violates the philosophy/spirit of the open
| source movement.
| ghath wrote:
| Orwell was in large part against re-definitions of
| existing words and the removal of words in order to
| reduce the basis for productive thought.
|
| Re-definition example: War is peace, freedom is slavery.
|
| Since the new euphemism for downloadable models is a re-
| definition, Orwell would have been 100% against it. In
| fact the new use of "open source" is an Orwellian term.
| _proofs wrote:
| could not have said it more succinctly, imo.
|
| i'm inclined to believe Orwell would have disagreed with
| OP, and would be asking himself -- why is there such a
| distinct push by those who benefit from the reframing, to
| reframe what Open Source means (compared to its already
| established meaning).
| _proofs wrote:
| imo, this is patently not what Orwell documented and
| criticized via narrative example, and certainly was not
| what i took as his position on the evolution of naturally
| occurring languages, in the alluded to book -- he is a
| writer, and i imagine no doubt understands the importance
| of language and shared associations, and what it means
| for a language to naturally evolve its vernacular
| (accepted, common, or developing) -- through usage, or
| otherwise.
|
| Orwell highlighted and warned against the consequences of
| people in influential positions of power _intentionally_
| distorting the collective associations with their new,
| updated versions of existing words, campaigning around
| those distortions, and intentionally reframing
| associations over time such that, the associations are
| polarizing and obfuscated, motivated by manipulation to
| benefit a select few, not motivated by advancing
| communication -- it certainly was not an example of
| society and its language naturally evolving
| "definitions" through usage.
|
| and the novel wasn't a criticism against slang, or
| association/vernacular changing/evolving over time
| throughout collective use, nor was it a stance on
| requiring fixed, permanent, unwavering definitions -- it
| only emphasized how important it is to have consistent
| meaning.
|
| he just wanted to encourage people to be skeptical of
| those pushing for the "different" or updated meaning of
| words, that clearly had a well-defined context, and
| association, previously -- why are they so dedicated and
| determined to "push" for a new meaning to get accepted,
| when there is a previously established and well accepted
| meaning already.
|
| that doesn't sound natural to me, that sounds
| manufactured.
| tbrownaw wrote:
| > _The user needs the complete capability to reproduce what
| is distributed._
|
| GPLv3 defines "source code" as the preferred form for
| making changes.
|
| For most normal software that is identical to what you'd
| use to recreate it... but the way to make changes to an LLM
| isn't to rebuild it, but is to run fine-tuning on it.
| j_maffe wrote:
| The Llama 3.1 transformer is available. But it does have some
| minor use restrictions, yes.
| ein0p wrote:
| The weight releases for LLMs are equivalent to binary
| releases for software. The "source code" here is the
| dataset, which is not disclosed.
| jrm4 wrote:
| Obligatory "Stallman Was Right."
|
| Once again: for those who are new here. There is Free Software,
| which has a usefully strict definition.
|
| And there is Open Source, the business-friendly -- but
| consequently looser -- other thing.
|
| You can like one or both, they both have advantages and
| drawbacks.
|
| But you cannot insist that "Open Source" has a very strict
| definition. It just doesn't. That's why the whole Free Software
| thing is needed, and IMHO, more important.
| j_maffe wrote:
| I agree except for your opinion in the end. But I know you're
| not alone in this opinion and it has been discussed to death.
| At this point it's more political than anything else in CS.
| jrm4 wrote:
| The _age_ of an opinion is not in ANY WAY an indicator of
| how important it is, nor does reducing it to being
| "political."
|
| Statements like this remind me that I really need to KEEP
| GOING with this.
| ensignavenger wrote:
| Open Source has every bit as strict of a definition as Free
| Software. Open Source as a term was coined and popularized by
| the OSI. The term may have occasionally been used in
| different contexts prior to the OSI, but it was never
| commonly applied to software before that.
|
| One could argue that the OSI should have gotten a trademark
| on the term. But the FSF doesn't have a trademark on the term
| "free Software" either, so the terms have approximately equal
| legal protections.
|
| Meta using the term "open source" to apply to their model
| data when their license isn't an open source license is
| dishonest at best.
| jrm4 wrote:
| I think I agree that they shouldn't use "open source," but
| again, this confusion highlights that you have to _put in
| work_ when it comes to this topic.
|
| Free Software has the GPL, and all its related, healthy
| controversy. It's not perfectly clear, but it's far more
| battle-tested than the much more nebulous "Open source."
|
| People who like "free software" put in work, and better
| understood that, to some extent, you can't have your cake
| and eat it too. "Open Source" is much more about a whole
| lot (to me, naive) wishful thinking.
|
| (The OSI is a bunch of companies in a trenchcoat, the
| creators of the GPL were more principled.)
| ensignavenger wrote:
| The FSF accepts a lot more licenses than just the GPL
| family of licenses as free software. As for the OSI, I
| don't think they have ever hidden who they are, how the
| organization is ran, etc. https://opensource.org/about
|
| I just noticed they are currently discussing what "Open
| Source AI" should mean. You can join in and add your
| thoughts tot he discussion.
| HarHarVeryFunny wrote:
| Sure, but you can't give something away for free that you
| don't own. What people complaining about LLama not being open
| source are talking about is the training data, and that isn't
| something that Meta owns for the most part.
| insane_dreamer wrote:
| The OSI doesn't have a monopoly on the definition of the words
| "open source". There is "open source as per the OSI Open Source
| Definition" and there are other interpretations.
| intellectronica wrote:
| Right, but if your "open source" package doesn't include ....
| the source, then you need some other definition.
| JackYoustra wrote:
| Fwiw Meta, under oath in congressional session, called Llama
| not open-source.
| KolmogorovComp wrote:
| source and context? Would be interested to know more about
| this
| codingwagie wrote:
| Probably will get flagged, but I get so annoyed by the cynical
| takes on Meta and their open source strategy. Meta is the only
| company releasing true open source (React, pytorch, graphql) and
| now LLama. This company has done more for software development
| than any other in the last decade. And now they are burning down
| the competition in AI, making it accessible to all. Meta software
| engineering compensation strategy pushed up the high end of
| developer compensation by almost twice. Enough with the weird
| cynicism on their licensing policy.
| wilsonnb3 wrote:
| > Meta is the only company releasing true open source
|
| What? There are so many open source projects from huge
| companies these days.
|
| VSCode, .NET, typescript from MS
|
| Angular, flutter, kubernetes, go, android, chromium from Google
| SushiHippie wrote:
| The llama models use a non open source license [0].
|
| Yes it is still better than not being able to access the
| weights at all, but calling these weights open source is not
| correct.
|
| [0] https://huggingface.co/meta-llama/Meta-
| Llama-3.1-8B/blob/mai...
| codingwagie wrote:
| Dude they spent billions on the model and then just open
| sourced it
| martindevans wrote:
| No, they spent billions on a model and released the
| weights, and that's fantastic! It's not not open source
| though.
| koolala wrote:
| Look at Apple spending a billion on ads to say they respect
| your privacy or the Earth. Meta is buying / licensing a
| market sector in an industry they dominate where they have
| full control of our data. Our data is what got them that
| billion dollars.
| cmur wrote:
| if something requires an EULA it isn't open at all, it is just
| publicly available. By your logic, public services are "open
| source." There are myriad corporations that release actual open
| source software that is truly free to use. If you experience
| massive success with anything regarding Meta's LLMs, they're
| going to take a cut according to their EULA.
| bunderbunder wrote:
| I'm trying to figure out the logic that makes "free for
| commercial use with less than 700 million monthly active
| users" less open than "free for non-commercial use", which is
| the traditional norm for non-copyleft open source machine
| learning products. But I just can't get there. Could somebody
| spell it out for me?
| koolala wrote:
| Ideals vs. Gut Instinct
| eduction wrote:
| You're certainly entitled to the opinion that an agreement
| (as in EULA) is distinct from a license (as in GPL, MIT etc).
|
| But many legal minds close to this issue have moved to the
| position that there is no meaningful distinction, at least
| when it comes to licenses like GPL.
|
| For example: https://writing.kemitchell.com/2023/10/13/Wrong-
| About-GPLs
| bschmidt1 wrote:
| React? Surely I'm not the only one who remembers
| https://news.ycombinator.com/item?id=15050841
|
| I don't think Facebook/Meta is the beacon of open-source
| goodness you think it is. The main reason they created yarn
| instead of iterating on npm is to use their own patent-friendly
| license they wanted to use with React (before the community
| flipped out and demanded they re-license it as MIT). Early Vue
| adoption seemed mostly driven by that React licensing fiasco.
| gwern wrote:
| There is nothing weirdly cynical about it. This is a fact of
| life in Silicon Valley - that a lot of FLOSS is released for
| strategic reasons (such as building up a community before
| enclosing it to extract a profit), and not because the Grinch's
| heart grew 2 sizes one day. "Commoditize your complement":
| https://gwern.net/complement
|
| You can benefit a lot from it, and I have... but do be sure you
| know what you are ferrying on your back _before_ you decide to
| offer it a ride across the river.
| xgb84j wrote:
| How do you think Meta profits off React and PyTorch? Just
| marketing to get good candidates?
| Der_Einzige wrote:
| The same way that the US benefits from being the reserve
| currency of the world. Control of the ecosystem allows meta
| to define the rules of the game.
|
| Also it's bad when HN is downvoting fking GWERN
| gwern wrote:
| No. I think they use that web & ML software to help run
| their $1.3 trillion marketcap online social network
| company, on which, I am given to understand from US
| Congressional hearings, they sell ads.
| talldayo wrote:
| > a lot of FLOSS is released for strategic reasons (such as
| building up a community before enclosing it
|
| Not only is "a lot" of FOSS _not_ released like this, both
| free software and Meta 's models cannot be monetized post-
| release. If Meta decides to charge money for Llama4, then
| everyone with access to the prior models can keep their
| access and even finetune/redistribute their model. There is
| no strategic flip Meta can attempt here without shotgunning
| their own foot off.
| gwern wrote:
| It absolutely is released like that. Please note that 'such
| as' does not mean 'all', 'a majority', or anything like
| that. It simply means 'one example'. (I eat many things
| 'such as' honey, yet that doesn't make me Winnie the Pooh.)
| And enclosures come up regularly here on HN where suddenly
| a company has embarked on a new incredible journey
| involving 'not releasing all the source code anymore under
| an OSI-approved license', so they are not even that exotic
| an outcome.
| koolala wrote:
| True? The cynicism is arguments over "true". If you trick
| yourself into believing this is what Open Source looks like (no
| source data) then you lose out on imagining what a real open
| source AI with open source data would be like.
| JumpCrisscross wrote:
| > _you lose out on imagining what a real open source AI with
| open source data would be like_
|
| Zero privacy?
| koolala wrote:
| Sounds like God. What is private? What is public? There is
| a line between an omnipresent God that sees everything
| (zero privacy) and an Open Source AI that collects and
| indexes all public information. I want an AI to be
| programmed / trained on our conversation here. (if your
| reading this God / FB / OpenAI / 2025 ai Source Dataset <3)
| YetAnotherNick wrote:
| Open source and open source model is not a term that came
| from dictionary, and it is just based on what community
| thinks it means. As long as open source model doesn't cause
| confusion, which it does not as open source model today just
| means open weights model, fighting over it is not worth it.
| koolala wrote:
| If open source model means open weight model then open
| source model means nothing.
|
| I want it to mean something!
| blackeyeblitzar wrote:
| Here we go again with the co opting of open source and the
| marketing open washing. Llama isn't open source. Sharing weights
| is like sharing a compiled program. Without visibility into the
| training data, curation / moderation decision, the training code,
| etc Llama could be doing anything and we wouldn't know.
|
| Also open source means the license used should be something
| standard not proprietary, without restrictions on how you can use
| it.
| talldayo wrote:
| > Sharing weights is like sharing a compiled program.
|
| Not at all. They're only similar in the sense that both are a
| build artifact.
|
| > Without visibility into the training data, curation /
| moderation decision, the training code, etc Llama could be
| doing anything and we wouldn't know.
|
| "could be doing anything" is quite the tortured phrase, there.
| For one, model training is not deterministic and having the
| full training data would not yield a byte-perfect Llama
| retrain. For two, the released models are not turing-complete
| or filled with viruses; you can open the weights yourself and
| confirm they're static and harmless. For three, training code
| exists for all 3 Llama models and the reason nobody uses them
| is because it's prohibitively expensive to reproduce and has
| zero positive potential compared to finetuning what we have
| already.
|
| > Also open source means the license used should be something
| standard not proprietary, without restrictions on how you can
| use it.
|
| There are very much restrictions on redistribution for nearly
| every single Open Source license. Permissive licensing may not
| mean what you think it means.
| thor-rodrigues wrote:
| I think that focusing primarily on the discussion of what is or
| isn't open source software makes us miss an interesting point
| here, that Llama enables users to have a similar performance to
| frontier models in your own systems, without having to send data
| to third-party sources.
|
| My company is building an application for an university client,
| regarding the examination of research data written in "human
| language" (mostly notes and docs).
|
| Due the high confidentiality of the subjects, as often they deal
| with non-patented information, we couldn't risk using frontier
| models, as it could break the novelty of the invention, therefore
| losing patentability.
|
| Now with Llama3.1, we can simply run these models locally, on
| systems that is not even connected to the internet. LLMs are
| mostly good in examining massive amount of research papers and
| information, at least for the application we are aiming at,
| saving thousands of hours of tiresome (and very boring) human
| labour.
|
| I am trying to endorse Meta or Zuckerberg or anything like that,
| but at least in this aspect, I think Llama being "open-source" is
| a very good aspect.
| jstummbillig wrote:
| To me it's fairly interesting how relatively little money it
| takes meta to pose a risk to other models makers businesses,
| who are _dependent_ on having to run the model after they
| created it (because that is how they make money) while meta
| does not even have to deal with the cost attached to providing
| inference infra, at all, to pose that risk.
| phyrex wrote:
| That's a funny definition of "little money"
| cosmojg wrote:
| They did say "relatively little money" which is arguably
| true.
| honorious wrote:
| Can you expand on the risk of breaking novelty?
|
| Is the concern that prompts could be re-used for training by
| the provider and such knowledge become part of the model?
| koolala wrote:
| Can you imagine how incredible an open source model would be
| for research / humanity beyond the buisness needs right in
| front of us?
|
| Open-Knowledge source with an Open-Inteligence that can guide
| you through the entire massive digital library of its own
| brain. Semantic data light-years beyond a Search Engine.
| talldayo wrote:
| No, I really can't imagine it. Extrapolating from our free
| commercially-licensed offerings it would seem most people
| would ignore it or share stories on Reddit about how FreeGPT
| poisoned their family when generating a potato salad recipe.
| koolala wrote:
| An open source model would be able to give you the sources
| of its potato salad recipe inspiration. It would be the
| best of both worlds. AI Knowledge + Real Open Human
| Knowledge.
| JumpCrisscross wrote:
| > _open source model would be able to give you the source
| of its potaeo salad recipe_
|
| Kagi's LLM can already do that. I believe so can
| Perplexity's. Citing sources isn't something only open
| models can do.
| koolala wrote:
| I'm pretty sure Kagi is like a normal search engine with
| AI integration like Google. Not an AI designed to be open
| source with an open dataset of knowledge it was trained
| on.
| JumpCrisscross wrote:
| > _pretty sure Kagi is like a normal search engine with
| AI integration like Google_
|
| Sure. The point is the thing you said only an open-source
| model can do, it can do. Plenty of proprietary LLMs can
| cite sources.
|
| The plain truth is most of the benefits of open models
| are _not_ on the consumer side. (Or at least, I haven 't
| seen any articulated.) They're on the producers'. Open
| models are better for those of us training models. That's
| partly why the open data debate is academic--very few
| people are training large foundation models because the
| compute and electricity costs are prohibitive.
| koolala wrote:
| I'm kinda hoping World Governments will use their Public
| Library infrastructure to train AI. Japan is my #1 hope
| with how they are opening public science knowledge.
| Super-computers have been prohibitive for a long time but
| national science institutions could be a great place for
| open source & open weight AI.
| JumpCrisscross wrote:
| > _hoping World Governments will use their Public Library
| infrastructure to train AI_
|
| Genuinely blown away the EU isn't doing this.
|
| In the U.S., the solution may be in carving a legal safe
| harbor for companies that release their models per the
| OSI's draft definition of open source.
| talldayo wrote:
| I bet Nvidia would quite like that too. Private _and_
| public-sector funding, theirs for the taking! Few
| businesses are ever so lucky.
| valine wrote:
| Just because you have the dataset doesn't mean you can
| generate a reference. Let's say I hand you a potato salad
| recipe and a copy of the entire internet. Say you somehow
| extract all potato salad recipes from the dataset (non
| trivial btw) and none of them are an exact match for the
| recipe the model generated. Now what?
| JumpCrisscross wrote:
| > _Open-Knowledge source with an Open-Inteligence that can
| guide you through the entire massive digital library of its
| own brain. Semantic data light-years beyond a Search Engine_
|
| This sounds like the usual AI marketing with the word "open"
| thrown in. It's not articulating something youc an only do
| with an open source LLM (and doesn't define what that means).
|
| I'm personally not thrilled with how locked down LLMs are.
| But we'll need to do a better job at articulating (a) a
| definition and (b) the benefits of adhering to it versus the
| "you can run it on your own metal" definition Facebook is
| promulgating. Because a model meeting Facebook's definition
| has obvious benefits over proprietary models run on someone
| else's servers.
| koolala wrote:
| You can't imagine it :( Open data :(
|
| I believe our world fights to destroy ideas like this
| because our economy drives our entire life.
| JumpCrisscross wrote:
| > Can you imagine
|
| >> No
|
| >>> You can't imagine it
|
| You haven't articulated the idea you claim the "world
| fights to destroy". (Just throwing around the word open
| without elaboration isn't an idea.)
| koolala wrote:
| Data that is accessible. Knowledge. Truth. With an AI
| trained on it that can expose it in any expert / layman
| terms into any human language.
| JumpCrisscross wrote:
| You're undermining the case for an open source LLM by
| stating things fully-proprietary models do.
| koolala wrote:
| They don't make the source data accessible :(
| JumpCrisscross wrote:
| > _they don 't make the source data accessible_
|
| No. But you haven't articulated why making everyone's
| Facebook chats public is a net good. What does opening
| that data up confer in practical benefits?
|
| Given what we know about LLMs, one trained only on
| public-domain data will underperform one trained on that
| _plus_ proprietary data. If you want source data
| available, you have to either concede the "open" models
| will be structurally handicapped or that all data must be
| public.
| koolala wrote:
| You think Llama is trained on peoples private messages?
| :( That isn't good...
| JumpCrisscross wrote:
| > _You think Llama is trained on peoples private
| messages?_
|
| Facebook says no, at least for Llama [1].
|
| [1] https://itlogs.com/facebook-uses-user-data-to-train-
| ai-but-l...
| tourmalinetaco wrote:
| I'm not sure what they're talking about, but I'll throw
| my hat into the ring. Copyright and other such systems
| are destroying any chance that we, as humanity, have of
| letting LLMs progress in an open and transparent manner.
| We have to hide the training data and make the weights a
| black box because of such antiquated notions such as
| copyright. While I am willing to permit some level of
| exclusivity with creative works, 100+ years is
| unreasonable and stagnates human creativity even outside
| of ML tasks. In the 19th century, I could take a book I
| was raised on and write my own fanfiction, and because
| that book would have been public domain by the time I was
| an adult I could add onto the work and the other fans of
| the previous work can build upon it with me. We see this
| with Sherlock Holmes for instance. If I wanted to publish
| a book set in the world of Harry Potter I'd need to wait
| for JK Rowling to croak, and _then_ wait another _70
| YEARS_.
|
| We need dramatic reforms on copyright, as we've really
| let corporate interests crowd out our rights to human
| culture and ideas. While I alone cannot decide what we as
| a country should find reasonable, I can say I find 20
| years + 5 years extension is perfectly reasonable and
| that corporations should have never been able to pay off
| politicians to get what they wanted. Let alone Sonny
| Bono, that bastard, signing in bills that specifically
| benefited him.
|
| So, to reiterate, the idea I feel that corporations want
| to destroy is the idea that we, as a people, have rights
| to the works that form our popular culture and that no
| one man, let alone a faceless corporation, should be able
| to profit from a singular work for hundreds of years.
| valine wrote:
| If you have the model weights you have roughly the same
| opportunities as the company that trained the model. The code
| you need to run inference on the Llama weights is very much
| open source. The only thing you're missing out on is the
| training code, which is prohibitively expensive to run for
| most anyways. Open source training isn't going to give you
| any unique insights into the "digital brain library" of your
| model.
|
| Also just to be clear, if you want to set up a RAG with an
| open weight model and a large dataset there's nothing
| stopping you. Download Red Pajama and Llama and give it a
| try.
|
| https://github.com/togethercomputer/RedPajama-Data
| HarHarVeryFunny wrote:
| You're not really asking for an open source model though,
| you're asking for open source training data set(s), which
| isn't something that Meta can give you. There are open source
| web scrapes such as The Pile, but much of the more
| specialized data needs to be licensed.
| koolala wrote:
| I'm asking for an "Open Source AI" and Meta and everyone
| supporting them is convinced its impossible in our
| lifetimes :( We are living in the Dark Ages where
| Information = $$$. I pray to AI we one day grow out of this
| pointless destructive economic spiral towards the heat
| death of the Earth and collect and share open knowledge
| across all human cultures and history.
| HarHarVeryFunny wrote:
| Well, as long as by "AI" you are referring to pre-trained
| transformers, then what you are effectively asking for is
| the data used to pre-train them.
|
| OTOH why you want the data is not clear. You don't need
| it to run Meta'a models for free, or to fine-tune them
| for your own needs. The only thing the data would allow
| you is to pre-train from scratch, in other words to
| obtain the exact same set of weights that Meta is giving
| you for free.
| tourmalinetaco wrote:
| All of that data is already available, just look into "shadow
| libraries". Now, I do wish Meta and other companies would
| publish their data sets and we, as humanity, could improve
| upon them and empower even better LLMs, but the unfortunate
| reality is copyright is holding us back. Most of what you say
| is essentially gibberish, but there is truth that LLMs would
| be better if it could not only utilize its weights, but
| reference and search its training data (that is collectively
| owned by humanity, by the way) and answer with that and not
| just what it "thinks".
| koolala wrote:
| "the first frontier-level open source AI"
|
| They are never going to stop saying this or show us the actual
| source data. Imagine if they did... Do they even entertain the
| idea? Can they really not imagine Open Source AI being possible
| because of all the personal data they train on?
| Spivak wrote:
| Source code, yes. Source data, probably never unless the US
| government gets real cool with a lot of things real quickly.
| j_maffe wrote:
| The source code is available though.
| Spivak wrote:
| The source for training? Since when?
| pj_mukh wrote:
| Has anyone heard about any effect Meta has said would happen if
| Californias SB 1047 passes[1]?
|
| Looking forward to continued updates and releases of Llama (and
| SAM!) from Meta.
|
| [1] https://www.theverge.com/2024/8/28/24229068/california-
| sb-10...
| ksajadi wrote:
| This is a good one for everything related to SB1047
| https://pca.st/episode/44b41e28-5772-41c4-bcd7-5d7aa48d5120
| pama wrote:
| Yoshua Bengio is a very respected scientist with well
| deserved reputation, but this discussion is upsetting...
| "academia now trains much smaller models... 10^26 FLOPs is 10
| to the 26 floating point operations per second.. yes.. how
| big is that compared to GPT-4? It is much bigger than all the
| existing ones..." (flops has a different meaning: there is no
| per second in the law; one single H100 from last year
| performs 1e15 FLOPs per second; llama3.1 was close to the
| 1e26 limit this year, and the total training FLOPS of other
| models are not published; research could change once compute
| is even cheaper but state laws move at glacial speeds...).
|
| It is disheartening so see the damage capacity in the hands
| of a couple of paranoic people who perhaps read the wrong
| scifi and had lots of power to influence others. If
| California passes this law, in a few years the world economy
| will be very different.
| EasyMark wrote:
| I think companies will simply quit doing business in
| California. They are killing the golden goose with all
| these regulations, just like they poison their own real
| estate markets allowing NIMBYs to dictate regulations that
| keep housing prices high and allow petty criminals to run
| rampant while cities like San Francisco continue to
| diminish and suffer
| trissi1996 wrote:
| IMHO it's not, it just parrots the same old arguments for
| "safety", arguing against straw-men and framing the other
| side as having wrong assumptions about AI safety/being
| unfair/etc, all while not going into the principled counter-
| arguments and their own assumptions at all.
|
| Here are some counter-points:
|
| Regulation:
|
| - Very little effort is made to evaluate risk of over-
| regulating, regulatory capture and counterproductive wrong
| regulation
|
| - The downside of under-regulating is vastly overemphasized,
| most arguments boil down to "we have to act FAST now or x BAD
| thing might happen"
|
| - The risk of over/wrongly regulating is vastly under
| emphasized with the same FUD reasons.
|
| - according to one of the many straw-men argument in the pod
| I'm a libertarian against any and all regulation because I
| criticize possible regulatory capture, I would
| enthusiastically support regulation that foundation models
| have to be:
|
| -- given freely to public researchers/academics for in-depth
| independent safety-research
|
| -- open weighted after a while (e.g. after ~ a year, safety
| concerns should be mostly ruled out and new generations are
| out so ROI is already likely there. [e.g. there's NO safety
| reason at all for ClosedAI to not release gpt-3.5, llama3 is
| better already])
|
| Proposed FLOP cut-off of SB 1047:
|
| - according to the pod, the cut off is much more advanced
| than anything currently released.
|
| - The 10^26 FLOP cutoff is way to low, llama-405b is ~4x10^25
| FLOPs
|
| - 405B is maybe 20% smarter than 70B, while taking over an
| order of magnitude more FLOPS to train, the cutoff itself is
| very likely not much smarter than the current SOTA.
|
| - IMO none of the current SOTA models are very dangerous, but
| kill switch regulation is.
|
| Kill-Switches:
|
| - SB 1047 is (non-explicitly) calling for kill-switches over
| the cut-off due to liability of the model creators and market
| dynamics
|
| - Any kill-switch regulation means a complete dead-end to any
| advanced open-weights AI. This means that huge corporations
| and governments will control any and all advanced AI-
| development. This is top-down control of the maths you are
| allowed to run on your computer IMO that is Orwellian as
| fuck.
|
| China:
|
| - mentioning china is FUD 101, it's basically AI's "think of
| the children"
|
| - If they think they can stop china from building their own
| advanced LLMs, they're delusional. This regulation might even
| help them to get there faster. They don't even need to steal,
| there maybe a year or two behind the SOTA and catching up
| fast.
|
| I just don't get how so many people on a site with "hacker"
| in the name want to make it impossible to hack on these
| things for anyone not employed by the big corporate AI
| research labs.
| malwrar wrote:
| I haven't heard anything specific from Meta themselves, but I
| think the bill is short enough that we can reason as non-
| lawyers about it. Almost certainly they would have to stop
| releasing LLMs weights based on the very specific
| qualifications in the legislation. I don't actually know what
| the specific size limit would be, but based on the translated $
| value in the text of the bill it probably would cover their
| 70B+ models.
|
| Disturbing approach to mitigating AI harms imo, this bill
| basically hopes it can limit the number of operators of an
| arbitrary model type so as to allow easier governance of AI
| model use. This ignores the reality that we _already_ have
| large models openly released and easily modifiable (outside CA
| jurisdiction) which likely are capable of perpetuating
| "critical harms", or that the information requirements to
| achieve the defined "critical harms" could be realized by an
| individual by simply reading a few books. There's also no
| reason to simply assume that future models will require
| millions of dollars of compute to create; fulfilling the goals
| of this regulatory philosophy long term almost certainly
| requires the banning of general purpose compute to come close
| to the desired outcome of a supposed reduction in probability
| of some "critical harm" being perpetrated. We should be
| focusing on hardening society to the realities of the "critical
| harms" identified by this bill, rather than implicitly assuming
| the only reason we don't see them as much irl is because
| everyone is stupid. The current paranoia wave around LLMs is
| just a symptom of people waking up to the fragility of the
| world we live in.
___________________________________________________________________
(page generated 2024-08-29 23:01 UTC)