[HN Gopher] Open source AI is the path forward
       ___________________________________________________________________
        
       Open source AI is the path forward
        
       Author : atgctg
       Score  : 1018 points
       Date   : 2024-07-23 15:08 UTC (7 hours ago)
        
 (HTM) web link (about.fb.com)
 (TXT) w3m dump (about.fb.com)
        
       | amusingimpala75 wrote:
       | Sure but under what license? Because slapping "open source" on
       | the model doesn't make it open source if it's not actually
       | license that way. The 3.1 license still contains their non-
       | commercial clause (over 700m users) and requires derivatives,
       | whether fine tunings or trained on generated data, to use the
       | llama name.
        
         | redleader55 wrote:
         | "Use it for whatever you want(conditions apply), but not if you
         | are Google, Amazon, etc. If you become big enough talk to us."
         | That's how I read the license, but obviously I might be missing
         | some nuance.
        
           | mesebrec wrote:
           | You also can't use it for training or improving other models.
           | 
           | You also can't use it if you're the government of India.
           | 
           | Neither can sex workers use it. (Do you know if your
           | customers are sex workers?)
           | 
           | There are also very vague restrictions for things like
           | discrimination, racism etc.
        
             | war321 wrote:
             | They're actually updating their license to allow LLAMA
             | outputs for training!
             | 
             | https://x.com/AIatMeta/status/1815766335219249513
        
       | aliljet wrote:
       | And this is happening RIGHT as a new potential leader is emerging
       | in Llama 3.1. I'm really curious about how this is going to match
       | up on the leaderboards...
        
       | kart23 wrote:
       | > This is how we've managed security on our social networks - our
       | more robust AI systems identify and stop threats from less
       | sophisticated actors who often use smaller scale AI systems.
       | 
       | Ok, first of all, has this really worked? AI moderators still
       | can't capture the mass of obvious spam/bots on all their
       | platforms, threads included. Second, AI detection doesn't work,
       | and with how much better the systems are getting, it's probably
       | never going to, unless you keep the best models for yourself, and
       | it's is clear from the rest of the note that its not zuck's
       | intention to do so.
       | 
       | > As long as everyone has access to similar generations of models
       | - which open source promotes - then governments and institutions
       | with more compute resources will be able to check bad actors with
       | less compute.
       | 
       | This just doesn't make sense. How are you going to prevent AI
       | spam, AI deepfakes from causing harm with more compute? What are
       | you gonna do with more compute about nonconsensual deepfakes?
       | People are already using AI to bypass identity verification on
       | your social media networks, and pump out loads of spam.
        
         | OpenComment wrote:
         | Interesting quotes. _Less sophisticated actors_ just means
         | humans who already write in 2020 what the NYT wrote in early
         | 2022 to prepare for Biden 's State Of The Union 180deg policy
         | reversals (manufacturing consent).
         | 
         | FB was notorious for censorship. Anyway, what is with the
         | "actions/actors" terminology? This is straightforward
         | totalitarian language.
        
         | simonw wrote:
         | "AI detection doesn't work, and with how much better the
         | systems are getting, it's probably never going to, unless you
         | keep the best models for yourself"
         | 
         | I don't think that's true. I don't think even the best
         | privately held models will be able to detect AI text reliably
         | enough for that to be worthwhile.
        
       | blackeyeblitzar wrote:
       | Only if it is truly open source (open data sets, transparent
       | curation/moderation/censorship of data sets, open training source
       | code, open evaluation suites, and an OSI approved open source
       | license).
       | 
       | Open weights (and open inference code) is NOT open source, but
       | just some weak open washing marketing.
       | 
       | The model that comes closest to being TRULY open is AI2's OLMo.
       | See their blog post on their approach:
       | 
       | https://blog.allenai.org/hello-olmo-a-truly-open-llm-43f7e73...
       | 
       | I think the only thing they're not open about is how they've
       | curated/censored their "Dolma" training data set, as I don't
       | think they explicitly share each decision made or the original
       | uncensored dataset:
       | 
       | https://blog.allenai.org/dolma-3-trillion-tokens-open-llm-co...
       | 
       | By the way, OSI is working on defining open source for AI. They
       | post weekly updates to their blog. Example:
       | 
       | https://opensource.org/blog/open-source-ai-definition-weekly...
        
         | JumpCrisscross wrote:
         | > _Only if it is truly open source (open data sets, transparent
         | curation /moderation/censorship of data sets, open training
         | source code, open evaluation suites, and an OSI approved open
         | source license)_
         | 
         | You're missing a then to your if. What happens if it's "truly"
         | open per your definition versus not?
        
           | blackeyeblitzar wrote:
           | I think you are asking what the benefits are? The main
           | benefit is that we can trust what these systems are doing
           | better. Or we can self host them. If we just take the
           | weights, then it is unclear how these systems might be lying
           | to us or manipulating us.
           | 
           | Another benefit is that we can learn from how the training
           | and other steps actually work. We can change them to suit our
           | needs (although costs are impractical today). Etc. It's all
           | the usual open source benefits.
        
         | haolez wrote:
         | There is also the risk of companies like Meta introducing ads
         | in the training itself, instead of inference time.
        
         | itissid wrote:
         | Yeah, though I do wonder for a big model like 405B if the
         | original training recipe, really matters for where models are
         | heading, practically speaking which is smaller and more
         | specific?
         | 
         | I imagine its main use would be to train other models by
         | distilling them down with LoRA/Quantization etc(assuming we
         | have a tokenizer). Or use them to generate training data for
         | smaller models directly.
         | 
         | But, I do think there is always a way to share without
         | disclosing too many specifics, like this[1] lecture from this
         | year's spring course at Stanford. You can always say, for
         | example:
         | 
         | - The most common technique for filtering was using voting LLMs
         | ( _without disclosing said llms or quantity of data_ ).
         | 
         | - We built on top of a filtering technique for removing poor
         | code using ____ by ____ authors ( _without disclosing or
         | handwaving how you exactly filtered, but saying that you had to
         | filter_ ).
         | 
         | - We mixed certain proportion of this data with that data to
         | make it better ( _without saying what proportion_ )
         | 
         | [1]
         | https://www.youtube.com/watch?v=jm2hyJLFfN8&list=PLoROMvodv4...
        
       | JumpCrisscross wrote:
       | "The Heavy Press Program was a Cold War-era program of the United
       | States Air Force to build the largest forging presses and
       | extrusion presses in the world." This "program began in 1944 and
       | concluded in 1957 after construction of four forging presses and
       | six extruders, at an overall cost of $279 million. Six of them
       | are still in operation today, manufacturing structural parts for
       | military and commercial aircraft" [1].
       | 
       | $279mm in 1957 dollars is about $3.2bn today [2]. A public
       | cluster of GPUs provided for free to American universities,
       | companies and non-profits might not be a bad idea.
       | 
       | [1] https://en.m.wikipedia.org/wiki/Heavy_Press_Program
       | 
       | [2] https://data.bls.gov/cgi-
       | bin/cpicalc.pl?cost1=279&year1=1957...
        
         | CardenB wrote:
         | Doubtful that GPUs purchased today would be in use for a
         | similar time scale. Govt investment would also drive the cost
         | of GPUs up a great deal.
         | 
         | Not sure why a publicly accessible GPU cluster would be a
         | better solution than the current system of research grants.
        
           | JumpCrisscross wrote:
           | > _Doubtful that GPUs purchased today would be in use for a
           | similar time scale_
           | 
           | Totally agree. That doesn't mean it can't generate massive
           | ROI.
           | 
           | > _Govt investment would also drive the cost of GPUs up a
           | great deal_
           | 
           | Difficult to say this _ex ante_. On its own, yes. But it
           | would displace some demand. And it could help boost chip
           | production in the long run.
           | 
           | > _Not sure why a publicly accessible GPU cluster would be a
           | better solution than the current system of research grants_
           | 
           | Those receiving the grants have to pay a private owner of the
           | GPUs. That gatekeeping might be both problematic, if there is
           | a conflict of interests, and inefficient. (Consider why the
           | government runs its own supercomputers versus contracting
           | everything to Oracle and IBM.)
        
             | rvnx wrote:
             | It would be better that the government removes IP on such
             | technology for public use, like drugs got generics.
             | 
             | This way the government pays 2'500 USD per card, not 40'000
             | USD or whatever absurd.
        
               | JumpCrisscross wrote:
               | > _better that the government removes IP on such
               | technology for public use, like drugs got generics_
               | 
               | You want to punish NVIDIA for calling its shots
               | correctly? You don't see the many ways that backfires?
        
               | gpm wrote:
               | No. But I do want to limit the amount we reward NVIDIA
               | for calling the shots correctly to maximize the benefit
               | to society. For instance by reducing the duration of the
               | government granted monopolies on chip technology that is
               | obsolete well before the default duration of 20 years is
               | over.
               | 
               | That said, it strikes me that the actual limiting factor
               | is fab capacity not nvidia's designs and we probably need
               | to lift the monopolies preventing competition there if we
               | want to reduce prices.
        
               | JumpCrisscross wrote:
               | > _reducing the duration of the government granted
               | monopolies on chip technology that is obsolete well
               | before the default duration of 20 years is over_
               | 
               | Why do you think these private entities are willing to
               | invest the massive capital it takes to keep the frontier
               | advancing at that rate?
               | 
               | > _I do want to limit the amount we reward NVIDIA for
               | calling the shots correctly to maximize the benefit to
               | society_
               | 
               | Why wouldn't NVIDIA be a solid steward of that capital
               | given their track record?
        
               | gpm wrote:
               | > Why do you think these private entities are willing to
               | invest the massive capital it takes to keep the frontier
               | advancing at that rate?
               | 
               | Because whether they make 100x or 200x they make a
               | shitload of money.
               | 
               | > Why wouldn't NVIDIA be a solid steward of that capital
               | given their track record?
               | 
               | The problem isn't who is the steward of the capital. The
               | problem is that economically efficient thing to do for a
               | single company is (given sufficient fab capacity, and a
               | monopoly) to raise prices to extract a greater share of
               | the pie at the expense of shrinking the size of the pie.
               | I'm not worried about who takes the profit, I'm worried
               | about the size of the pie.
        
               | whimsicalism wrote:
               | > Because whether they make 100x or 200x they make a
               | shitload of money.
               | 
               | It's not a certainty that they 'make a shitload of
               | money'. Reducing the right tail payoffs absolutely
               | reduces the capital allocated to solve problems - many of
               | which are _risky bets_.
               | 
               | Your solution absolutely decreases capital investment at
               | the margin, this is indisputable and basic economics.
               | Even worse when the taking is not due to some pre-
               | existing law, so companies have to deal with the
               | additional uncertainty of whether & when future people
               | will decide in retrospect that they got too large a
               | payoff and arbitrarily decide to take it from them.
        
               | gpm wrote:
               | You can't just look at the costs to an action, you also
               | have to look at the benefits.
               | 
               | Of course I agree I'm going to stop marginal investments
               | from occurring into research into patent-able
               | technologies by reducing the expect profit. But I'm going
               | to do so _very slightly_ because I 'm not shifting the
               | expected value by very much. Meanwhile I'm going to
               | greatly increase the investment into the existing
               | technology we already have, and allow many more people to
               | try to improve upon it, and I'm going to argue the
               | benefits greatly outweigh the costs.
               | 
               | Whether I'm right or wrong about the net benefit, the
               | basic economics here is that there are both costs and
               | benefits to my proposed action.
               | 
               | And yes I'm going to marginally reduce future investments
               | because the same might happen in the future and that
               | reduces expected value. In fact if I was in charge the
               | same _would_ happen in the future. And the trade-off I
               | get for this is that society gets the benefit of the same
               | _actually_ happening in the future and us not being
               | hamstrung by unbreachable monopolies.
        
               | JumpCrisscross wrote:
               | > _I 'm going to do so very slightly because I'm not
               | shifting the expected value by very much_
               | 
               | You're massively increasing uncertainty.
               | 
               | > _the same would happen in the future. And the trade-off
               | I get for this is that society gets the benefit_
               | 
               | Why would you expect it would ever happen again? What you
               | want is an unrealized capital gains tax. Not to nuke our
               | semiconductor industry.
        
               | whimsicalism wrote:
               | > But I'm going to do so very slightly because I'm not
               | shifting the expected value by very much
               | 
               | I think you're shifting it by a lot. If the government
               | can post-hoc decide to invalidate patents because the
               | holder is getting too successful, you are introducing a
               | substantial impact on expectations and uncertainty. Your
               | action is not taken in a vacuum.
               | 
               | > Meanwhile I'm going to greatly increase the investment
               | into the existing technology we already have, and allow
               | many more people to try to improve upon it, and I'm going
               | to argue the benefits greatly outweigh the costs.
               | 
               | I think this is a much more speculative impact. Why will
               | people even fund the improvements if the government might
               | just decide they've gotten too large a slice of the pie
               | later on down the road?
               | 
               | > the trade-off I get for this is that society gets the
               | benefit of the same actually happening in the future and
               | us not being hamstrung by unbreachable monopolies.
               | 
               | No the trade-off is that materially less is produced.
               | These incentive effects are not small. Take for instance,
               | drug price controls - a similar post-facto taking because
               | we feel that the profits from R&D are too high.
               | Introducing proposed price controls leads to hundreds of
               | fewer drugs over the next decade [0] - and likely
               | millions of premature deaths downstream of these
               | incentive effects. And that's with a policy with a clear
               | path towards short-term upside (cheaper drug prices).
               | Discounted GPUs by invalidating nvidia's patents has a
               | much more tenuous upside and clear downside.
               | 
               | [0]: https://bpb-
               | us-w2.wpmucdn.com/voices.uchicago.edu/dist/d/312...
        
               | hluska wrote:
               | You have proposed state ownership of all successful IP.
               | That is a massive change and yet you have demonstrated
               | zero understanding of the possible costs.
               | 
               | Your claim that removing a profit motivation will
               | increase investment is flat out wrong. Everything else
               | crumbles from there.
        
               | gpm wrote:
               | No, I've proposed removing or reducing IP protections,
               | not transferring them to the state. Allowing competitors
               | to enter the market will obviously increase investment in
               | competitors...
        
               | IG_Semmelweiss wrote:
               | This is already happening - its called China. There's a
               | reason they don't innovate in anything, and they are
               | always playing catch-up, except in the art of copying
               | (stealing) from others.
               | 
               | I do think there are some serious IP issues, as IP rules
               | can be hijacked in the US, but that means you fix those
               | problems, not blow up IP that was rightfully earned
        
               | whimsicalism wrote:
               | there is no such thing as a lump-sum transfer, this will
               | shift expectations and incentives going forward and make
               | future large capital projects an increasingly uphill
               | battle
        
               | hluska wrote:
               | So, if a private company is successful, you will
               | nationalize its IP under some guise of maximizing the
               | benefit to society? That form of government was tried
               | once. It failed miserably.
               | 
               | Under your idea, we'll try a badly broken economic
               | philosophy again. And while we're at it, we will
               | completely stifle investment in innovation.
        
               | Teever wrote:
               | There was a post[0] on here recently about how the US
               | went from producing woefully insufficient numbers of
               | aircraft to producing 300k by the end of world war 2.
               | 
               | One of the things that the post mentioned was the meager
               | profit margin that the companies made during this time.
               | 
               | But the thing is that this set the America auto and
               | aviation industry up to rule the world for decades.
               | 
               | A government going to a company and saying 'we need you
               | to produce this product for us at a lower margin thab
               | you'd like to' isn't the end of the world.
               | 
               | I don't know if this is one of those scenarios but they
               | exist.
               | 
               | [0] https://www.construction-physics.com/p/how-to-
               | build-300000-a...
        
               | rvnx wrote:
               | In the case of NVIDIA it's even more sneaky.
               | 
               | They are an intellectual property company holding the
               | rights on plans to make graphic cards, not even a company
               | actually making graphic cards.
               | 
               | The government could launch an initiative "OpenGPU" or
               | "OpenAI Accelerator", where the government orders GPUs
               | from TSMC directly, without the middleman.
               | 
               | It may require some tweaking in the law to allow
               | exception to intellectual property for "public interest".
        
               | whimsicalism wrote:
               | y'all really don't understand how these actions would
               | seriously harm capital markets and make it difficult for
               | private capital formation to produce innovations going
               | forward.
        
               | freeone3000 wrote:
               | If we have public capital formation, we don't necessarily
               | need private capital. Private innovation in weather
               | modelling isn't outpacing government work by leaps and
               | bounds, for instance.
        
               | whimsicalism wrote:
               | because it is extremely challenging to capture the
               | additional value that is being produced by better weather
               | forecasts and generally the forecasts we have right now
               | are pretty good.
               | 
               | private capital is absolutely the driving force for the
               | vast majority of innovations since the beginning of the
               | 20th century. public capital may be involved, but it is
               | dwarfed by private capital markets.
        
               | freeone3000 wrote:
               | It's challenging to capture the additional value and the
               | forecasts are pretty good _because_ of _continual_ large-
               | scale government investment into weather forecasting.
               | NOAA is launching satellites! it's a big deal!
               | 
               | Private nuclear research is heavily dependent on
               | governmental contracts to function. Solar was subsidized
               | to heck and back for years. Public investment does work,
               | and does make a didference.
               | 
               | I would even say governmental involvement is sometimes
               | even the deciding factor, to determine if research is
               | worth pursuing. Some major capital investors have decided
               | AI models cannot possibly gain enough money to pay for
               | their training costs. So what do we do when we believe
               | something is a net good for society, but isn't going to
               | be profitable?
        
               | inetknght wrote:
               | > _y 'all really don't understand how these actions would
               | seriously harm capital markets and make it difficult for
               | private capital_
               | 
               | Reflexively, I count that harm as a feature. I don't like
               | private capital markets because I've been screwed by
               | private capital on multiple occasions.
               | 
               | But you are right: I don't understand how these actions
               | would harm. So please do expand your concerns.
        
               | panarky wrote:
               | To the extent these are incremental units that wouldn't
               | have been sold absent the government program, it's
               | difficult to see how NVIDIA is "harmed".
        
               | kube-system wrote:
               | > It would be better that the government removes IP on
               | such technology for public use, like drugs got generics.
               | 
               | 20-25 year old drugs are a lot more useful than 20-25
               | year old GPUs, and the manufacturing supply chain is not
               | a bottleneck.
               | 
               | There's no generics for the latest and greatest drugs,
               | and a fancy gene therapy might run a _lot_ more than
               | $40k.
        
           | ygjb wrote:
           | Of course they won't. The investment in the Heavy Press
           | Program was the initial build, and just citing one example,
           | the Alcoa 50,000 ton forging press was built in 1955,
           | operated until 2008, and needed ~$100M to get it operational
           | again in 2012.
           | 
           | The investment was made to build the press, which created
           | significant jobs and capital investment. The press, and
           | others like it, were subsequently operated by and then sold
           | to a private operator, which in turn enabled the massive
           | expansion of both military manufacturing, and commercial
           | aviation and other manufacturing.
           | 
           | The Heavy Press Program was a strategic investment that paid
           | dividends by both advancing the state of the art in
           | manufacturing at the time it was built, and improving
           | manufacturing capacity.
           | 
           | A GPU cluster might not be the correct investment, but a
           | strategic investment in increasing, for example, the
           | availability of training data, or interoperability of tools,
           | or ease of use for building, training, and distributing
           | models would probably pay big dividends.
        
             | JumpCrisscross wrote:
             | > _A GPU cluster might not be the correct investment, but a
             | strategic investment in increasing, for example, the
             | availability of training data, or interoperability of
             | tools, or ease of use for building, training, and
             | distributing models would probably pay big dividends_
             | 
             | Would you mind expanding on these options? Universal
             | training data sounds intriguing.
        
               | ygjb wrote:
               | Sure, just on the training front, building and
               | maintaining a broad corpus of properly managed training
               | data with metadata that provides attribution (for
               | example, content that is known to be human generated
               | instead of model generated, what the source of data is
               | for datasets such as weather data, census data, etc), and
               | that also captures any licensing encumbrance so that
               | consumers of the training data can be confident in their
               | ability to use it without risk of legal challenge.
               | 
               | Much of this is already available to private sector
               | entities, but having a publicly funded organization
               | responsible for curating and publishing this would enable
               | new entrants to quickly and easily get a foundation
               | without having to scrape the internet again, especially
               | given how rapidly model generated content is being
               | published.
        
               | mnahkies wrote:
               | I think the EPC (energy performance certificate) dataset
               | in the UK is a nice example of this. Anyone can download
               | a full dataset of EPC data from
               | https://epc.opendatacommunities.org/
               | 
               | Admittedly it hasn't been cleaned all that much - you
               | still need to put a bit of effort into that (newer
               | certificates tend to be better quality), but it's very
               | low friction overall. I'd love to see them do this with
               | more datasets
        
             | dmix wrote:
             | I don't think there's a shortage of capital for AI...
             | probably the opposite
             | 
             | Of all the things to expand the scope of government
             | spending why would they choose AI, or more specifically
             | GPUs?
        
               | devmor wrote:
               | There may however, be a shortage of capital for _open
               | source_ AI, which is the subject under consideration.
               | 
               | As for the why... because there's no shortage of capital
               | for AI. It sounds like the government would like to
               | encourage redirecting that capital to something that's
               | good for the economy at large, rather than good for the
               | investors of a handful of Silicon Valley firms interested
               | only in their own short term gains.
        
               | hluska wrote:
               | Look at it from the perspective of an elected official:
               | 
               | If it succeeds, you were ahead of the curve. If it fails,
               | you were prudent enough to fund an investigation early.
               | Either way, bleeding edge tech gives you a W.
        
             | whimsicalism wrote:
             | there are many things i think are more capital constrained,
             | if the government is trying to subsidize things.
        
           | jvanderbot wrote:
           | A much better investment would be to (somehow) revolutionize
           | production of chips for AI so that it's all cheaper, more
           | reliable, and faster to stand up new generations of software
           | and hardware codesign. This is probably much closer to the
           | program mentioned in the top level comment: It wasn't to
           | produce one type of thing, but to allow better production of
           | any large thing from lighter alloys.
        
         | light_hue_1 wrote:
         | The problem is that any public cluster would be outdated in 2
         | years. At the same time, GPUs are massively overpriced.
         | Nvidia's profit margins on the H100 are crazy.
         | 
         | Until we get cheaper cards that stand the test of time,
         | building a public cluster is just a waste of money. There are
         | far better ways to spend $1b in research dollars.
        
           | JumpCrisscross wrote:
           | > _any public cluster would be outdated in 2 years_
           | 
           | The private companies buying hundreds of billions of dollars
           | of GPUs aren't writing them off in 2 years. They won't be
           | cutting edge for long. But that's not the point--they'll
           | still be available.
           | 
           | > _Nvidia 's profit margins on the H100 are crazy_
           | 
           | I don't see how the current practice of giving a researcher a
           | grant so they can rent time on a Google cluster that runs
           | H100s is more efficient. It's just a question of capex or
           | opex. As a state, the U.S. has a structual advantage in the
           | former.
           | 
           | > _far better ways to spend $1b in research dollars_
           | 
           | One assumes the U.S. government wouldn't be paying list
           | price. In any case, the purpose isn't purely research ROI.
           | Like the heavy presses, it's in making a prohibitively-
           | expensive capital asset generally available.
        
           | ninininino wrote:
           | What about dollar cost averaging your purchases of GPUs? So
           | that you're always buying a bit of the newest stuff every
           | year rather than just a single fixed investment in hardware
           | that will become outdated? Say 100 million a year every year
           | for 20 years instead of 2 billion in a single year?
        
         | fweimer wrote:
         | Don't these public clusters exist today, and have been around
         | for decades at this point, with varying architectures? In the
         | sense that you submit a proposal, it gets approved, and then
         | you get access for your research?
        
           | JumpCrisscross wrote:
           | Not--to my knowledge--for the GPUs necessary to train
           | cutting-edge LLMs.
        
             | Maxious wrote:
             | All of the major cloud providers offer grants for public
             | research https://www.amazon.science/research-awards https:/
             | /edu.google.com/intl/ALL_us/programs/credits/research
             | https://www.microsoft.com/en-us/azure-academic-research/
             | 
             | NVIDIA offers discounts
             | https://developer.nvidia.com/education-pricing
             | 
             | eg. for Australia, the National Computing Infrastructure
             | allows researchers to reserve time on:
             | 
             | - 160 nodes each containing four Nvidia V100 GPUs and two
             | 24-core Intel Xeon Scalable 'Cascade Lake' processors.
             | 
             | - 2 nodes of the NVIDIA DGX A100 system, with 8 A100 GPUs
             | per node.
             | 
             | https://nci.org.au/our-systems/hpc-systems
        
           | NewJazz wrote:
           | This is the most recent iteration of a national platform.
           | They have tons of GPUs (and CPUs, and flash storage) hooked
           | up as a Kubernetes cluster, available for teaching and
           | research.
           | 
           | https://nationalresearchplatform.org/
        
         | epaulson wrote:
         | The National Science Foundation has been doing this for
         | decades, starting with the supercomputing centers in the 80s.
         | Long before anyone talked about cloud credits, NSF has had a
         | bunch of different programs to allocate time on supercomputers
         | to researchers at no cost, these days mostly run out of the
         | Office of Advanced Cyberinfrastruture. (The office name is from
         | the early 00s) - https://new.nsf.gov/cise/oac
         | 
         | (To connect universities to the different supercomputing
         | centers, the NSF funded the NSFnet network in the 80s, which
         | was basically the backbone of the Internet in the 80s and early
         | 90s. The supercomputing funding has really, really paid off for
         | the USA)
        
           | JumpCrisscross wrote:
           | > _NSF has had a bunch of different programs to allocate time
           | on supercomputers to researchers at no cost, these days
           | mostly run out of the Office of Advanced Cyberinfrastruture_
           | 
           | This would be the logical place to put such a programme.
        
             | alephnerd wrote:
             | The DoE has also been a fairly active purchaser of GPUs for
             | almost two decades now thanks to the Exascale Computing
             | Project [0] and other predecessor projects.
             | 
             | The DoE helped subsidize development of Kepler, Maxwell,
             | Pascal, etc along with the underlying stack like NVLink,
             | NGC, CUDA, etc either via purchases or allowing grants to
             | be commercialized by Nvidia. They also played matchmaker by
             | helping connect private sector research partners with
             | Nvidia.
             | 
             | The DoE also did the same thing for AMD and Intel.
             | 
             | [0] - https://www.exascaleproject.org/
        
           | jszymborski wrote:
           | As you've rightly pointed out, we have the mechanism, now
           | let's fund it properly!
           | 
           | I'm in Canada, and our science funding has likewise fallen
           | year after year as a proportion of our GDP. I'm still
           | benefiting from A100 clusters funded by tax payer dollars,
           | but think of the advantage we'd have over industry if we
           | didn't have to fight over resources.
        
             | xena wrote:
             | Where do you get access to those as a member of the general
             | public?
        
           | cmdrk wrote:
           | Yeah, the specific AI/ML-focused program is NAIRR.
           | 
           | https://nairrpilot.org/
           | 
           | Terrible name unless they low-key plan to make AI
           | researchers' hair fall out.
        
         | blackeyeblitzar wrote:
         | What about distributed training on volunteer hardware? Is that
         | feasible?
        
           | oersted wrote:
           | It is an exciting concept, there's a huge wealth of gaming
           | hardware deployed that is inactive at most hours of the day.
           | And I'm sure people are willing to pay well above the
           | electricity cost for it.
           | 
           | Unfortunately, the dominant LLM architecture makes it
           | relatively infeasible right now.
           | 
           | - Gaming hardware has too limited VRAM for training any kind
           | of near-state-of-the-art model. Nvidia is being annoyingly
           | smart about this to sell enterprise GPUs at exorbitant
           | markups.
           | 
           | - Right now communication between machines seems to be the
           | bottleneck, and this is way worse with limited VRAM. Even
           | with data-centre-grade interconnect (mostly Infiniband, which
           | is also Nvidia, smart-asses), any failed links tend to cause
           | big delays in training.
           | 
           | Nevertheless, it is a good direction to push towards, and the
           | government could indeed help, but it will take time. We need
           | both a more healthy competitive landscape in hardware, and
           | research towards model architectures that are easy to train
           | in a distributed manner (this was also the key to the success
           | of Transformers, but we need to go further).
        
           | codemusings wrote:
           | Ever heard of SETI@home?
           | 
           | https://setiathome.berkeley.edu
        
         | ks2048 wrote:
         | How about using some of that money to develop CUDA alternatives
         | so everyone is not paying the Nvidia tax?
        
           | lukan wrote:
           | It would be probably cheaper to negate some IP. There are
           | quite some projects and initiatives to make CUDA code run on
           | AMD for example, but as far as I know, they all stopped at
           | some point, probably because of fear of being sued into
           | oblivion.
        
           | whimsicalism wrote:
           | It seems like rocm is already fully ready for transformer
           | inference, so you are just referring to training?
        
             | janalsncm wrote:
             | ROCm is buggy and largely undocumented. That's why we don't
             | use it.
        
           | belter wrote:
           | Please start with the Windows Tax first for Linux users
           | buying hardware...and the Apple Tax for Android users...
        
           | zitterbewegung wrote:
           | Either you port Tensorflow (Apple)[1] or PyTorch to your
           | platform or you allow CUDA to run on your hardware (AMD) [2].
           | Companies are incentives to not have NVIDIA having a monopoly
           | but the thing is that CUDA is a huge moat due to
           | compatibility of all frameworks and everyone knows it. Also,
           | all of the cloud or on premises providers use NVIDIA
           | regardless.
           | 
           | [1] https://developer.apple.com/metal/tensorflow-plugin/ [2]
           | https://www.xda-developers.com/nvidia-cuda-amd-zluda/
        
           | erickj wrote:
           | That's the kind of work that can come out of academia and
           | open source communities when societies provide the resources
           | required.
        
         | prpl wrote:
         | Great idea, too bad the DOE and NSF were there first.
        
         | kjkjadksj wrote:
         | The size of the cluster would have to be massive or else your
         | job will be on the queue for a year. And even then what are you
         | going to do downsize the resources requested so you can get in
         | earlier? After a certain point it starts to make more sense to
         | just buy your own xeons and run your own cluster.
        
         | Aperocky wrote:
         | Imagine if they made a data center with 1957 electronics that
         | cost $279 million.
         | 
         | They probably won't be using it now because the phone in your
         | pocket is likely more powerful. Moore law did end but data
         | center stuff are still evolving order of magnitudes faster than
         | forging presses.
        
         | goda90 wrote:
         | I'd like to see big programs to increase the amount of cheap,
         | clean energy we have. AI compute would be one of many
         | beneficiaries of super cheap energy, especially since you
         | wouldn't need to chase newer, more efficient hardware just to
         | keep costs down.
        
           | Melatonic wrote:
           | Yeah this would be the real equivalent of the program people
           | are talking about above. That an investing in core networking
           | infrastructure (like cables) instead of just giving huge
           | handouts to certain corporations that then pocket the
           | money.....
        
         | BigParm wrote:
         | So we'll have the government bypass markets and force the
         | working class to buy toys for the owning class?
         | 
         | If anything, allocate compute to citizens.
        
           | _fat_santa wrote:
           | > If anything, allocate compute to citizens.
           | 
           | If something like this were to become a reality, I could see
           | something like "CitizenCloud" where once you prove that you
           | are a US Citizen (or green card holder or some other
           | requirement), you can then be allocated a number of credits
           | every month for running workloads on the "CitizenCloud".
           | Everyone would get a baseline amount, from there if you can
           | prove you are a researcher or own a business related to AI
           | then you can get more credits.
        
         | aiauthoritydev wrote:
         | Overall government doing anything is a bad idea. There are
         | cases however where government is the only entity that can do
         | certain things. These are things that involve military, law
         | enforcement etc. Outside of this we should rely on private
         | industry and for-profit industry as much as possible.
        
           | pavlov wrote:
           | The American healthcare industry demonstrates the tremendous
           | benefits of rigidly applying this mindset.
           | 
           | Why couldn't law enforcement be private too? You call 911,
           | several private security squads rush to solve your immediate
           | crime issue, and the ones who manage to shoot the suspect
           | send you a $20k bill. Seems efficient. If you don't like the
           | size of the bill, you can always get private crime insurance.
        
             | sterlind wrote:
             | For a further exploration of this particular utopia, see
             | Snowcrash by Neal Stephenson.
        
           | chris_wot wrote:
           | That's not correct. The American health care system is an
           | extreme example of where private organisations fail overall
           | society.
        
           | fragmede wrote:
           | > Overall government doing anything is a bad idea.
           | 
           | that is bereft of detail enough to just be wrong. There are
           | things that government is good for and things that government
           | is bad for, but "anything" is just too broad, and reveals an
           | anti-government bias which just isn't well thought out.
        
           | goatlover wrote:
           | Why are governments a bad idea? Seems the human race has
           | opted for governments doing things since the dawn of
           | civilization. Building roads, providing defense, enforcing
           | rights, provide social safety nets, funding costly scientific
           | endeavors.
        
           | com2kid wrote:
           | [delayed]
        
         | varenc wrote:
         | I just watched this 1950s DoD video on the heavy press program
         | and highly recommend it:
         | https://www.youtube.com/watch?v=iZ50nZU3oG8
        
         | spullara wrote:
         | It makes much more sense to invest in a next generation fab for
         | GPUs than to buy GPUs and more closely matches this kind of
         | project.
        
       | maxdo wrote:
       | so that North Korea will create small call centers for cheaper,
       | since they can get these models for free?
        
         | HanClinto wrote:
         | The article argues that the threat of foreign espionage is not
         | solved by closing models.
         | 
         | > Some people argue that we must close our models to prevent
         | China from gaining access to them, but my view is that this
         | will not work and will only disadvantage the US and its allies.
         | Our adversaries are great at espionage, stealing models that
         | fit on a thumb drive is relatively easy, and most tech
         | companies are far from operating in a way that would make this
         | more difficult. It seems most likely that a world of only
         | closed models results in a small number of big companies plus
         | our geopolitical adversaries having access to leading models,
         | while startups, universities, and small businesses miss out on
         | opportunities.
        
         | tempfile wrote:
         | This argument implies that cheap phones are bad since
         | telemarketers can use them.
        
         | mrfinn wrote:
         | You guys really need to get over your bellicose POV of the
         | world. Actually, before it destroys you. Really, is not
         | necessary. Most people in the world just want to leave in
         | peace, and see their children grow happily. For each data
         | center NK would create there will be a thousand of peaceful,
         | kind, and well-intentioned AI projects going on. Or maybe more.
        
       | the8thbit wrote:
       | "Eventually though, open source Linux gained popularity -
       | initially because it allowed developers to modify its code
       | however they wanted ..."
       | 
       | I find the language around "open source AI" to be confusing. With
       | "open source" there's usually "source" to open, right? As in,
       | there is human legible code that can be read and modified by the
       | user? If so, then how can current ML models be open source?
       | They're very large matrices that are, for the most part,
       | inscrutable to the user. They seem akin to binaries, which, yes,
       | can be modified by the user, but are extremely obscured to the
       | user, and require enormous effort to understand and effectively
       | modify.
       | 
       | "Open source" code is not just code that isn't executed remotely
       | over an API, and it seems like maybe its being conflated with
       | that here?
        
         | orthoxerox wrote:
         | Open training dataset + open steps sufficient to train exactly
         | the same model.
        
           | the8thbit wrote:
           | This isn't what Meta releases with their models, though I
           | would like to see more public training data. However, I still
           | don't think that would qualify as "open source". Something
           | isn't open source just because its reproducible out of
           | composable parts. If one, very critical and system defining
           | part is a binary (or similar) without publicly available
           | source code, then I don't think it can be said to be "open
           | source". That would be like saying that Windows 11 is open
           | source because Windows Calculator is open source, and its a
           | component of Windows.
        
             | blackeyeblitzar wrote:
             | Here's one list of what is needed to be actually open
             | source:
             | 
             | https://blog.allenai.org/hello-olmo-a-truly-open-
             | llm-43f7e73...
        
             | orthoxerox wrote:
             | That's what I meant by "open steps", I guess I wasn't clear
             | enough.
        
               | the8thbit wrote:
               | Is that what you meant? I don't think releasing the
               | sequence of steps required to produce the model satisfies
               | "open source", which is how I interpreted you, because
               | there is still no source code for the model.
        
           | Yizahi wrote:
           | They can't release training dataset if it was illegally
           | scrapped all over the web without permission :) (taps head)
        
         | bilsbie wrote:
         | Can't you do fine tuning on those binaries? That's a
         | modification.
        
           | the8thbit wrote:
           | You can fine tune the models, and you can modify binaries.
           | However, there is no human readable "source" to open in
           | either case. The act of "fine tuning" is essentially brute
           | forcing the system to gradually alter the weights such that
           | loss is reduced against a new training set. This limits what
           | you can actually do with the model vs an actual open source
           | system where you can understand how the system is working and
           | modify specific functionality.
           | 
           | Additionally, models can be (and are) fine tuned via APIs, so
           | if that is the threshold required for a system to be "open
           | source", then that would also make the GPT4 family and other
           | such API only models which allow finetuning open source.
        
             | whimsicalism wrote:
             | I don't find this argument super convincing.
             | 
             | There's a pretty clear difference between the 'finetuning'
             | offered via API by GPT4 and the ability to do whatever sort
             | of finetuning you want and get the weights at the end that
             | you can do with open weights models.
             | 
             | "Brute forcing" is not the correct language to use for
             | describing fine-tuning. It is not as if you are trying
             | weights randomly and seeing which ones work on your dataset
             | - you are following a gradient.
        
               | the8thbit wrote:
               | "There's a pretty clear difference between the
               | 'finetuning' offered via API by GPT4 and the ability to
               | do whatever sort of finetuning you want and get the
               | weights at the end that you can do with open weights
               | models."
               | 
               | Yes, the difference is that one is provided over a remote
               | API, and the provider of the API can restrict how you
               | interact with it, while the other is performed directly
               | by the user. One is a SaaS solution, the other is a
               | compiled solution, and neither are open source.
               | 
               | ""Brute forcing" is not the correct language to use for
               | describing fine-tuning. It is not as if you are trying
               | weights randomly and seeing which ones work on your
               | dataset - you are following a gradient."
               | 
               | Whatever you want to call it, this doesn't sound like
               | modifying functionality in source code. When I modify
               | source code, I might make a change, check what that does,
               | change the same functionality again, check the new
               | change, etc... up to maybe a couple dozen times. What I
               | don't do is have a very simple routine make very small
               | modifications to all of the system's functionality, then
               | check the result of that small change across the broad
               | spectrum of functionality, and repeat millions of times.
        
               | Kubuxu wrote:
               | The gap between fine-tuning API and weights-available is
               | much more significant than you give it credit for.
               | 
               | You can take the weights and train LoRAs (which is close
               | to fine-tuning), but you can also build custom adapters
               | on top (classification heads). You can mix models from
               | different fine-tunes or perform model surgery (adding
               | additional layers, attention heads, MoE).
               | 
               | You can perform model decomposition and amplify some of
               | its characteristics. You can also train multi-modal
               | adapters for the model. Prompt tuning requires weights as
               | well.
               | 
               | I would even say that having the model is more potent in
               | the hands of individual users than having the dataset.
        
               | thayne wrote:
               | That still doesn't make it open source.
               | 
               | There is a massive difference between a compiled binary
               | that you are allowed to do anything you want with,
               | including modifying it, building something else on top or
               | even pulling parts of it out and using in something else,
               | and a SaaS offering where you can't modify the software
               | at all. But that doesn't make the compiled binary open
               | source.
        
               | emporas wrote:
               | > When I modify source code, I might make a change, check
               | what that does, change the same functionality again,
               | check the new change, etc... up to maybe a couple dozen
               | times.
               | 
               | You can modify individual neurons if you are so inclined.
               | That's what Anthropic have done with the Claude family of
               | models [1]. You cannot do that using any closed model. So
               | "Open Weights" looks very much like "Open Source".
               | 
               | Techniques for introspection of weights are very
               | primitive, but i do think new techniques will be
               | developed, or even new architectures which will make it
               | much easier.
               | 
               | [1] https://www.anthropic.com/news/mapping-mind-language-
               | model
        
               | the8thbit wrote:
               | "You can modify individual neurons if you are so
               | inclined."
               | 
               | You can also modify a binary, but that doesn't mean that
               | binaries are open source.
               | 
               | "That's what Anthropic have done with the Claude family
               | of models [1]. ... Techniques for introspection of
               | weights are very primitive, but i do think new techniques
               | will be developed"
               | 
               | Yeah, I don't think what we have now is robust enough
               | interpretability to be capable of generating something
               | comparable to "source code", but I would like to see us
               | get there at some point. It might sound crazy, but a few
               | years ago the degree of interpretability we have today
               | (thanks in no small part to Anthropic's work) would have
               | sounded crazy.
               | 
               | I think getting to open sourcable models is probably
               | pretty important for producing models that actually do
               | what we want them to do, and as these models become more
               | powerful and integrated into our lives and production
               | processes the inability to make them do what we actually
               | want them to do may become increasingly dangerous.
               | Muddling the meaning of open source today to market your
               | product, then, can have troubling downstream effects as
               | focus in the open source community may be taken away from
               | interpretability and on distributing and tuning public
               | weights.
        
             | bilsbie wrote:
             | You make a good point but those are also just limitations
             | of the technology (or at least our current understanding of
             | it)
             | 
             | Maybe an analogy would help. A family spent generations
             | breeding the perfect apple tree and they decided to "open
             | source" it. What would open sourcing look like?
        
               | the8thbit wrote:
               | "You make a good point but those are also just
               | limitations of the technology (or at least our current
               | understanding of it)"
               | 
               | Yeah, that _is_ my point. Things that don 't have source
               | code can't be open source.
               | 
               | "Maybe an analogy would help. A family spent generations
               | breeding the perfect apple tree and they decided to "open
               | source" it. What would open sourcing look like?"
               | 
               | I think we need to be weary of dilemmas without solutions
               | here. For example, let's think about another analogy: I
               | was in a car accident last week. How can I open source my
               | car accident?
               | 
               | I don't think all, or even most things, are actually
               | "open sourcable". ML models could be open sourced, but it
               | would require a lot of work to interpret the models and
               | generate the source code from them.
        
         | jsheard wrote:
         | I also think that something like Chromium is a better analogy
         | for corporate open source models than a grassroots project like
         | Linux is. Chromium is technically open source, but Google has
         | absolute control over the direction of it's development and
         | realistically it's far too complex to maintain a fork without
         | Googles resources, just like Meta has complete control over
         | what goes into their open models, and even if they did release
         | all the training data and code (which they don't) us mere plebs
         | could never afford to train a fork from scratch anyway.
        
           | skybrian wrote:
           | I think you're right from the perspective of an individual
           | developer. You and I are not about to fork Chromium any time
           | soon. If you presume that forking is impractical then sure,
           | the right to fork isn't worth much.
           | 
           | But just because a single developer couldn't do it doesn't
           | mean it couldn't be done. It means nobody has organized a
           | large enough effort yet.
           | 
           | For something like a browser, which is critical for security,
           | you need both the organization and the trust. Despite
           | frequent criticism, Mozilla (for example) is still considered
           | pretty trustworthy in a way that an unknown developer can't
           | be.
        
             | Yizahi wrote:
             | If Microsoft can't do it, then we can reasonably conclude
             | that it can't be done for any practical purpose. Discussing
             | infinitesimal possibilities is better left to philosophers.
        
               | skybrian wrote:
               | Doesn't Microsoft maintain its own fork of Chromimum?
        
               | umbra07 wrote:
               | yes - their browser is chromium-based
        
         | candiddevmike wrote:
         | None of Meta's models are "open source" in the FOSS sense, even
         | the latest Llama 3.1. The license is restrictive. And no one
         | has bothered to release their training data either.
         | 
         | This post is an ad and trying to paint these things as
         | something they aren't.
        
           | JumpCrisscross wrote:
           | > _no one has bothered to release their training data_
           | 
           | If the FOSS community sets this as the benchmark for open
           | source in respect of AI, they're going to lose control of the
           | term. In most jurisdictions it would be illegal for the likes
           | of Meta to release training data.
        
             | exe34 wrote:
             | the training data is the source.
        
               | JumpCrisscross wrote:
               | > _the training data is the source_
               | 
               | Sure. But that's not going to be released. The term open
               | source AI cannot be expected to cover it because it's not
               | practical.
        
               | diggan wrote:
               | So because it's really hard to do proper Open Source with
               | these LLMs, means we need to change the meaning of Open
               | Source so it fits with these PR releases?
        
               | JumpCrisscross wrote:
               | > _because it 's really hard to do proper Open Source
               | with these LLMs, means we need to change the meaning of
               | Open Source so it fits with these PR releases?_
               | 
               | Open training data is hard to the point of
               | impracticality. It requires excluding private and
               | proprietary data.
               | 
               | Meanwhile, the term "open source" is massively popular.
               | So it will get used. The question is how.
               | 
               | Meta _et al_ would love for the choice to be between, on
               | one hand, open weights only, and, on the other hand, open
               | training data, because the latter is impractical. That
               | dichotomy guarantees that when someone says open source
               | AI they 'll mean open weights. (The way open source
               | software, today, generally means source available, not
               | FOSS.)
        
               | Palomides wrote:
               | source available is absolutely not the same as open
               | source
               | 
               | you are playing very loosely with terms that have
               | specific, widely accepted definitions (e.g.
               | https://opensource.org/osd )
               | 
               | I don't get why you think it would be useful to call LLMs
               | with published weights "open source"
        
               | JumpCrisscross wrote:
               | > _terms that have specific, widely accepted definitions_
               | 
               | OSF's definition is far from the only one [1].
               | Switzerland is currently implementing CH Open's
               | definition, the EU another one, _et cetera_.
               | 
               | > _I don 't get why you think it would be useful to call
               | LLMs with published weights "open source"_
               | 
               | I don't. I'm saying that if the choice is between open
               | weights or open weights + open training data, open
               | weights will win because the useful definition will
               | outcompete the pristine one in a public context.
               | 
               | [1] https://en.wikipedia.org/wiki/Open-
               | source_software#Definitio...
        
               | diggan wrote:
               | For the EU, I'm guessing you're talking about the EUPL,
               | which is FSF/OSI approved and GPL compatible, generally
               | considered copyleft.
               | 
               | For the CH Open, I'm not finding anything specific, even
               | from Swiss websites, could you help me understand what
               | you're referring to here?
               | 
               | I'm guessing that all these definitions have at least
               | some points in common, which involves (another guess) at
               | least being able to produce the output artifacts/binaries
               | by yourself, something that you cannot do with Llama,
               | just as an example.
        
               | JumpCrisscross wrote:
               | > _For the CH Open, I 'm not finding anything specific,
               | even from Swiss websites, could you help me understand
               | what you're referring to here_
               | 
               | Was on the _HN_ front page earlier [1][2]. The definition
               | comes strikingly close to source on request with no use
               | restrictions.
               | 
               | > _all these definitions have at least some points in
               | common_
               | 
               | Agreed. But they're all different. There isn't an
               | accepted defintiion of open source even when it comes to
               | software; there is an accepted set of broad principles.
               | 
               | [1] https://news.ycombinator.com/item?id=41047172
               | 
               | [2] https://joinup.ec.europa.eu/collection/open-source-
               | observato...
        
               | diggan wrote:
               | > Agreed. But they're all different. There isn't an
               | accepted defintiion of open source even when it comes to
               | software; there is an accepted set of broad principles.
               | 
               | Agreed, but are we splitting hairs here and is it
               | relevant to the claim made earlier?
               | 
               | > (The way open source software, today, generally means
               | source available, not FOSS.)
               | 
               | Do any of these principles or definitions from these orgs
               | agree/disagree with that?
               | 
               | My hypothesis is that they generally would go against
               | that belief and instead argue that open source is
               | different from source available. But I haven't looked
               | specifically to confirm if that's true or not, just a
               | guess.
        
               | JumpCrisscross wrote:
               | > _are we splitting hairs here and is it relevant to the
               | claim made earlier?_
               | 
               | I don't think so. Take the Swiss definition. Source on
               | request, not even available. Yet being branded and
               | accepted as open source.
               | 
               | (To be clear, the Swiss example favours FOSS. But it also
               | permits source on request and bundles them together under
               | the same label.)
        
               | Palomides wrote:
               | diluting open source into a marketing term meaning "you
               | can download something" would be a sad result
        
               | SquareWheel wrote:
               | > specific, widely accepted definitions
               | 
               | Realistically, nobody outside of Hacker News commenters
               | have ever cared about the OSD. It's just not how the term
               | is used colloquially.
        
               | Palomides wrote:
               | who says open source colloquially? ime anyone who doesn't
               | care about software licenses will just say free (per free
               | beer)
               | 
               | and (strong personal opinion) any software developer
               | should have a firm grip on the terminology and details
               | for legal reasons
        
               | SquareWheel wrote:
               | > who says open source colloquially?
               | 
               | There is a large span of people between gray beard
               | programmer and lay person, and many in that span have
               | some concept of open-source. It's often used synonymously
               | with visible source, free software, or in this case, open
               | weights.
               | 
               | It seems unfortunate - though expected - that over half
               | of the comments in this thread are debating the OSD for
               | the umpeenth time instead of discussing the actual model
               | release or accompanying news posts. Meanwhile communities
               | like /r/LocalLlama are going hog wild with this release
               | and already seeing what it can do.
               | 
               | > any software developer should have a firm grip on the
               | terminology and details for legal reasons
               | 
               | They'd simply need to review the terms of the license to
               | see if it fits their usage. It doesn't really matter if
               | the license satisfies the OSD or not.
        
               | diggan wrote:
               | > Open training data is hard to the point of
               | impracticality. It requires excluding private and
               | proprietary data.
               | 
               | Right, so the onus is on Facebook/Meta to get that right,
               | then they could call something Open Source, until then,
               | find another name that already doesn't have a specific
               | meaning.
               | 
               | > (The way open source software, today, generally means
               | source available, not FOSS.)
               | 
               | No, but it's going in that way. Open Source, today, still
               | means that the things you need to build a project, is
               | publicly available for you to download and run on your
               | own machine, granted you have the means to do so. What
               | you're thinking of is literally called "Source Available"
               | which is very different from "Open Source".
               | 
               | The intent of Open Source is for people to be able to
               | reproduce the work themselves, with modifications if they
               | want to. Is that something you can do today with the
               | various Llama models? No, because one core part of the
               | projects "source code" (what you need to reproduce it
               | from scratch), the training data, is being held back and
               | kept private.
        
               | unethical_ban wrote:
               | >Meanwhile, the term "open source" is massively popular.
               | So it will get used. The question is how.
               | 
               | Here's the source of the disagreement. You're justifying
               | the use of the term "open source" by saying it's logical
               | for Meta to want to use it for its popularity and layman
               | (incorrect) understanding.
               | 
               | Other person is saying it doesn't matter how convenient
               | it is or how much Meta wants to use it, that the term
               | "open source" is misleading for a product where the
               | "source" is the training data, _and_ the final product
               | has onerous restrictions on use.
               | 
               | This would be like Adobe giving Photoshop away for free,
               | but for personal use only and not for making ads for
               | Adobe's competitors. Sure, Adobe likes it and most users
               | may be fine with it, but it isn't open source.
               | 
               | >The way open source software, today, generally means
               | source available, not FOSS.
               | 
               | I don't agree with that. When a company says "open
               | source" but it's not free, the tech community is quick to
               | call it "source available" or "open core".
        
               | JumpCrisscross wrote:
               | > _You 're justifying the use of the term "open source"
               | by saying it's logical for Meta to want to use it for its
               | popularity and layman (incorrect) understanding_
               | 
               | I'm actually not a fan of Meta's definition. I'm arguing
               | specifically against an unrealistic definition, because
               | for practical purposes that cedes the term to Meta.
               | 
               | > _the term "open source" is misleading for a product
               | where the "source" is the training data, and the final
               | product has onerous restrictions on use_
               | 
               | Agree. I think the focus should be on the use
               | restrictions.
               | 
               | > _When a company says "open source" but it's not free,
               | the tech community is quick to call it "source available"
               | or "open core"_
               | 
               | This isn't consistently applied. It's why we have the
               | free vs open vs FOSS fracture.
        
               | plsbenice34 wrote:
               | Of course it could be practical - provide the data. The
               | fact of that society is a dystopian nightmare controlled
               | by a few megacorporations that don't want free
               | information does not justify outright changing the
               | meaning of the language.
        
               | JumpCrisscross wrote:
               | > _provide the data_
               | 
               | Who? It's not their data.
        
               | tintor wrote:
               | Meta can call it something else other than open source.
               | 
               | Synthetic part of the training data could be released.
        
               | JimDabell wrote:
               | I don't think it's that simple. The source is "the
               | preferred form of the work for making modifications to
               | it" (to use the GPL's wording).
               | 
               | For an LLM, that's not the training data. That's the
               | model itself. You don't make changes to an LLM by going
               | back to the training data and making changes to it, then
               | re-running the training. You update the model itself with
               | more training data.
               | 
               | You can't even use the training code and original
               | training data to reproduce the existing model. A lot of
               | it is non-deterministic, so you'll get different results
               | each time anyway.
               | 
               | Another complication is that the object code for normal
               | software is a clear derivative work of the source code.
               | It's a direct translation from one form to another. This
               | isn't the case with LLMs and their training data. The
               | models learn from it, but they aren't simply an
               | alternative form of it. I don't think you can describe an
               | LLM as a derivative work of its training data. It learns
               | from it, it isn't a copy of it. This is mostly the reason
               | why distributing training data is infeasible - the
               | model's creator may not have the license to do so.
               | 
               | Would it be extremely useful to have the original
               | training data? Definitely. Is distributing it the same as
               | distributing source code for normal software? I don't
               | think so.
               | 
               | I think new terminology is needed for open AI models. We
               | can't simply re-use what works for human-editable code
               | because it's a fundamentally different type of thing with
               | different technical and legal constraints.
        
               | root_axis wrote:
               | No. It's an asset used in the training process, the
               | source code can process arbitrary training data.
        
               | sangnoir wrote:
               | We've had a similar debate before, but the last time it
               | about whether Linux device drivers based on non-public
               | datasheets under NDA were actually open source. This
               | debate occurred again over drivers that interact with
               | binary blobs.
               | 
               | I disagree with the purists - if you can _legally_ change
               | the source or weights - even without having access to the
               | data used by the upstream authors - it 's open enough for
               | me. YMMV.
        
               | wrs wrote:
               | I don't think even that is true. I conjecture that
               | Facebook couldn't reproduce the model weights if they
               | started over with the same training data, because I doubt
               | such a huge training run is a reproducible deterministic
               | process. I don't think _anyone_ has "the" source.
        
               | exe34 wrote:
               | numpy.random.seed(1234)
        
             | mesebrec wrote:
             | Regardless of the training data, the license even heavily
             | restricts how you can use the model.
             | 
             | Please read through their "acceptable use" policy before
             | you decide whether this is really in line with open source.
        
               | JumpCrisscross wrote:
               | > _Please read through their "acceptable use" policy
               | before you decide whether this is really in line with
               | open source_
               | 
               | I'm not taking a specific posiion on this license. I
               | haven't read it closely. My broad point is simply that
               | open source AI, as a term, cannot practically require the
               | training data be made available.
        
         | causal wrote:
         | "Open weights" is a more appropriate term but I'll point out
         | that these weights are also largely inscrutable to the people
         | with the code that trained it. And for licensing reasons, the
         | datasets may not be possible to share.
         | 
         | There is still a lot of modifying you can do with a set of
         | weights, and they make great foundations for new stuff, but
         | yeah we may never see a competitive model that's 100% buildable
         | at home.
         | 
         | Edit: mkolodny points out that the model code is shared (under
         | llama license at least), which is really all you need to run
         | training https://github.com/meta-
         | llama/llama3/blob/main/llama/model.p...
        
           | aerzen wrote:
           | LLAMA is an open-weights model. I like this term, let's use
           | that instead of open source.
        
           | stavros wrote:
           | "Open weights" means you can use the weights for free (as in
           | beer). "Open source" means you get the training dataset and
           | the methodology. ~Nobody does open source LLMs.
        
             | _heimdall wrote:
             | Why is the dataset required for it to be open source?
             | 
             | If I self host a project that is open sourced rather than
             | paying for a hosted version, like Sentry.io for example, I
             | don't expect data to come along with the code. Licensing
             | rights are always up for debate in open source, but I
             | wouldn't expect more than the code to be available and
             | reviewable for anything needed to build and run the
             | project.
             | 
             | In the case of an LLM I would expect that to mean the code
             | run to train the model, the code for the model data
             | structure itself, and the control code for querying the
             | model should all be available. I'm not actually sure if
             | Meta does share all that, but training data is separate
             | from open source IMO.
        
               | solarmist wrote:
               | The sticking point is you can't build the model. To be
               | able to build the model from scratch you need methodology
               | and a complete description of the data set.
               | 
               | They only give you a blob of data you can run.
        
               | _heimdall wrote:
               | Got it, that makes sense. I still wouldn't expect them to
               | have to publicly share the data itself, but if you can't
               | take the code they share and run it against your own data
               | to build a model that wouldn't be open source in my
               | understanding of it.
        
               | stavros wrote:
               | Data is to models what code is to software.
        
               | gowld wrote:
               | https://opensource.org/osd
               | 
               | "The source code must be the preferred form in which a
               | programmer would modify the program. Deliberately
               | obfuscated source code is not allowed. Intermediate forms
               | such as the output of a preprocessor or translator are
               | not allowed."
               | 
               | > In the case of an LLM I would expect that to mean the
               | code run to train the model, the code for the model data
               | structure itself, and the control code for querying the
               | model should all be available
               | 
               | The M in LLM is for "Model".
               | 
               | The code you describe is for an LLM _harness_ , not for
               | an LLM. The code for the _LLM_ is whatever is needed to
               | enable a developer to _modify_ to inputs and then build a
               | modified output LLM (minus standard generally available
               | tools not custom-created for that product).
               | 
               | Training data is one way to provide this. Another way is
               | some sort of semantic model editor for an interpretable
               | model.
        
             | blackeyeblitzar wrote:
             | There is a comment elsewhere claiming there are a few dozen
             | fully open source models:
             | https://news.ycombinator.com/item?id=41048796
        
             | sigmoid10 wrote:
             | >Nobody does open source LLMs.
             | 
             | There are a bunch of independent, fully open source
             | foundation models from companies that share everything
             | (including all data). AMBER and MAP-NEO for example. But we
             | have yet to see one in the 100B+ parameter category.
        
               | stavros wrote:
               | Sorry, the tilde before "nobody" is my notation for
               | "basically nobody" or "almost nobody". I thought it was
               | more common.
        
         | mkolodny wrote:
         | Llama's code is open source: https://github.com/meta-
         | llama/llama3/blob/main/llama/model.p...
        
           | apsec112 wrote:
           | That's not the _training_ code, just the inference code. The
           | training code, running on thousands of high-end H100 servers,
           | is surely much more complex. They also don 't open-source the
           | dataset, or the code they used for data
           | scraping/filtering/etc.
        
             | the8thbit wrote:
             | "just the inference code"
             | 
             | It's not the "inference code", its the code that specifies
             | the architecture of the model and loads the model. The
             | "inference code" is mostly the model, and the model is not
             | legible to a human reader.
             | 
             | Maybe someday open source models will be possible, but we
             | will need much better interpretability tools so we can
             | generate the source code from the model. In most software
             | projects you write the source as a specification that is
             | then used by the computer to implement the software, but in
             | this case the process is reversed.
        
           | blackeyeblitzar wrote:
           | That is just the inference code. Not training code or
           | evaluation code or whatever pre/post processing they do.
        
             | patrickaljord wrote:
             | Is there an LLM with actual open source training code and
             | dataset? Besides BLOOM
             | https://huggingface.co/bigscience/bloom
        
               | osanseviero wrote:
               | Yes, there are a few dozen full open source models
               | (license, code, data, models)
        
               | blackeyeblitzar wrote:
               | What are some of the other ones? I am aware mainly of
               | OLMo (https://blog.allenai.org/olmo-open-language-
               | model-87ccfc95f5...)
        
               | navinsylvester wrote:
               | Here you go - https://github.com/apple/corenet
        
           | mesebrec wrote:
           | This is like saying any python program is open source because
           | the python runtime is open source.
           | 
           | Inference code is the runtime; the code that runs the model.
           | Not the model itself.
        
             | mkolodny wrote:
             | I disagree. The file I linked to, model.py, contains the
             | Llama 3 model itself.
             | 
             | You can use that model with open data to train it from
             | scratch yourself. Or you can load Meta's open weights and
             | have a working LLM.
        
               | causal wrote:
               | Yeah a lot of people here seem to not understand that
               | PyTorch really does make model definitions that simple,
               | and that has everything you need to resume back-
               | propagation. Not to mention PyTorch itself being open-
               | sourced by Meta.
               | 
               | That said the LLama-license doesn't meet strict
               | definitions of OS, and I bet they have internal tooling
               | for datacenter-scale training that's not represented
               | here.
        
               | yjftsjthsd-h wrote:
               | > The file I linked to, model.py, contains the Llama 3
               | model itself.
               | 
               | That makes it source available (
               | https://en.wikipedia.org/wiki/Source-available_software
               | ), not open source
        
               | macrolime wrote:
               | Source available means you can see the source, but not
               | modify it. This is kinda the opposite, you can modify the
               | model, but you don't see all the details of its creation.
        
           | Flimm wrote:
           | No, it's not. The Llama 3 Community License Agreement is not
           | an open source license. Open source licenses need to meet the
           | criteria of the only widely accepted definition of "open
           | source", and that's the one formulated by the OSI [0]. This
           | license has multiple restrictions on use and distribution
           | which make it not open source. I know Facebook keeps calling
           | this stuff open source, maybe in order to get all the good
           | will that open source branding gets you, but that doesn't
           | make it true. It's like a company calling their candy vegan
           | while listing one its ingredients as pork-based gelatin. No
           | matter how many times the company advertises that their
           | product is vegan, it's not, because it doesn't meet the
           | definition of vegan.
           | 
           | [0] - https://opensource.org/osd
        
             | CamperBob2 wrote:
             | _Open source licenses need to meet the criteria of the only
             | widely accepted definition of "open source", and that's the
             | one formulated by the OSI [0]_
             | 
             | Who died and made OSI God?
        
               | vbarrielle wrote:
               | The OSI was created about 20 years ago and defined and
               | popularized the term open source. Their definition has
               | been widely accepted over that period.
               | 
               | Recently, companies are trying to market things as open
               | source when in reality, they fail to adhere to the
               | definition.
               | 
               | I think we should not let these companies change the
               | meaning of the term, which means it's important to
               | explain every time they try to seem more open than they
               | are.
               | 
               | I'm afraid the battle is being lost though.
        
               | Suppafly wrote:
               | >The OSI was created about 20 years ago and defined and
               | popularized the term open source. Their definition has
               | been widely accepted over that period.
               | 
               | It was defined and accepted by the community well before
               | OSI came around though.
        
               | MaxBarraclough wrote:
               | This isn't helpful. The community defers to the OSI's
               | definition because it captures what they care about.
               | 
               | We've seen people try to deceptively describe non-OSS
               | projects as open source, and no doubt we will continue to
               | see it. Thankfully the community (including Hacker News)
               | is quick to call it out, and to insist on not cheapening
               | the term.
               | 
               | This is one the topics that just keeps turning up:
               | 
               | * https://news.ycombinator.com/item?id=24483168
               | 
               | * https://news.ycombinator.com/item?id=31203209
               | 
               | * https://news.ycombinator.com/item?id=36591820
        
             | 8note wrote:
             | Isn't the MIT license the generally accepted "open source"
             | license? It's a community owned term, not OSI owned
        
               | henryfjordan wrote:
               | There are more licenses than just MIT that are "open
               | source". GPL, BSD, MIT, Apache, some of the Creative
               | Commons licenses, etc. MIT has become the defacto default
               | though
               | 
               | https://opensource.org/license (linking to OSI for the
               | list because it's convenient, not because they get to
               | decide)
        
               | yjftsjthsd-h wrote:
               | MIT is _a_ permissive open source license, not _the_ open
               | source license.
        
         | stale2002 wrote:
         | Ok call it Open Weights then if the dictionary definitions
         | matter so much to you.
         | 
         | The actual point that matters is that these models are
         | available for most people to use for a lot of stuff, and this
         | is way way better than what competitors like OpenAI offer.
        
           | the8thbit wrote:
           | They don't "[allow] developers to modify its code however
           | they want", which is a critical component of "open source",
           | and one that Meta is clearly trying to leverage in branding
           | around its products. I would like _them_ to start calling
           | these  "public weight models", because what they're doing now
           | is muddying the waters so much that "open source" now just
           | means providing an enormous binary and an open source harness
           | to run it in, rather than serving access to the same binary
           | via an API.
        
             | Voloskaya wrote:
             | Feels a bit like you are splitting hair for the pleasure of
             | semantic arguments to be honest. Yes there are no source in
             | ML, so if we want to be pedantic it shouldn't be called
             | open source. But what really matters in the open source
             | movement is that we are able to take a program built by
             | someone and modify it to do whatever we want with it,
             | without having to ask someone for permission or get
             | scrutinized or have to pay someone.
             | 
             | The same applies here, you can take those models and modify
             | them to do whatever you want (provided you know how to
             | train ML models), without having to ask for permission, get
             | scrutinized or pay someone.
             | 
             | I personally think using the term open source is fine, as
             | it conveys the intent correctly, even if, yes, weights are
             | not sources you can read with your eyes.
        
               | wrs wrote:
               | Calling that "open source" renders the word "source"
               | meaningless. By your definition, I can release a binary
               | executable freely and call it "open source" because you
               | can modify it to do whatever you want.
               | 
               | Model weights are like a binary that _nobody_ has the
               | source for. We need another term.
        
               | Voloskaya wrote:
               | No it's not the same as releasing a binary, feels like we
               | can't get out of the pedantics. I can in theory modify a
               | binary to do whatever I want. In practice it is
               | intractably hard to make any significant modification to
               | a binary, and even if you could, you would then not be
               | legally allowed to e.g. redistribute.
               | 
               | Here, modifying that model is not harder that doing
               | regular ML, and I can redistribute.
               | 
               | Meta doesn't have access to some magic higher level
               | abstraction for that model that would make working with
               | it easier that they did not release.
               | 
               | The sources in ML are the architecture the training and
               | inference code and a paper describing the training
               | procedure. It's all there.
        
             | bornfreddy wrote:
             | "Public weight models" sounds about right, thanks for
             | coming up with a good term! Hope it catches.
        
         | input_sh wrote:
         | Open Source Initiative (kind of a de-facto authority on what's
         | open source and what not) is spending a whole lot of time
         | figuring out what it means for an AI system to be open source.
         | In other words, they're basically trying to come up with a new
         | license because the existing ones can't easily apply.
         | 
         | I believe this is the current draft:
         | https://opensource.org/deepdive/drafts/the-open-source-ai-de...
        
           | downWidOutaFite wrote:
           | OSI made themselves the authority because they hated Richard
           | Stallman and his Free Software movement. It's just marketing.
        
         | Zambyte wrote:
         | > If so, then how can current ML models be open source?
         | 
         | The source of a language model is the text it was trained on.
         | Llama models are not open source (contrary to their claims),
         | they are open weight.
        
           | thayne wrote:
           | I think it would also include the code used to train it
        
           | moffkalast wrote:
           | You can find the entire Llama 3.0 pretraining set here:
           | https://huggingface.co/datasets/HuggingFaceFW/fineweb
           | 
           | 15T tokens, 45 terrabytes. Seems fairly open source to me.
        
             | Zambyte wrote:
             | Where has Facebook linked that? I can't find anywhere that
             | they actually published that.
        
       | Oras wrote:
       | This is obviously good news, but __personally__ I feel the open-
       | source models are just trying to catch up with whoever the market
       | leader is, based on some benchmarks.
       | 
       | The actual problem is running these models. Very few companies
       | can afford the hardware to run these models privately. If you run
       | them in the cloud, then I don't see any potential financial gain
       | for any company to fine-tune these huge models just to catch up
       | with OpenAI or Anthropic, when you can probably get a much better
       | deal by fine-tuning the closed-source models.
       | 
       | Also this point:
       | 
       | > We need to protect our data. Many organizations handle
       | sensitive data that they need to secure and can't send to closed
       | models over cloud APIs.
       | 
       | First, it's ironic that Meta is talking about privacy. Second,
       | most companies will run these models in the cloud anyway. You can
       | run OpenAI via Azure Enterprise and Anthropic on AWS Bedrock.
        
         | simonw wrote:
         | "Very few companies can afford the hardware to run these models
         | privately."
         | 
         | I can run Llama 3 70B on my (64GB RAM M2) laptop. I haven't
         | tried 3.1 yet but I expect to be able to run that 70B model
         | too.
         | 
         | As for the 405B model, the Llama 3.1 announcement says:
         | 
         | > To support large-scale production inference for a model at
         | the scale of the 405B, we quantized our models from 16-bit
         | (BF16) to 8-bit (FP8) numerics, effectively lowering the
         | compute requirements needed and allowing the model to run
         | within a single server node.
        
       | InDubioProRubio wrote:
       | CrowdStrike just added "Centralized Company Controlled Software
       | Ecosystem" to every risk data sheet on the planet. Everything
       | futureproof is self-hosted and open source.
        
       | mesebrec wrote:
       | Note that Meta's models are not open source in any interpretation
       | of the term.
       | 
       | * You can't use them for any purpose. For example, the license
       | prohibits using these models to train other models. * You can't
       | meaningfully modify them given there is almost no information
       | available about the training data, how they were trained, or how
       | the training data was processed.
       | 
       | As such, the model itself is not available under an open source
       | license and the AI does not comply with the "open source AI"
       | definition by OSI.
       | 
       | It's an utter disgrace for Meta to write such a blogpost patting
       | themselves on the back while lying about how open these models
       | are.
        
         | ChadNauseam wrote:
         | > you can't meaningfully modify them given there is almost no
         | information available about the training data, how they were
         | trained, or how the training data was processed.
         | 
         | I was under the impression that you could still fine-tune the
         | models or apply your own RLHF on top of them. My understanding
         | is that the training data would mostly be useful for training
         | the model yourself from scratch (possibly after modifying the
         | training data), which would be extremely expensive and out of
         | reach for most people
        
           | mesebrec wrote:
           | Indeed, fine-tuning is still possible, but you can only go so
           | far with fine-tuning before you need to completely retrain
           | the model.
           | 
           | This is why Silo AI, for example, had to start from scratch
           | to get better support for small European languages.
        
           | chasd00 wrote:
           | From what i understand the training data and careful curation
           | of it is the hard part. Everyone wants training data sets to
           | train their own models instead of producing their own.
        
         | causal wrote:
         | You are definitely allowed to train other models with these
         | models, you just have to give credit in the name, per the
         | license:
         | 
         | > If you use the Llama Materials or any outputs or results of
         | the Llama Materials to create, train, fine tune, or otherwise
         | improve an AI model, which is distributed or made available,
         | you shall also include "Llama" at the beginning of any such AI
         | model name.
        
           | mesebrec wrote:
           | Indeed, this is something they changed in the 3.1 version of
           | the license.
           | 
           | Regardless, the license [1] still has many restrictions, such
           | as the acceptable use policy [2].
           | 
           | [1] https://huggingface.co/meta-llama/Meta-
           | Llama-3.1-8B/blob/mai...
           | 
           | [2] https://llama.meta.com/llama3_1/use-policy
        
       | tw04 wrote:
       | >In the early days of high-performance computing, the major tech
       | companies of the day each invested heavily in developing their
       | own closed source versions of Unix.
       | 
       | Because they sold the resultant code and systems built on it for
       | money... this is the gold miner saying that all shovels and jeans
       | should be free.
       | 
       | Am I happy Facebook open sources some of their code? Sure, I
       | think it's good for everyone. Do I think they're talking out of
       | both sides of their mouth? Absolutely.
       | 
       | Let me know when Facebook opens up the entirety of their Ad and
       | Tracking platforms and we can start talking about how it's silly
       | for companies to keep software closed.
       | 
       | I can say with 100% confidence if Facebook were selling their AI
       | advances instead of selling the output it produces, they wouldn't
       | be advocating for everyone else to open source their stacks.
        
         | JumpCrisscross wrote:
         | > _if Facebook were selling their AI advances instead of
         | selling the output it produces, they wouldn 't be advocating
         | for everyone else to open source their stack_
         | 
         | You're acting as if commoditizing one's complements is either
         | new or reprehensible [1].
         | 
         | [1] https://gwern.net/complement
        
           | tw04 wrote:
           | >You're acting as if commoditizing one's complements is
           | either new or reprehensible [1].
           | 
           | I'm acting as if calling on other companies to open source
           | their core product, just because it's a complement for you,
           | and acting as if it's for the benefit of mankind is
           | disingenuous, which it is.
        
             | stale2002 wrote:
             | > as if it's for the benefit of mankind
             | 
             | But it does benefit mankind.
             | 
             | More free tech products is good for the world.
             | 
             | This is a good thing. When people or companies do good
             | things, they should get the credit for doing good things.
        
             | JumpCrisscross wrote:
             | > _acting as if it 's for the benefit of mankind is
             | disingenuous, which it is_
             | 
             | Is it bad for mankind that Meta publishes its weights?
             | Mutually beneficial is a valid game state--there is no
             | moral law that requires anything good be made as a
             | sacrifice.
        
         | rvnx wrote:
         | The source-code to Ad tracking platform is useless to users.
         | 
         | At the end, it's actually Facebook doing the right thing
         | (though they are known for being evil).
         | 
         | It's a bit of an irony.
         | 
         | The supposedly "good" and "open" people like Google or OpenAI,
         | haven't given their model weights.
         | 
         | A bit like Microsoft became the company that actually supports
         | the whole open-source ecosystem with GitHub.
        
           | tw04 wrote:
           | >The source-code to Ad tracking platform is useless to users.
           | 
           | It's absolutely not useless for developers looking to build a
           | competing project.
           | 
           | >The supposedly "good" and "open" people like Google or
           | OpenAI, haven't given their model weights.
           | 
           | Because they're monetizing it... the only reason Facebook is
           | giving it away is because it's a complement to their core
           | product of selling ads. If they were monetizing it, it would
           | be closed source. Just like their Ads platform...
        
       | abetusk wrote:
       | Another case of "open-washing". Llama is not available open
       | source, under the common definition of open source, as the
       | license doesn't allow for commercial re-use by default [0].
       | 
       | They provide their model, with weights and code, as "source
       | available" and it looks like they allow for commercial use until
       | a 700M monthly subscriber cap is surpassed. They also don't allow
       | you to train other AI models with their model:
       | 
       | """ ... v. You will not use the Llama Materials or any output or
       | results of the Llama Materials to improve any other large
       | language model (excluding Meta Llama 3 or derivative works
       | thereof). ... """
       | 
       | [0] https://github.com/meta-llama/llama3/blob/main/LICENSE
        
         | sillysaurusx wrote:
         | They cannot legally enforce this, because they don't have the
         | rights to the content they trained it on. Whoever's willing to
         | fund that court battle would likely win.
         | 
         | There's a legal precedent that says hard work alone isn't
         | enough to guarantee copyright, i.e. it doesn't matter that it
         | took millions of dollars to train.
        
         | whimsicalism wrote:
         | i think these clauses are unenforceable. it's telling that OAI
         | hasn't tried a similar suit despite multiple extremely well-
         | known cases of competitors training on OAI outputs
        
       | nuz wrote:
       | Everyone complaining about not having data access: Remember that
       | without meta you would have openai and anthropic and that's it.
       | I'm really thankful they're releasing this, and the reason they
       | can't release the data is obvious.
        
         | mesebrec wrote:
         | Without Meta, you would still have Mistral, Silo AI, and the
         | many other companies and labs producing much more open models
         | with similar performance.
        
       | Invictus0 wrote:
       | The irony of this letter being written by Mark Zuckerburg at
       | Meta, while OpenAI continues to be anything but open, is richer
       | than anyone could have imagined.
        
       | 1024core wrote:
       | "open source AI" ... "open" ... "open" ....
       | 
       | And you can't even try it without an FB/IG account.
       | 
       | Zuck will never change.
        
         | causal wrote:
         | I think you can use an HF account as well
         | https://huggingface.co/meta-llama
        
           | Gracana wrote:
           | You can also wait a bit for someone to upload quantized
           | variants, finetunes, etc, and download those. FWIW I'm not
           | making a claim about the legality of that, just saying it's
           | an easy way around needing to sign the agreement.
        
         | CamperBob2 wrote:
         | It doesn't require an account. You do have to fill in your name
         | and email (and birthdate, although it seems to accept whatever
         | you feed it.)
        
       | mvkel wrote:
       | It's a real shame that we're still calling Llama "open source"
       | when at best it's "open weights."
       | 
       | Not that anyone would go buy 100,000 H100s to train their own
       | Llama, but words matter. Definitions matter.
        
         | sidcool wrote:
         | Honest question. As far as LLMs are concerned, isn't open
         | weights same as open source?
        
           | mesebrec wrote:
           | Open source requires, at the very least, that you can use it
           | for any purpose. This is not the case with Llama.
           | 
           | The Llama license has a lot of restrictions, based on user
           | base size, type of use, etc.
           | 
           | For example you're not allowed to use Llama to train or
           | improve other models.
           | 
           | But it goes much further than that. The government of India
           | can't use Llama because they're too large. Sex workers are
           | not allowed to use Llama due to the acceptable use policy of
           | the license. Then there is also the vague language
           | probibiting discrimination, racism etc.. good luck getting
           | something like that approved by your legal team.
        
           | aloe_falsa wrote:
           | GPL defines the "source code" of a work as the preferred form
           | of the work for making modifications to it. If Meta released
           | a petabyte of raw training data, would that really be easier
           | to extend and adapt (as opposed to fine-tuning the weights)?
        
           | paulhilbert wrote:
           | No, I would argue that from the three main ingredients -
           | training data, model source code and weights - weights are
           | the furthest away from something akin to source code.
           | 
           | They're more like obfuscated binaries. When it comes to fine-
           | tuning only however things shift a little bit, yes.
        
         | lolinder wrote:
         | Source versus weights seems like a really pedantic distinction
         | to make. As you say, the training code and training data would
         | be worthless to anyone who doesn't have compute on the level
         | that Meta does. Arguably, the weights are source code
         | interpreted by an inference engine, and realistically it's the
         | weights that someone is going to want to modify through fine-
         | tuning, not the original training code and data.
         | 
         | The far more important distinction is "open" versus "not open",
         | and I disagree that we should cede that distinction while
         | trying to fight for "source". The Llama license is restrictive
         | in a number of ways (it incorporates an entire acceptable use
         | policy) that make it most definitely not "open" in the
         | customary sense.
        
           | mvkel wrote:
           | > training code and training data would be worthless to
           | anyone who doesn't have compute on the level that Meta does
           | 
           | I don't fully agree.
           | 
           | Isn't that like saying *nix being open source is worthless
           | unless you're planning to ship your own Linux distro?
           | 
           | Knowing how the sausage is made is important if you're an
           | animal rights activist.
        
           | JamesBarney wrote:
           | https://llama.meta.com/llama3_1/use-policy/
           | 
           | The acceptable use policy is seems fine. Don't use it to
           | break the law, solicit sex, kill people, or lie.
        
             | lolinder wrote:
             | It's fine in that I'm happy to use it and don't think I'll
             | be breaking the terms anytime soon. It's not fine in that
             | one of the primary things that makes open source open is
             | that an open source license doesn't restrict groups of
             | people or whole fields from usage of the software. The
             | policy has a number of such blanket bans on industries,
             | which, while reasonable, make the license not truly open.
        
       | rybosworld wrote:
       | Huge companies like facebook will often argue for solutions that
       | on the surface, seem to be in the public interest.
       | 
       | But I have strong doubts they (or any other company) actually
       | believe what they are saying.
       | 
       | Here is the reality:
       | 
       | - Facebook is spending untold billions on GPU hardware.
       | 
       | - Facebook is arguing in favor of open sourcing the models, that
       | they spent billions of dollars to generate, for free...?
       | 
       | It follows that companies with much smaller resources (money)
       | will not be able to match what Facebook is doing. Seems like an
       | attempt to kill off the competition (specifically, smaller
       | organizations) before they can take root.
        
         | Salgat wrote:
         | The reason for Meta making their model open source is rather
         | simple: They receive an unimaginable amount of free labor, and
         | their license only excludes their major competitors to ensure
         | mass adoption without benefiting their competition (Microsoft,
         | Google, Alibaba, etc). Public interest, philanthropy, etc are
         | just nice little marketing bonuses as far as they're concerned
         | (otherwise they wouldn't be including this licensing
         | restriction).
        
           | noiseinvacuum wrote:
           | All correct, Meta does obviously benefit.
           | 
           | It's helpful to also look at what do the developers and
           | companies (everyone outside of top 5/10 big tech companies)
           | get out of this. They get open access to weights of SOTA LLM
           | models that take billions of dollars to train and 10s of
           | billions a year to run the AI labs that make these. They get
           | the freedom to fine tune them, to distill them, and to host
           | them on their own hardware in whatever way works best for
           | their products and services.
        
         | mattnewton wrote:
         | I actually think this is one of the rare times where the small
         | guys interests are aligned with Meta. Meta is scared of a world
         | where they are locked out of LLM platforms, one where OpenAI
         | gets to dictate rules around their use of the platform much
         | like Apple and Google dictates rules around advertiser data and
         | monetization on their mobile platforms. Small developers should
         | be scared of a world where the only competitive LLMs are owned
         | by those players too.
         | 
         | Through this lense, Meta's actions make more sense to me. Why
         | invest billions in VR/AR? The answer is simple, don't get
         | locked out of the next platform, maybe you can own the next
         | one. Why invest in LLMs? Again, don't get locked out. Google
         | and OpenAi/Microsoft are far larger and ahead of Meta right now
         | and Meta genuinely believes the best way to make sure they have
         | an LLM they control is to make everyone else have an LLM they
         | can control. That way community efforts are unified around
         | their standard.
        
           | mupuff1234 wrote:
           | Sure, but don't you think the "not getting locked out" is
           | just the pre-requisite for their eventual goal of locking
           | everyone else out?
        
             | yesco wrote:
             | Does it really matter? Attributing goodwill to a company is
             | like attributing goodwill to a spider that happens to clean
             | up the bugs in your basement. Sure if they had the ability
             | to, I'm confident Meta would try something like that, but
             | they obviously don't, and will not for the foreseeable
             | future.
             | 
             | I have faith they will continue to do what's in their best
             | interests and if their best interests happen to align with
             | mine, then I will support that. Just like how I don't
             | bother killing the spider in my basement because it helps
             | clean up the other bugs.
        
               | mupuff1234 wrote:
               | But you also know that the spider has been laying eggs so
               | you better have an extermination plan ready.
        
             | noiseinvacuum wrote:
             | If by "everyone else" here you mean 3 or 4 large players
             | trying to create a regulatory moat around themselves then I
             | am fine with them getting locked out and not being able to
             | create a moat for next 3 decades.
        
           | myaccountonhn wrote:
           | > I actually think this is one of the rare times where the
           | small guys interests are aligned with Meta
           | 
           | Small guys are the ones being screwed over by AI companies
           | and having their text/art/code stolen without any attribution
           | or adherence to license. I don't think Meta is on their side
           | at all
        
             | MisterPea wrote:
             | That's a separate problem which affects small to large
             | players alike (e.g. ScarJo).
             | 
             | Small companies interests are aligned with Meta as they are
             | now on an equal footing with large incumbent players. They
             | can now compete with a similarly sized team at a big tech
             | company instead of that team + dozens of AI scientists
        
         | ketzo wrote:
         | Meta is, fundamentally, a user-generated-content distribution
         | company.
         | 
         | Meta wants to make sure they commoditize their complements:
         | they don't want a world where OpenAI captures all the value of
         | content generation, they want the cost of producing the best
         | content to be as close to free as possible.
        
           | chasd00 wrote:
           | i was thinking along the same. A lot of content generated by
           | LLMs is going to end up on Facebook or Instagram. The easier
           | it is to create AI generated content the more content ends up
           | on those applications.
        
           | Nesco wrote:
           | Especially because genAI is a copyright laundering system.
           | You can train it on copyrighted material and none of the
           | content generated with it are copyright-able, which is
           | perfect for social apps
        
         | KaiserPro wrote:
         | The model it's self isn't actually that valuable to facebook.
         | The thing that's important is the dataset, the infrastructure
         | and the people to make the models.
         | 
         | There is still, just about, a strong ethos( especially in the
         | research teams) to chuck loads of stuff over the wall into
         | opensource. (pytorch, detectron, SAM, aria etc)
         | 
         | but its seen internally as a two part strategy:
         | 
         | 1) strong recruitment tool (come work with us, we've done cool
         | things, and you'll be able to write papers)
         | 
         | 2) seeding the research community with a common toolset.
        
       | jorblumesea wrote:
       | Cynically I think this position is largely due to how they can
       | undercut OpenAI's moat.
        
         | wayeq wrote:
         | It's not cynical, it's just an awareness that public companies
         | have a fiduciary duty to their shareholders.
        
       | cs702 wrote:
       | _> We're releasing Llama 3.1 405B, the first frontier-level open
       | source AI model, as well as new and improved Llama 3.1 70B and 8B
       | models._
       | 
       |  _Bravo!_ While I don 't agree with Zuck's views and actions on
       | many fronts, on this occasion I think he and the AI folks at Meta
       | deserve our praise and gratitude. With this release, they have
       | brought the cost of pretraining a frontier 400B+ parameter model
       | to ZERO for pretty much everyone -- well, everyone _except_ Meta
       | 's key competitors.[a] THANK YOU ZUCK.
       | 
       | Meanwhile, the business-minded people at Meta surely won't mind
       | if the release of these frontier models to the public happens to
       | completely mess up the AI plans of competitors like
       | OpenAI/Microsoft, Google, Anthropic, etc. Come to think of it,
       | the negative impact on such competitors was likely a key
       | motivation for releasing the new models.
       | 
       | ---
       | 
       | [a] The license is not open to the handful of companies worldwide
       | which have more than 700M users.
        
         | swyx wrote:
         | > the AI folks at Meta deserve our praise and gratitude
         | 
         | We interviewed Thomas who led Llama 2 and 3 post training here
         | in case you want to hear from someone closer to the ground on
         | the models https://www.latent.space/p/llama-3
        
         | throwaway_2494 wrote:
         | > We're releasing Llama 3.1 405B
         | 
         | Is it possible to run this with ollama?
        
           | jessechin wrote:
           | Sure, if you have a H100 cluster. If you quant it to int4 you
           | might get away with using only 4 H100 GPUs!
        
             | sheepscreek wrote:
             | Assuming $25k a pop, that's at least $100k in just the GPUs
             | alone. Throw in their linking technology (NVLink) and cost
             | for the remaining parts, won't be surprised if you're
             | looking at $150k for such a cluster. Which is not bad to be
             | honest, for something at this scale.
             | 
             | Can anyone share the cost of their pre-built clusters,
             | they've recently started selling? (sorry feeling lazy to
             | research atm, I might do that later when I have more time).
        
               | rty32 wrote:
               | You can rent H100 GPUs.
        
               | tomp wrote:
               | you're about right.
               | 
               | https://smicro.eu/nvidia-
               | hgx-h100-640gb-935-24287-0001-000-1
               | 
               | 8x H100 HGX cluster for EUR250k + VAT
        
           | vorticalbox wrote:
           | If you have the ram for it.
           | 
           | Ollama will offload as many layers as it can to the gpu then
           | the rest will run on the cpu/ram.
        
         | tambourine_man wrote:
         | Praising is good. Gratitude is a bit much. They got this big by
         | selling user generated content and private info to the highest
         | bidder. Often through questionable means.
         | 
         | Also, the underdog always touts Open Source and standards, so
         | it's good to remain skeptical when/if tables turn.
        
           | sheepscreek wrote:
           | All said and done, it is a very _expensive_ and balsy way to
           | undercut competitors. They've spent  > $5B on hardware alone,
           | much of which will depreciate in value quickly.
           | 
           | Pretty sure the only reason Meta's managed to do this is
           | because of Zuck's iron grip on the board (majority voting
           | rights). This is great for Open Source and regular people
           | though!
        
             | wrsh07 wrote:
             | Zuck made a bet when they provisioned for reels to buy
             | enough GPUs to be able to spin up another reels-sized
             | service.
             | 
             | Llama is probably just running on spare capacity (I mean,
             | sure, they've kept increasing capex, but if they're worried
             | about an llm-based fb competitor they sort of have to in
             | order to enact their copycat strategy)
        
             | fractalf wrote:
             | Well, he didn't do it to be "nice", you can be sure about
             | that. Obviously they see a financial gain
             | somewhere/sometime
        
             | tambourine_man wrote:
             | At Meta level, spending $5B to stay competitive is not
             | balsy. It's a bargain.
        
           | ricardo81 wrote:
           | >selling user generated content and private info to the
           | highest bidder
           | 
           | Was always their modus operandi, surely. How else would they
           | have survived.
           | 
           | Thanks for returning everyone else;s content and never mind
           | all the content stealing your platform did.
        
           | jart wrote:
           | I'm perfectly happy with them draining the life essence out
           | of the people crazy enough to still use Facebook, if they're
           | funneling the profits into advancing human progress with AI.
           | It's an Alfred Nobel kind of thing to do.
        
         | germinalphrase wrote:
         | "Come to think of it, the negative impact on such competitors
         | was likely a key motivation for releasing the new models."
         | 
         | "Commoditize Your Complement" is often cited here:
         | https://gwern.net/complement
        
         | tintor wrote:
         | > they have brought the cost of pretraining a frontier 400B+
         | parameter model to ZERO
         | 
         | It is still far from zero.
        
           | cs702 wrote:
           | If the model is already pretrained, there's no need to
           | pretrain it, so the cost of pretraining is zero.
        
             | moffkalast wrote:
             | Yeah but you only have the one model, and so far it seems
             | to be only good on paper.
        
         | pwdisswordfishd wrote:
         | Makes me wonder why he's really doing this. Zuckerberg being
         | Zuckerberg, it can't be out of any genuine sense of altruism.
         | Probably just wants to crush all competitors before he
         | monetizes the next generation of Meta AI.
        
           | spiralk wrote:
           | Its certainly not altruism. Given that Facebook/Meta owns the
           | largest user data collection systems, any advancement in AI
           | ultimately strengthens their business model (which is still
           | mostly collecting private user data, amassing large user
           | datasets, and selling targeting ads).
           | 
           | There is a demo video that shows a user wearing a Quest VR
           | headset and asks the AI "what do you see" and it interprets
           | everything around it. Then, "what goes well with these
           | shorts"... You can see where this is going. Wearing headsets
           | with AIs monitoring everything the users see and collecting
           | even more data is becoming normalized. Imagine the private
           | data harvesting capabilities of the internet but anywhere in
           | the physical world. People need not even choose to wear a
           | Meta headset, simply passing a user with a Meta headset in
           | public will be enough to have private data collected. This
           | will be the inevitable result of vision models improvements
           | integrated into mobile VR/AR headsets.
        
             | goatlover wrote:
             | That's very dystopian. It's bad enough having cameras
             | everywhere now. I never opted in to being recorded.
        
             | warkdarrior wrote:
             | That sounds fantastic. If they make the Meta headset easy
             | to wear and somewhat fashionable (closer to eyeglass than
             | to a motorcycle helmet), I'd take it everywhere and record
             | everything. Give me a retrospective search and
             | conferences/meetings will be so much easier (I am terrible
             | with names).
        
           | phyrex wrote:
           | You can always listen to the investor calls for the
           | capitalist point of view. In short, attracting talent,
           | building the ecosystem, and making it really easy for users
           | to make stuff they want to share on Meta's social networks
        
           | bun_at_work wrote:
           | I really think the value of this for Meta is content
           | generation. More open models (especially state of the art)
           | means more content is being generated, and more content is
           | being shared on Meta platforms, so there is more advertising
           | revenue for Meta.
        
           | chasd00 wrote:
           | All the content generated by llms (good or bad) is going to
           | end up back in Facebook/Instagram and other social media
           | sites. This enables Meta to show growth and therefore demand
           | a higher stock price. So it makes sense to get content
           | generation tools out there as widely as possible.
        
         | troupo wrote:
         | There's nothing open source about it.
         | 
         | It's a proprietary dump of data you can't replicate or verify.
         | 
         | What were the sources? What datasets it was trained on? What
         | are the training parameters? And so on and so on
        
         | advael wrote:
         | Look, absolutely zero people in the world should trust any tech
         | company when they say they care about or will keep commitments
         | to the open-source ecosystem in any capacity. Nevertheless, it
         | is occasionally strategic for them to do so, and there can be
         | ancillary benefits for said ecosystem in those moments where
         | this is the best play for them to harm their competitors
         | 
         | For now, Meta seems to release Llama models in ways that don't
         | significantly lock people into their infrastructure. If that
         | ever stops being the case, you should fork rather than trust
         | their judgment. I say this knowing full well that most of the
         | internet is on AWS or GCP, most brick and mortar businesses use
         | Windows, and carrying a proprietary smartphone is essentially
         | required to participate in many aspects of the modern economy.
         | All of this is a mistake. You can't resist all lock-in. The
         | players involved effectively run the world. You should still
         | try where you can, and we should still be happy when tech
         | companies either slip up or make the momentary strategic
         | decision to make this easier
        
           | ori_b wrote:
           | > _If that ever stops being the case, you should fork rather
           | than trust their judgment._
           | 
           | Fork what? The secret sauce is in the training data and
           | infrastructure. I don't think either of those is currently
           | open.
        
             | quasse wrote:
             | I'm just a lowly outsider to the AI space, but calling
             | these open source models seems kind of like calling a
             | compiled binary open source.
             | 
             | If you don't have a way to replicate what they did to
             | create the model, it seems more like freeware than open
             | source.
        
               | advael wrote:
               | As an ML researcher, I agree. Meta doesn't include
               | adequate information to replicate the models, and from
               | the perspective of fundamental research, the interest
               | that big tech companies have taken in this field has been
               | a significant impediment to independent researchers,
               | despite the fact that they are undeniably producing
               | groundbreaking results in many respects, due to this
               | fundamental lack of openness
               | 
               | This should also make everyone very skeptical of any
               | claim they are making, from benchmark results to the
               | legalities involved in their training process to the
               | prospect of future progress on these models. Without
               | being able to vet their results against the same datasets
               | they're using, there is no way to verify what they're
               | saying, and the credulity that otherwise smart people
               | have been exhibiting in this space has been baffling to
               | me
               | 
               | As a developer, if you have a working Llama model,
               | including the source code and weights, and it's crucial
               | for something you're building or have already built, it's
               | still fundamentally a good thing that Meta isn't gating
               | it behind an API and if they went away tomorrow, you
               | could still use, self-host, retrain, and study the models
        
               | warkdarrior wrote:
               | The model is public, so you can at least verify their
               | benchmark claims.
        
               | Nuzzerino wrote:
               | Which option would be better?
               | 
               | A) Release the data, and if it ends up causing a privacy
               | scandal, at least you can actually call it open this
               | time.
               | 
               | B) Neuter the dataset, and the model
               | 
               | All I ever see in these threads is a lot of whining and
               | no viable alternative solutions (I'm fine with the idea
               | of it being a hard problem, but when I see this attitude
               | from "researchers" it makes me less optimistic about the
               | future)
               | 
               | > and the credulity that otherwise smart people have been
               | exhibiting in this space has been baffling to me
               | 
               | Remove the "otherwise" and you're halfway to
               | understanding your error.
        
               | Nuzzerino wrote:
               | > it seems more like freeware than open source.
               | 
               | What would you have them do instead? Specifically?
        
               | wongarsu wrote:
               | > If you don't have a way to replicate what they did to
               | create the model, it seems more like freeware
               | 
               | Isn't that a bit like arguing that a linux kernel driver
               | isn't open source if I just give you a bunch of GPL-
               | licensed source code that speaks to my device, but no
               | documentation how my device works? If you take away the
               | source code you have no way to recreate it. But so far
               | that never caused anyone to call the code not open-
               | source. The closest is the whole GPL3 Tivoization debate
               | and that was very divisive.
               | 
               | The heart of the issue is that open source is kind of
               | hard to define for anything that isn't software. As a
               | proxy we could look at Stallman's free software
               | definition. Free software shares a common history with
               | open source and in most open source software is
               | free/libre, and the other way around, so this might be a
               | useful proxy.
               | 
               | So checking the four software freedoms:
               | 
               | - The freedom to run the program as you wish, for any
               | purpose: For most purposes. There's that 700M user
               | restriction, also Meta forbids breaking the law and
               | requires you to follow their acceptable use policy.
               | 
               | - The freedom to study how the program works, and change
               | it so it does your computing as you wish: yes. You can
               | change it by fine tuning it, and the weights allow you to
               | figure out how it works. At least as well as anyone knows
               | how any large neural network works, but it's not like
               | Meta is keeping something from you here
               | 
               | - The freedom to redistribute copies so you can help your
               | neighbor: Allowed, no real asterisks
               | 
               | - The freedom to distribute copies of your modified
               | versions to others: Yes
               | 
               | So is it Free Software(tm)? Not really, but it is pretty
               | close.
        
             | JKCalhoun wrote:
             | A good point.
             | 
             | Forgive me, I am AI naive, is there some way to harness
             | Llama to train ones own actually-open AI?
        
               | advael wrote:
               | Kinda. Since you can self-host the model on a linux
               | machine, there's no meaningful way for them to prevent
               | you from having the trained weights. You can use this to
               | bootstrap other models, or retrain on your own datasets,
               | or fine-tune from the starting point of the currently-
               | working model. What you can't do is be sure what they
               | trained it on
        
               | QuercusMax wrote:
               | How open is it _really_ though? If you 're starting from
               | their weights, do you actually have legal permission to
               | use derived models for commercial purposes? If it turns
               | out that Meta used datasets they didn't have licenses to
               | use in order to generate the model, then you might be in
               | a big heap of mess.
        
               | ein0p wrote:
               | I could be wrong but most "model" licenses prohibit the
               | use of the models to improve other models
        
             | logicchains wrote:
             | They actually did open source the infrastructure library
             | they developed. They don't open source the data but they
             | describe how they gathered/filtered it.
        
           | ladzoppelin wrote:
           | Is forking really possible with an LLM or one the size of
           | future Lama versions, have they even released the weights and
           | everything? Maybe I am just negative about it because I feel
           | Meta is the worst company ever invented and feel this will
           | hurt society in the long run just like Facebook.
        
             | lawlessone wrote:
             | > have they even released the weights?
             | 
             | Isn't that what the model is? just a collection weights?
        
             | pmarreck wrote:
             | When you run `ollama pull llama3.1:70b`, which you can
             | literally do right now (assuming ollama.com is installed
             | and you're not afraid of the terminal), and it downloads a
             | 40 gigabyte model, _that is the weights_!
             | 
             | I'd consider the ability to admit when even your most hated
             | adversary is doing something right, a hallmark of acting
             | smarter.
             | 
             | Now, they haven't released the training data with the model
             | weights. THAT plus the training tooling would be "end to
             | end open source". Apple actually did _that very thing_
             | recently, and it flew under almost everyone 's radar for
             | some reason:
             | 
             | https://x.com/vaishaal/status/1813956553042711006?s=46&t=qW
             | a...
        
               | mym1990 wrote:
               | Doing something right vs doing something that seems right
               | but has a hidden self interest that is harmful in the
               | long run can be vastly different things. Often this kind
               | of strategy will allow people to let their guard down,
               | and those same people will get steamrolled down the road,
               | left wondering where it all went wrong. Get smarter.
        
               | pmarreck wrote:
               | How in the heck is an open source model that is free and
               | open today going to lock me down, down the line? This is
               | nonsense. You can literally run this model forever if you
               | use NixOS (or never touch your windows, macos or linux
               | install again). Zuck can't come back and molest it. Ever.
               | 
               | The best I can tell is that their self-interest here is
               | more about gathering mindshare. That's not a terrible
               | motive; in fact, that's a pretty decent one. It's not the
               | bully pressing you into their ecosystem with a tit-for-
               | tat; it's the nerd showing off his latest and going
               | "Here. Try it. Join me. Join us."
        
           | holoduke wrote:
           | In tech you can trust the underdogs. Once they turn into
           | dominant players they turn evil. 99% of the cases.
        
         | sandworm101 wrote:
         | >> Bravo! While I don't agree with Zuck's views and actions on
         | many fronts, on this occasion I think he and the AI folks at
         | Meta deserve our praise and gratitude.
         | 
         | Nope. Not one bit. Supporting F/OSS when it suits you in one
         | area and then being totally dismissive of it in _every other
         | area_ should not be lauded. How about open sourcing some of FB
         | 's VR efforts?
        
         | y04nn wrote:
         | Don't be fooled, it is a "embrace extend extinguish" strategy.
         | Once they have enough usage and be the default standard they
         | will start to find any possible ways to make you pay.
        
           | war321 wrote:
           | Hasn't really happened with PyTorch or any of their other
           | open sourced releases tbh.
        
         | tyler-jn wrote:
         | So far, it seems like this release has done ~nothing to the
         | stock price for GOOGL/MSFT, which we all know has been propped
         | up largely on the basis of their AI plans. So it's probably
         | premature to say that this has messed it up for them.
        
       | userabchn wrote:
       | Interview with Mark Zuckerberg released today:
       | https://www.bloomberg.com/news/videos/2024-07-23/mark-zucker...
        
       | starship006 wrote:
       | > Our adversaries are great at espionage, stealing models that
       | fit on a thumb drive is relatively easy, and most tech companies
       | are far from operating in a way that would make this more
       | difficult.
       | 
       | Mostly unrelated to the correctness of the article, but this
       | feels like a bad argument. AFAIK, Anthropic/OpenAI/Google are not
       | having issues with their weights being leaked (are they?). Why is
       | it that Meta's model weights are?
        
         | meowface wrote:
         | >AFAIK, Anthropic/OpenAI/Google are not having issues with
         | their weights being leaked. Why is it that Meta's model weights
         | are?
         | 
         | The main threat actors there would be powerful nation-states,
         | in which case they'd be unlikely to leak what they've taken.
         | 
         | It is a bad argument though, because one day possession of AI
         | models (and associated resources) might confer great and
         | dangerous power, and we can't just throw up our hands and say
         | "welp, no point trying to protect this, might as well let
         | everyone have it". I don't think that'll happen anytime soon,
         | but I am personally somewhat in the AI doomer camp.
        
         | whimsicalism wrote:
         | We have no way of knowing whether nation-state level actors
         | have access to those weights.
        
         | skybrian wrote:
         | I think it's hard to say. We simply don't know much from the
         | outside. Microsoft has had some pretty bad security lapses, for
         | example around guarding access to Windows source code. I don't
         | think we've seen a bad security break-in at Google in quite a
         | few years? It would surprise me if Anthropic and OpenAI had
         | good security since they're pretty new, and fast-growing
         | startups have a lot of organizational challenges.
         | 
         | It seems safe to assume that not all the companies doing
         | leading-edge LLM's have good security and that the industry as
         | a whole isn't set up to keep secrets for long. Things aren't
         | locked down to the level of classified research. And it sounds
         | like Zuckerberg doesn't want to play the game that way.
         | 
         | At the state level, China has independent AI research efforts
         | and they're going to figure it out. It's largely a matter of
         | timing, which could matter a lot.
         | 
         | There's still an argument to be made against making
         | proliferation too easy. Just because states have powerful
         | weapons doesn't mean you want them in the hands of people on
         | the street.
        
         | dfadsadsf wrote:
         | We have nationals/citizens of every major US adversary working
         | in those companies with looser security practice than security
         | at local warehouse. Security check before hiring is a joke
         | (mostly checks that resume checks out), laptops can be taken
         | home and internal communication are not segmented on need to
         | know basis. Essentially if China wants weights or source code,
         | it will have hundreds of people to choose from who can provide
         | it.
        
       | probablybetter wrote:
       | I would avoid Facebook and Meta products in general. I do NOT
       | trust them. We have approx. 20 years of their record to go upon.
        
       | diggan wrote:
       | > Today we're taking the next steps towards open source AI
       | becoming the industry standard. We're releasing Llama 3.1 405B,
       | the first frontier-level open source AI model,
       | 
       | Why do people keep mislabeling this as Open Source? The whole
       | point of calling something Open Source is that the "magic sauce"
       | of how to build something is publicly available, so I could built
       | it myself if I have the means. But without the training data
       | publicly available, could I train Llama 3.1 if I had the means?
       | No wonder Zuckerberg doesn't start with defining what Open Source
       | actually means, as then the blogpost would have lost all meaning
       | from the get go.
       | 
       | Just call it "Open Model" or something. As it stands right now,
       | the meaning of Open Source is being diluted by all these
       | companies pretending to doing one thing, while actually doing
       | something else.
       | 
       | I initially got very exciting seeing the title and the domain,
       | but hopelessly sad after reading through the article and
       | realizing they're still trying to pass their artifacts off as
       | Open Source projects.
        
         | valine wrote:
         | The codebase to do the training is way less valuable than the
         | weights for the vast majority of people. Releasing the training
         | code would be nice, but it doesn't really help anyone but
         | Meta's direct competitors.
         | 
         | If you want to train on top of Llama there's absolutely nothing
         | stopping you. Plenty of open source tools to do parameter
         | optimization.
        
           | diggan wrote:
           | Not just the training code but the training data as well,
           | should be under a permissive license, otherwise you cannot
           | call the project itself Open Source, which Facebook does
           | here.
           | 
           | > is way less valuable than the weights for the vast majority
           | of people
           | 
           | The same is true for most Open Source projects, most people
           | use the distributed binaries or other artifacts from the
           | projects, and couldn't care less about the code itself. But
           | that doesn't warrant us changing the meaning of Open Source
           | just because companies feel like it's free PR.
           | 
           | > If you want to train on top of Llama there's absolutely
           | nothing stopping you.
           | 
           | Sure, but in order for the intent of Open Source to be true
           | for Llama, I should be able to build this project from
           | scratch. Say I have a farm of 100 A100's, could I reproduce
           | the Llama model from scratch today?
        
             | unshavedyak wrote:
             | > Not just the training code but the training data as well,
             | should be under a permissive license, otherwise you cannot
             | call the project itself Open Source, which Facebook does
             | here.
             | 
             | Does FB even have the capability to do that? I'd assume
             | there's a bunch of data that's not theirs and they can't
             | even release it. Let alone some data that they might not
             | want to admit is in the source.
        
               | bornfreddy wrote:
               | If not, it is questionable if they should train on such
               | data anyway.
               | 
               | Also, that doesn't matter in this discussion - if you are
               | unable to release the source under appropriate licence
               | (for whatever reason), you should not call it Open
               | Source.
        
             | talldayo wrote:
             | I will steelman the idea that a tokenizer and weights are
             | all you need for the "source" of an LLM. They are
             | components that can be modified, redistributed and when put
             | together, reproduce the full experience intended.
             | 
             | If we _insist_ upon the release of training data with Open
             | models, you might as well kiss the idea of usable Open LLMs
             | out the door. Most of the content in training datasets like
             | The Pile are not licensed for redistribution in any way
             | shape or form. It would jeopardize projects that _do_ use
             | transparent training data while not offering anything of
             | value to the community compared to the training code.
             | Republishing all training data is an absolute trap.
        
               | enriquto wrote:
               | > Most of the content in training datasets like The Pile
               | are not licensed for redistribution in any way shape or
               | form.
               | 
               | But distributing the weights is a "form" of distribution.
               | You can recover many items of the dataset (most easily,
               | the outliers) by using the weights.
               | 
               | Just because they are codified in a non-readily
               | accessible way, does not mean that you are not
               | distributing them.
               | 
               | It's scary to think that "training" is becoming a thinly
               | veiled way to strip copyright of works.
        
               | talldayo wrote:
               | The weights are a transformed, lossy and non-complete
               | permutation of the training material. You _cannot_
               | recover most of the dataset reliably, which is what stops
               | it from being an outright replacement for the work it 's
               | trained on.
               | 
               | > does not mean that you are not distributing them.
               | 
               | Except you literally aren't distributing them. It's like
               | accusing me of pirating a movie because I sent a
               | screenshot or a scene description to my friend.
               | 
               | > It's scary to think that "training" is becoming a
               | thinly veiled way to strip copyright of works.
               | 
               | This is the way it's been for years. Google is given Fair
               | Use for redistributing incomplete parts of copywritten
               | text materials verbatim, since their application is
               | transformative: https://en.wikipedia.org/wiki/Authors_Gui
               | ld,_Inc._v._Google,....
               | 
               | Or Corellium, who won their case to use copywritten Apple
               | code in novel and transformative ways: https://www.forbes
               | .com/sites/thomasbrewster/2023/12/14/apple...
               | 
               | Copyright has always been a limited power.
        
             | jncfhnb wrote:
             | People don't typically modify distributed binaries.
             | 
             | People do typically modify model weights. They are the
             | preferred form to modify model.
             | 
             | Saying "build" llama is just a nonsense comparison to
             | traditional compiled software. "Building llama" is more
             | akin to taking the raw weights as text and putting them
             | into a nice pickle file. Or loading it into an inference
             | engine.
             | 
             | Demanding that you have everything needed to recreate the
             | weights from scratch is like arguing an application cannot
             | be open source unless it also includes the user testing
             | history and design documents.
             | 
             | And of course some idiots don't understand what a pickled
             | weights file is and claim it's as useless as a distributed
             | binary if you want to modify the program just because it is
             | technically compiled; not understanding that the point of
             | the pickled file is "convenience" and that it unpacks back
             | to the original form. Like arguing open source software
             | can't be distributed in zip files.
             | 
             | > Say I have a farm of 100 A100's, could I reproduce the
             | Llama model from scratch today?
             | 
             | Say you have a piece of paper. Can you reproduce
             | `print("hello world")` from scratch?
        
         | vngzs wrote:
         | Agreed. The Linux kernel source contains everything you need to
         | produce Linux kernel binaries. The llama source does not
         | contain what you need to produce llama models. Facebook is
         | using sleight of hand to garner favor with open model weights.
         | 
         | Open model weights are still commendable, but it's a far cry
         | from open-source (or even _libre_ ) software!
        
         | elromulous wrote:
         | 100%. With this licensing model, meta gets to reap the benefits
         | of open source (people contributing, social cachet), without
         | any of the real detriment (exposing secret sauce).
        
         | hbn wrote:
         | Is that even something they keep on hand? Or would WANT to keep
         | on hand? I figured they're basically sending a crawler to go
         | nuts reading things and discard the data once they've trained
         | on it.
         | 
         | If that included, e.g. reading all of Github for code, I
         | wouldn't expect them to host an entire separate read-only copy
         | of Github because they trained on it and say "this is part of
         | our open source model"
        
         | jdminhbg wrote:
         | > Why do people keep mislabeling this as Open Source? The whole
         | point of calling something Open Source is that the "magic
         | sauce" of how to build something is publicly available, so I
         | could built it myself if I have the means. But without the
         | training data publicly available, could I train Llama 3.1 if I
         | had the means?
         | 
         | I don't think not releasing the commit history of a project
         | makes it not Open Source, this seems like that to me. What's
         | important is you can download it, run it, modify it, and re-
         | release it. Being able to see how the sausage was made would be
         | interesting, but I don't think Meta have to show their training
         | data any more than they are obligated to release their planning
         | meeting notes for React development.
         | 
         | Edit: I think the restrictions in the license itself are good
         | cause for saying it shouldn't be called Open Source, fwiw.
        
           | thenoblesunfish wrote:
           | You don't need to have the commit history to see "how it
           | works". ML that works well does so in huge part due to the
           | training data used. The leading models today aren't
           | distinguished by the way they're trained, but what they're
           | trained on.
        
             | jdminhbg wrote:
             | I agree that you need training data to build AI from
             | scratch, much like you need lots of really smart developers
             | and a mailing list and servers and stuff to build the Linux
             | kernel from scratch. But it's not like having the training
             | data and training code will get you the same result, in the
             | way something like open data in science is about
             | replicating results.
        
           | tempfile wrote:
           | For the freedom to change to be effective, a user must be
           | given the software in a form they can modify. Can you tweak
           | an LLM once it's built? (I genuinely don't know the answer)
        
             | jdminhbg wrote:
             | Yes, you can finetune Llama:
             | https://llama.meta.com/docs/how-to-guides/fine-tuning/
        
           | diggan wrote:
           | > I don't think not releasing the commit history of a project
           | makes it not Open Source,
           | 
           | Right, I'm not talking about the commit history, but rather
           | that anyone (with means) should be able to produce the final
           | artifact themselves, if they want. For weights like this,
           | that requires at least the training script + the training
           | data. Without that, it's very misleading to call the project
           | Open Source, when only the result of the training is
           | released.
           | 
           | > What's important is you can download it, run it, modify it,
           | and re-release it
           | 
           | But I literally cannot download the project, build it and run
           | it myself? I can only use the binaries (weights) provided by
           | Meta. No one can modify how the artifact is produced, only
           | modify the already produced artifact.
           | 
           | That's like saying that Slack is Open Source because if I
           | want to, I could patch the binary with a hex editor and
           | add/remove things as I see fit? No one believes Slack should
           | be called Open Source for that.
        
             | jdminhbg wrote:
             | > Right, I'm not talking about the commit history, but
             | rather that anyone (with means) should be able to produce
             | the final artifact themselves, if they want. For weights
             | like this, that requires at least the training script + the
             | training data.
             | 
             | You cannot produce the final artifact with the training
             | script + data. Meta also cannot reproduce the current
             | weights with the training script + data. You could produce
             | some other set of weights that are just about as good, but
             | it's not a deterministic process like compiling source
             | code.
             | 
             | > That's like saying that Slack is Open Source because if I
             | want to, I could patch the binary with a hex editor and
             | add/remove things as I see fit? No one believes Slack
             | should be called Open Source for that.
             | 
             | This analogy doesn't work because it's not like Meta can
             | "patch" Llama any more than you can. They can only finetune
             | it like everyone else, or produce an entirely different LLM
             | by training from scratch like everyone else.
             | 
             | The right to release your changes is another difference; if
             | you patch Slack with a hex editor to do some useful thing,
             | you're not allowed to release that changed Slack to others.
             | 
             | If Slack lost their source code, went out of business, and
             | released a decompiled version of the built product into the
             | public domain, that would in some sense be "open source,"
             | even if not as good as something like Linux. LLMs though do
             | not have a source code-like representation that is easily
             | and deterministically modifiable like that, no matter who
             | the owner is or what the license is.
        
         | unraveller wrote:
         | Open-weights is not open-source, for sure, but I don't mind it
         | being stated as an aspiration goal, the moment it is legally
         | possible to publish a source without shooting themselves in the
         | foot they should do it.
         | 
         | They could release 50% of their best data but that would only
         | stop them from attracting the best talent.
        
         | JeremyNT wrote:
         | > _Why do people keep mislabeling this as Open Source?_
         | 
         | I guess this is a rhetorical question, but this is a press
         | release from Meta itself. It's just a marketing ploy, of
         | course.
        
         | blcknight wrote:
         | InstructLab and the Granite Models from IBM seem the closest to
         | being open source. Certainly more than whatever FB is doing
         | here.
         | 
         | (Disclaimer: I work for an IBM subsidiary but not on any of
         | these products)
        
       | hubraumhugo wrote:
       | The big winners of this: devs and AI startups
       | 
       | - No more vendor lock-in
       | 
       | - Instead of just wrapping proprietary API endpoints, developers
       | can now integrate AI deeply into their products in a very cost-
       | effective and performant way
       | 
       | - Price race to the bottom with near-instant LLM responses at
       | very low prices are on the horizon
       | 
       | As a founder, it feels like a very exciting time to build a
       | startup as your product automatically becomes better, cheaper,
       | and more scalable with every major AI advancement. This leads to
       | a powerful flywheel effect: https://www.kadoa.com/blog/ai-
       | flywheel
        
         | danielmarkbruce wrote:
         | It creates the opposite of a flywheel effect for you. It
         | creates a leapfrog effect.
        
           | boringg wrote:
           | AI might cannabalize a lot of first gen AI businesses.
        
         | boringg wrote:
         | - Price race to the bottom with near-instant LLM responses at
         | very low prices are on the horizon
         | 
         | Maybe a big price war while the market majors fight out for
         | positioning but they still need to make money off their
         | investments so someone is going to have to raise prices at some
         | point and youll be locked into their system if you build on it.
        
       | mav3ri3k wrote:
       | I am not deep into llms so I ask this. From my understanding,
       | their last model was open source but it was in a way that you can
       | use them but the inner working were "hidden"/not transparent.
       | 
       | With the new model, I am seeing alot of how open source they are
       | and can be build upon. Is it now completely open source or
       | similar to their last models ?
        
         | whimsicalism wrote:
         | It's intrinsic to transformers that the inner workings are
         | largely inscrutable. This is no different, but it does not mean
         | they cannot be built upon.
         | 
         | Gradient descent works on these models just like the prior
         | ones.
        
       | carimura wrote:
       | Looks like you can already try out Llama-3.1-405b on Groq,
       | although it's timing out. So. Hugged I guess.
        
         | TechDebtDevin wrote:
         | All the big providers should have it up by end of day. They
         | just change their API configs (they're just reselling you AWS
         | Bedrock).
        
           | jamiedg wrote:
           | 405B and the other Llama 3.1 models are working and available
           | on Together AI. https://api.together.ai
        
       | mensetmanusman wrote:
       | It's easy to support open source AI when the code is 1,000 lines
       | and the execution costs $100,000,000 of electricity.
       | 
       | Only the big players can afford to push go, and FB would love to
       | see OpenAI's code so they can point it to their proprietary user
       | data.
        
       | bun_at_work wrote:
       | Meta makes their money off advertising, which means they profit
       | from attention.
       | 
       | This means they need content that will grab attention, and
       | creating open source models that allow anyone to create any
       | content on their own becomes good for Meta. The users of the
       | models can post it to their Instagram/FB/Threads account.
       | 
       | Releasing an open model also releases Meta from the burden of
       | having to police the content the model generates, once the open
       | source community fine-tunes the models.
       | 
       | Overall, this move is good business move for Meta - the post
       | doesn't really talk about the true benefit, instead moralizing
       | about open source, but this is a sound business move for Meta.
        
         | jklinger410 wrote:
         | This is a great point. Eventually, META will only allow LLAMA
         | generated visual AI content on its platforms. They'll put a
         | little key in the image that clears it with the platform.
         | 
         | Then all other visual AI content will be banned. If that is
         | where legislation is heading.
        
         | natural219 wrote:
         | AI moderators too would be an enormous boon if they could get
         | that right.
        
           | KaiserPro wrote:
           | It would be good, but the cost per moderation is still really
           | high for it to be practical.
        
         | noiseinvacuum wrote:
         | Creating content with AI will surely be helpful for social
         | media to some extent but I think it's not that important in
         | larger scheme of things, there's already a vast sea of content
         | being created by humans and differentiation is already in
         | recommending the right content to right people at right time.
         | 
         | More important is the products that Meta will be able to make
         | if the industry standardizes on Llama. They would have the
         | front seat in not just with access the latest unreleased models
         | but also settings the direction of progress and next gen LLM
         | optimizes for. If you're Twitter or Snap or TikTok or compete
         | with Meta on the product then good luck in trying to keep up.
        
         | apwell23 wrote:
         | I am not sure I follow this.
         | 
         | 1. Is there such a thing as 'attention grabbing AI content' ?
         | Most AI content I see is the opposite of 'attention grabbing'.
         | Kindle store is flooded with this garbage and none of it is
         | particularly 'attention grabbing'.
         | 
         | 2. Why would creation of such content, even if it was truly
         | attention grabbing, benefit meta in particular ?
         | 
         | 3. How would poliferation of AI content lead to more ad spend
         | in the economy. Ad budgets won't increase because of AI
         | content?
         | 
         | To me this is typical Zuckerberg play. Attach metas name to
         | whatever is trendy at the moment like ( now forgotten)
         | metaverse, cryptocoins and bunch of other failed stuff that was
         | trendy for a second. Meta is NOT an Gen AI company like he is
         | scamming ( more like colluding) the market to believe. A mere
         | distraction from slowing user growth on ALL of meta apps.
        
           | bun_at_work wrote:
           | Sure - there is plenty of attention grabbing AI content - it
           | doesn't have to grab _your_ attention, and it won't work for
           | everyone. I have seen people engaging with apps that redo a
           | selfie to look like a famous character or put the person in a
           | movie scene, for example.
           | 
           | Every piece of content in any feed (good, bad, or otherwise)
           | benefits the aggregator (Meta, YouTube, whatever), because
           | someone will look at it. Not everything will go viral, but it
           | doesn't matter. Scroll whatever on Twitter, YouTube Shorts,
           | Reddit, etc. Meta has a massive presence in social media, so
           | content being generated is shared there.
           | 
           | The more content of any type leads to more engagement on the
           | platforms where it's being shared. Every Meta feed serves the
           | viewer an ad (for which Meta is paid) every 3 or so posts
           | (pieces of content). It doesn't matter if the user doesn't
           | like 1/5 posts or whatever, the number of ads still goes up.
        
             | apwell23 wrote:
             | > it doesn't have to grab _your_ attention
             | 
             | I am talking about in general, not me personally. No
             | popular content on any website/platform is AI generated.
             | Maybe you have examples that lead you believe that its
             | possible on a mass scale.
             | 
             | > look like a famous character or put the person in a movie
             | scene
             | 
             | what attention grabbing movie used gen ai persons
        
       | resters wrote:
       | This is really good news. Zuck sees the inevitability of it and
       | the dystopian regulatory landscape and decided to go all in.
       | 
       | This also has the important effect of neutralizing the critique
       | of US Government AI regulation because it will democratize
       | "frontier" models and make enforcement nearly impossible. Thank
       | you, Zuck, this is an important and historic move.
       | 
       | It also opens up the market to a lot more entry in the area of
       | "ancillary services to support the effective use of frontier
       | models" (including safety-oriented concerns), which should really
       | be the larger market segment.
        
         | passion__desire wrote:
         | Probably, Yann Lecun is the Lord Varys here. He has Mark's ear
         | and Mark believes in Yann's vision.
        
         | war321 wrote:
         | Unfortunately, there are a number of AI safety people that are
         | still crowing about how AI models need to be locked down, with
         | some of them loudly pivoting to talking about how open source
         | models aid China.
         | 
         | Plus there's still the spectre of SB-1047 hanging around.
        
       | amelius wrote:
       | > One of my [Mark Zuckerberg, ed.] formative experiences has been
       | building our services constrained by what Apple will let us build
       | on their platforms. Between the way they tax developers, the
       | arbitrary rules they apply, and all the product innovations they
       | block from shipping, it's clear that Meta and many other
       | companies would be freed up to build much better services for
       | people if we could build the best versions of our products and
       | competitors were not able to constrain what we could build.
       | 
       | This is hard to disagree with.
        
         | glhaynes wrote:
         | I think it's very easy to disagree with!
         | 
         | If Zuckerberg had his way, mobile device OSes would let Meta
         | ingest microphone and GPS data 24/7 (just like much of the
         | general public already _thinks_ they do because of the
         | effectiveness of the other sorts of tracking they are able to
         | do).
         | 
         | There are certainly legit innovations that haven't shipped
         | because gatekeepers don't allow them. But there've been lots of
         | harmful "innovations" blocked, too.
        
       | throwaway1194 wrote:
       | I strongly suspect that what AI will end up doing is push
       | companies and organizations towards open source, they will
       | eventually realize that code is already being shared via AI
       | channels, so why not do it legally with open source?
        
         | talldayo wrote:
         | > they will eventually realize that code is already being
         | shared via AI channels
         | 
         | Private repos are not being reproduced by any modern AI. Their
         | source code is safe, although AI arguably lowers the bar to
         | compete with them.
        
       | whimsicalism wrote:
       | OpenAI needs to release a new model setting a new capabilities
       | highpoint. This is existential for them now.
        
       | ChrisArchitect wrote:
       | Related:
       | 
       |  _Llama 3.1 Official Launch_
       | 
       | https://news.ycombinator.com/item?id=41046540
        
       | m3kw9 wrote:
       | The truth is we need both closed and open source, they both have
       | their discovery path and advantages and disadvantages, there
       | shouldn't be a system where one is eliminated over the other.
       | They also seem to be driving each other forward via competition.
        
       | typpo wrote:
       | Thanks to Meta for their work on safety, particularly Llama
       | Guard. Llama Guard 3 adds defamation, elections, and code
       | interpreter abuse as detection categories.
       | 
       | Having run many red teams recently as I build out promptfoo's red
       | teaming featureset [0], I've noticed the Llama models punch above
       | their weight in terms of accuracy when it comes to safety. People
       | hate excessive guardrails and Llama seems to thread the needle.
       | 
       | Very bullish on open source.
       | 
       | [0] https://www.promptfoo.dev/docs/red-team/
        
         | swyx wrote:
         | is there a #2 to llamaguard? Meta seems curiously alone in
         | doing this kind of, lets call it, "practical safety" work
        
       | enriquto wrote:
       | It's alarming that he refers to llama as if it was open source.
       | 
       | The definition of free software (and open source, for that
       | mater), is well-established. The same definition applies to all
       | programs, whether they are "AI" or not. In any case, if a program
       | was built by training against a dataset, the whole dataset is
       | part of the source code.
       | 
       | Llama is distributed in binary form, and it was built based on a
       | secret dataset. Referring to it as "open source" is not
       | ignorance, it's malice.
        
         | Nesco wrote:
         | The training data contains most likely insane amounts of
         | copyrighted material. That's why virtually none of the "open
         | models" come with their training data
        
           | enriquto wrote:
           | > The training data contains most likely insane amounts of
           | copyrighted material.
           | 
           | If that is the case then the weights must inherit all these
           | copyrights. It has been shown (at least in image processing)
           | that you can extract many training images from the weights,
           | almost verbatim. Hiding the training data does not solve this
           | issue.
           | 
           | But regardless of copyright issues, people here are
           | complaining about the malicious use of the term "open
           | source", to signify a completely different thing (more like
           | "open api").
        
             | tempfile wrote:
             | > If that is the case then the weights must inherit all
             | these copyrights.
             | 
             | Not if it's a fair use (which is obviously the defence
             | they're hoping for)
        
         | jdminhbg wrote:
         | > In any case, if a program was built by training against a
         | dataset, the whole dataset is part of the source code.
         | 
         | I'm not sure why I keep seeing this. What is the equivalent of
         | the training data for something like the Linux kernel?
        
           | enriquto wrote:
           | > What is the equivalent of the training data for something
           | like the Linux kernel?
           | 
           | It's the source code.
           | 
           | For the linux kernel:
           | compile(sourcecode) = binary
           | 
           | For llama:                         train(data) = weights
        
             | jdminhbg wrote:
             | That analogy doesn't work. `train` is not a deterministic
             | process. Meta has all of the training data and all of the
             | supporting source code and they still won't get the same
             | `weights` if they re-run the process.
             | 
             | The weights are the result of the development process, like
             | the source code of a program is the result of a development
             | process.
        
       | indus wrote:
       | Is there an argument against Open Source AI?
       | 
       | Not the usual nation-state rhetoric, but something that justifies
       | that closed source leads to better user-experience and fewer
       | security and privacy issues.
       | 
       | An ecosystem that benefits vendors, customers, and the makers of
       | close source?
       | 
       | Are there historical analogies other than Microsoft Windows or
       | Apple iPhone / iOS?
        
         | kjkjadksj wrote:
         | Lets take the iphone. Secured by the industries best security
         | teams I am sure. Closed source, yet teenagers in eastern europe
         | have cracked into it dozens of times making jailbreaks. Every
         | law enforcement agency can crack into it. Closed source is not
         | a security moat, but a trade protection moat.
        
         | finolex1 wrote:
         | Replace "Open Source AI" in "is there an argument against xxx"
         | with bioweapons or nuclear missiles. We are obviously not at
         | that stage yet, but it could be a real, non-trivial concern in
         | the near future.
        
       | GaggiX wrote:
       | Llama 3.1 405B is on par with GPT-4o and Claude 3.5 Sonnet, the
       | 70B model is better than GPT 3.5 turbo, incredible.
        
       | itissid wrote:
       | How are smaller models distilled from large models, I know of
       | LoRA, quantization like technique; but does distilling also mean
       | generating new datasets for conversing with smaller models
       | entirely from the big models for many simpler tasks?
        
         | tintor wrote:
         | Smaller models can be trained to match log probs of the larger
         | model. Larger model can be used to generate synthethic data for
         | the smaller model.
        
       | popcorncowboy wrote:
       | > Developers can run inference on Llama 3.1 405B on their own
       | infra at roughly 50% the cost of using closed models like GPT-4o
       | 
       | Does anyone have details on exactly what this means or where/how
       | this metric gets derived?
        
         | rohansood15 wrote:
         | I am guessing these are prices on services like AWS Bedrock
         | (their post is down right now).
        
         | PlattypusRex wrote:
         | a big chunk of that is probably the fact that you don't need to
         | pay someone who is trying to make a profit by running inference
         | off-premises.
        
       | wesleyyue wrote:
       | Just added Llama 3.1 405B/70B/8B to https://double.bot (VSCode
       | coding assistant) if anyone would like to try it.
       | 
       | ---
       | 
       | Some observations:
       | 
       | * The model is much better at trajectory correcting and putting
       | out a chain of tangential thoughts than other frontier models
       | like Sonnet or GPT-4o. Usually, these models are limited to
       | outputting "one thought", no matter how verbose that thought
       | might be.
       | 
       | * I remember in Dec of 2022 telling famous "tier 1" VCs that
       | frontier models would eventually be like databases: extremely
       | hard to build, but the best ones will eventually be open and win
       | as it's too important to too many large players. I remember the
       | confidence in their ridicule at the time but it seems
       | increasingly more likely that this will be true.
        
       | didip wrote:
       | Is it really open source though? You can't run these models for
       | your company. The license is extremely restrictive and there's NO
       | SOURCE CODE.
        
       | jamiedg wrote:
       | Looks like it's easy to test out these models now on Together AI
       | - https://api.together.ai
        
       | KingOfCoders wrote:
       | Open Source AI needs to include training data.
        
       | fsndz wrote:
       | Small language models is the path forward
       | https://medium.com/thoughts-on-machine-learning/small-langua...
        
       | pja wrote:
       | "Commoditise your complement" in action!
        
       | manishrana wrote:
       | rally useful insights
        
       | manishrana wrote:
       | really useful insights
        
       | bufferoverflow wrote:
       | Hard disagree. So far every big important model is closed-source.
       | Grok is sort-of the only exception, and it's not even that big
       | compared to the (already old) GPT-4.
       | 
       | I don't see open source being able to compete with the cutting-
       | edge proprietary models. There's just not enough money. GPT-5
       | will take an estimated $1.2 billion to train. MS and OpenAI are
       | already talking about building a $100 billion training data
       | center.
       | 
       | How can you compete with that if your plan is to give away the
       | training result for free?
        
         | sohamgovande wrote:
         | Where is the $1.2b number from?
        
           | bufferoverflow wrote:
           | There are a few numbers floating around, $1.2B being the
           | lowest estimate.
           | 
           | HSBC estimates the training cost for GPT-5 between $1.7B and
           | $2.5B.
           | 
           | Vlad Bastion Research estimates $1.25B - 2.25B.
           | 
           | Some people on HN estimate $10B:
           | 
           | https://news.ycombinator.com/item?id=39860293
        
       | smusamashah wrote:
       | Meta's article with more details on the new LLAMA 3.1
       | https://ai.meta.com/blog/meta-llama-3-1/
        
       | 6gvONxR4sf7o wrote:
       | > Third, a key difference between Meta and closed model providers
       | is that selling access to AI models isn't our business model.
       | That means openly releasing Llama doesn't undercut our revenue,
       | sustainability, or ability to invest in research like it does for
       | closed providers. (This is one reason several closed providers
       | consistently lobby governments against open source.)
       | 
       | The whole thing is interesting, but this part strikes me as
       | potentially anticompetitive reasoning. I wonder what the lines
       | are that they have to avoid crossing here?
        
         | phkahler wrote:
         | >> ...but this part strikes me as potentially anticompetitive
         | reasoning.
         | 
         | "Commoditize your complements" is an accepted strategy. And
         | while pricing below cost to harm competitors is often illegal,
         | the reality is that the marginal cost of software is zero.
        
           | Palomides wrote:
           | spending a very quantifiable large amount of money to release
           | something your nominal competitors charge for without having
           | your own direct business case for it seems a little much
        
             | phkahler wrote:
             | Companies spend very large amounts of money on all sorts of
             | things that never even get released. Nothing wrong with
             | releasing something for free that no longer costs you
             | anything. Who knows why they developed it in the first
             | place, it makes no difference.
        
       | frabjoused wrote:
       | Who knew FB would hold OpenAI's original ideals, and OpenAI now
       | holds early FB ideals/integrity.
        
         | boringg wrote:
         | FB needed to differentiate drastically. FB is at its best
         | creating large data infra.
        
       | jmward01 wrote:
       | I never thought I would say this but thanks Meta.
       | 
       | *I reserve the right to remove this praise if they abuse this
       | open source model position in the future.
        
       | gooob wrote:
       | why do they keep training on publicly available online data, god
       | dammit? what the fuck. don't they want to make a good LLM? train
       | on the classics, on the essentials reference manuals for
       | different technologies, on history books, medical encyclopedias,
       | journal notes from the top surgeons and engineers, scientific
       | papers of the experiments that back up our fundamental theories.
       | we want quality information, not recent information. we already
       | have plenty of recent information.
        
       | mmmore wrote:
       | I appreciate that Mark Zuckerberg soberly and neutrally talked
       | about some of the risks from advances in AI technology. I agree
       | with others in this thread that this is more accurately called
       | "public weights" instead of open source, and in that vein I
       | noticed some issues in the article.
       | 
       | > This is one reason several closed providers consistently lobby
       | governments against open source.
       | 
       | Is this substantially true? I've noticed a tendency of those who
       | support the general arguments in this post to conflate the
       | beliefs of people concerned about AI existential risk, some of
       | whom work at the leading AI labs, with the position of the labs
       | themselves. In most cases I've seen, the AI labs (especially
       | OpenAI) have lobbied against any additional regulation on AI,
       | including with SB1047[1] and the EU AI Act[2]. Can anyone provide
       | an example of this in the context of actual legislation?
       | 
       | > On this front, open source should be significantly safer since
       | the systems are more transparent and can be widely scrutinized.
       | Historically, open source software has been more secure for this
       | reason.
       | 
       | This may be true if we could actually understand what was
       | happening in neural networks, or train them to consistently avoid
       | unwanted behaviors. As things are, the public weights are simply
       | inscrutable black boxes, and the existence of jailbreaks and
       | other strange LLM behaviors show that we don't understand how our
       | training processes create models' emergent behaviors. The
       | capabilities of these models and their influence are growing
       | faster than our understand of them, and our ability to steer them
       | to behave precisely how we want, and that will only get harder as
       | the models get more powerful.
       | 
       | > At this point, the balance of power will be critical to AI
       | safety. I think it will be better to live in a world where AI is
       | widely deployed so that larger actors can check the power of
       | smaller bad actors.
       | 
       | This paragraph ignores the concept of offense/defense balance.
       | It's much easier to cause a pandemic than to stop one, and
       | cyberattacks, while not as bad as pandemics, seem to also favor
       | the attacker (this one is contingent on how much AI tools can
       | improve our ability to write secure code). At the extreme, it
       | would clearly be bad if everyone had access to a anti-matter
       | weapon large enough to destroy the Earth; at some level of
       | capability, we have to limit the commands an advanced AI will
       | follow from an arbitrary person.
       | 
       | That said, I'm unsure if limiting public weights at this time
       | would be good regulation. They do seem to have some benefits in
       | increasing research around alignment/interpretability, and I
       | don't know if I buy the argument that public weights are
       | significantly more dangerous from a "misaligned ASI" perspective
       | than many competing closed companies. I also don't buy the view
       | of some in the leading labs that we'll likely have "human level"
       | systems by the end of the decade; it seems possible but unlikely.
       | But I worry that Zuckerberg's vision of the future does not
       | adequately guard against downside risks, and is not compatible
       | with the way the technology will actually develop.
       | 
       | [1] https://thebulletin.org/2024/06/california-ai-bill-
       | becomes-a...
       | 
       | [2] https://time.com/6288245/openai-eu-lobbying-ai-act/
        
       | btbuildem wrote:
       | The "open source" part sounds nice, though we all know there's
       | nothing particularly open about the models (or their weights).
       | The barriers to entry remain the same - huge upfront investments
       | to train your own, and steep ongoing costs for "inference".
       | 
       | Is the vision here to treat LLM-based AI as a "public good", akin
       | to a utility provider in a civilized country (taxpayer funded,
       | govt maintained, non-for-profit)?
       | 
       | I think we could arguably call this "open source" when all the
       | infra blueprints, scripts and configs are freely available for
       | anyone to try and duplicate the state-of-the-art (resource and
       | grokking requirements nonwithstanding)
        
         | brrrrrm wrote:
         | check out the paper. it's pretty comprehensive
         | https://ai.meta.com/research/publications/the-llama-3-herd-o...
        
       | openrisk wrote:
       | Open source "AI" is a proxy for democratising and making (much)
       | more widely useful the goodies of high performance computing
       | (HPC).
       | 
       | The HPC domain (data and compute intensive applications that
       | typically need vector, parallel or other such architectures) have
       | been around for the longest time, but confined to academic /
       | government tasks.
       | 
       | LLM's with their famous "matrix multiply" at their very core are
       | basically demolishing an ossified frontier where a few commercial
       | entities (Intel, Microsoft, Apple, Google, Samsung etc) have
       | defined for decades what computing looks like _for most people_.
       | 
       | Assuming that the genie is out of the bottle, the question is:
       | what is the shape of end-user devices that are optimally designed
       | to use compute intensive open source algorithms? The "AI PC" is
       | already a marketing gimmick, but could it be that Linux desktops
       | and smartphones will suddenly be "AI natives"?
       | 
       | For sure its a transformational period and the landscape T+10 yrs
       | could be drastically different...
        
       | LarsDu88 wrote:
       | Obligatory reminder of why tech companies subsidize open source
       | projects: https://www.joelonsoftware.com/2002/06/12/strategy-
       | letter-v/
        
       | avivo wrote:
       | The FTC also recently put out a statement that is fairly pro-open
       | source: https://www.ftc.gov/policy/advocacy-research/tech-at-
       | ftc/202...
       | 
       | I think it's interesting to think about this question of open
       | source, benefits, risk, and even competition, without all of the
       | baggage that Meta brings.
       | 
       | I agree with the FTC, that the benefits of open-weight models are
       | significant for competition. _The challenge is in distinguishing
       | between good competition and bad competition._
       | 
       | Some kind of competition can harm consumers and critical public
       | goods, including democracy itself. For example, competing for
       | people's scarce attention or for their food buying, with
       | increasingly optimized and addictive innovations. Or competition
       | to build the most powerful biological weapons.
       | 
       | Other kinds of competition can massively accelerate valuable
       | innovation. The FTC must navigate a tricky balance here --
       | leaning into competition that serves consumers and the broader
       | public, while being careful about what kind of competition it is
       | accelerating that could cause significant risk and harm.
       | 
       | It's also obviously not just "big tech" that cares about the
       | risks behind open-weight foundation models. Many people have
       | written about these risks even before it became a subject of
       | major tech investment. (In other words, A16Z's framing is often
       | rather misleading.) There are many non-big tech actors who are
       | very concerned about current and potential negative impacts of
       | open-weight foundation models.
       | 
       | One approach which can provide the best of both worlds, is for
       | cases where there are significant potential risks, to ensure that
       | there is at least some period of time where weights are not
       | provided openly, in order to learn a bit about the potential
       | implications of new models.
       | 
       | Longer-term, there may be a line where models are too risky to
       | share openly, and it may be unclear what that line is. In that
       | case, it's important that we have governance systems for such
       | decisions that are not just profit-driven, and which can help us
       | continue to get the best of all worlds. (Plug: my organization,
       | the AI & Democracy Foundation; https://ai-dem.org/; is working to
       | develop such systems and hiring.)
        
         | whimsicalism wrote:
         | making food that people want to buy is good actually
         | 
         | i am not down with this concept of the chattering class
         | deciding what are good markets and what are bad, unless it is
         | due to broad-based and obvious moral judgements.
        
       | tpurves wrote:
       | 405 sounds like a lot of B's! What do you need to practically run
       | or host that yourself?
        
       | tpurves wrote:
       | 405 is a lot of B's. What does it take to run or host that?
        
         | danielmarkbruce wrote:
         | quantize to 0 bit. Run on a potato.
         | 
         | Jokes aside ~ 405b x 2 bytes of memory (FP16), so say 810 gigs,
         | maybe 1000 gigs or so required in reality, need maybe 2 aws p5
         | instances?
        
       | dang wrote:
       | Related ongoing thread:
       | 
       |  _Llama 3.1_ - https://news.ycombinator.com/item?id=41046540 -
       | July 2024 (114 comments)
        
       | littlestymaar wrote:
       | I love how Zuck decided to play a new game called "commoditize
       | some other billionaire's business to piss him", I can't wait
       | until this becomes a trend and we get plenty of open source cool
       | stuff.
       | 
       | If he really wants to replicate Linux's success against
       | proprietary Unices, he needs to release Llama with some kind of
       | GPL equivalent, that forces everyone to play the open source
       | game.
        
       | Dwedit wrote:
       | Without the raw data that trained the model, how is it open
       | source?
        
       | suyash wrote:
       | Open source is a welcome step but what we really need is complete
       | decentralisation so people can run their own private AI Models
       | that keep all the data private to them. We need this to happen
       | locally on laptops, mobile phones, smart devices etc. Waiting for
       | when that will become ubiquitous.
        
       | zoogeny wrote:
       | Totally tangential thought, probably doomed to be lost in the
       | flood of comments on this very interesting announcement.
       | 
       | I was thinking today about Musk, Zuckerberg and Altman. Each
       | claims that the next version of their big LLMs will be the best.
       | 
       | For some reason it reminded me of one apocryphal cause of WW1,
       | which was that the kings of Europe were locked in a kind of ego
       | driven contest. It made me think about the Nation State as a
       | technology. In some sense, the kings were employing the new
       | technology which was clearly going to be the basis for the future
       | political order. And they were pitting their own implementation
       | of this new technology against the other kings.
       | 
       | I feel we are seeing a similar clash of kings playing out. The
       | claims that this is all just business or some larger claim about
       | the good of humanity seem secondary to the ego stakes of the
       | major players. And when it was about who built the biggest
       | rocket, it felt less dangerous.
       | 
       | It breaks my heart just a little bit. I feel sympathy in some
       | sense for the AIs we will create, especially if they do reach the
       | level of AGI. As another tortured analogy, it is like a bunch of
       | competitive parents forcing their children into adversarial
       | relationships to satisfy the parent's ego.
        
       | light_triad wrote:
       | They are positioning themselves as champions of AI open source
       | mostly because they were blindsided by OpenAI, are not in the
       | infra game, and want to commoditize their complements as much as
       | possible.
       | 
       | This is not altruism although it's still great for devs and
       | startups. All FB GPU investments is primarily for new AI products
       | "friends", recommendations and selling ads.
       | 
       | https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/
        
         | baby wrote:
         | Meta does a good thing
         | 
         | HN spends a day figuring out how it's actually bad
        
           | shnock wrote:
           | It's not actually bad, OP's point is that it is not motivated
           | by altruism. An action can be beneficial to the people
           | without that effect being the incentive
        
         | war321 wrote:
         | They've been working on AI for a good bit now. Open source
         | especially is something they've championed since the mid 2010s
         | at least with things like PyTorch, GraphQL, and React. It's not
         | something they've suddenly pivoted to since ChatGPT came in
         | 2022.
        
         | kertoip_1 wrote:
         | They are giving it "for free" because:
         | 
         | * they need LLMs that they can control for features on their
         | platforms (Fb/Instagram, but I can see many use cases on VR
         | too)
         | 
         | * they cannot sell it. They have no cloud services to offer.
         | 
         | So they would spend this money anyways, but to compensate some
         | losses they just decided to use it to fix their PR by
         | contenting developers
        
           | sterlind wrote:
           | They also reap the benefits of AI researchers across the
           | world using Llama as a base. All their research is
           | immediately applicable to their models. It's also likely a
           | strategic decision to reduce the moat OpenAI is building
           | around itself.
           | 
           | I also think LeCunn opposes OpenAI's gatekeeping at a
           | philosophical/political level. He's using his position to
           | strengthen open-source AI. Sure, there's strategic business
           | considerations, but I wouldn't rule out principled
           | motivations too.
        
       | anthomtb wrote:
       | > My framework for understanding safety is that we need to
       | protect against two categories of harm: unintentional and
       | intentional. Unintentional harm is when an AI system may cause
       | harm even when it was not the intent of those running it to do
       | so. For example, modern AI models may inadvertently give bad
       | health advice. Or, in more futuristic scenarios, some worry that
       | models may unintentionally self-replicate or hyper-optimize goals
       | to the detriment of humanity. Intentional harm is when a bad
       | actor uses an AI model with the goal of causing harm.
       | 
       | Okay then Mark. Replace "modern AI models" with "social media"
       | and repeat this statement with a straight face.
        
       | j_m_b wrote:
       | > We need to protect our data.
       | 
       | This is a very important concern in Health Care because of HIPAA
       | compliance. You can't just send your data over the wire to
       | someone's proprietary API. You would at least need to de-identify
       | your data. This can be a tricky task, especially with
       | unstructured text.
        
       | xpe wrote:
       | Zuck needs to get real. They are Open Weights not Open Source.
        
       | Sparkyte wrote:
       | The real path forward is recognizing what AI is good at and what
       | it is bad at. Focus on making what it is good at even better and
       | faster. Open AI will definitely give us that option but it isn't
       | a miracle worker.
       | 
       | My impression is that AI if done correctly will be the new way to
       | build APIs with large data sets and information. It can't write
       | code unless you want to dump billions of dollars into a solution
       | with millions of dollars of operational costs. As it stands it
       | loses context too quickly to do advance human tasks. BUT this is
       | where it is great at assembling data and information. You know
       | what is great at assembling data and information? APIs.
       | 
       | Think of it this way if we can make it faster and it trains on a
       | datalake for a company it could be used to return information
       | faster than a nested micro-service architecture that is just a
       | spiderweb of dependencies.
       | 
       | Because AI loses context simple API requests could actually be
       | more efficient.
        
       | Bluescreenbuddy wrote:
       | >This is how we've managed security on our social networks - our
       | more robust AI systems identify and stop threats from less
       | sophisticated actors who often use smaller scale AI systems.
       | 
       | So about all the bots and sock puppets on social media..
        
       | pjkundert wrote:
       | Deployment of PKI-signed distributed software systems to use
       | community-provisioned compute, bandwidth and storage at scale is,
       | now quite literally, the future.
       | 
       | We mostly don't all want or need the hardware to run these AIs
       | ourselves, all the time. But, when we do, we need lots of it for
       | a little while.
       | 
       | This is what Holochain was born to do. We can rent massive
       | capacity when we need it, or earn money renting ours when we
       | don't.
       | 
       | All running cryptographically trusted software at Internet scale,
       | without the knowledge or authorization of commercial or
       | government "do-gooders".
       | 
       | Exciting times!
        
       | ayakang31415 wrote:
       | Massive props to AI teams at Meta that released this model open
       | source
        
       | ceva wrote:
       | They have earned so much money on all of their users, this is
       | least they can do to give back to the community, if this can be
       | considered that ;)
        
       | animanoir wrote:
       | "Says the Meta Inc".
        
       | seydor wrote:
       | That assumes LLMs are the path to AI, which is increasingly
       | becoming an unpopular opinion
        
       | tmsh wrote:
       | Software 2.0 is about open licensing.
       | 
       | I.e., the more important thing - the more "free" thing - is the
       | licensing now.
       | 
       | E.g., I play around with different image diffusion models like
       | Stable Diffusion and specific fine-tuned variations for
       | ControlNet or LoRA that I plug into ComfyUI.
       | 
       | But I can't use it at work because of the licensing. I have to
       | use InvokeAI instead of ComfyUI if I want to be careful and only
       | very specific image diffusion models without the latest and
       | greatest fine-tuning. As others have said - the weights
       | themselves are rather inscrutable. So we're building on more
       | abstract shapes now.
       | 
       | But the key open thing is making sure (1) the tools to modify the
       | weights are open and permissive (ComfyUI, related scripts or
       | parts of both the training and deployment) and (2) the underlying
       | weights of the base models and the tools to recreate them have
       | MIT or other generous licensing. As well as the fine-tuned
       | variants for specific tasks.
       | 
       | It's not going to be the naive construction in the future where
       | you take a base model and as company A you produce company A's
       | fine tuned model and you're done.
       | 
       | It's going to be a tree of fine-tuned models as a node-based
       | editor like ComfyUI already shows and that whole tree has to be
       | open if we're to keep the same hacker spirit where anyone can
       | tinker with it and also at some point make money off of it. Or go
       | free software the whole way (i.e., LGPL or equivalent the whole
       | tree of tools).
       | 
       | In that sense unfortunately Llama has a ways to go to be truly
       | open: https://news.ycombinator.com/item?id=36816395
        
       | jameson wrote:
       | It's hard to say Llama is an "open source" when their license
       | states Meta has full control under certain circumstances
       | 
       | https://raw.githubusercontent.com/meta-llama/llama-models/ma...
       | 
       | > 2. Additional Commercial Terms. If, on the Llama 3.1 version
       | release date, the monthly active users of the products or
       | services made available by or for Licensee, or Licensee's
       | affiliates, is greater than 700 million monthly active users in
       | the preceding calendar month, you must request a license from
       | Meta, which Meta may grant to you in its sole discretion, and you
       | are not authorized to exercise any of the rights under this
       | Agreement unless or until Meta otherwise expressly grants you
       | such rights.
        
         | __loam wrote:
         | It should be transparently clear that this move was taken by
         | Meta to drive their competitors out of business in a capital
         | intensive space.
        
           | apwell23 wrote:
           | not sure how it drives competitors out of business. OpenAI is
           | losing money on queries not on model creation. This
           | opensource model has no impact of their business model of
           | charging users money to run queries.
           | 
           | on a side note OpenAI is losing users on its own. It doesn't
           | need meta to put it out of business.
        
         | systemvoltage wrote:
         | Tbh, it's incredibly generous.
        
       | nailer wrote:
       | Llama isn't open source. The license is at
       | https://llama.meta.com/llama3/license/ and includes various
       | restrictions on use, which means it falls outside the rules
       | created by the https://opensource.org/osd
        
       | war321 wrote:
       | Even if it's just open weights and not "true" open source, I'll
       | still give Meta the appreciation of being one of the few big AI
       | companies actually committed to open models. In an ecosystem
       | where groups like Anthropic and OpenAI keep hemming and hawing
       | about safety and the necessity of closed AI systems "for our
       | sake", they stand out among the rest.
        
       | rednafi wrote:
       | How's only sharing the binary artifact is open source? There's
       | the data aspect of things that they can't share because of
       | licensing and the code itself isn't accessible.
        
       ___________________________________________________________________
       (page generated 2024-07-23 23:00 UTC)