hngopher.com

       [HN Gopher] Open source AI is the path forward
       ___________________________________________________________________
        
       Open source AI is the path forward
        
       Author : atgctg
       Score  : 2262 points
       Date   : 2024-07-23 15:08 UTC (1 days ago)
        
 (HTM) web link (about.fb.com)
 (TXT) w3m dump (about.fb.com)
        
       | amusingimpala75 wrote:
       | Sure but under what license? Because slapping "open source" on
       | the model doesn't make it open source if it's not actually
       | license that way. The 3.1 license still contains their non-
       | commercial clause (over 700m users) and requires derivatives,
       | whether fine tunings or trained on generated data, to use the
       | llama name.
        
         | redleader55 wrote:
         | "Use it for whatever you want(conditions apply), but not if you
         | are Google, Amazon, etc. If you become big enough talk to us."
         | That's how I read the license, but obviously I might be missing
         | some nuance.
        
           | mesebrec wrote:
           | You also can't use it for training or improving other models.
           | 
           | You also can't use it if you're the government of India.
           | 
           | Neither can sex workers use it. (Do you know if your
           | customers are sex workers?)
           | 
           | There are also very vague restrictions for things like
           | discrimination, racism etc.
        
             | war321 wrote:
             | They're actually updating their license to allow LLAMA
             | outputs for training!
             | 
             | https://x.com/AIatMeta/status/1815766335219249513
        
             | sumedh wrote:
             | > You also can't use it if you're the government of India.
             | 
             | Why is that?
        
         | frabcus wrote:
         | Also it isn't source code, it is a binary. You need at least
         | the data curation code and preferably the data itself for it to
         | be actually source code in the practical sense that anyone can
         | remake the build.
         | 
         | Llama could change the license on later versions to kill your
         | business and you have no options as you don't know how they
         | trained it or have the budget to.
         | 
         | It's not much more free than binary software.
        
       | aliljet wrote:
       | And this is happening RIGHT as a new potential leader is emerging
       | in Llama 3.1. I'm really curious about how this is going to match
       | up on the leaderboards...
        
       | kart23 wrote:
       | > This is how we've managed security on our social networks - our
       | more robust AI systems identify and stop threats from less
       | sophisticated actors who often use smaller scale AI systems.
       | 
       | Ok, first of all, has this really worked? AI moderators still
       | can't capture the mass of obvious spam/bots on all their
       | platforms, threads included. Second, AI detection doesn't work,
       | and with how much better the systems are getting, it's probably
       | never going to, unless you keep the best models for yourself, and
       | it's is clear from the rest of the note that its not zuck's
       | intention to do so.
       | 
       | > As long as everyone has access to similar generations of models
       | - which open source promotes - then governments and institutions
       | with more compute resources will be able to check bad actors with
       | less compute.
       | 
       | This just doesn't make sense. How are you going to prevent AI
       | spam, AI deepfakes from causing harm with more compute? What are
       | you gonna do with more compute about nonconsensual deepfakes?
       | People are already using AI to bypass identity verification on
       | your social media networks, and pump out loads of spam.
        
         | OpenComment wrote:
         | Interesting quotes. _Less sophisticated actors_ just means
         | humans who already write in 2020 what the NYT wrote in early
         | 2022 to prepare for Biden 's State Of The Union 180deg policy
         | reversals (manufacturing consent).
         | 
         | FB was notorious for censorship. Anyway, what is with the
         | "actions/actors" terminology? This is straightforward
         | totalitarian language.
        
         | simonw wrote:
         | "AI detection doesn't work, and with how much better the
         | systems are getting, it's probably never going to, unless you
         | keep the best models for yourself"
         | 
         | I don't think that's true. I don't think even the best
         | privately held models will be able to detect AI text reliably
         | enough for that to be worthwhile.
        
         | zmmmmm wrote:
         | I found this dubious as well, especially how it is portrayed as
         | a simple game of compute power. For a start, there is an
         | enormous asymmetry which is why we have a spam problem in the
         | first place. For example a single bot can send out millions of
         | emails at almost no cost and we have to expend a lot more
         | "energy" to classify each one and decide if it's spam or not.
         | So you don't just need more compute power you need drastically
         | more compute power, and as AI models improve and get refined,
         | the operation at ten times the scale is probably going to be
         | marginally better, not orders of magnitude better.
         | 
         | I still agree with his general take - bad actors will get these
         | models or make them themselves, you can't stop it. But the
         | logic about compute power is odd.
        
       | blackeyeblitzar wrote:
       | Only if it is truly open source (open data sets, transparent
       | curation/moderation/censorship of data sets, open training source
       | code, open evaluation suites, and an OSI approved open source
       | license).
       | 
       | Open weights (and open inference code) is NOT open source, but
       | just some weak open washing marketing.
       | 
       | The model that comes closest to being TRULY open is AI2's OLMo.
       | See their blog post on their approach:
       | 
       | https://blog.allenai.org/hello-olmo-a-truly-open-llm-43f7e73...
       | 
       | I think the only thing they're not open about is how they've
       | curated/censored their "Dolma" training data set, as I don't
       | think they explicitly share each decision made or the original
       | uncensored dataset:
       | 
       | https://blog.allenai.org/dolma-3-trillion-tokens-open-llm-co...
       | 
       | By the way, OSI is working on defining open source for AI. They
       | post weekly updates to their blog. Example:
       | 
       | https://opensource.org/blog/open-source-ai-definition-weekly...
        
         | JumpCrisscross wrote:
         | > _Only if it is truly open source (open data sets, transparent
         | curation /moderation/censorship of data sets, open training
         | source code, open evaluation suites, and an OSI approved open
         | source license)_
         | 
         | You're missing a then to your if. What happens if it's "truly"
         | open per your definition versus not?
        
           | blackeyeblitzar wrote:
           | I think you are asking what the benefits are? The main
           | benefit is that we can trust what these systems are doing
           | better. Or we can self host them. If we just take the
           | weights, then it is unclear how these systems might be lying
           | to us or manipulating us.
           | 
           | Another benefit is that we can learn from how the training
           | and other steps actually work. We can change them to suit our
           | needs (although costs are impractical today). Etc. It's all
           | the usual open source benefits.
        
         | haolez wrote:
         | There is also the risk of companies like Meta introducing ads
         | in the training itself, instead of inference time.
        
         | itissid wrote:
         | Yeah, though I do wonder for a big model like 405B if the
         | original training recipe, really matters for where models are
         | heading, practically speaking which is smaller and more
         | specific?
         | 
         | I imagine its main use would be to train other models by
         | distilling them down with LoRA/Quantization etc(assuming we
         | have a tokenizer). Or use them to generate training data for
         | smaller models directly.
         | 
         | But, I do think there is always a way to share without
         | disclosing too many specifics, like this[1] lecture from this
         | year's spring course at Stanford. You can always say, for
         | example:
         | 
         | - The most common technique for filtering was using voting LLMs
         | ( _without disclosing said llms or quantity of data_ ).
         | 
         | - We built on top of a filtering technique for removing poor
         | code using ____ by ____ authors ( _without disclosing or
         | handwaving how you exactly filtered, but saying that you had to
         | filter_ ).
         | 
         | - We mixed certain proportion of this data with that data to
         | make it better ( _without saying what proportion_ )
         | 
         | [1]
         | https://www.youtube.com/watch?v=jm2hyJLFfN8&list=PLoROMvodv4...
        
       | JumpCrisscross wrote:
       | "The Heavy Press Program was a Cold War-era program of the United
       | States Air Force to build the largest forging presses and
       | extrusion presses in the world." This "program began in 1944 and
       | concluded in 1957 after construction of four forging presses and
       | six extruders, at an overall cost of $279 million. Six of them
       | are still in operation today, manufacturing structural parts for
       | military and commercial aircraft" [1].
       | 
       | $279mm in 1957 dollars is about $3.2bn today [2]. A public
       | cluster of GPUs provided for free to American universities,
       | companies and non-profits might not be a bad idea.
       | 
       | [1] https://en.m.wikipedia.org/wiki/Heavy_Press_Program
       | 
       | [2] https://data.bls.gov/cgi-
       | bin/cpicalc.pl?cost1=279&year1=1957...
        
         | CardenB wrote:
         | Doubtful that GPUs purchased today would be in use for a
         | similar time scale. Govt investment would also drive the cost
         | of GPUs up a great deal.
         | 
         | Not sure why a publicly accessible GPU cluster would be a
         | better solution than the current system of research grants.
        
           | JumpCrisscross wrote:
           | > _Doubtful that GPUs purchased today would be in use for a
           | similar time scale_
           | 
           | Totally agree. That doesn't mean it can't generate massive
           | ROI.
           | 
           | > _Govt investment would also drive the cost of GPUs up a
           | great deal_
           | 
           | Difficult to say this _ex ante_. On its own, yes. But it
           | would displace some demand. And it could help boost chip
           | production in the long run.
           | 
           | > _Not sure why a publicly accessible GPU cluster would be a
           | better solution than the current system of research grants_
           | 
           | Those receiving the grants have to pay a private owner of the
           | GPUs. That gatekeeping might be both problematic, if there is
           | a conflict of interests, and inefficient. (Consider why the
           | government runs its own supercomputers versus contracting
           | everything to Oracle and IBM.)
        
             | rvnx wrote:
             | It would be better that the government removes IP on such
             | technology for public use, like drugs got generics.
             | 
             | This way the government pays 2'500 USD per card, not 40'000
             | USD or whatever absurd.
        
               | JumpCrisscross wrote:
               | > _better that the government removes IP on such
               | technology for public use, like drugs got generics_
               | 
               | You want to punish NVIDIA for calling its shots
               | correctly? You don't see the many ways that backfires?
        
               | gpm wrote:
               | No. But I do want to limit the amount we reward NVIDIA
               | for calling the shots correctly to maximize the benefit
               | to society. For instance by reducing the duration of the
               | government granted monopolies on chip technology that is
               | obsolete well before the default duration of 20 years is
               | over.
               | 
               | That said, it strikes me that the actual limiting factor
               | is fab capacity not nvidia's designs and we probably need
               | to lift the monopolies preventing competition there if we
               | want to reduce prices.
        
               | JumpCrisscross wrote:
               | > _reducing the duration of the government granted
               | monopolies on chip technology that is obsolete well
               | before the default duration of 20 years is over_
               | 
               | Why do you think these private entities are willing to
               | invest the massive capital it takes to keep the frontier
               | advancing at that rate?
               | 
               | > _I do want to limit the amount we reward NVIDIA for
               | calling the shots correctly to maximize the benefit to
               | society_
               | 
               | Why wouldn't NVIDIA be a solid steward of that capital
               | given their track record?
        
               | gpm wrote:
               | > Why do you think these private entities are willing to
               | invest the massive capital it takes to keep the frontier
               | advancing at that rate?
               | 
               | Because whether they make 100x or 200x they make a
               | shitload of money.
               | 
               | > Why wouldn't NVIDIA be a solid steward of that capital
               | given their track record?
               | 
               | The problem isn't who is the steward of the capital. The
               | problem is that economically efficient thing to do for a
               | single company is (given sufficient fab capacity, and a
               | monopoly) to raise prices to extract a greater share of
               | the pie at the expense of shrinking the size of the pie.
               | I'm not worried about who takes the profit, I'm worried
               | about the size of the pie.
        
               | whimsicalism wrote:
               | > Because whether they make 100x or 200x they make a
               | shitload of money.
               | 
               | It's not a certainty that they 'make a shitload of
               | money'. Reducing the right tail payoffs absolutely
               | reduces the capital allocated to solve problems - many of
               | which are _risky bets_.
               | 
               | Your solution absolutely decreases capital investment at
               | the margin, this is indisputable and basic economics.
               | Even worse when the taking is not due to some pre-
               | existing law, so companies have to deal with the
               | additional uncertainty of whether & when future people
               | will decide in retrospect that they got too large a
               | payoff and arbitrarily decide to take it from them.
        
               | gpm wrote:
               | You can't just look at the costs to an action, you also
               | have to look at the benefits.
               | 
               | Of course I agree I'm going to stop marginal investments
               | from occurring into research into patent-able
               | technologies by reducing the expect profit. But I'm going
               | to do so _very slightly_ because I 'm not shifting the
               | expected value by very much. Meanwhile I'm going to
               | greatly increase the investment into the existing
               | technology we already have, and allow many more people to
               | try to improve upon it, and I'm going to argue the
               | benefits greatly outweigh the costs.
               | 
               | Whether I'm right or wrong about the net benefit, the
               | basic economics here is that there are both costs and
               | benefits to my proposed action.
               | 
               | And yes I'm going to marginally reduce future investments
               | because the same might happen in the future and that
               | reduces expected value. In fact if I was in charge the
               | same _would_ happen in the future. And the trade-off I
               | get for this is that society gets the benefit of the same
               | _actually_ happening in the future and us not being
               | hamstrung by unbreachable monopolies.
        
               | JumpCrisscross wrote:
               | > _I 'm going to do so very slightly because I'm not
               | shifting the expected value by very much_
               | 
               | You're massively increasing uncertainty.
               | 
               | > _the same would happen in the future. And the trade-off
               | I get for this is that society gets the benefit_
               | 
               | Why would you expect it would ever happen again? What you
               | want is an unrealized capital gains tax. Not to nuke our
               | semiconductor industry.
        
               | whimsicalism wrote:
               | > But I'm going to do so very slightly because I'm not
               | shifting the expected value by very much
               | 
               | I think you're shifting it by a lot. If the government
               | can post-hoc decide to invalidate patents because the
               | holder is getting too successful, you are introducing a
               | substantial impact on expectations and uncertainty. Your
               | action is not taken in a vacuum.
               | 
               | > Meanwhile I'm going to greatly increase the investment
               | into the existing technology we already have, and allow
               | many more people to try to improve upon it, and I'm going
               | to argue the benefits greatly outweigh the costs.
               | 
               | I think this is a much more speculative impact. Why will
               | people even fund the improvements if the government might
               | just decide they've gotten too large a slice of the pie
               | later on down the road?
               | 
               | > the trade-off I get for this is that society gets the
               | benefit of the same actually happening in the future and
               | us not being hamstrung by unbreachable monopolies.
               | 
               | No the trade-off is that materially less is produced.
               | These incentive effects are not small. Take for instance,
               | drug price controls - a similar post-facto taking because
               | we feel that the profits from R&D are too high.
               | Introducing proposed price controls leads to hundreds of
               | fewer drugs over the next decade [0] - and likely
               | millions of premature deaths downstream of these
               | incentive effects. And that's with a policy with a clear
               | path towards short-term upside (cheaper drug prices).
               | Discounted GPUs by invalidating nvidia's patents has a
               | much more tenuous upside and clear downside.
               | 
               | [0]: https://bpb-
               | us-w2.wpmucdn.com/voices.uchicago.edu/dist/d/312...
        
               | hluska wrote:
               | You have proposed state ownership of all successful IP.
               | That is a massive change and yet you have demonstrated
               | zero understanding of the possible costs.
               | 
               | Your claim that removing a profit motivation will
               | increase investment is flat out wrong. Everything else
               | crumbles from there.
        
               | gpm wrote:
               | No, I've proposed removing or reducing IP protections,
               | not transferring them to the state. Allowing competitors
               | to enter the market will obviously increase investment in
               | competitors...
        
               | IG_Semmelweiss wrote:
               | This is already happening - its called China. There's a
               | reason they don't innovate in anything, and they are
               | always playing catch-up, except in the art of copying
               | (stealing) from others.
               | 
               | I do think there are some serious IP issues, as IP rules
               | can be hijacked in the US, but that means you fix those
               | problems, not blow up IP that was rightfully earned
        
               | psd1 wrote:
               | > they don't innovate in anything
               | 
               | They are leaders in solar and EVs.
               | 
               | Remember how Japan leapfrogged the western car industry,
               | and six sigma became required reading for managers in
               | every industry?
        
               | hluska wrote:
               | Removing IP restrictions transfers them to the state.
               | Grow up.
        
               | salawat wrote:
               | >Why wouldn't NVIDIA be a solid steward of that capital
               | given their track record?
               | 
               | Past performance is not indicative of future results.
        
               | whimsicalism wrote:
               | there is no such thing as a lump-sum transfer, this will
               | shift expectations and incentives going forward and make
               | future large capital projects an increasingly uphill
               | battle
        
               | hluska wrote:
               | So, if a private company is successful, you will
               | nationalize its IP under some guise of maximizing the
               | benefit to society? That form of government was tried
               | once. It failed miserably.
               | 
               | Under your idea, we'll try a badly broken economic
               | philosophy again. And while we're at it, we will
               | completely stifle investment in innovation.
        
               | tick_tock_tick wrote:
               | > That said, it strikes me that the actual limiting
               | factor is fab capacity not nvidia's designs and we
               | probably need to lift the monopolies preventing
               | competition there if we want to reduce prices.
               | 
               | Lol it's not "monopolies" limiting fab capacity. Existing
               | fab companies can barely manage to stand-up a new fab in
               | different cities. Fabs are impossibly complex and beyond
               | risky to fund.
               | 
               | It's the kind of thing you'd put government money to
               | making but it's so risky government really don't want to
               | spend billions and fail so they give existing companies
               | billions so if they fail it's not the governments fault.
        
               | Teever wrote:
               | There was a post[0] on here recently about how the US
               | went from producing woefully insufficient numbers of
               | aircraft to producing 300k by the end of world war 2.
               | 
               | One of the things that the post mentioned was the meager
               | profit margin that the companies made during this time.
               | 
               | But the thing is that this set the America auto and
               | aviation industry up to rule the world for decades.
               | 
               | A government going to a company and saying 'we need you
               | to produce this product for us at a lower margin thab
               | you'd like to' isn't the end of the world.
               | 
               | I don't know if this is one of those scenarios but they
               | exist.
               | 
               | [0] https://www.construction-physics.com/p/how-to-
               | build-300000-a...
        
               | rvnx wrote:
               | In the case of NVIDIA it's even more sneaky.
               | 
               | They are an intellectual property company holding the
               | rights on plans to make graphic cards, not even a company
               | actually making graphic cards.
               | 
               | The government could launch an initiative "OpenGPU" or
               | "OpenAI Accelerator", where the government orders GPUs
               | from TSMC directly, without the middleman.
               | 
               | It may require some tweaking in the law to allow
               | exception to intellectual property for "public interest".
        
               | whimsicalism wrote:
               | y'all really don't understand how these actions would
               | seriously harm capital markets and make it difficult for
               | private capital formation to produce innovations going
               | forward.
        
               | freeone3000 wrote:
               | If we have public capital formation, we don't necessarily
               | need private capital. Private innovation in weather
               | modelling isn't outpacing government work by leaps and
               | bounds, for instance.
        
               | whimsicalism wrote:
               | because it is extremely challenging to capture the
               | additional value that is being produced by better weather
               | forecasts and generally the forecasts we have right now
               | are pretty good.
               | 
               | private capital is absolutely the driving force for the
               | vast majority of innovations since the beginning of the
               | 20th century. public capital may be involved, but it is
               | dwarfed by private capital markets.
        
               | freeone3000 wrote:
               | It's challenging to capture the additional value and the
               | forecasts are pretty good _because_ of _continual_ large-
               | scale government investment into weather forecasting.
               | NOAA is launching satellites! it's a big deal!
               | 
               | Private nuclear research is heavily dependent on
               | governmental contracts to function. Solar was subsidized
               | to heck and back for years. Public investment does work,
               | and does make a didference.
               | 
               | I would even say governmental involvement is sometimes
               | even the deciding factor, to determine if research is
               | worth pursuing. Some major capital investors have decided
               | AI models cannot possibly gain enough money to pay for
               | their training costs. So what do we do when we believe
               | something is a net good for society, but isn't going to
               | be profitable?
        
               | inetknght wrote:
               | > _y 'all really don't understand how these actions would
               | seriously harm capital markets and make it difficult for
               | private capital_
               | 
               | Reflexively, I count that harm as a feature. I don't like
               | private capital markets because I've been screwed by
               | private capital on multiple occasions.
               | 
               | But you are right: I don't understand how these actions
               | would harm. So please do expand your concerns.
        
               | panarky wrote:
               | To the extent these are incremental units that wouldn't
               | have been sold absent the government program, it's
               | difficult to see how NVIDIA is "harmed".
        
               | nickpsecurity wrote:
               | They said remove legally-enforced monopolies on what they
               | produce. Many of these big firms made their tech with
               | millions to billions of taxpayer dollars at various
               | points in time. If we've given them millions, shouldn't
               | we at least get to make independent implementations of
               | the tech we already paid for?
        
               | kube-system wrote:
               | > It would be better that the government removes IP on
               | such technology for public use, like drugs got generics.
               | 
               | 20-25 year old drugs are a lot more useful than 20-25
               | year old GPUs, and the manufacturing supply chain is not
               | a bottleneck.
               | 
               | There's no generics for the latest and greatest drugs,
               | and a fancy gene therapy might run a _lot_ more than
               | $40k.
        
             | latchkey wrote:
             | > Those receiving the grants have to pay a private owner of
             | the GPUs.
             | 
             | Along similar lines, I'm trying to build a developer
             | credits program where I get whomever (AMD/Dell) to purchase
             | credits on my super computers, that we then give away to
             | developers to build solutions, which drives more demand for
             | our hardware, and we commit to re-invest those credits back
             | into more hardware. The idea is to create a win-win-win
             | (us, them, you) developer flywheel ecosystem. It isn't a
             | new idea at all, Nvidia and hyperscalers have been doing
             | this for ages.
        
           | ygjb wrote:
           | Of course they won't. The investment in the Heavy Press
           | Program was the initial build, and just citing one example,
           | the Alcoa 50,000 ton forging press was built in 1955,
           | operated until 2008, and needed ~$100M to get it operational
           | again in 2012.
           | 
           | The investment was made to build the press, which created
           | significant jobs and capital investment. The press, and
           | others like it, were subsequently operated by and then sold
           | to a private operator, which in turn enabled the massive
           | expansion of both military manufacturing, and commercial
           | aviation and other manufacturing.
           | 
           | The Heavy Press Program was a strategic investment that paid
           | dividends by both advancing the state of the art in
           | manufacturing at the time it was built, and improving
           | manufacturing capacity.
           | 
           | A GPU cluster might not be the correct investment, but a
           | strategic investment in increasing, for example, the
           | availability of training data, or interoperability of tools,
           | or ease of use for building, training, and distributing
           | models would probably pay big dividends.
        
             | JumpCrisscross wrote:
             | > _A GPU cluster might not be the correct investment, but a
             | strategic investment in increasing, for example, the
             | availability of training data, or interoperability of
             | tools, or ease of use for building, training, and
             | distributing models would probably pay big dividends_
             | 
             | Would you mind expanding on these options? Universal
             | training data sounds intriguing.
        
               | ygjb wrote:
               | Sure, just on the training front, building and
               | maintaining a broad corpus of properly managed training
               | data with metadata that provides attribution (for
               | example, content that is known to be human generated
               | instead of model generated, what the source of data is
               | for datasets such as weather data, census data, etc), and
               | that also captures any licensing encumbrance so that
               | consumers of the training data can be confident in their
               | ability to use it without risk of legal challenge.
               | 
               | Much of this is already available to private sector
               | entities, but having a publicly funded organization
               | responsible for curating and publishing this would enable
               | new entrants to quickly and easily get a foundation
               | without having to scrape the internet again, especially
               | given how rapidly model generated content is being
               | published.
        
               | mnahkies wrote:
               | I think the EPC (energy performance certificate) dataset
               | in the UK is a nice example of this. Anyone can download
               | a full dataset of EPC data from
               | https://epc.opendatacommunities.org/
               | 
               | Admittedly it hasn't been cleaned all that much - you
               | still need to put a bit of effort into that (newer
               | certificates tend to be better quality), but it's very
               | low friction overall. I'd love to see them do this with
               | more datasets
        
             | dmix wrote:
             | I don't think there's a shortage of capital for AI...
             | probably the opposite
             | 
             | Of all the things to expand the scope of government
             | spending why would they choose AI, or more specifically
             | GPUs?
        
               | devmor wrote:
               | There may however, be a shortage of capital for _open
               | source_ AI, which is the subject under consideration.
               | 
               | As for the why... because there's no shortage of capital
               | for AI. It sounds like the government would like to
               | encourage redirecting that capital to something that's
               | good for the economy at large, rather than good for the
               | investors of a handful of Silicon Valley firms interested
               | only in their own short term gains.
        
               | hluska wrote:
               | Look at it from the perspective of an elected official:
               | 
               | If it succeeds, you were ahead of the curve. If it fails,
               | you were prudent enough to fund an investigation early.
               | Either way, bleeding edge tech gives you a W.
        
               | Geezus_42 wrote:
               | Or you wasted a bunch of tax payer money on some over
               | hyped and over funded nonsense.
        
               | alickz wrote:
               | how would you determine that without investigation?
        
               | seunosewa wrote:
               | You'll be long gone before they find out.
        
               | ygjb wrote:
               | Yeah. There is alot of over hyped and over funded
               | nonsense that comes out of NASA. Some of it is hype from
               | the marketing and press teams, other hype comes from
               | misinterpretation of releases.
               | 
               | None of that changes that there have been major technical
               | breakthroughs, and entire classes of products and
               | services that didn't exist before those investments in
               | NASA (see https://en.wikipedia.org/wiki/NASA_spin-
               | off_technologies for a short list). There are 15
               | departments and dozens of Agencies that comprise the US
               | Federal government, many of whom make investments in
               | science and technology as part of their mandates, and
               | most of that is delivered through some structure of
               | public-private partnerships.
               | 
               | What you see as over-hyped and over-funded nonsense could
               | be the next ground breaking technology, and that is why
               | we need both elected leaders who (at least in theory)
               | represent the will of the people, and appointed, skilled
               | bureaucrats who provide the elected leaders with the
               | skills, domain expertise, and experience that the winners
               | of the popularity contest probably don't have.
               | 
               | Yep, there will be waste, but at least with public funds
               | there is the appearance of accountability that just
               | doesn't exist with private sector funds.
        
               | hluska wrote:
               | Which happens every single day in every government in the
               | world.
        
               | phatfish wrote:
               | If it succeeds the idea gets sold to private corporations
               | or the technology is made public and everyone thinks the
               | corporation with the most popular version created it.
               | 
               | If it fails certain groups ensure everyone knows the
               | government "wasted" taxpayer money.
        
             | whimsicalism wrote:
             | there are many things i think are more capital constrained,
             | if the government is trying to subsidize things.
        
           | jvanderbot wrote:
           | A much better investment would be to (somehow) revolutionize
           | production of chips for AI so that it's all cheaper, more
           | reliable, and faster to stand up new generations of software
           | and hardware codesign. This is probably much closer to the
           | program mentioned in the top level comment: It wasn't to
           | produce one type of thing, but to allow better production of
           | any large thing from lighter alloys.
        
           | photonthug wrote:
           | > Not sure why a publicly accessible GPU cluster would be a
           | better solution than the current system of research grants.
           | 
           | You mean a better solution than different teams paying AWS
           | over and over, potentially spending 10x on rent rather than
           | using all that cash as a down payment on actually owning
           | hardware? I can't really speak for the total costs of
           | depreciation/hardware maintenance but renting forever isn't
           | usually a great alternative to buying.
        
             | CardenB wrote:
             | Do you have some information to share to support your bias
             | against leasing especially with a depreciating asset?
        
               | manux wrote:
               | In Canada, all three major AI research centers use
               | clusters created with public money. These clusters
               | receive regular additional hardware as new generations of
               | GPUs become available. Considering how these institutions
               | work, I'm pretty confident they've considered the
               | alternatives (renting, AWS, etc). So that's one data
               | point.
        
               | photonthug wrote:
               | sure, I'll hand it over after you spend your own time
               | first to show that everything everywhere that's owned
               | instead of leased is a poor financial decision.
        
             | vasili111 wrote:
             | AWS is not only hardware but also software, documentation,
             | support and more.
        
         | light_hue_1 wrote:
         | The problem is that any public cluster would be outdated in 2
         | years. At the same time, GPUs are massively overpriced.
         | Nvidia's profit margins on the H100 are crazy.
         | 
         | Until we get cheaper cards that stand the test of time,
         | building a public cluster is just a waste of money. There are
         | far better ways to spend $1b in research dollars.
        
           | JumpCrisscross wrote:
           | > _any public cluster would be outdated in 2 years_
           | 
           | The private companies buying hundreds of billions of dollars
           | of GPUs aren't writing them off in 2 years. They won't be
           | cutting edge for long. But that's not the point--they'll
           | still be available.
           | 
           | > _Nvidia 's profit margins on the H100 are crazy_
           | 
           | I don't see how the current practice of giving a researcher a
           | grant so they can rent time on a Google cluster that runs
           | H100s is more efficient. It's just a question of capex or
           | opex. As a state, the U.S. has a structual advantage in the
           | former.
           | 
           | > _far better ways to spend $1b in research dollars_
           | 
           | One assumes the U.S. government wouldn't be paying list
           | price. In any case, the purpose isn't purely research ROI.
           | Like the heavy presses, it's in making a prohibitively-
           | expensive capital asset generally available.
        
           | ninininino wrote:
           | What about dollar cost averaging your purchases of GPUs? So
           | that you're always buying a bit of the newest stuff every
           | year rather than just a single fixed investment in hardware
           | that will become outdated? Say 100 million a year every year
           | for 20 years instead of 2 billion in a single year?
        
         | fweimer wrote:
         | Don't these public clusters exist today, and have been around
         | for decades at this point, with varying architectures? In the
         | sense that you submit a proposal, it gets approved, and then
         | you get access for your research?
        
           | JumpCrisscross wrote:
           | Not--to my knowledge--for the GPUs necessary to train
           | cutting-edge LLMs.
        
             | Maxious wrote:
             | All of the major cloud providers offer grants for public
             | research https://www.amazon.science/research-awards https:/
             | /edu.google.com/intl/ALL_us/programs/credits/research
             | https://www.microsoft.com/en-us/azure-academic-research/
             | 
             | NVIDIA offers discounts
             | https://developer.nvidia.com/education-pricing
             | 
             | eg. for Australia, the National Computing Infrastructure
             | allows researchers to reserve time on:
             | 
             | - 160 nodes each containing four Nvidia V100 GPUs and two
             | 24-core Intel Xeon Scalable 'Cascade Lake' processors.
             | 
             | - 2 nodes of the NVIDIA DGX A100 system, with 8 A100 GPUs
             | per node.
             | 
             | https://nci.org.au/our-systems/hpc-systems
        
           | NewJazz wrote:
           | This is the most recent iteration of a national platform.
           | They have tons of GPUs (and CPUs, and flash storage) hooked
           | up as a Kubernetes cluster, available for teaching and
           | research.
           | 
           | https://nationalresearchplatform.org/
        
         | epaulson wrote:
         | The National Science Foundation has been doing this for
         | decades, starting with the supercomputing centers in the 80s.
         | Long before anyone talked about cloud credits, NSF has had a
         | bunch of different programs to allocate time on supercomputers
         | to researchers at no cost, these days mostly run out of the
         | Office of Advanced Cyberinfrastruture. (The office name is from
         | the early 00s) - https://new.nsf.gov/cise/oac
         | 
         | (To connect universities to the different supercomputing
         | centers, the NSF funded the NSFnet network in the 80s, which
         | was basically the backbone of the Internet in the 80s and early
         | 90s. The supercomputing funding has really, really paid off for
         | the USA)
        
           | JumpCrisscross wrote:
           | > _NSF has had a bunch of different programs to allocate time
           | on supercomputers to researchers at no cost, these days
           | mostly run out of the Office of Advanced Cyberinfrastruture_
           | 
           | This would be the logical place to put such a programme.
        
             | alephnerd wrote:
             | The DoE has also been a fairly active purchaser of GPUs for
             | almost two decades now thanks to the Exascale Computing
             | Project [0] and other predecessor projects.
             | 
             | The DoE helped subsidize development of Kepler, Maxwell,
             | Pascal, etc along with the underlying stack like NVLink,
             | NGC, CUDA, etc either via purchases or allowing grants to
             | be commercialized by Nvidia. They also played matchmaker by
             | helping connect private sector research partners with
             | Nvidia.
             | 
             | The DoE also did the same thing for AMD and Intel.
             | 
             | [0] - https://www.exascaleproject.org/
        
               | PostOnce wrote:
               | The DoE subsidized the development of GPUs, but so did
               | Bitcoin.
               | 
               | But before that, it was video games, like quake. Nvidia
               | wouldn't be viable if not for games.
               | 
               | But before that, graphics research was subsidized by the
               | DoD, back when visualizing things in 3D cost serious
               | money.
               | 
               | It's funny how technology advances.
        
               | Retric wrote:
               | It was really Ethereum / Alt coins not Bitcoin that
               | caused the GPU demand in 2021.
               | 
               | Bitcoin moved to FPGAs/ASIC very quickly because
               | dedicated hardware was vastly more efficient they were
               | only viable from Oct 2010. By 2013 when ASIC's came
               | online GPU's only made sense if someone else was paying
               | for both the hardware and electricity.
        
           | jszymborski wrote:
           | As you've rightly pointed out, we have the mechanism, now
           | let's fund it properly!
           | 
           | I'm in Canada, and our science funding has likewise fallen
           | year after year as a proportion of our GDP. I'm still
           | benefiting from A100 clusters funded by tax payer dollars,
           | but think of the advantage we'd have over industry if we
           | didn't have to fight over resources.
        
             | xena wrote:
             | Where do you get access to those as a member of the general
             | public?
        
               | kiwih wrote:
               | In Australia at least, anyone who is enrolled at or works
               | at a university can use the taxpayer-subsidised "Gadi"
               | HPC which is part of the National Computing
               | Infrastructure (https://nci.org.au/our-systems/hpc-
               | systems). I also do mean anyone, I have an undergraduate
               | student using it right now (for free _) to fine-tune
               | several LLMs.
               | 
               | It also says commercial orgs can get access via
               | negotiation, I expect a random member of the public would
               | be able to go that route as well. I expect that there
               | would be some hurdles to cross, it isn't really common
               | for random members of the public to be doing the kinds of
               | research Gadi was created to benefit. I expect it is the
               | same way in this case in Canada. I suppose the argument
               | is if there weren't any gatekeeping at all, you might end
               | up with all kinds of unsuitable stuff on the cluster,
               | e.g. crypto miners and such.
               | 
               | Possibly another way for a true random person to get
               | access would be to get some kind of 0-hour academic
               | affiliation via someone willing to back you up, or one
               | could enrol in a random AI course or something and then
               | talk to the lecturer in charge.
               | 
               | _In reality, the (also taxpayer-subsidised) university
               | pays some fee for access, but it doesn't come from any of
               | our budgets.
        
               | jph00 wrote:
               | Australia's peak HPC has a total of: "2 nodes of the
               | NVIDIA DGX A100 system, with 8 A100 GPUs per node".
               | 
               | It's pretty meagre pickings!
        
               | FireBeyond wrote:
               | Well, one, it has:
               | 
               | > 160 nodes each containing four Nvidia V100 GPUs
               | 
               | and two, well, it's a CPU-based supercomputer.
        
               | mmastrac wrote:
               | I'm going to guess it's Compute Canada, which I don't
               | think we non-academics have access to.
        
               | jszymborski wrote:
               | That's correct (they go by the Digital Research Alliance
               | of Canada now... how boring).
               | 
               | I wish that wasn't the case though!
        
               | jszymborski wrote:
               | I get my resources through a combination of servers my
               | lab bought with using a government grant and the Digital
               | Research Alliance of Canada (nee Compute Canada)'s
               | cluster.
               | 
               | These resources aren't available to the public, but if I
               | were king for a day we'd increase science funding such
               | that we'd have compute resources available to high-school
               | students and the general public (possibly following
               | training on how to use it).
               | 
               | Making sure folks didn't use it to mine bitcoin would be
               | important, though ;)
        
           | cmdrk wrote:
           | Yeah, the specific AI/ML-focused program is NAIRR.
           | 
           | https://nairrpilot.org/
           | 
           | Terrible name unless they low-key plan to make AI
           | researchers' hair fall out.
        
           | dastbe wrote:
           | the US already pays for 2+ aws region for cia/dod. why not
           | pay for a region that is only available to researchers?
        
         | blackeyeblitzar wrote:
         | What about distributed training on volunteer hardware? Is that
         | feasible?
        
           | oersted wrote:
           | It is an exciting concept, there's a huge wealth of gaming
           | hardware deployed that is inactive at most hours of the day.
           | And I'm sure people are willing to pay well above the
           | electricity cost for it.
           | 
           | Unfortunately, the dominant LLM architecture makes it
           | relatively infeasible right now.
           | 
           | - Gaming hardware has too limited VRAM for training any kind
           | of near-state-of-the-art model. Nvidia is being annoyingly
           | smart about this to sell enterprise GPUs at exorbitant
           | markups.
           | 
           | - Right now communication between machines seems to be the
           | bottleneck, and this is way worse with limited VRAM. Even
           | with data-centre-grade interconnect (mostly Infiniband, which
           | is also Nvidia, smart-asses), any failed links tend to cause
           | big delays in training.
           | 
           | Nevertheless, it is a good direction to push towards, and the
           | government could indeed help, but it will take time. We need
           | both a more healthy competitive landscape in hardware, and
           | research towards model architectures that are easy to train
           | in a distributed manner (this was also the key to the success
           | of Transformers, but we need to go further).
        
             | sharpshadow wrote:
             | Couldn't VRAM be subsidised with SSDs on a lower end
             | machine? It would make it slower but maybe useful at last.
        
               | oersted wrote:
               | Perhaps, the landscape has improved a lot in the last
               | couple of years, there are lots of implementation tricks
               | to improve efficiency on consumer hardware, particularly
               | for inference.
               | 
               | Although it is clear that the computing capacity of the
               | GPU would be very underutilized with the SSD as the
               | bottleneck. Even using RAM instead of VRAM is pretty
               | impractical. It might be a bit better for chips like
               | Apple's where the CPU, RAM and GPU are all tightly
               | connected on the same SoC, and the main RAM is used as
               | the VRAM.
               | 
               | Would that performance be still worth more than the
               | electricity cost? Would the earnings be high enough for a
               | wide population to be motivated to go through the hassle
               | of setting up their machine to serve requests?
        
           | codemusings wrote:
           | Ever heard of SETI@home?
           | 
           | https://setiathome.berkeley.edu
        
             | tessellated wrote:
             | Followed the link and got two, for me, new infos: both the
             | project and Drake are dead.
             | 
             | Used to contribute in the early 2000s with my Pentium for a
             | while.
             | 
             | Ever got any results?
             | 
             | Also, for training LLMs, I understand there is a huge
             | bandwith problem with this approach.
        
         | ks2048 wrote:
         | How about using some of that money to develop CUDA alternatives
         | so everyone is not paying the Nvidia tax?
        
           | lukan wrote:
           | It would be probably cheaper to negate some IP. There are
           | quite some projects and initiatives to make CUDA code run on
           | AMD for example, but as far as I know, they all stopped at
           | some point, probably because of fear of being sued into
           | oblivion.
        
           | whimsicalism wrote:
           | It seems like rocm is already fully ready for transformer
           | inference, so you are just referring to training?
        
             | janalsncm wrote:
             | ROCm is buggy and largely undocumented. That's why we don't
             | use it.
        
               | latchkey wrote:
               | It is actively improving every day.
               | 
               | https://news.ycombinator.com/item?id=41052750
        
           | belter wrote:
           | Please start with the Windows Tax first for Linux users
           | buying hardware...and the Apple Tax for Android users...
        
           | zitterbewegung wrote:
           | Either you port Tensorflow (Apple)[1] or PyTorch to your
           | platform or you allow CUDA to run on your hardware (AMD) [2].
           | Companies are incentives to not have NVIDIA having a monopoly
           | but the thing is that CUDA is a huge moat due to
           | compatibility of all frameworks and everyone knows it. Also,
           | all of the cloud or on premises providers use NVIDIA
           | regardless.
           | 
           | [1] https://developer.apple.com/metal/tensorflow-plugin/ [2]
           | https://www.xda-developers.com/nvidia-cuda-amd-zluda/
        
             | TuringNYC wrote:
             | >> Either you port Tensorflow (Apple)[1] or PyTorch to your
             | platform or you allow CUDA to run on your hardware (AMD)
             | [2]. Companies are incentives to not have NVIDIA having a
             | monopoly but the thing is that CUDA is a huge moat due to
             | compatibility of all frameworks and everyone knows it.
             | Also, all of the cloud or on premises providers use NVIDIA
             | regardless.
             | 
             | This never made sense to me -- Apple could easily hire top
             | talent to write Apple Silicon bindings for these popular
             | libraries. I work at a creative ad agency, we have tons of
             | high end apple devices yet the neural cores sit unused most
             | of the time.
        
               | jcheng wrote:
               | A lot of libraries seem to be working on Apple Silicon
               | GPUs but not on ANE. I found this discussion interesting,
               | seems like the ANE has a lot of limitations, is not well
               | documented, and can only be used indirectly through Core
               | ML.
               | https://github.com/ggerganov/llama.cpp/discussions/336
        
           | erickj wrote:
           | That's the kind of work that can come out of academia and
           | open source communities when societies provide the resources
           | required.
        
           | latchkey wrote:
           | It is being done already...
           | 
           | https://docs.scale-lang.com/
        
           | dogcomplex wrote:
           | Or just develop the next wave of chips designed for
           | specifically transformer-based architectures (and ternary
           | computing), and bypass the needs for GPUs and CUDA altogether
        
             | Zambyte wrote:
             | That would be betting against other architectures like
             | Mamba, which does not seem like an obviously good bet to
             | make yet. Maybe it is though.
        
         | prpl wrote:
         | Great idea, too bad the DOE and NSF were there first.
        
         | kjkjadksj wrote:
         | The size of the cluster would have to be massive or else your
         | job will be on the queue for a year. And even then what are you
         | going to do downsize the resources requested so you can get in
         | earlier? After a certain point it starts to make more sense to
         | just buy your own xeons and run your own cluster.
        
         | Aperocky wrote:
         | Imagine if they made a data center with 1957 electronics that
         | cost $279 million.
         | 
         | They probably won't be using it now because the phone in your
         | pocket is likely more powerful. Moore law did end but data
         | center stuff are still evolving order of magnitudes faster than
         | forging presses.
        
         | goda90 wrote:
         | I'd like to see big programs to increase the amount of cheap,
         | clean energy we have. AI compute would be one of many
         | beneficiaries of super cheap energy, especially since you
         | wouldn't need to chase newer, more efficient hardware just to
         | keep costs down.
        
           | Melatonic wrote:
           | Yeah this would be the real equivalent of the program people
           | are talking about above. That an investing in core networking
           | infrastructure (like cables) instead of just giving huge
           | handouts to certain corporations that then pocket the
           | money.....
        
         | BigParm wrote:
         | So we'll have the government bypass markets and force the
         | working class to buy toys for the owning class?
         | 
         | If anything, allocate compute to citizens.
        
           | _fat_santa wrote:
           | > If anything, allocate compute to citizens.
           | 
           | If something like this were to become a reality, I could see
           | something like "CitizenCloud" where once you prove that you
           | are a US Citizen (or green card holder or some other
           | requirement), you can then be allocated a number of credits
           | every month for running workloads on the "CitizenCloud".
           | Everyone would get a baseline amount, from there if you can
           | prove you are a researcher or own a business related to AI
           | then you can get more credits.
        
         | aiauthoritydev wrote:
         | Overall government doing anything is a bad idea. There are
         | cases however where government is the only entity that can do
         | certain things. These are things that involve military, law
         | enforcement etc. Outside of this we should rely on private
         | industry and for-profit industry as much as possible.
        
           | pavlov wrote:
           | The American healthcare industry demonstrates the tremendous
           | benefits of rigidly applying this mindset.
           | 
           | Why couldn't law enforcement be private too? You call 911,
           | several private security squads rush to solve your immediate
           | crime issue, and the ones who manage to shoot the suspect
           | send you a $20k bill. Seems efficient. If you don't like the
           | size of the bill, you can always get private crime insurance.
        
             | sterlind wrote:
             | For a further exploration of this particular utopia, see
             | Snowcrash by Neal Stephenson.
        
           | chris_wot wrote:
           | That's not correct. The American health care system is an
           | extreme example of where private organisations fail overall
           | society.
        
           | fragmede wrote:
           | > Overall government doing anything is a bad idea.
           | 
           | that is bereft of detail enough to just be wrong. There are
           | things that government is good for and things that government
           | is bad for, but "anything" is just too broad, and reveals an
           | anti-government bias which just isn't well thought out.
        
           | goatlover wrote:
           | Why are governments a bad idea? Seems the human race has
           | opted for governments doing things since the dawn of
           | civilization. Building roads, providing defense, enforcing
           | rights, provide social safety nets, funding costly scientific
           | endeavors.
        
           | com2kid wrote:
           | Ugh.
           | 
           | Government distorting undeveloped markets that have a lot of
           | room for competition to increase efficiencies is a bad thing.
           | 
           | Government agencies running programs that should not be
           | profitable, or where the only profit to be left comes at the
           | expense of society as a whole, is a good thing.
           | 
           | Lots of basic medicine is the go to example here, treating
           | cancer isn't going to be "profitable" and attempting to make
           | it such just leads to dead people.
           | 
           | On the flip side, one can argue that dentistry has seen
           | amazing strides in affordability and technological progress
           | through the free market. From dental xrays to improvements in
           | dental procedures to make them less painful for the patients.
           | 
           | Eye surgery is another area where competition has lead to
           | good consumer outcomes.
           | 
           | But life of death situations where people can't spend time
           | researching? The only profit there comes through exploiting
           | people.
        
           | Angostura wrote:
           | To summarise: There are some things where government action
           | is the best solution, however by default see if the private
           | sector can sort it first.
        
         | varenc wrote:
         | I just watched this 1950s DoD video on the heavy press program
         | and highly recommend it:
         | https://www.youtube.com/watch?v=iZ50nZU3oG8
        
           | newzisforsukas wrote:
           | https://loc.gov/pictures/search/?q=Photograph:%20oh1540&fi=n.
           | ..
        
         | spullara wrote:
         | It makes much more sense to invest in a next generation fab for
         | GPUs than to buy GPUs and more closely matches this kind of
         | project.
        
           | epolanski wrote:
           | Does it? You're looking at a gargantuan investment in terms
           | of money that would also require thousands of staff.
           | 
           | That just doesn't seem a good idea.
        
             | inhumantsar wrote:
             | > gargantuan investment
             | 
             | it's a bigger investment, but it's an investment which will
             | pay dividends for decades. with a compute cluster, the
             | government is taking on an asset in the form of the cluster
             | but also liabilities in the form of operations and
             | administration.
             | 
             | with a fab, the government takes on either a promise of
             | lower taxes for N years or hands over a bag of cash. after
             | that they're clear of it. the company operating the fab
             | will be responsible for the risks and on-going expenses.
             | 
             | on top of that...
             | 
             | > thousand of staff
             | 
             | the company will employ/attract even more top talent, each
             | of whom will pay taxes and eventually go on to found
             | related companies or teach the next generation or what have
             | you. not to mention the risk reduction that comes with on-
             | shoring something as critical to national security and the
             | economy as a fab.
             | 
             | a public-access compute cluster isn't a bad idea, but it
             | probably makes more sense to fund/operate it in similar PPP
             | model. non-profit consortium of universities and business
             | pool resources to plan, build, and operate it, government
             | recognizes it as a public good and chips in a significant
             | amount of money to help.
        
         | rkique wrote:
         | Very much in this spirit is the NSF-funded National Deep
         | Inference Fabric, which lets researchers run remote experiments
         | on foundation models: https://ndif.us. They just announced a
         | pilot program for Llama405b!
        
         | cyanydeez wrote:
         | Better idea would be to make various open source packages
         | utilities and put maintainers everywhere funded by public good.
         | 
         | AI is a fad, the brick and mortar of the future is open source
         | tools.
        
         | fintler wrote:
         | For the DoE, take a look at:
         | 
         | https://doeleadershipcomputing.org/
        
         | carschno wrote:
         | In the Netherlands, for instance, there is "the national
         | supercomputer" Snellius:
         | https://www.surf.nl/en/services/snellius-the-national-superc...
         | I am not sure about its budget, but my impression as a user is
         | that its resources are never fully used. At least, none of my
         | jobs ever had to queue. I doubt that it can compete with the
         | scale of resources that FAANG companies have available, but
         | then again, I also doubt how research would benefit.
         | 
         | Sure, academia could build LLMs, and there is at least one
         | large-scale project for that: https://gpt-nl.com/ On the other
         | hand, this kind of models still need to demonstrate specific
         | scientific value that goes beyond using a chatbot for
         | generating ideas and summarizing documents.
         | 
         | So I fully agree that the research budget cuts in the past
         | decades have been catastrophic, and probably have contributed
         | to all the disasters the world is currently facing. But I think
         | that funding prestigious super-projects is not the best way to
         | spend funds.
        
           | teekert wrote:
           | Snellius is a nice resource. A powerful Slurm based HTC
           | cluster with different cues for different workloads
           | (cpu/genomics, gpu/deep learning).
           | 
           | To access the resource I had to go through EuroCC [0], which
           | is a network facilitating access to and exploitation of
           | HPC/HTC infra. It is (or can be) a great competing model to
           | US cloud providers.
           | 
           | As a small business I got 8 hrs of consultancy and 10k
           | compute hours for free. I'm still learning the details but my
           | understanding is is that after that the prices are very
           | competitive.
           | 
           | [0] https://www.eurocc-access.eu/
        
           | matteocontrini wrote:
           | Italy built the Leonardo HPC cluster, it's one of the largest
           | in EU and was created by a consortium of universities. After
           | just over a year it's already at full capacity and expansion
           | plans have been anticipated because of this.
        
         | B4CKlash wrote:
         | Eric Schmidt advocated for this exact thing in an Op-ed piece
         | in the latest MIT Technology Review.
         | 
         | [1] https://www.technologyreview.com/2024/05/13/1092322/why-
         | amer...
        
       | maxdo wrote:
       | so that North Korea will create small call centers for cheaper,
       | since they can get these models for free?
        
         | HanClinto wrote:
         | The article argues that the threat of foreign espionage is not
         | solved by closing models.
         | 
         | > Some people argue that we must close our models to prevent
         | China from gaining access to them, but my view is that this
         | will not work and will only disadvantage the US and its allies.
         | Our adversaries are great at espionage, stealing models that
         | fit on a thumb drive is relatively easy, and most tech
         | companies are far from operating in a way that would make this
         | more difficult. It seems most likely that a world of only
         | closed models results in a small number of big companies plus
         | our geopolitical adversaries having access to leading models,
         | while startups, universities, and small businesses miss out on
         | opportunities.
        
         | tempfile wrote:
         | This argument implies that cheap phones are bad since
         | telemarketers can use them.
        
         | mrfinn wrote:
         | You guys really need to get over your bellicose POV of the
         | world. Actually, before it destroys you. Really, is not
         | necessary. Most people in the world just want to leave in
         | peace, and see their children grow happily. For each data
         | center NK would create there will be a thousand of peaceful,
         | kind, and well-intentioned AI projects going on. Or maybe more.
        
       | the8thbit wrote:
       | "Eventually though, open source Linux gained popularity -
       | initially because it allowed developers to modify its code
       | however they wanted ..."
       | 
       | I find the language around "open source AI" to be confusing. With
       | "open source" there's usually "source" to open, right? As in,
       | there is human legible code that can be read and modified by the
       | user? If so, then how can current ML models be open source?
       | They're very large matrices that are, for the most part,
       | inscrutable to the user. They seem akin to binaries, which, yes,
       | can be modified by the user, but are extremely obscured to the
       | user, and require enormous effort to understand and effectively
       | modify.
       | 
       | "Open source" code is not just code that isn't executed remotely
       | over an API, and it seems like maybe its being conflated with
       | that here?
        
         | orthoxerox wrote:
         | Open training dataset + open steps sufficient to train exactly
         | the same model.
        
           | the8thbit wrote:
           | This isn't what Meta releases with their models, though I
           | would like to see more public training data. However, I still
           | don't think that would qualify as "open source". Something
           | isn't open source just because its reproducible out of
           | composable parts. If one, very critical and system defining
           | part is a binary (or similar) without publicly available
           | source code, then I don't think it can be said to be "open
           | source". That would be like saying that Windows 11 is open
           | source because Windows Calculator is open source, and its a
           | component of Windows.
        
             | blackeyeblitzar wrote:
             | Here's one list of what is needed to be actually open
             | source:
             | 
             | https://blog.allenai.org/hello-olmo-a-truly-open-
             | llm-43f7e73...
        
             | orthoxerox wrote:
             | That's what I meant by "open steps", I guess I wasn't clear
             | enough.
        
               | the8thbit wrote:
               | Is that what you meant? I don't think releasing the
               | sequence of steps required to produce the model satisfies
               | "open source", which is how I interpreted you, because
               | there is still no source code for the model.
        
           | Yizahi wrote:
           | They can't release training dataset if it was illegally
           | scrapped all over the web without permission :) (taps head)
        
         | bilsbie wrote:
         | Can't you do fine tuning on those binaries? That's a
         | modification.
        
           | the8thbit wrote:
           | You can fine tune the models, and you can modify binaries.
           | However, there is no human readable "source" to open in
           | either case. The act of "fine tuning" is essentially brute
           | forcing the system to gradually alter the weights such that
           | loss is reduced against a new training set. This limits what
           | you can actually do with the model vs an actual open source
           | system where you can understand how the system is working and
           | modify specific functionality.
           | 
           | Additionally, models can be (and are) fine tuned via APIs, so
           | if that is the threshold required for a system to be "open
           | source", then that would also make the GPT4 family and other
           | such API only models which allow finetuning open source.
        
             | whimsicalism wrote:
             | I don't find this argument super convincing.
             | 
             | There's a pretty clear difference between the 'finetuning'
             | offered via API by GPT4 and the ability to do whatever sort
             | of finetuning you want and get the weights at the end that
             | you can do with open weights models.
             | 
             | "Brute forcing" is not the correct language to use for
             | describing fine-tuning. It is not as if you are trying
             | weights randomly and seeing which ones work on your dataset
             | - you are following a gradient.
        
               | the8thbit wrote:
               | "There's a pretty clear difference between the
               | 'finetuning' offered via API by GPT4 and the ability to
               | do whatever sort of finetuning you want and get the
               | weights at the end that you can do with open weights
               | models."
               | 
               | Yes, the difference is that one is provided over a remote
               | API, and the provider of the API can restrict how you
               | interact with it, while the other is performed directly
               | by the user. One is a SaaS solution, the other is a
               | compiled solution, and neither are open source.
               | 
               | ""Brute forcing" is not the correct language to use for
               | describing fine-tuning. It is not as if you are trying
               | weights randomly and seeing which ones work on your
               | dataset - you are following a gradient."
               | 
               | Whatever you want to call it, this doesn't sound like
               | modifying functionality in source code. When I modify
               | source code, I might make a change, check what that does,
               | change the same functionality again, check the new
               | change, etc... up to maybe a couple dozen times. What I
               | don't do is have a very simple routine make very small
               | modifications to all of the system's functionality, then
               | check the result of that small change across the broad
               | spectrum of functionality, and repeat millions of times.
        
               | Kubuxu wrote:
               | The gap between fine-tuning API and weights-available is
               | much more significant than you give it credit for.
               | 
               | You can take the weights and train LoRAs (which is close
               | to fine-tuning), but you can also build custom adapters
               | on top (classification heads). You can mix models from
               | different fine-tunes or perform model surgery (adding
               | additional layers, attention heads, MoE).
               | 
               | You can perform model decomposition and amplify some of
               | its characteristics. You can also train multi-modal
               | adapters for the model. Prompt tuning requires weights as
               | well.
               | 
               | I would even say that having the model is more potent in
               | the hands of individual users than having the dataset.
        
               | thayne wrote:
               | That still doesn't make it open source.
               | 
               | There is a massive difference between a compiled binary
               | that you are allowed to do anything you want with,
               | including modifying it, building something else on top or
               | even pulling parts of it out and using in something else,
               | and a SaaS offering where you can't modify the software
               | at all. But that doesn't make the compiled binary open
               | source.
        
               | emporas wrote:
               | > When I modify source code, I might make a change, check
               | what that does, change the same functionality again,
               | check the new change, etc... up to maybe a couple dozen
               | times.
               | 
               | You can modify individual neurons if you are so inclined.
               | That's what Anthropic have done with the Claude family of
               | models [1]. You cannot do that using any closed model. So
               | "Open Weights" looks very much like "Open Source".
               | 
               | Techniques for introspection of weights are very
               | primitive, but i do think new techniques will be
               | developed, or even new architectures which will make it
               | much easier.
               | 
               | [1] https://www.anthropic.com/news/mapping-mind-language-
               | model
        
               | the8thbit wrote:
               | "You can modify individual neurons if you are so
               | inclined."
               | 
               | You can also modify a binary, but that doesn't mean that
               | binaries are open source.
               | 
               | "That's what Anthropic have done with the Claude family
               | of models [1]. ... Techniques for introspection of
               | weights are very primitive, but i do think new techniques
               | will be developed"
               | 
               | Yeah, I don't think what we have now is robust enough
               | interpretability to be capable of generating something
               | comparable to "source code", but I would like to see us
               | get there at some point. It might sound crazy, but a few
               | years ago the degree of interpretability we have today
               | (thanks in no small part to Anthropic's work) would have
               | sounded crazy.
               | 
               | I think getting to open sourcable models is probably
               | pretty important for producing models that actually do
               | what we want them to do, and as these models become more
               | powerful and integrated into our lives and production
               | processes the inability to make them do what we actually
               | want them to do may become increasingly dangerous.
               | Muddling the meaning of open source today to market your
               | product, then, can have troubling downstream effects as
               | focus in the open source community may be taken away from
               | interpretability and on distributing and tuning public
               | weights.
        
             | bilsbie wrote:
             | You make a good point but those are also just limitations
             | of the technology (or at least our current understanding of
             | it)
             | 
             | Maybe an analogy would help. A family spent generations
             | breeding the perfect apple tree and they decided to "open
             | source" it. What would open sourcing look like?
        
               | the8thbit wrote:
               | "You make a good point but those are also just
               | limitations of the technology (or at least our current
               | understanding of it)"
               | 
               | Yeah, that _is_ my point. Things that don 't have source
               | code can't be open source.
               | 
               | "Maybe an analogy would help. A family spent generations
               | breeding the perfect apple tree and they decided to "open
               | source" it. What would open sourcing look like?"
               | 
               | I think we need to be weary of dilemmas without solutions
               | here. For example, let's think about another analogy: I
               | was in a car accident last week. How can I open source my
               | car accident?
               | 
               | I don't think all, or even most things, are actually
               | "open sourcable". ML models could be open sourced, but it
               | would require a lot of work to interpret the models and
               | generate the source code from them.
        
               | gowld wrote:
               | Be charitable and intellectually curious. What would
               | "open" look like?
               | 
               | GNU says "The GNU GPL can be used for general data which
               | is not software, as long as one can determine what the
               | definition of "source code" refers to in the particular
               | case. As it turns out, the DSL (see below) also requires
               | that you determine what the "source code" is, using
               | approximately the same definition that the GPL uses."
               | 
               | and offers these categories, for example:
               | 
               | https://www.gnu.org/licenses/license-
               | list.en.html#NonFreeSof...
               | 
               | * Software Licenses
               | 
               | * * GPL-Compatible Free Software Licenses
               | 
               | \
               | 
               | * * GPL-Incompatible Free Software Licenses
               | 
               | \
               | 
               | * Licenses For Documentation
               | 
               | * * Free Documentation Licenses
               | 
               | \
               | 
               | * Licenses for Other Works
               | 
               | * * Licenses for Works of Practical Use besides Software
               | and Documentation
               | 
               | * * Licenses for Fonts
               | 
               | * * Licenses for Works stating a Viewpoint (e.g., Opinion
               | or Testimony)
               | 
               | * * Licenses for Designs for Physical Objects
        
               | the8thbit wrote:
               | "Be charitable and intellectually curious. What would
               | "open" look like?"
               | 
               | To really be intellectually curious we need to be open to
               | the idea that there is not (yet) a solution to this
               | problem. Or in the analogy you laid out, that it is
               | simply not possible for the system to be "open source".
               | 
               | Note that most of the licenses listed under the "Licenses
               | for Other Works" section say "It is incompatible with the
               | GNU GPL. Please don't use it for software or
               | documentation, since it is incompatible with the GNU GPL
               | and with the GNU FDL." This is because these are not free
               | software/open source licenses. They are licenses that the
               | FSF endorses because they encourage openness and copyleft
               | in non-software mediums, and play nicely with the GPL
               | _when used appropriately_ (i.e. not for software).
               | 
               | The GPL _is_ appropriate for many works that we wouldn 't
               | conventionally view as software, but in those contexts
               | the analogy is usually so close to the literal nature of
               | software that it stops being an analogy. The major
               | difference is public perception. For example, we don't
               | generally view jpegs as software. However, jpegs, at
               | their heart, are executable binaries with very domain
               | specific instructions that are executed in a very much
               | non-Turing complete context. The source code for the jpeg
               | is the XCF or similar (if it exists) which contains a
               | specification (code) for building the binary. The code
               | becomes human readable once loaded into an IDE, such as
               | GIMP, designed to display and interact with the
               | specification. This is code that is most easily
               | interacted with using a visual IDE, but that doesn't
               | change the fact that it _is_ code.
               | 
               | There are some scenarios where you could identify a
               | "source code" but not a "software". For example, a cake
               | can be open sourced by releasing the recipe. In such a
               | context, though, there is literally source code. It's
               | just that the code never produces a binary, and is
               | compiled by a human and kitchen instead of a computer.
               | There is open source hardware, where the source code is a
               | human readable hardware specification which can be easily
               | modified, and the hardware is compiled by a human or
               | machine using that specification.
               | 
               | The scenario where someone has bred a specific plant,
               | however, can not be open source, unless they have also
               | deobfuscated the genome, released the genome publicly,
               | and there is also some feasible way to convert the
               | deobfuscated genome, or a modification of it, into a
               | seed.
        
             | jpadkins wrote:
             | > vs an actual open source system where you can understand
             | how the system is working and modify specific
             | functionality.
             | 
             | No one on the planet understands how the model weights work
             | exactly, nor can they modify them specifically (i.e. hand
             | modifying the weights to get the result they want). This is
             | an impossible standard.
             | 
             | The source code is open (sorta, it does have some
             | restrictions). The weights are open. The training data is
             | closed.
        
               | the8thbit wrote:
               | > No one on the planet understands how the model weights
               | work exactly
               | 
               | Which is my point. These models aren't open source
               | because there is no source code to open. Maybe one day we
               | will have strong enough interpretability to generate
               | source from these models, and _then_ we could have open
               | source models. But today its not possible, and changing
               | the meaning of open source such that it is possible
               | probably isn 't a great idea.
        
         | jsheard wrote:
         | I also think that something like Chromium is a better analogy
         | for corporate open source models than a grassroots project like
         | Linux is. Chromium is technically open source, but Google has
         | absolute control over the direction of it's development and
         | realistically it's far too complex to maintain a fork without
         | Googles resources, just like Meta has complete control over
         | what goes into their open models, and even if they did release
         | all the training data and code (which they don't) us mere plebs
         | could never afford to train a fork from scratch anyway.
        
           | skybrian wrote:
           | I think you're right from the perspective of an individual
           | developer. You and I are not about to fork Chromium any time
           | soon. If you presume that forking is impractical then sure,
           | the right to fork isn't worth much.
           | 
           | But just because a single developer couldn't do it doesn't
           | mean it couldn't be done. It means nobody has organized a
           | large enough effort yet.
           | 
           | For something like a browser, which is critical for security,
           | you need both the organization and the trust. Despite
           | frequent criticism, Mozilla (for example) is still considered
           | pretty trustworthy in a way that an unknown developer can't
           | be.
        
             | Yizahi wrote:
             | If Microsoft can't do it, then we can reasonably conclude
             | that it can't be done for any practical purpose. Discussing
             | infinitesimal possibilities is better left to philosophers.
        
               | skybrian wrote:
               | Doesn't Microsoft maintain its own fork of Chromimum?
        
               | umbra07 wrote:
               | yes - their browser is chromium-based
        
         | candiddevmike wrote:
         | None of Meta's models are "open source" in the FOSS sense, even
         | the latest Llama 3.1. The license is restrictive. And no one
         | has bothered to release their training data either.
         | 
         | This post is an ad and trying to paint these things as
         | something they aren't.
        
           | JumpCrisscross wrote:
           | > _no one has bothered to release their training data_
           | 
           | If the FOSS community sets this as the benchmark for open
           | source in respect of AI, they're going to lose control of the
           | term. In most jurisdictions it would be illegal for the likes
           | of Meta to release training data.
        
             | exe34 wrote:
             | the training data is the source.
        
               | JumpCrisscross wrote:
               | > _the training data is the source_
               | 
               | Sure. But that's not going to be released. The term open
               | source AI cannot be expected to cover it because it's not
               | practical.
        
               | diggan wrote:
               | So because it's really hard to do proper Open Source with
               | these LLMs, means we need to change the meaning of Open
               | Source so it fits with these PR releases?
        
               | JumpCrisscross wrote:
               | > _because it 's really hard to do proper Open Source
               | with these LLMs, means we need to change the meaning of
               | Open Source so it fits with these PR releases?_
               | 
               | Open training data is hard to the point of
               | impracticality. It requires excluding private and
               | proprietary data.
               | 
               | Meanwhile, the term "open source" is massively popular.
               | So it will get used. The question is how.
               | 
               | Meta _et al_ would love for the choice to be between, on
               | one hand, open weights only, and, on the other hand, open
               | training data, because the latter is impractical. That
               | dichotomy guarantees that when someone says open source
               | AI they 'll mean open weights. (The way open source
               | software, today, generally means source available, not
               | FOSS.)
        
               | Palomides wrote:
               | source available is absolutely not the same as open
               | source
               | 
               | you are playing very loosely with terms that have
               | specific, widely accepted definitions (e.g.
               | https://opensource.org/osd )
               | 
               | I don't get why you think it would be useful to call LLMs
               | with published weights "open source"
        
               | JumpCrisscross wrote:
               | > _terms that have specific, widely accepted definitions_
               | 
               | OSF's definition is far from the only one [1].
               | Switzerland is currently implementing CH Open's
               | definition, the EU another one, _et cetera_.
               | 
               | > _I don 't get why you think it would be useful to call
               | LLMs with published weights "open source"_
               | 
               | I don't. I'm saying that if the choice is between open
               | weights or open weights + open training data, open
               | weights will win because the useful definition will
               | outcompete the pristine one in a public context.
               | 
               | [1] https://en.wikipedia.org/wiki/Open-
               | source_software#Definitio...
        
               | diggan wrote:
               | For the EU, I'm guessing you're talking about the EUPL,
               | which is FSF/OSI approved and GPL compatible, generally
               | considered copyleft.
               | 
               | For the CH Open, I'm not finding anything specific, even
               | from Swiss websites, could you help me understand what
               | you're referring to here?
               | 
               | I'm guessing that all these definitions have at least
               | some points in common, which involves (another guess) at
               | least being able to produce the output artifacts/binaries
               | by yourself, something that you cannot do with Llama,
               | just as an example.
        
               | JumpCrisscross wrote:
               | > _For the CH Open, I 'm not finding anything specific,
               | even from Swiss websites, could you help me understand
               | what you're referring to here_
               | 
               | Was on the _HN_ front page earlier [1][2]. The definition
               | comes strikingly close to source on request with no use
               | restrictions.
               | 
               | > _all these definitions have at least some points in
               | common_
               | 
               | Agreed. But they're all different. There isn't an
               | accepted defintiion of open source even when it comes to
               | software; there is an accepted set of broad principles.
               | 
               | [1] https://news.ycombinator.com/item?id=41047172
               | 
               | [2] https://joinup.ec.europa.eu/collection/open-source-
               | observato...
        
               | diggan wrote:
               | > Agreed. But they're all different. There isn't an
               | accepted defintiion of open source even when it comes to
               | software; there is an accepted set of broad principles.
               | 
               | Agreed, but are we splitting hairs here and is it
               | relevant to the claim made earlier?
               | 
               | > (The way open source software, today, generally means
               | source available, not FOSS.)
               | 
               | Do any of these principles or definitions from these orgs
               | agree/disagree with that?
               | 
               | My hypothesis is that they generally would go against
               | that belief and instead argue that open source is
               | different from source available. But I haven't looked
               | specifically to confirm if that's true or not, just a
               | guess.
        
               | JumpCrisscross wrote:
               | > _are we splitting hairs here and is it relevant to the
               | claim made earlier?_
               | 
               | I don't think so. Take the Swiss definition. Source on
               | request, not even available. Yet being branded and
               | accepted as open source.
               | 
               | (To be clear, the Swiss example favours FOSS. But it also
               | permits source on request and bundles them together under
               | the same label.)
        
               | Palomides wrote:
               | diluting open source into a marketing term meaning "you
               | can download something" would be a sad result
        
               | SquareWheel wrote:
               | > specific, widely accepted definitions
               | 
               | Realistically, nobody outside of Hacker News commenters
               | have ever cared about the OSD. It's just not how the term
               | is used colloquially.
        
               | Palomides wrote:
               | who says open source colloquially? ime anyone who doesn't
               | care about software licenses will just say free (per free
               | beer)
               | 
               | and (strong personal opinion) any software developer
               | should have a firm grip on the terminology and details
               | for legal reasons
        
               | SquareWheel wrote:
               | > who says open source colloquially?
               | 
               | There is a large span of people between gray beard
               | programmer and lay person, and many in that span have
               | some concept of open-source. It's often used synonymously
               | with visible source, free software, or in this case, open
               | weights.
               | 
               | It seems unfortunate - though expected - that over half
               | of the comments in this thread are debating the OSD for
               | the umpeenth time instead of discussing the actual model
               | release or accompanying news posts. Meanwhile communities
               | like /r/LocalLlama are going hog wild with this release
               | and already seeing what it can do.
               | 
               | > any software developer should have a firm grip on the
               | terminology and details for legal reasons
               | 
               | They'd simply need to review the terms of the license to
               | see if it fits their usage. It doesn't really matter if
               | the license satisfies the OSD or not.
        
               | diggan wrote:
               | > Open training data is hard to the point of
               | impracticality. It requires excluding private and
               | proprietary data.
               | 
               | Right, so the onus is on Facebook/Meta to get that right,
               | then they could call something Open Source, until then,
               | find another name that already doesn't have a specific
               | meaning.
               | 
               | > (The way open source software, today, generally means
               | source available, not FOSS.)
               | 
               | No, but it's going in that way. Open Source, today, still
               | means that the things you need to build a project, is
               | publicly available for you to download and run on your
               | own machine, granted you have the means to do so. What
               | you're thinking of is literally called "Source Available"
               | which is very different from "Open Source".
               | 
               | The intent of Open Source is for people to be able to
               | reproduce the work themselves, with modifications if they
               | want to. Is that something you can do today with the
               | various Llama models? No, because one core part of the
               | projects "source code" (what you need to reproduce it
               | from scratch), the training data, is being held back and
               | kept private.
        
               | unethical_ban wrote:
               | >Meanwhile, the term "open source" is massively popular.
               | So it will get used. The question is how.
               | 
               | Here's the source of the disagreement. You're justifying
               | the use of the term "open source" by saying it's logical
               | for Meta to want to use it for its popularity and layman
               | (incorrect) understanding.
               | 
               | Other person is saying it doesn't matter how convenient
               | it is or how much Meta wants to use it, that the term
               | "open source" is misleading for a product where the
               | "source" is the training data, _and_ the final product
               | has onerous restrictions on use.
               | 
               | This would be like Adobe giving Photoshop away for free,
               | but for personal use only and not for making ads for
               | Adobe's competitors. Sure, Adobe likes it and most users
               | may be fine with it, but it isn't open source.
               | 
               | >The way open source software, today, generally means
               | source available, not FOSS.
               | 
               | I don't agree with that. When a company says "open
               | source" but it's not free, the tech community is quick to
               | call it "source available" or "open core".
        
               | JumpCrisscross wrote:
               | > _You 're justifying the use of the term "open source"
               | by saying it's logical for Meta to want to use it for its
               | popularity and layman (incorrect) understanding_
               | 
               | I'm actually not a fan of Meta's definition. I'm arguing
               | specifically against an unrealistic definition, because
               | for practical purposes that cedes the term to Meta.
               | 
               | > _the term "open source" is misleading for a product
               | where the "source" is the training data, and the final
               | product has onerous restrictions on use_
               | 
               | Agree. I think the focus should be on the use
               | restrictions.
               | 
               | > _When a company says "open source" but it's not free,
               | the tech community is quick to call it "source available"
               | or "open core"_
               | 
               | This isn't consistently applied. It's why we have the
               | free vs open vs FOSS fracture.
        
               | plsbenice34 wrote:
               | Of course it could be practical - provide the data. The
               | fact of that society is a dystopian nightmare controlled
               | by a few megacorporations that don't want free
               | information does not justify outright changing the
               | meaning of the language.
        
               | JumpCrisscross wrote:
               | > _provide the data_
               | 
               | Who? It's not their data.
        
               | exe34 wrote:
               | why are they using it?
        
               | guitarlimeo wrote:
               | And why legislation allows them to use the data to train
               | their LLM and release that, but not release the data?
        
               | tintor wrote:
               | Meta can call it something else other than open source.
               | 
               | Synthetic part of the training data could be released.
        
               | JimDabell wrote:
               | I don't think it's that simple. The source is "the
               | preferred form of the work for making modifications to
               | it" (to use the GPL's wording).
               | 
               | For an LLM, that's not the training data. That's the
               | model itself. You don't make changes to an LLM by going
               | back to the training data and making changes to it, then
               | re-running the training. You update the model itself with
               | more training data.
               | 
               | You can't even use the training code and original
               | training data to reproduce the existing model. A lot of
               | it is non-deterministic, so you'll get different results
               | each time anyway.
               | 
               | Another complication is that the object code for normal
               | software is a clear derivative work of the source code.
               | It's a direct translation from one form to another. This
               | isn't the case with LLMs and their training data. The
               | models learn from it, but they aren't simply an
               | alternative form of it. I don't think you can describe an
               | LLM as a derivative work of its training data. It learns
               | from it, it isn't a copy of it. This is mostly the reason
               | why distributing training data is infeasible - the
               | model's creator may not have the license to do so.
               | 
               | Would it be extremely useful to have the original
               | training data? Definitely. Is distributing it the same as
               | distributing source code for normal software? I don't
               | think so.
               | 
               | I think new terminology is needed for open AI models. We
               | can't simply re-use what works for human-editable code
               | because it's a fundamentally different type of thing with
               | different technical and legal constraints.
        
               | jononor wrote:
               | No the preferred way to make modifications is using the
               | the training code. One may also input a snapshot weighs
               | to start from, but the training code is definitely what
               | you would modify to make a change.
        
               | exe34 wrote:
               | how do you train it in a different language by changing
               | the training code?
        
               | jononor wrote:
               | By selecting different dataset. Of course this dataset
               | does need to exist. In practice building and curating
               | datasets also involves a lot of code.
        
               | exe34 wrote:
               | sounds like you need the data to train the model.
        
               | root_axis wrote:
               | No. It's an asset used in the training process, the
               | source code can process arbitrary training data.
        
               | sangnoir wrote:
               | We've had a similar debate before, but the last time it
               | about whether Linux device drivers based on non-public
               | datasheets under NDA were actually open source. This
               | debate occurred again over drivers that interact with
               | binary blobs.
               | 
               | I disagree with the purists - if you can _legally_ change
               | the source or weights - even without having access to the
               | data used by the upstream authors - it 's open enough for
               | me. YMMV.
        
               | wrs wrote:
               | I don't think even that is true. I conjecture that
               | Facebook couldn't reproduce the model weights if they
               | started over with the same training data, because I doubt
               | such a huge training run is a reproducible deterministic
               | process. I don't think _anyone_ has "the" source.
        
               | exe34 wrote:
               | numpy.random.seed(1234)
        
             | mesebrec wrote:
             | Regardless of the training data, the license even heavily
             | restricts how you can use the model.
             | 
             | Please read through their "acceptable use" policy before
             | you decide whether this is really in line with open source.
        
               | JumpCrisscross wrote:
               | > _Please read through their "acceptable use" policy
               | before you decide whether this is really in line with
               | open source_
               | 
               | I'm not taking a specific posiion on this license. I
               | haven't read it closely. My broad point is simply that
               | open source AI, as a term, cannot practically require the
               | training data be made available.
        
             | guitarlimeo wrote:
             | > In most jurisdictions it would be illegal for the likes
             | of Meta to release training data.
             | 
             | How come releasing an LLM trained on that data is not
             | illegal then? I think it should be.
        
           | blackeyeblitzar wrote:
           | AI2 has released training data in their OLMo model:
           | https://blog.allenai.org/hello-olmo-a-truly-open-
           | llm-43f7e73...
        
         | causal wrote:
         | "Open weights" is a more appropriate term but I'll point out
         | that these weights are also largely inscrutable to the people
         | with the code that trained it. And for licensing reasons, the
         | datasets may not be possible to share.
         | 
         | There is still a lot of modifying you can do with a set of
         | weights, and they make great foundations for new stuff, but
         | yeah we may never see a competitive model that's 100% buildable
         | at home.
         | 
         | Edit: mkolodny points out that the model code is shared (under
         | llama license at least), which is really all you need to run
         | training https://github.com/meta-
         | llama/llama3/blob/main/llama/model.p...
        
           | aerzen wrote:
           | LLAMA is an open-weights model. I like this term, let's use
           | that instead of open source.
        
             | gowld wrote:
             | Can a human programmer edit the weights according to some
             | semantics?
        
               | sebastiennight wrote:
               | It is possible to merge two fine-tunes of models from the
               | same family by... wait for it... averaging or combining
               | their weights[0].
               | 
               | I am still amazed that we can do that.
               | 
               | [0]: https://arxiv.org/abs/2212.09849
        
               | root_axis wrote:
               | Yes. Using fine tuning.
        
               | sitkack wrote:
               | Yes, there is the concept of a "frakenmerge" and folks
               | have also bolted on vision and audio models to LLMs.
        
           | stavros wrote:
           | "Open weights" means you can use the weights for free (as in
           | beer). "Open source" means you get the training dataset and
           | the methodology. ~Nobody does open source LLMs.
        
             | _heimdall wrote:
             | Why is the dataset required for it to be open source?
             | 
             | If I self host a project that is open sourced rather than
             | paying for a hosted version, like Sentry.io for example, I
             | don't expect data to come along with the code. Licensing
             | rights are always up for debate in open source, but I
             | wouldn't expect more than the code to be available and
             | reviewable for anything needed to build and run the
             | project.
             | 
             | In the case of an LLM I would expect that to mean the code
             | run to train the model, the code for the model data
             | structure itself, and the control code for querying the
             | model should all be available. I'm not actually sure if
             | Meta does share all that, but training data is separate
             | from open source IMO.
        
               | solarmist wrote:
               | The sticking point is you can't build the model. To be
               | able to build the model from scratch you need methodology
               | and a complete description of the data set.
               | 
               | They only give you a blob of data you can run.
        
               | _heimdall wrote:
               | Got it, that makes sense. I still wouldn't expect them to
               | have to publicly share the data itself, but if you can't
               | take the code they share and run it against your own data
               | to build a model that wouldn't be open source in my
               | understanding of it.
        
               | TeMPOraL wrote:
               | Data _is_ the source code here, though. Training code is
               | effectively a build script. Data that goes into training
               | a model does _not_ function like assets in videogames;
               | you can 't swap out the training dataset after release
               | and get substantially the same thing. If anything, you
               | can imagine the weights themselves are the asset - and
               | even if the vendor is granting most users a license to
               | copy and modify it (unlike with videogames), the _asset
               | itself_ isn 't open source.
               | 
               | So, the only bit that's actually open-sourced in these
               | models is the inference code. But that's a trivial part
               | that people can procure equivalents of elsewhere or
               | reproduce from published papers. In this sense, even if
               | you think calling the models "open source" is correct, it
               | doesn't really mean much, because the only parts that
               | matter are _not_ open sourced.
        
               | derefr wrote:
               | Compare/contrast:
               | 
               | DOOM-the-engine is open source (https://github.com/id-
               | Software/DOOM), even though DOOM-the-asset-and-scenario-
               | data is not. While you need a copy of DOOM-the-asset-and-
               | scenario-data to "use DOOM to run DOOM", you are free to
               | build _other games_ using DOOM-the-engine.
        
               | echoangle wrote:
               | I think no one would claim that "Doom" is open source
               | though, if that's the situation.
        
               | camgunz wrote:
               | That's what op is saying, the engine is GPLv2, but the
               | assets are copyrighted. There's Freedoom though and it's
               | pretty good [0].
               | 
               | [0]: https://freedoom.github.io/
        
               | stavros wrote:
               | Data is to models what code is to software.
        
               | _heimdall wrote:
               | I don't quite agree there. Based on other comments it
               | sounds like Meta doesn't open source the code used to
               | train the model, that would make it not open source in my
               | book.
               | 
               | The trained model doesn't need to be open source though,
               | and frankly I'm not sure what the value there is
               | specifically with regards to OSS. I'm not aware of a
               | solution to interpretability problem, even if the model
               | is shared we can't understand what's in it.
               | 
               | Microsoft ships obfuscated code with Windows builds, but
               | that doesn't make it open source.
        
               | Xelynega wrote:
               | Wouldn't the "source code" of the model be closer to the
               | source code of a compiler or the runtime library?
               | 
               | IMO a pre-trained model given with the source code used
               | to train/run it is analogous to a company shipping a
               | compiler and a compiled binary without any of the source,
               | which is why I don't think it's "open source" without the
               | training data.
        
               | _heimdall wrote:
               | You really should be able to train a model on whatever
               | data you choose to use though.
               | 
               | Training data instead source code at all, it's content
               | fed into the ingestion side to train a model. As long as
               | source for ingedting and training a model is available,
               | which it sounds like isn't the case for Meta, that would
               | be open source as best I understand it.
               | 
               | Said a little differently, I would need to be able to
               | review all code used to generate a model and all code
               | used to query the model for it to be OSS. I don't need
               | Meta's training data or their actual model at all, I can
               | train my own with code that I can fully audit and modify
               | if I choose to.
        
               | gowld wrote:
               | https://opensource.org/osd
               | 
               | "The source code must be the preferred form in which a
               | programmer would modify the program. Deliberately
               | obfuscated source code is not allowed. Intermediate forms
               | such as the output of a preprocessor or translator are
               | not allowed."
               | 
               | > In the case of an LLM I would expect that to mean the
               | code run to train the model, the code for the model data
               | structure itself, and the control code for querying the
               | model should all be available
               | 
               | The M in LLM is for "Model".
               | 
               | The code you describe is for an LLM _harness_ , not for
               | an LLM. The code for the _LLM_ is whatever is needed to
               | enable a developer to _modify_ to inputs and then build a
               | modified output LLM (minus standard generally available
               | tools not custom-created for that product).
               | 
               | Training data is one way to provide this. Another way is
               | some sort of semantic model editor for an interpretable
               | model.
        
               | _heimdall wrote:
               | I still don't quite follow. If Meta were to provide all
               | code required to train a model (it sounds like they
               | don't), and they provided the code needed to query the
               | model you train to get answers how is that not open
               | source?
               | 
               | > Deliberately obfuscated source code is not allowed.
               | Intermediate forms such as the output of a preprocessor
               | or translator are not allowed.
               | 
               | This definition actually makes it impossible for _any_
               | LLM to be considered open source until the
               | interpretability problem is solved. The trained model is
               | functionally obfuscated code, it can 't be read or
               | interpreted by a human.
               | 
               | We may be saying the same thing here, I'm not quite sure
               | if you're saying the model must be available or if what
               | is missing is the code to train your own model.
        
               | the8thbit wrote:
               | I'm not the person you replied directly to so I can't
               | speak for them, but I did start this thread, and I just
               | wanted to clarify what I meant in my OP, because I see a
               | lot of people misinterpreting what I meant.
               | 
               | I did _not_ mean that LLM training data needs to be
               | released for the model to be open source. It would be a
               | good thing if creators of models did release their
               | training data, and I wouldn 't even be opposed to
               | regulation which encourages or even requires that
               | training data be released when models meet certain
               | specifications. I don't even think the bar needs to be
               | high there- We could require or encourage smaller
               | creators to release their training data too and the
               | result would be a net positive when it comes to public
               | understanding of ML models, control over outputs, safety,
               | and probably even capabilities.
               | 
               | Sure, its possible that training data is being used
               | illegally, but I don't think the solution to that is to
               | just have everyone hide that and treat it as an open
               | secret. We should either change the law, or apply it
               | equally.
               | 
               | But that being said, I don't think it has anything to do
               | with whether the model is "open source". Training data
               | simply isn't source code.
               | 
               | I also don't mean that the license that these models are
               | released under is too restrictive to be open source.
               | Though that is _also_ true, and if these models had
               | source code, that would also prevent them from being open
               | source. (Rather, they would be  "source available"
               | models)
               | 
               | What I mean is "The trained model is functionally
               | obfuscated code, it can't be read or interpreted by a
               | human." As you point out, it is definitionally impossible
               | for any contemporary LLM to be considered open source.
               | (Except for maybe some very, very small research models?)
               | There's no source code (yet) so there is no source to
               | open.
               | 
               | I think it is okay to acknowledge when something is
               | technically infeasible, and then proceed to not claim to
               | have done that technically infeasible thing. I don't
               | think the best response to that situation is to, instead,
               | use that as justification for muddying the language to
               | such a degree that its no longer useful. And I don't
               | think the distinction is trivial or purely semantic.
               | Using the language of open source in this way is
               | dangerous for two reason.
               | 
               | The first is that it could conceivably make it more
               | challenging for copyleft licenses such as the GPL to
               | protect the works licensed with them. If the "public" no
               | longer treats software with public binaries and without
               | public source code as closed source, then who's to say
               | you can't fork the linux kernel, release the binary, and
               | keep the code behind closed doors? Wouldn't that also be
               | open source?
               | 
               | The second is that I think convincing a significant
               | portion of the open source community that releasing a
               | model's weights is sufficient to open source a model will
               | cause the community to put more focus on distributing and
               | tuning weights, and less time actually figuring out how
               | to construct source code for these models. I suspect that
               | solving interpretability and generating something
               | resembling source code may be necessary to get these
               | models to _actually_ do what we want them to do. As ML
               | models become increasingly integrated into our lives and
               | production processes, and become increasingly
               | sophisticated, the danger created by having models
               | optimized towards something other than what we would
               | actually like them optimized towards increases.
        
               | achrono wrote:
               | > not actually sure if Meta does share all that
               | 
               | Meta shares the code for inference but not for training,
               | so even if we say it can be open-source without the
               | training data, Meta's models are not open-source.
               | 
               | I can appreciate Zuck's enthusiasm for open-source but
               | not his willingness to mislead the larger public about
               | how open they actually are.
        
               | swatcoder wrote:
               | The open source _movement_ , from which the name derives,
               | was about the freedom to make bespoke alterations to the
               | software you choose to run. Provided you have reasonably
               | widespread proficiency in industry standard tools, you
               | can take something that's open source, modify that
               | source, and rebuild/redeploy/reinterpret/re-whatever to
               | make it behave the way that you want or need it to
               | behave.
               | 
               | This is in contrast to a compiled binary or obfuscated
               | source image, where alteration may be possible with
               | extraordinairy skill and effort but is not expected and
               | possibly even specirically discouraged.
               | 
               | In this sense, weights are entirely like those compiler
               | binaries or obfuscated sources rather than the source
               | code usually associated with "open source"
               | 
               | To be "open source" we would want LLM's where one might
               | be able to manipulate the original training data or
               | training algorithm to produce a set of weights more
               | suited to one's own desires and needs.
               | 
               | Facebook isn't giving us that yet, and very probably
               | can't. They're just trading on the weird boundary state
               | of the term "open source" -- it still carries prestige
               | and garners good will from its original techno-populist
               | ideals, but is so diluted by twenty years of naive
               | consumers who just take it to mean "I don't have to pay
               | to use this" that the prestige and good will is now
               | misplaced.
        
               | llm_trw wrote:
               | >The open source movement, from which the name derives,
               | was about the freedom to make bespoke alterations to the
               | software you choose to run.
               | 
               | The open source movement was a cash grab to make the free
               | software movement more palatable to big corp by moving
               | away from copy left licenses. The MIT license is
               | perfectly open source and means that you can buy software
               | without ever seeing its code.
        
               | Tepix wrote:
               | If you obtain open source licensed software you can pass
               | it on legally (and freely). With some licenses you also
               | have to provide the source code.
        
               | saurik wrote:
               | The thing they are pointing at and which is the thing
               | people want is the output of the training engine, not the
               | inputs. This is like someone saying they have an open
               | source kernel, but they only release a compiler and a
               | binary... the kernel code is never released, but the
               | kernel is the only reason anyone even wants the compiler.
               | (For avoidance of anyone being somehow confused: the
               | training code is a compiler which takes training data and
               | outputs model weights.)
        
               | _heimdall wrote:
               | The output of the training engine, I.E. the model itself,
               | isn't source code at all though. The best approximation
               | would be considering it obfuscated code, and even then
               | it's a stretch since it is more similar to compressed
               | data.
               | 
               | It sounds like Meta doesn't share source for the training
               | logic. That would be necessary for it to really be open
               | source, you need to be able to recreate and modify the
               | codebase but that has nothing to do with the training
               | data or the trained model.
        
               | saurik wrote:
               | I didn't claim the output is source code, any more than
               | the kernel is. Are you sure you don't simply agree with
               | me?
        
               | croemer wrote:
               | But surely you wouldn't call it open source if sentry
               | just gave you a binary - and the source code wasn't
               | available.
        
             | blackeyeblitzar wrote:
             | There is a comment elsewhere claiming there are a few dozen
             | fully open source models:
             | https://news.ycombinator.com/item?id=41048796
        
             | sigmoid10 wrote:
             | >Nobody does open source LLMs.
             | 
             | There are a bunch of independent, fully open source
             | foundation models from companies that share everything
             | (including all data). AMBER and MAP-NEO for example. But we
             | have yet to see one in the 100B+ parameter category.
        
               | stavros wrote:
               | Sorry, the tilde before "nobody" is my notation for
               | "basically nobody" or "almost nobody". I thought it was
               | more common.
        
               | plausibility wrote:
               | It is more common when it comes to numbers I guess. There
               | are ~5 ancestors in this comment chain, if I would agree
               | roughly 4-6 is acceptable.
        
               | politelemon wrote:
               | It's the literal (figurative) nobody rather than the
               | literal (literal) nobody.
        
             | larodi wrote:
             | Indeed, since when the deliverable being a jpeg/exe, which
             | is similar to what the model file is, is considered the
             | source? it is more like open result or freely available vm
             | image, which works, but has its core FS scrambled or
             | crypted.
             | 
             | Zuck knows this very well and it does him no honour to
             | speak like, and from his position this equals attempt ate
             | trying to change the present semantics of open source. Of
             | course, others do that too - using the notion of open
             | source to describe something very far from open.
             | 
             | What Meta is doing under his command can better be
             | desdribed as releasing the resulting...build, so that it
             | can be freely poked around and even put to work. But the
             | result cannot be effectively reversed engineered.
             | 
             | Whats more ridiculous is that precisely because the result
             | is not the source in its whole form, that these graphical
             | structures can made available. Only thanks to the fact it
             | is not traceable to the source, which makes the whole game
             | not only closed, but like... sealed forever. An unfair
             | retell of humanity's knowledge tossed around in very
             | obscure container that nobody can reverse engineer.
             | 
             | how's that even remotely similar to open source?
        
               | proteal wrote:
               | Even if everything was released how you described, what
               | good would that really do for an individual without
               | access to heaps of compute? Functionally there seems to
               | be no difference between open weights and open compute
               | because nobody could train a facsimile model.
               | Furthermore, all frontier models are inscrutable due to
               | their construction. It's wild to me seeing people
               | complain semantics when meta dropped their model for
               | cheap. Now I'm not saying we should suck the zuck for
               | this act of charity, but you have to imagine that other
               | frontier models are not thrilled that meta has
               | invalidated their compute moats with the release of
               | llama. Whether we like it or not, we're on this AI
               | rollercoaster and I'm glad that it's not just
               | oligopolists dictating the direction forward. I'm happy
               | to see meta take this direction, knowing that the
               | alternatives are much worse.
        
               | frabcus wrote:
               | I'd find knowing what's in the training data hugely
               | valuable - can analyse it to understand and predict
               | capabilities.
        
               | stavros wrote:
               | That's not the discussion. We're talking about what open
               | source is, and it's having the weights and the method to
               | recreate the model.
               | 
               | If someone gives me an executable that I can run for
               | free, and then says "eh why do you want the source, it
               | would take you a long time to compile", that doesn't make
               | it open source, it just makes it gratis.
        
               | nightski wrote:
               | Calling weights an executable is disingenuous and not a
               | serious discussion. You can do a lot more with weights
               | than you could with a binary executable.
        
               | rizky05 wrote:
               | This is debatable, even an executable is valuable
               | artifact. You can also do a lot with executable in expert
               | hand.
        
               | _flux wrote:
               | You can do a lot more with an executable as well than
               | just execute it. So maybe the _analogy_ is apt, even if
               | not exact.
               | 
               | Actually executables you can reverse engineer it into
               | something that could be compiled back into an executable
               | with the exact same functionality, which is AFAIK
               | impossible to do with "open weights". Still, we don't
               | call free executables "open source".
        
             | nine_k wrote:
             | Linux is open source and is mostly C code. You cannot run C
             | code directly, you have to compile it and produce binaries.
             | But it's the C code, not binary form, where the
             | collaboration happens.
             | 
             | With LLMs, weights are the binary code: it's how you run
             | the model. But to be able to train the model from scratch,
             | or to collaborate on new approaches, you have to operate at
             | a the level of architecture, methods, and training data
             | sets. They are the source code.
        
               | verdverm wrote:
               | Analogies are always going to fall short. With LLM
               | weights, you can modify them (quant, fine-tuning) to get
               | something different, which is not something you do with
               | compiled binaries. There are ample areas for
               | collaboration even without being able to reproduce from
               | scratch, which takes $X Millions of dollars, also
               | something that a typical binary does not have as a
               | feature.
        
               | piperswe wrote:
               | You can absolutely modify compiled binaries to get
               | something different. That's how lots of video game
               | modding and ROM hacks work.
        
               | krisoft wrote:
               | And we would absolutely do it more often if compiling
               | would cost as much as training of an LLM costs now.
        
               | verdverm wrote:
               | I considered adding "normally" to the binary
               | modifications expecting a response like this. The
               | concepts are still worlds apart
               | 
               | Weights aren't really a binary in the same sense that a
               | compiler produces, they lack instructions and are more
               | just a bunch of floating point values. Nor can you run
               | model weights without separate code to interpret them
               | correctly. In this sense, they are more like a JPEG or 3d
               | model
        
             | llm_trw wrote:
             | This is bending the definition to the other extreme.
             | 
             | Linux doesn't ship you the compiler you need to build the
             | binaries either, that doesn't mean it's closed source.
             | 
             | LLMs are fundamentally different to software and using
             | terms from software just muddies the waters.
        
               | saurik wrote:
               | Then what is the "source"? If we are to use the term
               | "source" then what does that mean here, as distinct from
               | it merely being free?
        
               | llm_trw wrote:
               | It means nothing because LLMs aren't software.
        
               | Phelinofist wrote:
               | Do they not run on a computer?
        
               | llm_trw wrote:
               | So does a video. Is a video open source if you're given
               | the permissions to edit it? To distribute it? Given the
               | files to generate it? What if the files can only be open
               | in a proprietary program?
               | 
               | Videos aren't software and neither are llms.
        
               | saurik wrote:
               | If a video doesn't have source code, then it can't be
               | open source. Likewise, if you feel that an LLM doesn't
               | have source code because of some property of what it is
               | -- as you claim it isn't software and somehow that means
               | that it abstractly removes it from consideration for this
               | concept (an idea I think is ridiculous, FWIW: an LLM is
               | clearly software that runs in a particularly interesting
               | virtual machine defined by the model architecture) --
               | then; somewhat trivially, it also can't be open source.
               | It is, as the person you are responding to says, at best
               | "open weights".
               | 
               | If a video somehow _does_ have source code which can
               | "generate it", then the question of what it means for the
               | source code to the video to be open even if the only
               | program which can read it and generate the video is
               | closed source is equivalent to asking if a program
               | written in Visual Basic can ever be open source given
               | that the Visual Basic compiler is closed source.
               | Personally, I can see arguments either way on this issue,
               | though _most people_ seem to agree that the program is
               | still open source in such a situation.
               | 
               | However, we need not care too much about the answer to
               | that specific conundrum, as the moral equivalent of both
               | the compiler and the runtime virtual machine are almost
               | always open source. What is then important is much
               | easier: if you don't provide the source code to the
               | project, even if the compiler is open source and even if
               | it runs on an open source machine, clearly the project --
               | whatever it is that we might try to be discussing,
               | including video files -- cannot be open source. The idea
               | that a video can be open source when what you mean is the
               | video is unencrypted and redistributanle but was merely
               | intended to be played in an open source video player is
               | absurd.
        
               | dns_snek wrote:
               | > Is a video open source if you're given the permissions
               | to edit it? To distribute it? Given the files to generate
               | it?
               | 
               | If you're given the source material and project files to
               | continue editing where the original editors finished, and
               | you're granted the rights to re-distribute - Yes, that
               | would be open source[1].
               | 
               | Much like we have "open source hardware" where the
               | "source" consists of original schematics, PCB layouts,
               | BOM, etc. [2]
               | 
               | [1] https://en.wikipedia.org/wiki/Open-source_film
               | 
               | [2] https://en.wikipedia.org/wiki/Open-source_hardware
        
               | the8thbit wrote:
               | Videos and images are software. They are compiled
               | binaries with very domain specific instructions executed
               | in a very non-turing complete context. They are generally
               | not released as open source, and in many cases the source
               | code (the file used to edit the video or image) is lost.
               | They are not seen, colloquially, as software, but that
               | does not mean that they are not software.
               | 
               | If a video lacks a specification file (the source code)
               | which can be used by a human reader to modify specific
               | features in the video, then it is software that is simply
               | incapable of being open sourced.
        
               | TeMPOraL wrote:
               | And LLMs don't ship with a Python distribution.
               | 
               | Linux sources :: dataset that goes into training
               | 
               | Linux sources' build confs and scripts :: training code +
               | hyperparameters
               | 
               | GCC :: Python + PyTorch or whatever they use in training
               | 
               | Compiled Linux kernel binary :: model weights
        
               | llm_trw wrote:
               | Just because you keep saying it doesn't make it true.
               | 
               | LLMs are not software any more than photographs are.
        
               | the8thbit wrote:
               | "LLMs are fundamentally different to software and using
               | terms from software just muddies the waters."
               | 
               | They're still software, they just don't have source code
               | (yet).
        
             | mattnewton wrote:
             | There are plenty of open source LLMs, they just aren't at
             | the top of the leaderboards yet. Here's a recent example, I
             | think from Apple: https://huggingface.co/apple/DCLM-7B
             | 
             | Using open data and dclm:
             | https://github.com/mlfoundations/dclm
        
             | Aeolun wrote:
             | I suspect that even if you allowed people to take the data,
             | nobody but a FAANG like organisation could even store it?
        
               | jlokier wrote:
               | My impression is the training data for foundation models
               | isn't that large. It won't fit on your laptop drive, but
               | it will fit comfortably in a few racks of high-density
               | SSDs.
        
               | jijji wrote:
               | yeah, according to the article [0] about the release of
               | Llama 3.1 405B, it was trained on 15 trillion tokens
               | using 16000 Nvidia H100's to do it. Even if they did
               | release the training data, I don't think many people
               | would have the number of gpus required to actually do any
               | real training to create the model....
               | 
               | [0] https://ai.meta.com/blog/meta-llama-3-1/
        
             | WithinReason wrote:
             | If weights are not the source, then if they gave you the
             | training data and scripts but not the weights, would that
             | be "open source"?
        
               | guappa wrote:
               | Yes, but they won't do that. Possibly because extensive
               | copyright violation in the training data that they're not
               | legally allowed to share.
        
               | sharpshadow wrote:
               | If somebody would leak the training data and they would
               | deny that it's real ergo not getting sued and the data
               | would be available.
               | 
               | Edit typo.
        
               | guappa wrote:
               | It's not available if you can't use it because you don't
               | have as many lawyers as facebook and can't ignore laws so
               | easily.
        
           | ab5tract wrote:
           | If you can't share the dataset, under what twisted reality
           | are you fine to share the derivative models based on those
           | unsharable datasets?
           | 
           | In a better world, there would be no "I ran some algos on it
           | and now it's mine" defense.
        
             | guitarlimeo wrote:
             | Yeah was gonna say exactly the same thing. Weird how the
             | legislation allows releasing LLMs trained on data that is
             | not allowed to be shared otherwise.
        
               | floydnoel wrote:
               | Meta might possibly have a license to use (some of) that
               | data, but not a license to distribute it. Legislation has
               | little to do with it, I imagine.
        
           | yangcheng wrote:
           | latest llama 3.1 is in a different repo,
           | https://github.com/meta-llama/llama-
           | models/blob/main/models/... , but yes, the code is shared. It
           | astonishing that in software 2.0 era, powerful applications
           | like llama has only hundreds of lines of code, and most work
           | hidden in training data. Source code alone is no longer that
           | informative as Software 1.0
        
           | twelvechairs wrote:
           | Open training data would be great too.
           | 
           | If you have open data and open source code you can reproduce
           | the weights
        
             | blharr wrote:
             | Not easily for these large scale models, but theoretically
             | maybe
        
           | ajxlasA wrote:
           | Really? I have to check out the training code again. Last
           | time I looked the training and inference code were just
           | example toys that were barely usable.
           | 
           | Has that changed?
        
           | danielrhodes wrote:
           | For models of this size, the code used to train them is going
           | to be very custom to the architecture/cluster they are built
           | on. It would be almost useless to anybody outside of Meta.
           | The dataset would be more a lot more interesting, as it would
           | at the very least show everybody how they got it to behave in
           | certain ways.
        
         | mkolodny wrote:
         | Llama's code is open source: https://github.com/meta-
         | llama/llama3/blob/main/llama/model.p...
        
           | apsec112 wrote:
           | That's not the _training_ code, just the inference code. The
           | training code, running on thousands of high-end H100 servers,
           | is surely much more complex. They also don 't open-source the
           | dataset, or the code they used for data
           | scraping/filtering/etc.
        
             | the8thbit wrote:
             | "just the inference code"
             | 
             | It's not the "inference code", its the code that specifies
             | the architecture of the model and loads the model. The
             | "inference code" is mostly the model, and the model is not
             | legible to a human reader.
             | 
             | Maybe someday open source models will be possible, but we
             | will need much better interpretability tools so we can
             | generate the source code from the model. In most software
             | projects you write the source as a specification that is
             | then used by the computer to implement the software, but in
             | this case the process is reversed.
        
           | blackeyeblitzar wrote:
           | That is just the inference code. Not training code or
           | evaluation code or whatever pre/post processing they do.
        
             | patrickaljord wrote:
             | Is there an LLM with actual open source training code and
             | dataset? Besides BLOOM
             | https://huggingface.co/bigscience/bloom
        
               | osanseviero wrote:
               | Yes, there are a few dozen full open source models
               | (license, code, data, models)
        
               | blackeyeblitzar wrote:
               | What are some of the other ones? I am aware mainly of
               | OLMo (https://blog.allenai.org/olmo-open-language-
               | model-87ccfc95f5...)
        
               | navinsylvester wrote:
               | Here you go - https://github.com/apple/corenet
        
           | mesebrec wrote:
           | This is like saying any python program is open source because
           | the python runtime is open source.
           | 
           | Inference code is the runtime; the code that runs the model.
           | Not the model itself.
        
             | mkolodny wrote:
             | I disagree. The file I linked to, model.py, contains the
             | Llama 3 model itself.
             | 
             | You can use that model with open data to train it from
             | scratch yourself. Or you can load Meta's open weights and
             | have a working LLM.
        
               | causal wrote:
               | Yeah a lot of people here seem to not understand that
               | PyTorch really does make model definitions that simple,
               | and that has everything you need to resume back-
               | propagation. Not to mention PyTorch itself being open-
               | sourced by Meta.
               | 
               | That said the LLama-license doesn't meet strict
               | definitions of OS, and I bet they have internal tooling
               | for datacenter-scale training that's not represented
               | here.
        
               | yjftsjthsd-h wrote:
               | > The file I linked to, model.py, contains the Llama 3
               | model itself.
               | 
               | That makes it source available (
               | https://en.wikipedia.org/wiki/Source-available_software
               | ), not open source
        
               | macrolime wrote:
               | Source available means you can see the source, but not
               | modify it. This is kinda the opposite, you can modify the
               | model, but you don't see all the details of its creation.
        
               | yjftsjthsd-h wrote:
               | > Source available means you can see the source, but not
               | modify it.
               | 
               | No, it doesn't mean that. To quote the page I linked,
               | emphasis mine,
               | 
               | > Source-available software is software released through
               | a source code distribution model that includes
               | arrangements where the source can be viewed, _and in some
               | cases modified_ , but without necessarily meeting the
               | criteria to be called open-source. The licenses
               | associated with the offerings range from allowing code to
               | be viewed for reference to allowing code to be modified
               | and redistributed for both commercial and non-commercial
               | purposes.
               | 
               | > This is kinda the opposite, you can modify the model,
               | but you don't see all the details of its creation.
               | 
               | Per https://github.com/meta-
               | llama/llama3/blob/main/LICENSE there's also a laundry
               | list of ways you're not allowed to use it, including
               | restrictions on commercial use. So not Open Source.
        
           | Flimm wrote:
           | No, it's not. The Llama 3 Community License Agreement is not
           | an open source license. Open source licenses need to meet the
           | criteria of the only widely accepted definition of "open
           | source", and that's the one formulated by the OSI [0]. This
           | license has multiple restrictions on use and distribution
           | which make it not open source. I know Facebook keeps calling
           | this stuff open source, maybe in order to get all the good
           | will that open source branding gets you, but that doesn't
           | make it true. It's like a company calling their candy vegan
           | while listing one its ingredients as pork-based gelatin. No
           | matter how many times the company advertises that their
           | product is vegan, it's not, because it doesn't meet the
           | definition of vegan.
           | 
           | [0] - https://opensource.org/osd
        
             | CamperBob2 wrote:
             | _Open source licenses need to meet the criteria of the only
             | widely accepted definition of "open source", and that's the
             | one formulated by the OSI [0]_
             | 
             | Who died and made OSI God?
        
               | vbarrielle wrote:
               | The OSI was created about 20 years ago and defined and
               | popularized the term open source. Their definition has
               | been widely accepted over that period.
               | 
               | Recently, companies are trying to market things as open
               | source when in reality, they fail to adhere to the
               | definition.
               | 
               | I think we should not let these companies change the
               | meaning of the term, which means it's important to
               | explain every time they try to seem more open than they
               | are.
               | 
               | I'm afraid the battle is being lost though.
        
               | Suppafly wrote:
               | >The OSI was created about 20 years ago and defined and
               | popularized the term open source. Their definition has
               | been widely accepted over that period.
               | 
               | It was defined and accepted by the community well before
               | OSI came around though.
        
               | MaxBarraclough wrote:
               | This isn't helpful. The community defers to the OSI's
               | definition because it captures what they care about.
               | 
               | We've seen people try to deceptively describe non-OSS
               | projects as open source, and no doubt we will continue to
               | see it. Thankfully the community (including Hacker News)
               | is quick to call it out, and to insist on not cheapening
               | the term.
               | 
               | This is one the topics that just keeps turning up:
               | 
               | * https://news.ycombinator.com/item?id=24483168
               | 
               | * https://news.ycombinator.com/item?id=31203209
               | 
               | * https://news.ycombinator.com/item?id=36591820
        
               | CamperBob2 wrote:
               | _This isn 't helpful. The community..._
               | 
               | Speak for yourself, please. The term is much older than
               | 1998, with one easily-Googled example being
               | https://www.cia.gov/readingroom/docs/DOC_0000639879.pdf ,
               | and an explicit case of IT-related usage being
               | https://i.imgur.com/Nw4is6s.png from https://www.google.c
               | om/books/edition/InfoWarCon/09X3Ove9uKgC... .
               | 
               | Unless a registered trademark is involved (spoiler: it's
               | not) no one, whether part of a so-called "community" or
               | not, has any authority to gatekeep or dictate the terms
               | under which a generic phrase like "open source" can be
               | used.
        
               | Flimm wrote:
               | Neither of those usages relate to IT, they both are about
               | sources of intelligence (espionage). Even if they were,
               | the OSI definition won, nobody is using the definitions
               | from 1995 CIA or the 1996 InfoConWar book in the realm of
               | IT, not even Facebook.
               | 
               | The community has the authority to complain about
               | companies mis-labelling their pork products as vegan,
               | even if nobody has a registered trademark on the term
               | vegan. Would you tell people to shut up about that case
               | because they don't have a registered trademark? Likewise,
               | the community has authority to complain about
               | Meta/Facebook mis-labelling code as open source even when
               | they put restrictions on usage. It's not gate-keeping or
               | dictatorship to complain about being misled or being lied
               | to.
        
               | CamperBob2 wrote:
               | _Would you tell people to shut up about that case because
               | they don 't have a registered trademark?_
               | 
               | I especially like how _I 'm_ the one telling people to
               | "shut up" all of a sudden.
               | 
               | As for the rest, see my other reply.
        
               | Flimm wrote:
               | You're right, I and those who agree with me were the
               | first to ask people to "shut up", in this case, to ask
               | Meta to stop misusing the term open source. And I was the
               | first to say "shut up", and I know that can be
               | inflammatory and disrespectful, so I shouldn't have used
               | it. I'm sorry. We're here in a discussion forum, I want
               | you to express your opinion even it is to complain about
               | my complaints. For what it's worth, your counter-
               | arguments have been stronger and better referenced than
               | any other I have read (for the case of accepting a looser
               | definition of the term open source in the realm of IT).
        
               | CamperBob2 wrote:
               | All good, and I also apologize if my objection came
               | across as disrespectful.
               | 
               | This whole 'Open Source' thing is a bigger pet peeve than
               | it should be, because I've received criticism for using
               | the term on a page where I literally just posted a .zip
               | file full of source code. The smart thing to do would
               | have been to ignore and forget the criticism, which I
               | will now work harder at doing.
               | 
               | In the case of a pork producer who labels their products
               | as 'vegan', that's different because there _is_ some
               | authority behind the usage of  'vegan'. It's a standard
               | English-language word that according to Merriam-Webster
               | goes back to 1944. So that would amount to an open-and-
               | shut case of false advertising, which I don't think
               | applies here at all.
        
               | MaxBarraclough wrote:
               | > In the case of a pork producer who labels their
               | products as 'vegan', that's different because there is
               | some authority behind the usage of 'vegan'.
               | 
               | I don't see the difference. _Open source software_ is a
               | term of art with a specific meaning accepted by its
               | community. When people misuse the term, invariably in
               | such a way as to broaden it to include whatever it is
               | they 're pushing, it's right that the community responds
               | harshly.
        
               | CamperBob2 wrote:
               | Terms of art do not require licenses. A given term is
               | either an ordinary dictionary word that everyone
               | including the courts will readily recognize ("Vegan"), a
               | trademark ("Microsoft(r) Office 365(tm)"), or a fragment
               | of language that everyone can feel free to use for their
               | own purposes without asking permission. "Open Source"
               | falls into the latter category.
               | 
               | This kind of argument is literally why trademark law
               | exists. OSI did not elect to go down that path. Maybe
               | they should have, but I respect their decision not to,
               | and perhaps you should, too.
        
             | 8note wrote:
             | Isn't the MIT license the generally accepted "open source"
             | license? It's a community owned term, not OSI owned
        
               | henryfjordan wrote:
               | There are more licenses than just MIT that are "open
               | source". GPL, BSD, MIT, Apache, some of the Creative
               | Commons licenses, etc. MIT has become the defacto default
               | though
               | 
               | https://opensource.org/license (linking to OSI for the
               | list because it's convenient, not because they get to
               | decide)
        
               | yjftsjthsd-h wrote:
               | MIT is _a_ permissive open source license, not _the_ open
               | source license.
        
             | NiloCK wrote:
             | These discussions (ie, everything that follows here) would
             | be much easier if the crowd insisting on the OSI definition
             | of open source would capitalize Open Source.
             | 
             | In English, proper nouns are capitalized.
             | 
             | "Open" and "source" are both very normal English words.
             | English speakers have "the right" to use them according to
             | their own perspective and with personal context. It's the
             | difference between referring to a blue tooth, and
             | Bluetooth, or to an apple store or an Apple store.
        
         | stale2002 wrote:
         | Ok call it Open Weights then if the dictionary definitions
         | matter so much to you.
         | 
         | The actual point that matters is that these models are
         | available for most people to use for a lot of stuff, and this
         | is way way better than what competitors like OpenAI offer.
        
           | the8thbit wrote:
           | They don't "[allow] developers to modify its code however
           | they want", which is a critical component of "open source",
           | and one that Meta is clearly trying to leverage in branding
           | around its products. I would like _them_ to start calling
           | these  "public weight models", because what they're doing now
           | is muddying the waters so much that "open source" now just
           | means providing an enormous binary and an open source harness
           | to run it in, rather than serving access to the same binary
           | via an API.
        
             | Voloskaya wrote:
             | Feels a bit like you are splitting hair for the pleasure of
             | semantic arguments to be honest. Yes there are no source in
             | ML, so if we want to be pedantic it shouldn't be called
             | open source. But what really matters in the open source
             | movement is that we are able to take a program built by
             | someone and modify it to do whatever we want with it,
             | without having to ask someone for permission or get
             | scrutinized or have to pay someone.
             | 
             | The same applies here, you can take those models and modify
             | them to do whatever you want (provided you know how to
             | train ML models), without having to ask for permission, get
             | scrutinized or pay someone.
             | 
             | I personally think using the term open source is fine, as
             | it conveys the intent correctly, even if, yes, weights are
             | not sources you can read with your eyes.
        
               | wrs wrote:
               | Calling that "open source" renders the word "source"
               | meaningless. By your definition, I can release a binary
               | executable freely and call it "open source" because you
               | can modify it to do whatever you want.
               | 
               | Model weights are like a binary that _nobody_ has the
               | source for. We need another term.
        
               | Voloskaya wrote:
               | No it's not the same as releasing a binary, feels like we
               | can't get out of the pedantics. I can in theory modify a
               | binary to do whatever I want. In practice it is
               | intractably hard to make any significant modification to
               | a binary, and even if you could, you would then not be
               | legally allowed to e.g. redistribute.
               | 
               | Here, modifying that model is not harder that doing
               | regular ML, and I can redistribute.
               | 
               | Meta doesn't have access to some magic higher level
               | abstraction for that model that would make working with
               | it easier that they did not release.
               | 
               | The sources in ML are the architecture the training and
               | inference code and a paper describing the training
               | procedure. It's all there.
        
               | the8thbit wrote:
               | "In practice it is intractably hard to make any
               | significant modification to a binary, and even if you
               | could, you would then not be legally allowed to e.g.
               | redistribute."
               | 
               | It depends on the binary and the license the binary is
               | released under. If the binary is released to the public
               | domain, for example, you are free to make whatever
               | modifications you wish. And there are plenty of licenses
               | like this, that allow closed source software to be used
               | as the user wishes. That doesn't make it open source.
               | 
               | Likewise, there are plenty of closed source projects
               | who's binaries we can poke and prod with much higher
               | understanding of what our changes are actually doing than
               | we're able to get when we poke and prod LLMs. If you want
               | to make a Pokemon Red/Blue or Minecraft mod you have a
               | lot of tools at your disposal.
               | 
               | A project that only exists as a binary which the
               | copyright holder has relinquished rights to, or has
               | released under some similar permissive closed source
               | license, but people have poked around enough to figure
               | out how to modify certain parts of the binary with some
               | degree of predictability is a more apt analogy.
               | Especially if the original author has lost the source
               | code, as there is no source code the speak of when
               | discussing these models.
               | 
               | I would not call that binary "open source", because the
               | source would, in fact, not be open.
        
               | wrs wrote:
               | Can you change the tokenizer? No, because all you have is
               | the weights trained with the current tokenizer.
               | Therefore, by any normal definition, you don't have the
               | source. You have a giant black box of numbers with no
               | ability to reproduce it.
        
               | Voloskaya wrote:
               | > Can you change the tokenizer?
               | 
               | Yes.
               | 
               | You can change it however you like, then look at the
               | paper [1] under section 3.2. to know which
               | hyperparameters were used during training and finetune
               | the model to work with your new tokenizer using e.g.
               | FineWeb [2] dataset.
               | 
               | You'll need to do only a fraction of the training you
               | would have needed to do if you were to start a training
               | from scratch for your tokenizer of choice. The weights
               | released by Meta give you a massive head start and cost
               | saving.
               | 
               | The fact that it's not trivial to do and out of reach of
               | most consumer is not a matter of openness. That's just
               | how ML is today.
               | 
               | [1]: https://scontent-
               | sjc3-1.xx.fbcdn.net/v/t39.2365-6/452387774_...
               | 
               | [2]:
               | https://huggingface.co/datasets/HuggingFaceFW/fineweb
        
               | wrs wrote:
               | You can change the tokenizer and build _another_ model,
               | if you can come up with your own version of the rest of
               | the source (e.g., the training set, RLHF, etc.). You
               | can't change the tokenizer for _this_ model, because you
               | don't have all of its source.
        
               | slavik81 wrote:
               | > The same applies here, you can take those models and
               | modify them to do whatever you want without having to ask
               | for permission, get scrutinized or pay someone.
               | 
               | The "Additional Commercial Terms" section of the license
               | includes restrictions that would not meet the OSI
               | definition of open source. You must ask for permission if
               | you have too many users.
        
             | bornfreddy wrote:
             | "Public weight models" sounds about right, thanks for
             | coming up with a good term! Hope it catches.
        
             | stale2002 wrote:
             | My central point is this:
             | 
             | "are available for most people to use for a lot of stuff,
             | and this is way way better than what competitors like
             | OpenAI offer."
             | 
             | I presume you agree with it.
             | 
             | > rather than serving access
             | 
             | Its not the same access though.
             | 
             | I am sure that you are creative enough to think of many
             | questions that you could ask llama3, that would instead get
             | you kicked off of OpenAI.
             | 
             | > They don't "[allow] developers to modify its code however
             | they want"
             | 
             | Actually, the fact that the model weights are available
             | means that you can even ignore any limitations that you
             | think are on it, and you'll probably just get away with it.
             | You are also ignoring the fact that the limitations are
             | minimal to most people.
             | 
             | Thats a huge deal!
             | 
             | And it is dishonest to compare a situation where
             | limitations are both minimal and almost unenforceable
             | (Except against maybe Google) to a situation where its
             | physically not possible to get access to the model weights
             | to do what you want with them.
        
               | the8thbit wrote:
               | > Actually, the fact that the model weights are available
               | means that you can even ignore any limitations that you
               | think are on it, and you'll probably just get away with
               | it. You are also ignoring the fact that the limitations
               | are minimal to most people.
               | 
               | The limitations here are technical, not legal. (Though I
               | am aware of the legal restrictions as well, and I think
               | its worth noting that no _other_ project would get by
               | calling themselves open source while imposing a
               | restriction which prevents competitors from using the
               | system to build their competing systems.) There isn 't
               | any source code to read and modify. Yes, you can fine
               | tune a model just like you can modify a binary but this
               | isn't _source code_. Source code is a human readable
               | specification that a computer can use to transform into
               | executable code. This allows the human to directly modify
               | functionality in the specification. We simply don 't have
               | that, and it will not be possible unless we make a lot of
               | strides in interpretability research.
               | 
               | > Its not the same access though.
               | 
               | > I am sure that you are creative enough to think of many
               | questions that you could ask llama3, that would instead
               | get you kicked off of OpenAI.
               | 
               | I'm not saying that systems that are provided as SaaS
               | don't tend to be more restrictive in terms of what they
               | let you do through the API they expose vs what is
               | possible if you run the same system locally. That may not
               | always be true, but sure, as a general rule it is. I
               | mean, it can't be _less_ restrictive. However, that doesn
               | 't mean that being able to run code on your own machine
               | makes the code open source. I wouldn't consider Windows
               | open source, for example. Why? Because they haven't
               | released the source code for Windows. Likewise, I
               | wouldn't consider these models open source because their
               | creators haven't released source code for them. Being
               | technically infeasible to do doesn't mean that the
               | definition changes such that its no longer technically
               | infeasible. It is simply infeasible, and if we want to
               | change that, we need to do work in interpretability, not
               | pretend like the problem is already solved.
        
               | stale2002 wrote:
               | So then yes you agree with this:
               | 
               | "are available for most people to use for a lot of stuff,
               | and this is way way better than what competitors like
               | OpenAI offer." And that this is very significant.
        
         | input_sh wrote:
         | Open Source Initiative (kind of a de-facto authority on what's
         | open source and what not) is spending a whole lot of time
         | figuring out what it means for an AI system to be open source.
         | In other words, they're basically trying to come up with a new
         | license because the existing ones can't easily apply.
         | 
         | I believe this is the current draft:
         | https://opensource.org/deepdive/drafts/the-open-source-ai-de...
        
           | downWidOutaFite wrote:
           | OSI made themselves the authority because they hated Richard
           | Stallman and his Free Software movement. It's just marketing.
        
             | gowld wrote:
             | RMS has no interest in governing Open Source, so your
             | comment bears no particular relevance.
             | 
             | RMS is an advocate for Free Software. Free Software
             | generally implies Open Source, but not the converse.
             | 
             | RMS considers openness of source to be a separate category
             | from the freeness of software. "Free software is a
             | political movement; open source is a development model."
             | 
             | https://www.gnu.org/licenses/license-list.en.html
        
               | ab5tract wrote:
               | Are you really pretending that OSI and the open source
               | label itself wasn't a reactionary movement that vilified
               | free software principles in hopes of gaining corporate
               | traction?
               | 
               | Most of us who were there remember it differently. True
               | open source advocates will find little to refute in what
               | I've said.
        
               | cheema33 wrote:
               | > True open source advocates will find little to refute
               | in what I've said.
               | 
               | No true Scotsman
               | https://en.wikipedia.org/wiki/No_true_Scotsman
               | 
               | OSI helped popularize the open source movement. They not
               | only make it palatable to businesses, but got them
               | excited about it. I think that FSF/Stallman alone would
               | not have been very successful on this front with
               | GPL/AGPL.
        
               | ab5tract wrote:
               | Like I said, honest open source advocates won't take
               | issue to how I framed their position.
               | 
               | Here's a more important point: how far would the open
               | source people have gotten without GCC and glibc?
               | 
               | Much less far than they will ever admit, in my
               | experience.
        
               | miffy900 wrote:
               | > Most of us who were there remember it differently. True
               | open source advocates will find little to refute in what
               | I've said.
               | 
               | > Like I said, honest open source advocates won't take
               | issue to how I framed their position.
               | 
               | Yet you've failed to provide even a single point of
               | evidence to back up your claim.
               | 
               | > "honest open source advocates"
               | 
               | You've literally just made this term up. It's
               | meaningless.
        
               | halostatue wrote:
               | For some advocates, sure. I was there, too -- although at
               | the beginning of my career and not deeply involved in
               | most licensing discussions until the founding of Mozilla
               | (where I argued _against_ the GNU GPL and was generally
               | pleased with the result of the MPL). However, from ~1990,
               | I remember sharing some code where I  "more or less" made
               | my code public domain but recommended people consider the
               | GNU GPL as part of the README (I don't have the source
               | code available, so I don't recall).
               | 
               | Your characterization is quit easily refutable, because
               | at the time that OSI was founded, there was _already_ an
               | explosion of possible licenses and RMS and other
               | GNUnatics were making lots of noise about GNU /Linux and
               | trying to be as maximalist as possible while presenting
               | any choice _other_ than the GNU GPL as  "against
               | freedom".
               | 
               | This _certainly_ would not have held well with people who
               | were using the MIT Licence or BSD licences (created
               | around the same time as the GNU GPL v1), who believed
               | (and continue to believe) that there were options _other_
               | than a restrictive viral licence++. Yes, some of the
               | people involved vilified the  "free software principles",
               | but there were also GNU "advocates" who were making RMS
               | look tame with their wording (I recall someone telling me
               | to enjoy "software slavery" because I preferred licences
               | other than the GNU GPL).
               | 
               | The "Free Software" advocates were pretending that the
               | goals of their licence were the only goals that should
               | matter for all authors and consumers of software. That is
               | not and never has been the case, so it is unsurprising
               | that there was a bit of reaction to such extremism.
               | 
               | OSI and the open source label _were_ a move to make
               | things easier for corporations to accept and understand
               | by providing (a) a clear unifying definition, and (b) a
               | set of licences and guidelines for knowing what licenses
               | did what and the risks and obligations they presented to
               | people who used software under those licences.
               | 
               | ++ Don't @ me on this, because both the virality and
               | restrictiveness are features of the GNU GPL. If it
               | weren't for the nonsense in the preamble, it would be a
               | _good_ licence. As it is, it is an _effective_ if
               | rampantly misrepresented licence.
        
             | dogleash wrote:
             | Didn't the Open Source Definition start as the DFSG? You
             | telling me Debian hates the Free Software movement? Unless
             | you define "hating Free Software" as "not banning the BSD
             | license", then I'll have to disagree.
        
         | Zambyte wrote:
         | > If so, then how can current ML models be open source?
         | 
         | The source of a language model is the text it was trained on.
         | Llama models are not open source (contrary to their claims),
         | they are open weight.
        
           | thayne wrote:
           | I think it would also include the code used to train it
        
             | pphysch wrote:
             | That would be more analogous to the build toolchain than
             | the source code, but yes
        
               | tshaddox wrote:
               | Surely traditional "open source" also needs some notion
               | of a reproducible build toolchain, otherwise the source
               | code itself is approximately useless.
               | 
               | Imagine if the source code was in a programming language
               | of which the basic syntax and semantics were known to no
               | one but the original developers.
               | 
               | Or more realistically, I think it's a major problem if an
               | open source project can only be built by an esoteric
               | process that only the original developers have access to.
        
               | pphysch wrote:
               | Source code in a vacuum is still valuable as a way to
               | deal with missing/inaccurate documentation and diagnose
               | faults and their causes.
               | 
               | Raw training datasets similarly has some value as you can
               | analyze it for different characteristics to understand
               | why the trained model is under/over-representing
               | different concepts.
               | 
               | But yes real FOSS should be "open-build" and allow anyone
               | to build a test-passing artifact from raw source
               | material.
        
           | moffkalast wrote:
           | You can find the entire Llama 3.0 pretraining set here:
           | https://huggingface.co/datasets/HuggingFaceFW/fineweb
           | 
           | 15T tokens, 45 terrabytes. Seems fairly open source to me.
        
             | Zambyte wrote:
             | Where has Facebook linked that? I can't find anywhere that
             | they actually published that.
        
               | moffkalast wrote:
               | Yeah I don't think I've seen it linked officially, but
               | Meta does this sort of semi-official stuff all the time,
               | leaking models ahead of time for PR, they even have a
               | dedicated Reddit account for releasing unofficial info.
               | 
               | Regardless, it fits the compute used and the claim that
               | they trained from public web data, and was suspiciously
               | published by HF staff shortly after L3 released. It's
               | about as official as the Mistral 7B v0.2 base model. I.e.
               | mostly, but not entirely, probably for some weird legal
               | reasons.
        
               | nickpsecurity wrote:
               | Many companies stopped publishing their data sets after
               | people published evidence they were mass, copyright
               | infringement. They dropped the specifics of pretraining
               | data from the model cards.
               | 
               | Aside from licensing content, that content creators don't
               | like redistribution means a lawful model would probably
               | only use Gutenberg's collection and permissive code.
               | Anything else, including Wikipedia, usually has licensing
               | requirements they might violate.
        
             | verdverm wrote:
             | Says it is ~94TB, with >130k downloads, implying more than
             | 12 exabytes of copying, seems a bit off, wonder how they
             | are calculating downloads
        
           | root_axis wrote:
           | No. The text is an asset used by the source to train the
           | model. The source can process arbitrary text. Text is just
           | text, it was written for communication purposes, software
           | (defined by source code) processes that text in a particular
           | way to train a model.
        
             | Zambyte wrote:
             | In programming, "source" and "asset" have specific meanings
             | that conflict with how you used them.
             | 
             | Source is the input to some built artifact. It is the
             | _source_ of that artifact. As in: where the artifact comes
             | from. Textual input is absolutely the source of the ML
             | model. What you are using  "source" as is analogous to the
             | source of the compiler in traditional programming.
             | 
             | Asset is an artifact used as input, that is revered
             | verbatim by the output. For example, a logo baked into an
             | application to be rendered in the UI. The compilation of
             | the program doesn't make a new logo, it just moves the
             | asset into the built artifact.
        
               | Zambyte wrote:
               | I hadn't had my morning coffee yet when I wrote this and
               | I have no idea what I meant instead of "revered", but you
               | get the idea :D
        
         | gorgoiler wrote:
         | One counterpoint is that major publications (eg New York Times)
         | would have you believe that AI is a mildly lossy compression
         | algorithm capable of reconstructing the original source
         | material.
        
           | actinium226 wrote:
           | It's not?
        
           | _flux wrote:
           | I believe it is able to reconstruct parts of the original
           | source material--if the interrogator already knows the
           | original source material to prompt the model appropriately.
        
         | halflings wrote:
         | Training code is only useful to people in academia, and the
         | closest thing to "code you can modify" are open weights.
         | 
         | People are framing this as if it was an open-source hierarchy,
         | with "actual" open-source requiring all training code to be
         | shared. This is not obvious to me, as I'm not asking people
         | that share open-source libraries to also share the tools they
         | used to develop them. I'm also not asking them to share all the
         | design documents/architecture discussion behind this software.
         | It's sufficient that I can take the end result and reshape it
         | in any way I desire.
         | 
         | This is coming from an LLM practitioner that finetunes models
         | for a living; and this constant debate about open-source vs
         | open-weights seems like a huge distraction vs the impact open-
         | sourcing something like Llama has... this is truly a Linux-like
         | moment. (at a much smaller scale of course, for now at least)
        
           | kemiller wrote:
           | I dunno -- if an open source project required, say, a
           | proprietary compiler, that would diminish its open source-
           | ness. But I agree it's not totally comparable, since the
           | weights are not particularly analogous to machine code. We
           | probably need a new term. Open Weights.
        
             | 0-_-0 wrote:
             | There are many "compilers", you can download The Pile
             | yourself.
        
         | nothrowaways wrote:
         | Weight is the new code.
        
           | nomel wrote:
           | I think saying it's the new binary is closer to the truth.
           | You can't reproduce it, but you can use it. In this new
           | version, you can even nudge it a bit to do something a
           | _little_ different.
           | 
           | New stuff, so probably not good to force old words, with
           | known meanings, onto new stuff.
        
             | GreenWatermelon wrote:
             | The model is more akin to a python script than a compiled C
             | binary. This is how I see it:
             | 
             | Training Code and dataset are analogous to the developer
             | who wrote the script
             | 
             | Model and weights are end product that is then released
             | 
             | Inference Code is the runtime that could execute the code.
             | That would be e.g. PyTorch, which can import the weights
             | and run inference.
        
               | nomel wrote:
               | > The model is more akin to a python script than a
               | compiled C binary.
               | 
               | No, I completely disagree. Python is near pseudo-text
               | source. Source exists for the specific purpose of being
               | easily and _completely_ understood, by humans, because it
               | 's for and from humans. You can turn a python calculator
               | into a web server, because it can be split and separated
               | at any point, because it can be _completely understood_
               | at any point, and it 's _deterministic at every point_.
               | 
               | A model _cannot be understood_ by a human. It isn 't
               | meant to be. It's meant to be used, very close to as is.
               | You can't fundamentally change the model, or dissect it,
               | you can only nudge it in a direction, with the force of
               | that nudge being proportional to the money you can burn,
               | along with hope that it turns out how you want.
               | 
               | That's why I say it's closer to a binary: more of a black
               | box you can use. You can't easily make a binary do
               | something fundamentally different without changing the
               | source. You can't easily see into that black box, or even
               | know what it will do without trying. You can only nudge
               | it to act a little differently, or use it as part of a
               | workflow. (decompilation tools aside ;))
        
         | GuB-42 wrote:
         | I like the term "open weights". Open source would be the
         | dataset and code that generates these weights.
         | 
         | There is still a lot you can do with weights, like fine tuning,
         | and it is arguably more useful as retraining the entire model
         | would cost millions in compute.
        
         | szundi wrote:
         | Open source = reproducible binaries (weights) by you on your
         | computer, IMO.
         | 
         | Strategy of FB is that they are good to be a user only and fine
         | ruining competitor's business with good enough free
         | alternatives while collecting awards as saviors of whatever.
        
           | ric2b wrote:
           | If that were the definition then any software you can install
           | on your computer would be open source. It makes open source
           | lose nearly all meaning.
           | 
           | Just say "open weights", not "open source".
        
         | rmbyrro wrote:
         | If you think about LLMs as a new kind of programming runtime,
         | the matrices are the source.
        
         | beloch wrote:
         | It's no secret that implementing AI usually involves _far_ more
         | investment into training and teaching than actual code. You can
         | know how a neural net or other ML model works. You can have all
         | the code before you. It 's still a _huge_ job (and investment)
         | to do anything practical with that. If Meta shares the code
         | their AI runs on with you, you 're not going to be able to do
         | much with it unless you make the same investment in gathering
         | data and teaching to train that AI. That would probably require
         | data Meta _won 't_ share. You'd effectively need your own
         | Facebook.
         | 
         | If everyone open sources their AI code, Meta can snatch the
         | bits that help them without much fear of helping their direct
         | competitors.
        
           | the8thbit wrote:
           | I think you're misunderstanding what I'm saying. I don't
           | think its technically feasible for current models to be open
           | source, because there is no source code to open. Yes, there
           | is a harness that runs the model, but the vast, vast amount
           | of instructions are contained in the model weights, which are
           | akin to a compiled binary.
           | 
           | If we make large strides in interpretability we may have
           | something resembling source code, but we're certainly not
           | there yet. I don't think the solution to that problem should
           | be to change the definition of open source and pretend the
           | problem has been solved.
        
         | seoulmetro wrote:
         | Unfortunately open source really just means an open API these
         | days. The API is heavily intertwined with closed source.
        
         | langcss wrote:
         | Coming up with the words and concepts to describe the models is
         | a challenge.
         | 
         | Does the training data require permission from the copyright
         | holder to use? Are the weights really open source or more like
         | compiled assembly?
        
         | shdjkKA wrote:
         | Of course you are right, I'd put it less carefully: The quoted
         | Linux line is deceptive marketing.
         | 
         | - If we start with the closed training set, that is closed and
         | stolen, so call it Stolen Source.
         | 
         | - What is distributed is a bunch of float arrays. The Llama
         | architecture is published, but not the training or inference
         | code. Without code there is no open source. You can as well
         | call a compiler book open source, because it tells you how to
         | build a compiler.
         | 
         | Pure marketing, but predictably many people follow their
         | corporate overlords and eagerly adopt the co-opted terms.
         | 
         | Reminder again that FB is not releasing this out of altruism,
         | but because they have an existing profitable business model
         | that does not depend on generated chats. They probably do use
         | it internally for tracking and building profiles, but that is
         | the same as using Linux internally, so they release the weights
         | to destroy the competition.
         | 
         | Isn't price dumping an anti trust issue?
        
         | bjornsing wrote:
         | The term "source code" can mean many things. In a legal context
         | it's often just defined as the preferred format for
         | modification. It can be argued that for artificial neural
         | networks that's the weights (along with code and preferably
         | training data).
        
         | kashyapc wrote:
         | I agree; there's a lot of muddiness in the term "open source
         | AI". Earlier this year there was a talk[1] at FOSDEM, titled _"
         | Moving a step closer to defining Open Source AI"_. It is from
         | someone at the Open Source Initiative. The video and slides are
         | available in the link below[1]. From the abstract:
         | 
         |  _" Finding an agreement on what constitutes Open Source AI is
         | the most important challenge facing the free software (also
         | known as open source) movement. European regulation already
         | started referring to "free and open source AI", large economic
         | actors like Meta are calling their systems "open source"
         | despite the fact that their license contain restrictions on
         | fields-of-use (among other things) and the landscape is
         | evolving so quickly that if we don't keep up, we'll be
         | irrelevant."_
         | 
         | [1]
         | https://fosdem.org/2024/schedule/event/fosdem-2024-2805-movi...
         | defining-open-source-ai/
        
         | rbits wrote:
         | You release all the technology and the training data.
         | Everything that was used to create the model, including
         | instructions.
         | 
         | I'm not sure if facebook has done that
        
       | Oras wrote:
       | This is obviously good news, but __personally__ I feel the open-
       | source models are just trying to catch up with whoever the market
       | leader is, based on some benchmarks.
       | 
       | The actual problem is running these models. Very few companies
       | can afford the hardware to run these models privately. If you run
       | them in the cloud, then I don't see any potential financial gain
       | for any company to fine-tune these huge models just to catch up
       | with OpenAI or Anthropic, when you can probably get a much better
       | deal by fine-tuning the closed-source models.
       | 
       | Also this point:
       | 
       | > We need to protect our data. Many organizations handle
       | sensitive data that they need to secure and can't send to closed
       | models over cloud APIs.
       | 
       | First, it's ironic that Meta is talking about privacy. Second,
       | most companies will run these models in the cloud anyway. You can
       | run OpenAI via Azure Enterprise and Anthropic on AWS Bedrock.
        
         | simonw wrote:
         | "Very few companies can afford the hardware to run these models
         | privately."
         | 
         | I can run Llama 3 70B on my (64GB RAM M2) laptop. I haven't
         | tried 3.1 yet but I expect to be able to run that 70B model
         | too.
         | 
         | As for the 405B model, the Llama 3.1 announcement says:
         | 
         | > To support large-scale production inference for a model at
         | the scale of the 405B, we quantized our models from 16-bit
         | (BF16) to 8-bit (FP8) numerics, effectively lowering the
         | compute requirements needed and allowing the model to run
         | within a single server node.
        
       | InDubioProRubio wrote:
       | CrowdStrike just added "Centralized Company Controlled Software
       | Ecosystem" to every risk data sheet on the planet. Everything
       | futureproof is self-hosted and open source.
        
       | mesebrec wrote:
       | Note that Meta's models are not open source in any interpretation
       | of the term.
       | 
       | * You can't use them for any purpose. For example, the license
       | prohibits using these models to train other models. * You can't
       | meaningfully modify them given there is almost no information
       | available about the training data, how they were trained, or how
       | the training data was processed.
       | 
       | As such, the model itself is not available under an open source
       | license and the AI does not comply with the "open source AI"
       | definition by OSI.
       | 
       | It's an utter disgrace for Meta to write such a blogpost patting
       | themselves on the back while lying about how open these models
       | are.
        
         | ChadNauseam wrote:
         | > you can't meaningfully modify them given there is almost no
         | information available about the training data, how they were
         | trained, or how the training data was processed.
         | 
         | I was under the impression that you could still fine-tune the
         | models or apply your own RLHF on top of them. My understanding
         | is that the training data would mostly be useful for training
         | the model yourself from scratch (possibly after modifying the
         | training data), which would be extremely expensive and out of
         | reach for most people
        
           | mesebrec wrote:
           | Indeed, fine-tuning is still possible, but you can only go so
           | far with fine-tuning before you need to completely retrain
           | the model.
           | 
           | This is why Silo AI, for example, had to start from scratch
           | to get better support for small European languages.
        
           | chasd00 wrote:
           | From what i understand the training data and careful curation
           | of it is the hard part. Everyone wants training data sets to
           | train their own models instead of producing their own.
        
         | causal wrote:
         | You are definitely allowed to train other models with these
         | models, you just have to give credit in the name, per the
         | license:
         | 
         | > If you use the Llama Materials or any outputs or results of
         | the Llama Materials to create, train, fine tune, or otherwise
         | improve an AI model, which is distributed or made available,
         | you shall also include "Llama" at the beginning of any such AI
         | model name.
        
           | mesebrec wrote:
           | Indeed, this is something they changed in the 3.1 version of
           | the license.
           | 
           | Regardless, the license [1] still has many restrictions, such
           | as the acceptable use policy [2].
           | 
           | [1] https://huggingface.co/meta-llama/Meta-
           | Llama-3.1-8B/blob/mai...
           | 
           | [2] https://llama.meta.com/llama3_1/use-policy
        
       | tw04 wrote:
       | >In the early days of high-performance computing, the major tech
       | companies of the day each invested heavily in developing their
       | own closed source versions of Unix.
       | 
       | Because they sold the resultant code and systems built on it for
       | money... this is the gold miner saying that all shovels and jeans
       | should be free.
       | 
       | Am I happy Facebook open sources some of their code? Sure, I
       | think it's good for everyone. Do I think they're talking out of
       | both sides of their mouth? Absolutely.
       | 
       | Let me know when Facebook opens up the entirety of their Ad and
       | Tracking platforms and we can start talking about how it's silly
       | for companies to keep software closed.
       | 
       | I can say with 100% confidence if Facebook were selling their AI
       | advances instead of selling the output it produces, they wouldn't
       | be advocating for everyone else to open source their stacks.
        
         | JumpCrisscross wrote:
         | > _if Facebook were selling their AI advances instead of
         | selling the output it produces, they wouldn 't be advocating
         | for everyone else to open source their stack_
         | 
         | You're acting as if commoditizing one's complements is either
         | new or reprehensible [1].
         | 
         | [1] https://gwern.net/complement
        
           | tw04 wrote:
           | >You're acting as if commoditizing one's complements is
           | either new or reprehensible [1].
           | 
           | I'm acting as if calling on other companies to open source
           | their core product, just because it's a complement for you,
           | and acting as if it's for the benefit of mankind is
           | disingenuous, which it is.
        
             | stale2002 wrote:
             | > as if it's for the benefit of mankind
             | 
             | But it does benefit mankind.
             | 
             | More free tech products is good for the world.
             | 
             | This is a good thing. When people or companies do good
             | things, they should get the credit for doing good things.
        
             | JumpCrisscross wrote:
             | > _acting as if it 's for the benefit of mankind is
             | disingenuous, which it is_
             | 
             | Is it bad for mankind that Meta publishes its weights?
             | Mutually beneficial is a valid game state--there is no
             | moral law that requires anything good be made as a
             | sacrifice.
        
         | rvnx wrote:
         | The source-code to Ad tracking platform is useless to users.
         | 
         | At the end, it's actually Facebook doing the right thing
         | (though they are known for being evil).
         | 
         | It's a bit of an irony.
         | 
         | The supposedly "good" and "open" people like Google or OpenAI,
         | haven't given their model weights.
         | 
         | A bit like Microsoft became the company that actually supports
         | the whole open-source ecosystem with GitHub.
        
           | tw04 wrote:
           | >The source-code to Ad tracking platform is useless to users.
           | 
           | It's absolutely not useless for developers looking to build a
           | competing project.
           | 
           | >The supposedly "good" and "open" people like Google or
           | OpenAI, haven't given their model weights.
           | 
           | Because they're monetizing it... the only reason Facebook is
           | giving it away is because it's a complement to their core
           | product of selling ads. If they were monetizing it, it would
           | be closed source. Just like their Ads platform...
        
       | abetusk wrote:
       | Another case of "open-washing". Llama is not available open
       | source, under the common definition of open source, as the
       | license doesn't allow for commercial re-use by default [0].
       | 
       | They provide their model, with weights and code, as "source
       | available" and it looks like they allow for commercial use until
       | a 700M monthly subscriber cap is surpassed. They also don't allow
       | you to train other AI models with their model:
       | 
       | """ ... v. You will not use the Llama Materials or any output or
       | results of the Llama Materials to improve any other large
       | language model (excluding Meta Llama 3 or derivative works
       | thereof). ... """
       | 
       | [0] https://github.com/meta-llama/llama3/blob/main/LICENSE
        
         | sillysaurusx wrote:
         | They cannot legally enforce this, because they don't have the
         | rights to the content they trained it on. Whoever's willing to
         | fund that court battle would likely win.
         | 
         | There's a legal precedent that says hard work alone isn't
         | enough to guarantee copyright, i.e. it doesn't matter that it
         | took millions of dollars to train.
        
         | whimsicalism wrote:
         | i think these clauses are unenforceable. it's telling that OAI
         | hasn't tried a similar suit despite multiple extremely well-
         | known cases of competitors training on OAI outputs
        
       | nuz wrote:
       | Everyone complaining about not having data access: Remember that
       | without meta you would have openai and anthropic and that's it.
       | I'm really thankful they're releasing this, and the reason they
       | can't release the data is obvious.
        
         | mesebrec wrote:
         | Without Meta, you would still have Mistral, Silo AI, and the
         | many other companies and labs producing much more open models
         | with similar performance.
        
       | Invictus0 wrote:
       | The irony of this letter being written by Mark Zuckerburg at
       | Meta, while OpenAI continues to be anything but open, is richer
       | than anyone could have imagined.
        
       | 1024core wrote:
       | "open source AI" ... "open" ... "open" ....
       | 
       | And you can't even try it without an FB/IG account.
       | 
       | Zuck will never change.
        
         | causal wrote:
         | I think you can use an HF account as well
         | https://huggingface.co/meta-llama
        
           | Gracana wrote:
           | You can also wait a bit for someone to upload quantized
           | variants, finetunes, etc, and download those. FWIW I'm not
           | making a claim about the legality of that, just saying it's
           | an easy way around needing to sign the agreement.
        
         | CamperBob2 wrote:
         | It doesn't require an account. You do have to fill in your name
         | and email (and birthdate, although it seems to accept whatever
         | you feed it.)
        
       | mvkel wrote:
       | It's a real shame that we're still calling Llama "open source"
       | when at best it's "open weights."
       | 
       | Not that anyone would go buy 100,000 H100s to train their own
       | Llama, but words matter. Definitions matter.
        
         | sidcool wrote:
         | Honest question. As far as LLMs are concerned, isn't open
         | weights same as open source?
        
           | mesebrec wrote:
           | Open source requires, at the very least, that you can use it
           | for any purpose. This is not the case with Llama.
           | 
           | The Llama license has a lot of restrictions, based on user
           | base size, type of use, etc.
           | 
           | For example you're not allowed to use Llama to train or
           | improve other models.
           | 
           | But it goes much further than that. The government of India
           | can't use Llama because they're too large. Sex workers are
           | not allowed to use Llama due to the acceptable use policy of
           | the license. Then there is also the vague language
           | probibiting discrimination, racism etc.. good luck getting
           | something like that approved by your legal team.
        
           | aloe_falsa wrote:
           | GPL defines the "source code" of a work as the preferred form
           | of the work for making modifications to it. If Meta released
           | a petabyte of raw training data, would that really be easier
           | to extend and adapt (as opposed to fine-tuning the weights)?
        
           | paulhilbert wrote:
           | No, I would argue that from the three main ingredients -
           | training data, model source code and weights - weights are
           | the furthest away from something akin to source code.
           | 
           | They're more like obfuscated binaries. When it comes to fine-
           | tuning only however things shift a little bit, yes.
        
             | sidcool wrote:
             | I don't expect them to release the data used to train the
             | models. But I agree that the code is an important
             | ingredient of 'open'.
        
               | frabcus wrote:
               | Must include the code that curates the data
        
           | blackeyeblitzar wrote:
           | No open weights are the output of a proprietary and secretive
           | process of training. It's like sharing a pre compiled
           | application instead of what you need to reproduce the
           | compiled application.
           | 
           | AI2's OLMo is an example of what open source actually looks
           | like for LLMs:
           | 
           | https://blog.allenai.org/hello-olmo-a-truly-open-
           | llm-43f7e73...
        
         | lolinder wrote:
         | Source versus weights seems like a really pedantic distinction
         | to make. As you say, the training code and training data would
         | be worthless to anyone who doesn't have compute on the level
         | that Meta does. Arguably, the weights are source code
         | interpreted by an inference engine, and realistically it's the
         | weights that someone is going to want to modify through fine-
         | tuning, not the original training code and data.
         | 
         | The far more important distinction is "open" versus "not open",
         | and I disagree that we should cede that distinction while
         | trying to fight for "source". The Llama license is restrictive
         | in a number of ways (it incorporates an entire acceptable use
         | policy) that make it most definitely not "open" in the
         | customary sense.
        
           | mvkel wrote:
           | > training code and training data would be worthless to
           | anyone who doesn't have compute on the level that Meta does
           | 
           | I don't fully agree.
           | 
           | Isn't that like saying *nix being open source is worthless
           | unless you're planning to ship your own Linux distro?
           | 
           | Knowing how the sausage is made is important if you're an
           | animal rights activist.
        
           | JamesBarney wrote:
           | https://llama.meta.com/llama3_1/use-policy/
           | 
           | The acceptable use policy is seems fine. Don't use it to
           | break the law, solicit sex, kill people, or lie.
        
             | lolinder wrote:
             | It's fine in that I'm happy to use it and don't think I'll
             | be breaking the terms anytime soon. It's not fine in that
             | one of the primary things that makes open source open is
             | that an open source license doesn't restrict groups of
             | people or whole fields from usage of the software. The
             | policy has a number of such blanket bans on industries,
             | which, while reasonable, make the license not truly open.
        
             | mvkel wrote:
             | This is like saying "You have the right to privacy. The
             | police can tap your phone, but you have nothing to worry
             | about as long as you're not breaking the law."
             | 
             | "we're open source, you can use it for anything you can
             | imagine. But you can't use it for these specific things."
             | 
             | Then there's the added rub of the source not really being
             | source code, but a CSV file.
             | 
             | That's fine. If you want to set that expectation, great!
             | But don't call it open source.
        
           | frabcus wrote:
           | Meta could change the license of future releases of Llama and
           | kill your business built on it.
           | 
           | If the training data was openly available, even if you can't
           | afford to res train a new version, a competitor like Amazon
           | could do it for you
        
             | lolinder wrote:
             | > Meta could change the license of future releases of Llama
             | and kill your business built on it.
             | 
             | If you built a business on Llama 3.1, you're not going to
             | suddenly go down in flames because you can't upgrade to
             | Llama 4.
             | 
             | Even saying you really needed to upgrade, Llama 4 would be
             | a new model that you'd have to adapt your prompts for
             | anyway, you can't just version bump and call it good. If
             | you're going to update prompts anyway, at that point you
             | can just switch to any other competitor model. Updating
             | models isn't urgent, you have time to do it slowly and
             | right.
             | 
             | > If the training data was openly available, even if you
             | can't afford to res train a new version, a competitor like
             | Amazon could do it for you
             | 
             | If Llama 4 changed the license then presumably you wouldn't
             | have access to its training data even if you did have
             | access to Llama 3.1's. So now you have access to Llama
             | 3.1's training data... now what? You want to recreate the
             | Llama 3.1 weights in response to the Llama 4 release?
        
       | rybosworld wrote:
       | Huge companies like facebook will often argue for solutions that
       | on the surface, seem to be in the public interest.
       | 
       | But I have strong doubts they (or any other company) actually
       | believe what they are saying.
       | 
       | Here is the reality:
       | 
       | - Facebook is spending untold billions on GPU hardware.
       | 
       | - Facebook is arguing in favor of open sourcing the models, that
       | they spent billions of dollars to generate, for free...?
       | 
       | It follows that companies with much smaller resources (money)
       | will not be able to match what Facebook is doing. Seems like an
       | attempt to kill off the competition (specifically, smaller
       | organizations) before they can take root.
        
         | Salgat wrote:
         | The reason for Meta making their model open source is rather
         | simple: They receive an unimaginable amount of free labor, and
         | their license only excludes their major competitors to ensure
         | mass adoption without benefiting their competition (Microsoft,
         | Google, Alibaba, etc). Public interest, philanthropy, etc are
         | just nice little marketing bonuses as far as they're concerned
         | (otherwise they wouldn't be including this licensing
         | restriction).
        
           | noiseinvacuum wrote:
           | All correct, Meta does obviously benefit.
           | 
           | It's helpful to also look at what do the developers and
           | companies (everyone outside of top 5/10 big tech companies)
           | get out of this. They get open access to weights of SOTA LLM
           | models that take billions of dollars to train and 10s of
           | billions a year to run the AI labs that make these. They get
           | the freedom to fine tune them, to distill them, and to host
           | them on their own hardware in whatever way works best for
           | their products and services.
        
           | frabcus wrote:
           | Meta haven't made an open source model. They have released a
           | binary with a proprietary but relatively liberal license.
           | Binaries are not source and their license isn't free.
        
         | mattnewton wrote:
         | I actually think this is one of the rare times where the small
         | guys interests are aligned with Meta. Meta is scared of a world
         | where they are locked out of LLM platforms, one where OpenAI
         | gets to dictate rules around their use of the platform much
         | like Apple and Google dictates rules around advertiser data and
         | monetization on their mobile platforms. Small developers should
         | be scared of a world where the only competitive LLMs are owned
         | by those players too.
         | 
         | Through this lense, Meta's actions make more sense to me. Why
         | invest billions in VR/AR? The answer is simple, don't get
         | locked out of the next platform, maybe you can own the next
         | one. Why invest in LLMs? Again, don't get locked out. Google
         | and OpenAi/Microsoft are far larger and ahead of Meta right now
         | and Meta genuinely believes the best way to make sure they have
         | an LLM they control is to make everyone else have an LLM they
         | can control. That way community efforts are unified around
         | their standard.
        
           | mupuff1234 wrote:
           | Sure, but don't you think the "not getting locked out" is
           | just the pre-requisite for their eventual goal of locking
           | everyone else out?
        
             | yesco wrote:
             | Does it really matter? Attributing goodwill to a company is
             | like attributing goodwill to a spider that happens to clean
             | up the bugs in your basement. Sure if they had the ability
             | to, I'm confident Meta would try something like that, but
             | they obviously don't, and will not for the foreseeable
             | future.
             | 
             | I have faith they will continue to do what's in their best
             | interests and if their best interests happen to align with
             | mine, then I will support that. Just like how I don't
             | bother killing the spider in my basement because it helps
             | clean up the other bugs.
        
               | mupuff1234 wrote:
               | But you also know that the spider has been laying eggs so
               | you better have an extermination plan ready.
        
               | whitepaint wrote:
               | Everyone is aware of that. No one thinks Facebook or Mark
               | are some saint entities. But while the spider is doing
               | some good deeds why not just go "yeah! go spider!". Once
               | it becomes an asshole, we will kill it. People are not
               | dumb.
        
               | mupuff1234 wrote:
               | It's not even truly open source, they set a user limit.
        
               | xvector wrote:
               | I'm not particularly concerned about the user limit. The
               | companies for which those limits will matter are so large
               | that they should consider contributing back to humanity
               | by developing their own SOTA foundation models.
        
             | noiseinvacuum wrote:
             | If by "everyone else" here you mean 3 or 4 large players
             | trying to create a regulatory moat around themselves then I
             | am fine with them getting locked out and not being able to
             | create a moat for next 3 decades.
        
           | myaccountonhn wrote:
           | > I actually think this is one of the rare times where the
           | small guys interests are aligned with Meta
           | 
           | Small guys are the ones being screwed over by AI companies
           | and having their text/art/code stolen without any attribution
           | or adherence to license. I don't think Meta is on their side
           | at all
        
             | MisterPea wrote:
             | That's a separate problem which affects small to large
             | players alike (e.g. ScarJo).
             | 
             | Small companies interests are aligned with Meta as they are
             | now on an equal footing with large incumbent players. They
             | can now compete with a similarly sized team at a big tech
             | company instead of that team + dozens of AI scientists
        
         | ketzo wrote:
         | Meta is, fundamentally, a user-generated-content distribution
         | company.
         | 
         | Meta wants to make sure they commoditize their complements:
         | they don't want a world where OpenAI captures all the value of
         | content generation, they want the cost of producing the best
         | content to be as close to free as possible.
        
           | chasd00 wrote:
           | i was thinking along the same. A lot of content generated by
           | LLMs is going to end up on Facebook or Instagram. The easier
           | it is to create AI generated content the more content ends up
           | on those applications.
        
           | Nesco wrote:
           | Especially because genAI is a copyright laundering system.
           | You can train it on copyrighted material and none of the
           | content generated with it are copyright-able, which is
           | perfect for social apps
        
         | KaiserPro wrote:
         | The model it's self isn't actually that valuable to facebook.
         | The thing that's important is the dataset, the infrastructure
         | and the people to make the models.
         | 
         | There is still, just about, a strong ethos( especially in the
         | research teams) to chuck loads of stuff over the wall into
         | opensource. (pytorch, detectron, SAM, aria etc)
         | 
         | but its seen internally as a two part strategy:
         | 
         | 1) strong recruitment tool (come work with us, we've done cool
         | things, and you'll be able to write papers)
         | 
         | 2) seeding the research community with a common toolset.
        
       | jorblumesea wrote:
       | Cynically I think this position is largely due to how they can
       | undercut OpenAI's moat.
        
         | wayeq wrote:
         | It's not cynical, it's just an awareness that public companies
         | have a fiduciary duty to their shareholders.
        
       | cs702 wrote:
       | _> We're releasing Llama 3.1 405B, the first frontier-level open
       | source AI model, as well as new and improved Llama 3.1 70B and 8B
       | models._
       | 
       |  _Bravo!_ While I don 't agree with Zuck's views and actions on
       | many fronts, on this occasion I think he and the AI folks at Meta
       | deserve our praise and gratitude. With this release, they have
       | brought the cost of pretraining a frontier 400B+ parameter model
       | to ZERO for pretty much everyone -- well, everyone _except_ Meta
       | 's key competitors.[a] THANK YOU ZUCK.
       | 
       | Meanwhile, the business-minded people at Meta surely won't mind
       | if the release of these frontier models to the public happens to
       | completely mess up the AI plans of competitors like
       | OpenAI/Microsoft, Google, Anthropic, etc. Come to think of it,
       | the negative impact on such competitors was likely a key
       | motivation for releasing the new models.
       | 
       | ---
       | 
       | [a] The license is not open to the handful of companies worldwide
       | which have more than 700M users.
        
         | swyx wrote:
         | > the AI folks at Meta deserve our praise and gratitude
         | 
         | We interviewed Thomas who led Llama 2 and 3 post training here
         | in case you want to hear from someone closer to the ground on
         | the models https://www.latent.space/p/llama-3
        
         | throwaway_2494 wrote:
         | > We're releasing Llama 3.1 405B
         | 
         | Is it possible to run this with ollama?
        
           | jessechin wrote:
           | Sure, if you have a H100 cluster. If you quant it to int4 you
           | might get away with using only 4 H100 GPUs!
        
             | sheepscreek wrote:
             | Assuming $25k a pop, that's at least $100k in just the GPUs
             | alone. Throw in their linking technology (NVLink) and cost
             | for the remaining parts, won't be surprised if you're
             | looking at $150k for such a cluster. Which is not bad to be
             | honest, for something at this scale.
             | 
             | Can anyone share the cost of their pre-built clusters,
             | they've recently started selling? (sorry feeling lazy to
             | research atm, I might do that later when I have more time).
        
               | rty32 wrote:
               | You can rent H100 GPUs.
        
               | tomp wrote:
               | you're about right.
               | 
               | https://smicro.eu/nvidia-
               | hgx-h100-640gb-935-24287-0001-000-1
               | 
               | 8x H100 HGX cluster for EUR250k + VAT
        
           | vorticalbox wrote:
           | If you have the ram for it.
           | 
           | Ollama will offload as many layers as it can to the gpu then
           | the rest will run on the cpu/ram.
        
         | tambourine_man wrote:
         | Praising is good. Gratitude is a bit much. They got this big by
         | selling user generated content and private info to the highest
         | bidder. Often through questionable means.
         | 
         | Also, the underdog always touts Open Source and standards, so
         | it's good to remain skeptical when/if tables turn.
        
           | sheepscreek wrote:
           | All said and done, it is a very _expensive_ and balsy way to
           | undercut competitors. They've spent  > $5B on hardware alone,
           | much of which will depreciate in value quickly.
           | 
           | Pretty sure the only reason Meta's managed to do this is
           | because of Zuck's iron grip on the board (majority voting
           | rights). This is great for Open Source and regular people
           | though!
        
             | wrsh07 wrote:
             | Zuck made a bet when they provisioned for reels to buy
             | enough GPUs to be able to spin up another reels-sized
             | service.
             | 
             | Llama is probably just running on spare capacity (I mean,
             | sure, they've kept increasing capex, but if they're worried
             | about an llm-based fb competitor they sort of have to in
             | order to enact their copycat strategy)
        
             | fractalf wrote:
             | Well, he didn't do it to be "nice", you can be sure about
             | that. Obviously they see a financial gain
             | somewhere/sometime
        
             | tambourine_man wrote:
             | At Meta level, spending $5B to stay competitive is not
             | balsy. It's a bargain.
        
           | ricardo81 wrote:
           | >selling user generated content and private info to the
           | highest bidder
           | 
           | Was always their modus operandi, surely. How else would they
           | have survived.
           | 
           | Thanks for returning everyone else;s content and never mind
           | all the content stealing your platform did.
        
           | jart wrote:
           | I'm perfectly happy with them draining the life essence out
           | of the people crazy enough to still use Facebook, if they're
           | funneling the profits into advancing human progress with AI.
           | It's an Alfred Nobel kind of thing to do.
        
             | kataklasm wrote:
             | It's not often you see a take this bad on HN. Wow!
             | 
             | You are aware Facebook tracks everyone, not just people
             | with Facebook accounts, right? They have a history of being
             | anti-consumer in every sense of the word. So while I can
             | understand where you're coming from, it's just not anywhere
             | close to being reality.
             | 
             | If you want to or not, if you consent or not, Facebook is
             | tracking and selling you.
        
         | germinalphrase wrote:
         | "Come to think of it, the negative impact on such competitors
         | was likely a key motivation for releasing the new models."
         | 
         | "Commoditize Your Complement" is often cited here:
         | https://gwern.net/complement
        
         | tintor wrote:
         | > they have brought the cost of pretraining a frontier 400B+
         | parameter model to ZERO
         | 
         | It is still far from zero.
        
           | cs702 wrote:
           | If the model is already pretrained, there's no need to
           | pretrain it, so the cost of pretraining is zero.
        
             | moffkalast wrote:
             | Yeah but you only have the one model, and so far it seems
             | to be only good on paper.
        
         | pwdisswordfishd wrote:
         | Makes me wonder why he's really doing this. Zuckerberg being
         | Zuckerberg, it can't be out of any genuine sense of altruism.
         | Probably just wants to crush all competitors before he
         | monetizes the next generation of Meta AI.
        
           | spiralk wrote:
           | Its certainly not altruism. Given that Facebook/Meta owns the
           | largest user data collection systems, any advancement in AI
           | ultimately strengthens their business model (which is still
           | mostly collecting private user data, amassing large user
           | datasets, and selling targeting ads).
           | 
           | There is a demo video that shows a user wearing a Quest VR
           | headset and asks the AI "what do you see" and it interprets
           | everything around it. Then, "what goes well with these
           | shorts"... You can see where this is going. Wearing headsets
           | with AIs monitoring everything the users see and collecting
           | even more data is becoming normalized. Imagine the private
           | data harvesting capabilities of the internet but anywhere in
           | the physical world. People need not even choose to wear a
           | Meta headset, simply passing a user with a Meta headset in
           | public will be enough to have private data collected. This
           | will be the inevitable result of vision models improvements
           | integrated into mobile VR/AR headsets.
        
             | goatlover wrote:
             | That's very dystopian. It's bad enough having cameras
             | everywhere now. I never opted in to being recorded.
        
             | warkdarrior wrote:
             | That sounds fantastic. If they make the Meta headset easy
             | to wear and somewhat fashionable (closer to eyeglass than
             | to a motorcycle helmet), I'd take it everywhere and record
             | everything. Give me a retrospective search and
             | conferences/meetings will be so much easier (I am terrible
             | with names).
        
               | meroes wrote:
               | I wouldn't even say hi alone my name to someone wearing a
               | Meta headset out in public. And if facial recognition
               | becomes that common for wearers, most of the population
               | is going to adorn something to prevent that. And if it's
               | at work, I'm not working there and I have to think many
               | would agree. Coworkers don't and wouldn't tolerate
               | coworkers taking videos or pictures of them.
        
               | sebastiennight wrote:
               | This is not how the overwhelming majority of the world
               | works though.
               | 
               | > if facial recognition becomes that common for wearers,
               | most of the population is going to adorn something to
               | prevent that
               | 
               | "Most of the population" is going to be "the wearers".
               | 
               | > Coworkers don't and wouldn't tolerate coworkers taking
               | videos or pictures of them.
               | 
               | Here is a fun experience you can try: just hit "record"
               | on every single Teams or Meet meeting you're ever on (or
               | just set recording as the default setting in the app).
               | 
               | See how many coworkers comment on it, let alone protest.
               | 
               | I can tell you from experience (of having been in
               | thousands of hours of recorded meetings in the last 3
               | years) that the answer is zero.
        
               | spiralk wrote:
               | You are probably right, but that is truly a cyberpunk
               | dystopian situation. A few megacorps will catalog every
               | human interaction and there will be no way to opt out.
        
               | xvector wrote:
               | I'd gladly wear a headset like that! I think you
               | dramatically overestimate the number of people that would
               | actually care about any theoretical privacy infringement
               | here.
               | 
               | > And if facial recognition becomes that common for
               | wearers, most of the population is going to adorn
               | something to prevent that.
               | 
               | In my opinion, you do not have an accurate view of how
               | much the average person cares about this. London is the
               | most surveilled city on the planet with widespread
               | CCTV/facial recognition, as is Washington D.C. and China.
               | But literally no one bothers with anti-surveillance
               | measures.
               | 
               | > I'm not working there and I have to think many would
               | agree. Coworkers don't and wouldn't tolerate coworkers
               | taking videos or pictures of them.
               | 
               | This is a very antiquated view IMO. You are already being
               | filmed and monitored at work. I see no issue with a local
               | LLM interpreting my environment, or even a privacy-aware
               | secure LLM deployment like Apple's Private Cloud Compute:
               | https://security.apple.com/blog/private-cloud-compute/
        
               | troupo wrote:
               | > I think you dramatically overestimate the number of
               | people that would actually care about any theoretical
               | privacy infringement
               | 
               | Not really surprised that you don't see it as a problem
               | 
               | > This is a very antiquated view IMO. You are already
               | being filmed and monitored at work.
               | 
               | Not really surprised that you don't see it as a problem
        
               | talldayo wrote:
               | > a privacy-aware secure LLM
               | 
               | Funniest thing I've heard all month.
        
               | talldayo wrote:
               | Of course, no Hacker News thread is complete without the
               | "I would _never_ shake hands with an Android user " guy
               | who just _has_ to virtue signal.
               | 
               | > And if facial recognition becomes that common for
               | wearers, most of the population is going to adorn
               | something to prevent that
               | 
               | My brother in Christ, you sincerely underestimate how
               | much "most of the population" gives a shit. Most people
               | are being tracked by Google Maps or FindMy, are
               | triangulated with cell towers that know their exact
               | coordinates, and willingly use social media that profiles
               | them individually. The population doesn't even try in the
               | slightest to resist any of it.
        
           | phyrex wrote:
           | You can always listen to the investor calls for the
           | capitalist point of view. In short, attracting talent,
           | building the ecosystem, and making it really easy for users
           | to make stuff they want to share on Meta's social networks
        
           | bun_at_work wrote:
           | I really think the value of this for Meta is content
           | generation. More open models (especially state of the art)
           | means more content is being generated, and more content is
           | being shared on Meta platforms, so there is more advertising
           | revenue for Meta.
        
           | chasd00 wrote:
           | All the content generated by llms (good or bad) is going to
           | end up back in Facebook/Instagram and other social media
           | sites. This enables Meta to show growth and therefore demand
           | a higher stock price. So it makes sense to get content
           | generation tools out there as widely as possible.
        
           | zmmmmm wrote:
           | He's not even pretending it's altruism. Literally about 1/3
           | of the entire post is the section titled "Why Open Source AI
           | Is Good for Meta". I find it really weird that there are
           | whole debates in threads here about whether it's altruistic
           | when Zuckerberg isn't making that claim in the first place.
        
           | cageface wrote:
           | He addresses this pretty clearly in the post. They don't want
           | to be beholden to other companies to build the products they
           | want to build. Their experience being under Apple's thumb on
           | mobile strongly shaped this point of view.
        
           | GreenWatermelon wrote:
           | Zuckerberg didn't really say anything about altruism. The
           | point he was making is an explicit "I believe open models are
           | best for our business"
           | 
           | He was clear in that one of their motivations is avoiding
           | vendor lockin. He doesn't want Meta to be under the control
           | of their competitors or other AI providers.
           | 
           | He also recognizes the value brought to his company by open
           | sourcing products. Just look at React, PyTorch, and GraphQL.
           | All industry standards, and all brought tremendous value to
           | Facebook.
        
         | troupo wrote:
         | There's nothing open source about it.
         | 
         | It's a proprietary dump of data you can't replicate or verify.
         | 
         | What were the sources? What datasets it was trained on? What
         | are the training parameters? And so on and so on
        
         | advael wrote:
         | Look, absolutely zero people in the world should trust any tech
         | company when they say they care about or will keep commitments
         | to the open-source ecosystem in any capacity. Nevertheless, it
         | is occasionally strategic for them to do so, and there can be
         | ancillary benefits for said ecosystem in those moments where
         | this is the best play for them to harm their competitors
         | 
         | For now, Meta seems to release Llama models in ways that don't
         | significantly lock people into their infrastructure. If that
         | ever stops being the case, you should fork rather than trust
         | their judgment. I say this knowing full well that most of the
         | internet is on AWS or GCP, most brick and mortar businesses use
         | Windows, and carrying a proprietary smartphone is essentially
         | required to participate in many aspects of the modern economy.
         | All of this is a mistake. You can't resist all lock-in. The
         | players involved effectively run the world. You should still
         | try where you can, and we should still be happy when tech
         | companies either slip up or make the momentary strategic
         | decision to make this easier
        
           | ori_b wrote:
           | > _If that ever stops being the case, you should fork rather
           | than trust their judgment._
           | 
           | Fork what? The secret sauce is in the training data and
           | infrastructure. I don't think either of those is currently
           | open.
        
             | quasse wrote:
             | I'm just a lowly outsider to the AI space, but calling
             | these open source models seems kind of like calling a
             | compiled binary open source.
             | 
             | If you don't have a way to replicate what they did to
             | create the model, it seems more like freeware than open
             | source.
        
               | advael wrote:
               | As an ML researcher, I agree. Meta doesn't include
               | adequate information to replicate the models, and from
               | the perspective of fundamental research, the interest
               | that big tech companies have taken in this field has been
               | a significant impediment to independent researchers,
               | despite the fact that they are undeniably producing
               | groundbreaking results in many respects, due to this
               | fundamental lack of openness
               | 
               | This should also make everyone very skeptical of any
               | claim they are making, from benchmark results to the
               | legalities involved in their training process to the
               | prospect of future progress on these models. Without
               | being able to vet their results against the same datasets
               | they're using, there is no way to verify what they're
               | saying, and the credulity that otherwise smart people
               | have been exhibiting in this space has been baffling to
               | me
               | 
               | As a developer, if you have a working Llama model,
               | including the source code and weights, and it's crucial
               | for something you're building or have already built, it's
               | still fundamentally a good thing that Meta isn't gating
               | it behind an API and if they went away tomorrow, you
               | could still use, self-host, retrain, and study the models
        
               | warkdarrior wrote:
               | The model is public, so you can at least verify their
               | benchmark claims.
        
               | advael wrote:
               | Generally speaking, no. An important part of a lot of
               | benchmarks in ML research is generalization. What this
               | means is that it's often a lot easier to get a machine
               | learning model to memorize the test cases in a benchmark
               | than it is to train it to perform a general capability
               | the benchmark is trying to test for. For that reason, the
               | dataset is important, as if it includes the benchmark
               | test cases in some way, it invalidates the test
               | 
               | When AI research was still mostly academic, I'm sure a
               | lot of people still cheated, but there was somewhat less
               | incentive to, and norms like publishing datasets made it
               | easier to verify claims made in research papers. In a
               | world where people don't, and there's significant
               | financial incentive to lie, I just kind of assume they're
               | lying
        
               | Nuzzerino wrote:
               | Which option would be better?
               | 
               | A) Release the data, and if it ends up causing a privacy
               | scandal, at least you can actually call it open this
               | time.
               | 
               | B) Neuter the dataset, and the model
               | 
               | All I ever see in these threads is a lot of whining and
               | no viable alternative solutions (I'm fine with the idea
               | of it being a hard problem, but when I see this attitude
               | from "researchers" it makes me less optimistic about the
               | future)
               | 
               | > and the credulity that otherwise smart people have been
               | exhibiting in this space has been baffling to me
               | 
               | Remove the "otherwise" and you're halfway to
               | understanding your error.
        
               | wanderingbort wrote:
               | > Release the data, and if it ends up causing a privacy
               | scandal...
               | 
               | We can't prove that a model like llama will never produce
               | a segment of its training data set verbatim.
               | 
               | Any potential privacy scandal is already in motion.
               | 
               | My cynical assumption is that Meta knows that competitors
               | like OpenAI have PR-bombs in their trained model and
               | therefore would never opensource the weights.
        
               | advael wrote:
               | This isn't a dilemma at all. If Facebook can't release
               | data it trains on because it would compromise user
               | privacy, it is already a significant privacy violation
               | that should be a scandal, and if it would prompt some
               | regulatory or legislative remedies against Facebook for
               | them to release the data, it should do the same for
               | releasing the trained model, even through an API. The
               | only reason people don't think about it this way is that
               | public awareness of how these technologies work isn't
               | pervasive enough for the general public to think it
               | through, and it's hard to prove definitively. Basically,
               | if this is Facebook's position, it's saying that the
               | release of the model already constitutes a violation of
               | user privacy, but they're betting no one will catch them
               | 
               | If the company wants to help research, it should full-
               | throatedly endorse the position that it doesn't consider
               | it a violation of privacy to train on the data it does,
               | and release it so that it can be useful for research. If
               | the company thinks it's safeguarding user privacy, it
               | shouldn't be training models on data it considers private
               | and then using them in public-facing ways at all
               | 
               | As it stands, Facebook seems to take the position that it
               | wants to help the development of software built on models
               | like Llama, but not really the fundamental research that
               | goes into building those models in the same way
        
               | xvector wrote:
               | > If Facebook can't release data it trains on because it
               | would compromise user privacy, it is already a
               | significant privacy violation that should be a scandal
               | 
               | Thousands of entities would scramble to sue Facebook over
               | any released dataset _no matter what the privacy
               | implications of the dataset are._
               | 
               | It's just not worth it in _any_ world. I believe you are
               | not thinking of this problem from the view of the PM or
               | VPs that would actually have to approve this: if I were a
               | VP and I was 99% confident that the dataset had no
               | privacy implications, I still wouldn 't release it. Just
               | not worth the inevitable long, drawn out lawsuits from
               | people and regulators trying to get their pound of flesh.
               | 
               | I feel the world is too hostile to big tech and AI to
               | enable something like this. So, unless we want to kill
               | AGI development in the cradle, this is what we get - and
               | we can thank modern populist techno-pessimism for
               | cultivating this environment.
        
               | troupo wrote:
               | Translation: "we train our data on private user data and
               | copyrighted material so of course we cannot disclose any
               | of our datasets or we'll be sued into oblivion"
               | 
               | There's no AGI development in the cradle. And the world
               | isn't "hostile". The world is increasingly tired of
               | predatory behavior by supranational corporations
        
               | advael wrote:
               | This post demonstrates a willful ignorance of the factors
               | driving so-called "populist techno-pessimism" and I'm
               | sure every time a member of the public is exposed to
               | someone talking like this, their "techno-pessimism" is
               | galvanized
               | 
               | The ire people have toward tech companies right now is,
               | like most ire, perhaps in places overreaching. But it is
               | mostly justified by the real actions of tech companies,
               | and facebook has done more to deserve it than most. The
               | thought process you just described sounds like an
               | accurate prediction of the mindset and culture of a VP
               | within Facebook, and I'd like you to reflect on it for a
               | sec. Basically, you rightly point out that the org
               | releasing what data they have would likely invite
               | lawsuits, and then you proceeded to do some kind of
               | insane offscreen mental gymnastics that allow this
               | reality to mean nothing to you but that the unwashed
               | masses irrationally hate the company for some unknowable
               | reason
               | 
               | Like you're talking about a company that has spent the
               | last decade buying competitors to maintain an insane
               | amount of control over billions of users' access to their
               | friends, feeding them an increasingly degraded and
               | invasive channel of information that also from time to
               | time runs nonconsensual social experiments on them, and
               | following even people who didn't opt in around the
               | internet through shady analytics plugins in order to sell
               | dossiers of information on them to whoever will pay. What
               | do you think it is? Are people just jealous of their
               | success, or might they have some legit grievances that
               | may cause them to distrust and maybe even loathe such an
               | entity? It is hard for me to believe Facebook has a
               | dataset large enough to train a current-gen LLM that
               | wouldn't also feel, viscerally, to many, like a privacy
               | violation. Whether any party that felt this way could
               | actually win a lawsuit is questionable though, as the US
               | doesn't really have signficant privacy laws, and this is
               | partially due to extensive collaboration with, and
               | lobbying by, Facebook and other tech companies who do
               | mass-surveillance of this kind
               | 
               | I remember a movie called Das Leben der Anderen (2006)
               | (Officially translated as "the lives of others") which
               | got accolades for how it could make people who hadn't
               | experienced it feel how unsettling the surveillance state
               | of East Germany was, and now your average American is
               | more comprehensively surveilled than the Stasi could have
               | imagined, and this is in large part due to companies like
               | facebook
               | 
               | Frankly, I'm not an AGI doomer, but if the capabilities
               | of near-future AI systems are even in the vague ballpark
               | of the (fairly unfounded) claims the American tech
               | monopolies make about them, it would be an unprecedented
               | disaster on a global scale if those companies got there
               | first, so inasmuch as we view "AGI research" as something
               | that's inevitably going to hit milestones in corporate
               | labs with secretive datasets, I think we should
               | absolutely kill it to whatever degree is possible, and
               | that's as someone who truly, deeply believes that AI
               | research has been beneficial to humanity and could
               | continue to become moreso
        
               | sensanaty wrote:
               | > I feel the world is too hostile to big tech
               | 
               | Lmao what? If the world were sane and hostile to big
               | tech, we would've nuked them all years ago for all the
               | bullshit they pulled and continue to pull. Big tech has
               | politicians in their pockets, but thankfully the
               | "populist techno-pessimist" (read: normal people who are
               | sick of billionaires exploiting the entire planet) are
               | finally starting to turn their opinions, albeit slowly.
               | 
               | If we lived in a sane world Cambridge Analytica would've
               | been the death knell of Facebook and all of the people
               | involved with it. But we instead live in a world where
               | psychopathic pieces of shit like Zucc get away with it,
               | because they can just buy off any politician who knocks
               | on their doors.
        
               | Nuzzerino wrote:
               | > it seems more like freeware than open source.
               | 
               | What would you have them do instead? Specifically?
        
               | wongarsu wrote:
               | > If you don't have a way to replicate what they did to
               | create the model, it seems more like freeware
               | 
               | Isn't that a bit like arguing that a linux kernel driver
               | isn't open source if I just give you a bunch of GPL-
               | licensed source code that speaks to my device, but no
               | documentation how my device works? If you take away the
               | source code you have no way to recreate it. But so far
               | that never caused anyone to call the code not open-
               | source. The closest is the whole GPL3 Tivoization debate
               | and that was very divisive.
               | 
               | The heart of the issue is that open source is kind of
               | hard to define for anything that isn't software. As a
               | proxy we could look at Stallman's free software
               | definition. Free software shares a common history with
               | open source and in most open source software is
               | free/libre, and the other way around, so this might be a
               | useful proxy.
               | 
               | So checking the four software freedoms:
               | 
               | - The freedom to run the program as you wish, for any
               | purpose: For most purposes. There's that 700M user
               | restriction, also Meta forbids breaking the law and
               | requires you to follow their acceptable use policy.
               | 
               | - The freedom to study how the program works, and change
               | it so it does your computing as you wish: yes. You can
               | change it by fine tuning it, and the weights allow you to
               | figure out how it works. At least as well as anyone knows
               | how any large neural network works, but it's not like
               | Meta is keeping something from you here
               | 
               | - The freedom to redistribute copies so you can help your
               | neighbor: Allowed, no real asterisks
               | 
               | - The freedom to distribute copies of your modified
               | versions to others: Yes
               | 
               | So is it Free Software(tm)? Not really, but it is pretty
               | close.
        
               | advael wrote:
               | The model is "open-source" for the purpose of software
               | engineering, and it's "closed data" for the purpose of AI
               | research. These are separate issues and it's not
               | necessary to conflate them under one term
        
             | JKCalhoun wrote:
             | A good point.
             | 
             | Forgive me, I am AI naive, is there some way to harness
             | Llama to train ones own actually-open AI?
        
               | advael wrote:
               | Kinda. Since you can self-host the model on a linux
               | machine, there's no meaningful way for them to prevent
               | you from having the trained weights. You can use this to
               | bootstrap other models, or retrain on your own datasets,
               | or fine-tune from the starting point of the currently-
               | working model. What you can't do is be sure what they
               | trained it on
        
               | QuercusMax wrote:
               | How open is it _really_ though? If you 're starting from
               | their weights, do you actually have legal permission to
               | use derived models for commercial purposes? If it turns
               | out that Meta used datasets they didn't have licenses to
               | use in order to generate the model, then you might be in
               | a big heap of mess.
        
               | ein0p wrote:
               | I could be wrong but most "model" licenses prohibit the
               | use of the models to improve other models
        
             | logicchains wrote:
             | They actually did open source the infrastructure library
             | they developed. They don't open source the data but they
             | describe how they gathered/filtered it.
        
           | ladzoppelin wrote:
           | Is forking really possible with an LLM or one the size of
           | future Lama versions, have they even released the weights and
           | everything? Maybe I am just negative about it because I feel
           | Meta is the worst company ever invented and feel this will
           | hurt society in the long run just like Facebook.
        
             | lawlessone wrote:
             | > have they even released the weights?
             | 
             | Isn't that what the model is? just a collection weights?
        
             | pmarreck wrote:
             | When you run `ollama pull llama3.1:70b`, which you can
             | literally do right now (assuming ollama.com is installed
             | and you're not afraid of the terminal), and it downloads a
             | 40 gigabyte model, _that is the weights_!
             | 
             | I'd consider the ability to admit when even your most hated
             | adversary is doing something right, a hallmark of acting
             | smarter.
             | 
             | Now, they haven't released the training data with the model
             | weights. THAT plus the training tooling would be "end to
             | end open source". Apple actually did _that very thing_
             | recently, and it flew under almost everyone 's radar for
             | some reason:
             | 
             | https://x.com/vaishaal/status/1813956553042711006?s=46&t=qW
             | a...
        
               | mym1990 wrote:
               | Doing something right vs doing something that seems right
               | but has a hidden self interest that is harmful in the
               | long run can be vastly different things. Often this kind
               | of strategy will allow people to let their guard down,
               | and those same people will get steamrolled down the road,
               | left wondering where it all went wrong. Get smarter.
        
               | pmarreck wrote:
               | How in the heck is an open source model that is free and
               | open today going to lock me down, down the line? This is
               | nonsense. You can literally run this model forever if you
               | use NixOS (or never touch your windows, macos or linux
               | install again). Zuck can't come back and molest it. Ever.
               | 
               | The best I can tell is that their self-interest here is
               | more about gathering mindshare. That's not a terrible
               | motive; in fact, that's a pretty decent one. It's not the
               | bully pressing you into their ecosystem with a tit-for-
               | tat; it's the nerd showing off his latest and going
               | "Here. Try it. Join me. Join us."
        
               | mym1990 wrote:
               | Yeah because history isn't absolutely littered with
               | examples of shiny things being dangled in front of people
               | with the intent to entrap them /s.
               | 
               | Can you really say this model will still be useful in 2
               | years, 5 years for _you_? And that FB 's stance on these
               | models will still be open source at that time once they
               | incrementally make improvements? Maybe, maybe not. But FB
               | doesn't give anything away for free, and the fact that
               | you think so is your blindness, not mine. In case you
               | haven't figured it out, this isn't a technology problem,
               | this is a "FB needs marketshare and it needs it fast"
               | problem.
        
               | pmarreck wrote:
               | > But FB doesn't give anything away for free, and the
               | fact that you think so is your blindness, not mine
               | 
               | Is it, though? They are literally giving this away "for
               | free". https://dev.to/llm_explorer/llama3-license-
               | explained-2915 Unless you build a service with it that
               | has over 700 million monthly users (read: "problem anyone
               | would love to have"), you do not have to re-negotiate a
               | license agreement with them. Beyond that, it can't "phone
               | home" or do any other sorts of nefarious shite. The other
               | limitations there, which you can plainly read, seem not
               | very restrictive.
               | 
               | Is there a magic secret clause conspiracy buried within
               | the license agreement that you believe will be magically
               | pulled out at the worst possible moment? >..<
               | 
               | Sometimes, good things happen. Sorry you're "too blinded"
               | by past hurt experience to see that, I guess
        
               | troupo wrote:
               | > How in the heck is an open source model that is free
               | and open today
               | 
               | Is free, but it's not open source
        
           | holoduke wrote:
           | In tech you can trust the underdogs. Once they turn into
           | dominant players they turn evil. 99% of the cases.
        
         | sandworm101 wrote:
         | >> Bravo! While I don't agree with Zuck's views and actions on
         | many fronts, on this occasion I think he and the AI folks at
         | Meta deserve our praise and gratitude.
         | 
         | Nope. Not one bit. Supporting F/OSS when it suits you in one
         | area and then being totally dismissive of it in _every other
         | area_ should not be lauded. How about open sourcing some of FB
         | 's VR efforts?
        
         | y04nn wrote:
         | Don't be fooled, it is a "embrace extend extinguish" strategy.
         | Once they have enough usage and be the default standard they
         | will start to find any possible ways to make you pay.
        
           | war321 wrote:
           | Hasn't really happened with PyTorch or any of their other
           | open sourced releases tbh.
        
           | GreenWatermelon wrote:
           | Credits where due: Facebook didn't do that with React or
           | PyTorch. Meta will reap benefit for sure, but they don't seem
           | to be betting on selling the model itself, rather they will
           | benefit from being at the forefront of a new ecosystem.
        
         | tyler-jn wrote:
         | So far, it seems like this release has done ~nothing to the
         | stock price for GOOGL/MSFT, which we all know has been propped
         | up largely on the basis of their AI plans. So it's probably
         | premature to say that this has messed it up for them.
        
       | userabchn wrote:
       | Interview with Mark Zuckerberg released today:
       | https://www.bloomberg.com/news/videos/2024-07-23/mark-zucker...
        
       | starship006 wrote:
       | > Our adversaries are great at espionage, stealing models that
       | fit on a thumb drive is relatively easy, and most tech companies
       | are far from operating in a way that would make this more
       | difficult.
       | 
       | Mostly unrelated to the correctness of the article, but this
       | feels like a bad argument. AFAIK, Anthropic/OpenAI/Google are not
       | having issues with their weights being leaked (are they?). Why is
       | it that Meta's model weights are?
        
         | meowface wrote:
         | >AFAIK, Anthropic/OpenAI/Google are not having issues with
         | their weights being leaked. Why is it that Meta's model weights
         | are?
         | 
         | The main threat actors there would be powerful nation-states,
         | in which case they'd be unlikely to leak what they've taken.
         | 
         | It is a bad argument though, because one day possession of AI
         | models (and associated resources) might confer great and
         | dangerous power, and we can't just throw up our hands and say
         | "welp, no point trying to protect this, might as well let
         | everyone have it". I don't think that'll happen anytime soon,
         | but I am personally somewhat in the AI doomer camp.
        
         | whimsicalism wrote:
         | We have no way of knowing whether nation-state level actors
         | have access to those weights.
        
         | skybrian wrote:
         | I think it's hard to say. We simply don't know much from the
         | outside. Microsoft has had some pretty bad security lapses, for
         | example around guarding access to Windows source code. I don't
         | think we've seen a bad security break-in at Google in quite a
         | few years? It would surprise me if Anthropic and OpenAI had
         | good security since they're pretty new, and fast-growing
         | startups have a lot of organizational challenges.
         | 
         | It seems safe to assume that not all the companies doing
         | leading-edge LLM's have good security and that the industry as
         | a whole isn't set up to keep secrets for long. Things aren't
         | locked down to the level of classified research. And it sounds
         | like Zuckerberg doesn't want to play the game that way.
         | 
         | At the state level, China has independent AI research efforts
         | and they're going to figure it out. It's largely a matter of
         | timing, which could matter a lot.
         | 
         | There's still an argument to be made against making
         | proliferation too easy. Just because states have powerful
         | weapons doesn't mean you want them in the hands of people on
         | the street.
        
         | dfadsadsf wrote:
         | We have nationals/citizens of every major US adversary working
         | in those companies with looser security practice than security
         | at local warehouse. Security check before hiring is a joke
         | (mostly checks that resume checks out), laptops can be taken
         | home and internal communication are not segmented on need to
         | know basis. Essentially if China wants weights or source code,
         | it will have hundreds of people to choose from who can provide
         | it.
        
       | probablybetter wrote:
       | I would avoid Facebook and Meta products in general. I do NOT
       | trust them. We have approx. 20 years of their record to go upon.
        
       | diggan wrote:
       | > Today we're taking the next steps towards open source AI
       | becoming the industry standard. We're releasing Llama 3.1 405B,
       | the first frontier-level open source AI model,
       | 
       | Why do people keep mislabeling this as Open Source? The whole
       | point of calling something Open Source is that the "magic sauce"
       | of how to build something is publicly available, so I could built
       | it myself if I have the means. But without the training data
       | publicly available, could I train Llama 3.1 if I had the means?
       | No wonder Zuckerberg doesn't start with defining what Open Source
       | actually means, as then the blogpost would have lost all meaning
       | from the get go.
       | 
       | Just call it "Open Model" or something. As it stands right now,
       | the meaning of Open Source is being diluted by all these
       | companies pretending to doing one thing, while actually doing
       | something else.
       | 
       | I initially got very exciting seeing the title and the domain,
       | but hopelessly sad after reading through the article and
       | realizing they're still trying to pass their artifacts off as
       | Open Source projects.
        
         | valine wrote:
         | The codebase to do the training is way less valuable than the
         | weights for the vast majority of people. Releasing the training
         | code would be nice, but it doesn't really help anyone but
         | Meta's direct competitors.
         | 
         | If you want to train on top of Llama there's absolutely nothing
         | stopping you. Plenty of open source tools to do parameter
         | optimization.
        
           | diggan wrote:
           | Not just the training code but the training data as well,
           | should be under a permissive license, otherwise you cannot
           | call the project itself Open Source, which Facebook does
           | here.
           | 
           | > is way less valuable than the weights for the vast majority
           | of people
           | 
           | The same is true for most Open Source projects, most people
           | use the distributed binaries or other artifacts from the
           | projects, and couldn't care less about the code itself. But
           | that doesn't warrant us changing the meaning of Open Source
           | just because companies feel like it's free PR.
           | 
           | > If you want to train on top of Llama there's absolutely
           | nothing stopping you.
           | 
           | Sure, but in order for the intent of Open Source to be true
           | for Llama, I should be able to build this project from
           | scratch. Say I have a farm of 100 A100's, could I reproduce
           | the Llama model from scratch today?
        
             | unshavedyak wrote:
             | > Not just the training code but the training data as well,
             | should be under a permissive license, otherwise you cannot
             | call the project itself Open Source, which Facebook does
             | here.
             | 
             | Does FB even have the capability to do that? I'd assume
             | there's a bunch of data that's not theirs and they can't
             | even release it. Let alone some data that they might not
             | want to admit is in the source.
        
               | bornfreddy wrote:
               | If not, it is questionable if they should train on such
               | data anyway.
               | 
               | Also, that doesn't matter in this discussion - if you are
               | unable to release the source under appropriate licence
               | (for whatever reason), you should not call it Open
               | Source.
        
             | talldayo wrote:
             | I will steelman the idea that a tokenizer and weights are
             | all you need for the "source" of an LLM. They are
             | components that can be modified, redistributed and when put
             | together, reproduce the full experience intended.
             | 
             | If we _insist_ upon the release of training data with Open
             | models, you might as well kiss the idea of usable Open LLMs
             | out the door. Most of the content in training datasets like
             | The Pile are not licensed for redistribution in any way
             | shape or form. It would jeopardize projects that _do_ use
             | transparent training data while not offering anything of
             | value to the community compared to the training code.
             | Republishing all training data is an absolute trap.
        
               | enriquto wrote:
               | > Most of the content in training datasets like The Pile
               | are not licensed for redistribution in any way shape or
               | form.
               | 
               | But distributing the weights is a "form" of distribution.
               | You can recover many items of the dataset (most easily,
               | the outliers) by using the weights.
               | 
               | Just because they are codified in a non-readily
               | accessible way, does not mean that you are not
               | distributing them.
               | 
               | It's scary to think that "training" is becoming a thinly
               | veiled way to strip copyright of works.
        
               | talldayo wrote:
               | The weights are a transformed, lossy and non-complete
               | permutation of the training material. You _cannot_
               | recover most of the dataset reliably, which is what stops
               | it from being an outright replacement for the work it 's
               | trained on.
               | 
               | > does not mean that you are not distributing them.
               | 
               | Except you literally aren't distributing them. It's like
               | accusing me of pirating a movie because I sent a
               | screenshot or a scene description to my friend.
               | 
               | > It's scary to think that "training" is becoming a
               | thinly veiled way to strip copyright of works.
               | 
               | This is the way it's been for years. Google is given Fair
               | Use for redistributing incomplete parts of copywritten
               | text materials verbatim, since their application is
               | transformative: https://en.wikipedia.org/wiki/Authors_Gui
               | ld,_Inc._v._Google,....
               | 
               | Or Corellium, who won their case to use copywritten Apple
               | code in novel and transformative ways: https://www.forbes
               | .com/sites/thomasbrewster/2023/12/14/apple...
               | 
               | Copyright has always been a limited power.
        
             | jncfhnb wrote:
             | People don't typically modify distributed binaries.
             | 
             | People do typically modify model weights. They are the
             | preferred form to modify model.
             | 
             | Saying "build" llama is just a nonsense comparison to
             | traditional compiled software. "Building llama" is more
             | akin to taking the raw weights as text and putting them
             | into a nice pickle file. Or loading it into an inference
             | engine.
             | 
             | Demanding that you have everything needed to recreate the
             | weights from scratch is like arguing an application cannot
             | be open source unless it also includes the user testing
             | history and design documents.
             | 
             | And of course some idiots don't understand what a pickled
             | weights file is and claim it's as useless as a distributed
             | binary if you want to modify the program just because it is
             | technically compiled; not understanding that the point of
             | the pickled file is "convenience" and that it unpacks back
             | to the original form. Like arguing open source software
             | can't be distributed in zip files.
             | 
             | > Say I have a farm of 100 A100's, could I reproduce the
             | Llama model from scratch today?
             | 
             | Say you have a piece of paper. Can you reproduce
             | `print("hello world")` from scratch?
        
         | vngzs wrote:
         | Agreed. The Linux kernel source contains everything you need to
         | produce Linux kernel binaries. The llama source does not
         | contain what you need to produce llama models. Facebook is
         | using sleight of hand to garner favor with open model weights.
         | 
         | Open model weights are still commendable, but it's a far cry
         | from open-source (or even _libre_ ) software!
        
         | elromulous wrote:
         | 100%. With this licensing model, meta gets to reap the benefits
         | of open source (people contributing, social cachet), without
         | any of the real detriment (exposing secret sauce).
        
         | hbn wrote:
         | Is that even something they keep on hand? Or would WANT to keep
         | on hand? I figured they're basically sending a crawler to go
         | nuts reading things and discard the data once they've trained
         | on it.
         | 
         | If that included, e.g. reading all of Github for code, I
         | wouldn't expect them to host an entire separate read-only copy
         | of Github because they trained on it and say "this is part of
         | our open source model"
        
         | jdminhbg wrote:
         | > Why do people keep mislabeling this as Open Source? The whole
         | point of calling something Open Source is that the "magic
         | sauce" of how to build something is publicly available, so I
         | could built it myself if I have the means. But without the
         | training data publicly available, could I train Llama 3.1 if I
         | had the means?
         | 
         | I don't think not releasing the commit history of a project
         | makes it not Open Source, this seems like that to me. What's
         | important is you can download it, run it, modify it, and re-
         | release it. Being able to see how the sausage was made would be
         | interesting, but I don't think Meta have to show their training
         | data any more than they are obligated to release their planning
         | meeting notes for React development.
         | 
         | Edit: I think the restrictions in the license itself are good
         | cause for saying it shouldn't be called Open Source, fwiw.
        
           | thenoblesunfish wrote:
           | You don't need to have the commit history to see "how it
           | works". ML that works well does so in huge part due to the
           | training data used. The leading models today aren't
           | distinguished by the way they're trained, but what they're
           | trained on.
        
             | jdminhbg wrote:
             | I agree that you need training data to build AI from
             | scratch, much like you need lots of really smart developers
             | and a mailing list and servers and stuff to build the Linux
             | kernel from scratch. But it's not like having the training
             | data and training code will get you the same result, in the
             | way something like open data in science is about
             | replicating results.
        
               | frabcus wrote:
               | Reproducible builds of software binaries are a thing, but
               | they aren't routinely done. Likewise training an AI is
               | deterministic if you do it the same each time. And slight
               | variances lead to similar capability models.
        
           | tempfile wrote:
           | For the freedom to change to be effective, a user must be
           | given the software in a form they can modify. Can you tweak
           | an LLM once it's built? (I genuinely don't know the answer)
        
             | jdminhbg wrote:
             | Yes, you can finetune Llama:
             | https://llama.meta.com/docs/how-to-guides/fine-tuning/
        
           | diggan wrote:
           | > I don't think not releasing the commit history of a project
           | makes it not Open Source,
           | 
           | Right, I'm not talking about the commit history, but rather
           | that anyone (with means) should be able to produce the final
           | artifact themselves, if they want. For weights like this,
           | that requires at least the training script + the training
           | data. Without that, it's very misleading to call the project
           | Open Source, when only the result of the training is
           | released.
           | 
           | > What's important is you can download it, run it, modify it,
           | and re-release it
           | 
           | But I literally cannot download the project, build it and run
           | it myself? I can only use the binaries (weights) provided by
           | Meta. No one can modify how the artifact is produced, only
           | modify the already produced artifact.
           | 
           | That's like saying that Slack is Open Source because if I
           | want to, I could patch the binary with a hex editor and
           | add/remove things as I see fit? No one believes Slack should
           | be called Open Source for that.
        
             | jdminhbg wrote:
             | > Right, I'm not talking about the commit history, but
             | rather that anyone (with means) should be able to produce
             | the final artifact themselves, if they want. For weights
             | like this, that requires at least the training script + the
             | training data.
             | 
             | You cannot produce the final artifact with the training
             | script + data. Meta also cannot reproduce the current
             | weights with the training script + data. You could produce
             | some other set of weights that are just about as good, but
             | it's not a deterministic process like compiling source
             | code.
             | 
             | > That's like saying that Slack is Open Source because if I
             | want to, I could patch the binary with a hex editor and
             | add/remove things as I see fit? No one believes Slack
             | should be called Open Source for that.
             | 
             | This analogy doesn't work because it's not like Meta can
             | "patch" Llama any more than you can. They can only finetune
             | it like everyone else, or produce an entirely different LLM
             | by training from scratch like everyone else.
             | 
             | The right to release your changes is another difference; if
             | you patch Slack with a hex editor to do some useful thing,
             | you're not allowed to release that changed Slack to others.
             | 
             | If Slack lost their source code, went out of business, and
             | released a decompiled version of the built product into the
             | public domain, that would in some sense be "open source,"
             | even if not as good as something like Linux. LLMs though do
             | not have a source code-like representation that is easily
             | and deterministically modifiable like that, no matter who
             | the owner is or what the license is.
        
         | unraveller wrote:
         | Open-weights is not open-source, for sure, but I don't mind it
         | being stated as an aspiration goal, the moment it is legally
         | possible to publish a source without shooting themselves in the
         | foot they should do it.
         | 
         | They could release 50% of their best data but that would only
         | stop them from attracting the best talent.
        
         | JeremyNT wrote:
         | > _Why do people keep mislabeling this as Open Source?_
         | 
         | I guess this is a rhetorical question, but this is a press
         | release from Meta itself. It's just a marketing ploy, of
         | course.
        
         | blcknight wrote:
         | InstructLab and the Granite Models from IBM seem the closest to
         | being open source. Certainly more than whatever FB is doing
         | here.
         | 
         | (Disclaimer: I work for an IBM subsidiary but not on any of
         | these products)
        
       | hubraumhugo wrote:
       | The big winners of this: devs and AI startups
       | 
       | - No more vendor lock-in
       | 
       | - Instead of just wrapping proprietary API endpoints, developers
       | can now integrate AI deeply into their products in a very cost-
       | effective and performant way
       | 
       | - Price race to the bottom with near-instant LLM responses at
       | very low prices are on the horizon
       | 
       | As a founder, it feels like a very exciting time to build a
       | startup as your product automatically becomes better, cheaper,
       | and more scalable with every major AI advancement. This leads to
       | a powerful flywheel effect: https://www.kadoa.com/blog/ai-
       | flywheel
        
         | danielmarkbruce wrote:
         | It creates the opposite of a flywheel effect for you. It
         | creates a leapfrog effect.
        
           | boringg wrote:
           | AI might cannabalize a lot of first gen AI businesses.
        
             | jstummbillig wrote:
             | What Meta is doing is borderline market distortion. It's
             | not that they have figured out some magic sauce they are
             | happy to share. They are just deciding to burn brute force
             | money that they made elsewhere and give their stuff away
             | below cost, first of all because they can.
        
               | anon373839 wrote:
               | I know, and it's beautiful to see. Bad actors like
               | "Open"AI tried to get in first and monopolize this tech
               | with lawfare. But that game plan has been mooted by
               | Meta's scorched-earth generosity.
        
               | jstummbillig wrote:
               | Meta has actually figured out where the moot is:
               | Ecosystem, tooling. As soon as "we" build it, they an
               | still do whatever they want with the core/llm, starting
               | with Llama 4 or any other point in the future.
               | 
               | The best kind of open source: All the important
               | ingredients to make it work (more and more data and
               | money) are either not open source or in the hands of
               | Meta. It's prohibitive by design.
               | 
               | People seem happy to help build Metas empire once again
               | in return for scraps.
        
               | danielmarkbruce wrote:
               | It's strange you are downvoted for this. It is a
               | legitimate take on things (even if it is likely not
               | accurate as far as intent is concerned).
        
               | boringg wrote:
               | To be fair MSFT investments with credits into OpenAI is
               | also almost market distortion. All the investments done
               | with credits posing as dollars has made the VC investment
               | world very chaotic in the AI space. No real money
               | changing hands and the revenue on the books of MSFT and
               | AMAZON is low quality revenue. Those companies AI moves
               | are overvalued.
        
         | boringg wrote:
         | - Price race to the bottom with near-instant LLM responses at
         | very low prices are on the horizon
         | 
         | Maybe a big price war while the market majors fight out for
         | positioning but they still need to make money off their
         | investments so someone is going to have to raise prices at some
         | point and youll be locked into their system if you build on it.
        
           | Havoc wrote:
           | >locked into their system
           | 
           | There are going to be loads of providers for these open
           | models. Openrouter already has 3 providers for the new 405B
           | model within hours.
        
             | boringg wrote:
             | Maybe for the time being. I don't see how else they
             | monetize the incredible amount the spent on the models
             | without forcing people to lock into models or benefits or
             | something else.
             | 
             | It's not going to stay like this I can assure you that :).
        
               | Havoc wrote:
               | Not sure whether you mean by that post open router
               | serving the 405b or meta producing more.
               | 
               | Open router is a paid api so that can absolutely be
               | sustainable.
               | 
               | And meta has multiple reasons for going open route - some
               | explained in their posts so less so (harms their
               | competitors)
               | 
               | I reckon there will be a llama 4 and beyond
        
               | tim333 wrote:
               | Meta will make money like it has in the past by having
               | data about users and advertising to them. Commoditizing
               | AI helps them keep at that.
               | 
               | See Joel on Software "Smart companies try to commoditize
               | their products' complements"
               | https://www.joelonsoftware.com/2002/06/12/strategy-
               | letter-v/
        
           | wavemode wrote:
           | > they still need to make money off their investments
           | 
           | Depends on how you define this. Most of the top companies
           | don't care as much about making a profit off of AI inference
           | itself, if the existence of the -feature- of AI inference
           | drives more usage and/or sales of their other products
           | (phones, computers, operating systems, etc.)
           | 
           | That's why, for example, Google and Bing searches
           | automatically perform LLM inference at no cost to the user.
        
         | choppaface wrote:
         | Also the opportunity to run on user compute and on private
         | data. That supports a slate of business models that are
         | incompatible with the mainframe approach.
         | 
         | Including adtech models, which are predominantly cloud-based.
        
         | drcode wrote:
         | and Xi Jingping
        
       | mav3ri3k wrote:
       | I am not deep into llms so I ask this. From my understanding,
       | their last model was open source but it was in a way that you can
       | use them but the inner working were "hidden"/not transparent.
       | 
       | With the new model, I am seeing alot of how open source they are
       | and can be build upon. Is it now completely open source or
       | similar to their last models ?
        
         | whimsicalism wrote:
         | It's intrinsic to transformers that the inner workings are
         | largely inscrutable. This is no different, but it does not mean
         | they cannot be built upon.
         | 
         | Gradient descent works on these models just like the prior
         | ones.
        
       | carimura wrote:
       | Looks like you can already try out Llama-3.1-405b on Groq,
       | although it's timing out. So. Hugged I guess.
        
         | TechDebtDevin wrote:
         | All the big providers should have it up by end of day. They
         | just change their API configs (they're just reselling you AWS
         | Bedrock).
        
           | jamiedg wrote:
           | 405B and the other Llama 3.1 models are working and available
           | on Together AI. https://api.together.ai
        
           | Havoc wrote:
           | >they're just reselling you AWS Bedrock
           | 
           | Meta announced they have 25 providers ready on day 1, so no
           | it's not all AWS.
        
       | mensetmanusman wrote:
       | It's easy to support open source AI when the code is 1,000 lines
       | and the execution costs $100,000,000 of electricity.
       | 
       | Only the big players can afford to push go, and FB would love to
       | see OpenAI's code so they can point it to their proprietary user
       | data.
        
       | bun_at_work wrote:
       | Meta makes their money off advertising, which means they profit
       | from attention.
       | 
       | This means they need content that will grab attention, and
       | creating open source models that allow anyone to create any
       | content on their own becomes good for Meta. The users of the
       | models can post it to their Instagram/FB/Threads account.
       | 
       | Releasing an open model also releases Meta from the burden of
       | having to police the content the model generates, once the open
       | source community fine-tunes the models.
       | 
       | Overall, this move is good business move for Meta - the post
       | doesn't really talk about the true benefit, instead moralizing
       | about open source, but this is a sound business move for Meta.
        
         | jklinger410 wrote:
         | This is a great point. Eventually, META will only allow LLAMA
         | generated visual AI content on its platforms. They'll put a
         | little key in the image that clears it with the platform.
         | 
         | Then all other visual AI content will be banned. If that is
         | where legislation is heading.
        
         | natural219 wrote:
         | AI moderators too would be an enormous boon if they could get
         | that right.
        
           | KaiserPro wrote:
           | It would be good, but the cost per moderation is still really
           | high for it to be practical.
        
         | noiseinvacuum wrote:
         | Creating content with AI will surely be helpful for social
         | media to some extent but I think it's not that important in
         | larger scheme of things, there's already a vast sea of content
         | being created by humans and differentiation is already in
         | recommending the right content to right people at right time.
         | 
         | More important is the products that Meta will be able to make
         | if the industry standardizes on Llama. They would have the
         | front seat in not just with access the latest unreleased models
         | but also settings the direction of progress and next gen LLM
         | optimizes for. If you're Twitter or Snap or TikTok or compete
         | with Meta on the product then good luck in trying to keep up.
        
         | apwell23 wrote:
         | I am not sure I follow this.
         | 
         | 1. Is there such a thing as 'attention grabbing AI content' ?
         | Most AI content I see is the opposite of 'attention grabbing'.
         | Kindle store is flooded with this garbage and none of it is
         | particularly 'attention grabbing'.
         | 
         | 2. Why would creation of such content, even if it was truly
         | attention grabbing, benefit meta in particular ?
         | 
         | 3. How would poliferation of AI content lead to more ad spend
         | in the economy. Ad budgets won't increase because of AI
         | content?
         | 
         | To me this is typical Zuckerberg play. Attach metas name to
         | whatever is trendy at the moment like ( now forgotten)
         | metaverse, cryptocoins and bunch of other failed stuff that was
         | trendy for a second. Meta is NOT an Gen AI company ( or a
         | metaverse company, or a cypto company) like he is scamming (
         | more like colluding) the market to believe. A mere distraction
         | from slowing user growth on ALL of meta apps.
         | 
         | ppl seem to have just forgotten this
         | https://en.wikipedia.org/wiki/Diem_(digital_currency)
        
           | bun_at_work wrote:
           | Sure - there is plenty of attention grabbing AI content - it
           | doesn't have to grab _your_ attention, and it won't work for
           | everyone. I have seen people engaging with apps that redo a
           | selfie to look like a famous character or put the person in a
           | movie scene, for example.
           | 
           | Every piece of content in any feed (good, bad, or otherwise)
           | benefits the aggregator (Meta, YouTube, whatever), because
           | someone will look at it. Not everything will go viral, but it
           | doesn't matter. Scroll whatever on Twitter, YouTube Shorts,
           | Reddit, etc. Meta has a massive presence in social media, so
           | content being generated is shared there.
           | 
           | The more content of any type leads to more engagement on the
           | platforms where it's being shared. Every Meta feed serves the
           | viewer an ad (for which Meta is paid) every 3 or so posts
           | (pieces of content). It doesn't matter if the user doesn't
           | like 1/5 posts or whatever, the number of ads still goes up.
        
             | apwell23 wrote:
             | > it doesn't have to grab _your_ attention
             | 
             | I am talking about in general, not me personally. No
             | popular content on any website/platform is AI generated.
             | Maybe you have examples that lead you believe that its
             | possible on a mass scale.
             | 
             | > look like a famous character or put the person in a movie
             | scene
             | 
             | what attention grabbing movie used gen ai persons
        
         | visarga wrote:
         | > Meta makes their money off advertising, which means they
         | profit from attention. This means they need content that will
         | grab attention
         | 
         | That is why they hopped on the Attention is All You Need train
        
       | resters wrote:
       | This is really good news. Zuck sees the inevitability of it and
       | the dystopian regulatory landscape and decided to go all in.
       | 
       | This also has the important effect of neutralizing the critique
       | of US Government AI regulation because it will democratize
       | "frontier" models and make enforcement nearly impossible. Thank
       | you, Zuck, this is an important and historic move.
       | 
       | It also opens up the market to a lot more entry in the area of
       | "ancillary services to support the effective use of frontier
       | models" (including safety-oriented concerns), which should really
       | be the larger market segment.
        
         | passion__desire wrote:
         | Probably, Yann Lecun is the Lord Varys here. He has Mark's ear
         | and Mark believes in Yann's vision.
        
         | war321 wrote:
         | Unfortunately, there are a number of AI safety people that are
         | still crowing about how AI models need to be locked down, with
         | some of them loudly pivoting to talking about how open source
         | models aid China.
         | 
         | Plus there's still the spectre of SB-1047 hanging around.
        
       | amelius wrote:
       | > One of my [Mark Zuckerberg, ed.] formative experiences has been
       | building our services constrained by what Apple will let us build
       | on their platforms. Between the way they tax developers, the
       | arbitrary rules they apply, and all the product innovations they
       | block from shipping, it's clear that Meta and many other
       | companies would be freed up to build much better services for
       | people if we could build the best versions of our products and
       | competitors were not able to constrain what we could build.
       | 
       | This is hard to disagree with.
        
         | glhaynes wrote:
         | I think it's very easy to disagree with!
         | 
         | If Zuckerberg had his way, mobile device OSes would let Meta
         | ingest microphone and GPS data 24/7 (just like much of the
         | general public already _thinks_ they do because of the
         | effectiveness of the other sorts of tracking they are able to
         | do).
         | 
         | There are certainly legit innovations that haven't shipped
         | because gatekeepers don't allow them. But there've been lots of
         | harmful "innovations" blocked, too.
        
       | throwaway1194 wrote:
       | I strongly suspect that what AI will end up doing is push
       | companies and organizations towards open source, they will
       | eventually realize that code is already being shared via AI
       | channels, so why not do it legally with open source?
        
         | talldayo wrote:
         | > they will eventually realize that code is already being
         | shared via AI channels
         | 
         | Private repos are not being reproduced by any modern AI. Their
         | source code is safe, although AI arguably lowers the bar to
         | compete with them.
        
       | whimsicalism wrote:
       | OpenAI needs to release a new model setting a new capabilities
       | highpoint. This is existential for them now.
        
       | ChrisArchitect wrote:
       | Related:
       | 
       |  _Llama 3.1 Official Launch_
       | 
       | https://news.ycombinator.com/item?id=41046540
        
       | baceto123 wrote:
       | The value of AI is in the information used to train the models,
       | not the hardware.
        
       | m3kw9 wrote:
       | The truth is we need both closed and open source, they both have
       | their discovery path and advantages and disadvantages, there
       | shouldn't be a system where one is eliminated over the other.
       | They also seem to be driving each other forward via competition.
        
       | typpo wrote:
       | Thanks to Meta for their work on safety, particularly Llama
       | Guard. Llama Guard 3 adds defamation, elections, and code
       | interpreter abuse as detection categories.
       | 
       | Having run many red teams recently as I build out promptfoo's red
       | teaming featureset [0], I've noticed the Llama models punch above
       | their weight in terms of accuracy when it comes to safety. People
       | hate excessive guardrails and Llama seems to thread the needle.
       | 
       | Very bullish on open source.
       | 
       | [0] https://www.promptfoo.dev/docs/red-team/
        
         | swyx wrote:
         | is there a #2 to llamaguard? Meta seems curiously alone in
         | doing this kind of, lets call it, "practical safety" work
        
       | enriquto wrote:
       | It's alarming that he refers to llama as if it was open source.
       | 
       | The definition of free software (and open source, for that
       | mater), is well-established. The same definition applies to all
       | programs, whether they are "AI" or not. In any case, if a program
       | was built by training against a dataset, the whole dataset is
       | part of the source code.
       | 
       | Llama is distributed in binary form, and it was built based on a
       | secret dataset. Referring to it as "open source" is not
       | ignorance, it's malice.
        
         | Nesco wrote:
         | The training data contains most likely insane amounts of
         | copyrighted material. That's why virtually none of the "open
         | models" come with their training data
        
           | enriquto wrote:
           | > The training data contains most likely insane amounts of
           | copyrighted material.
           | 
           | If that is the case then the weights must inherit all these
           | copyrights. It has been shown (at least in image processing)
           | that you can extract many training images from the weights,
           | almost verbatim. Hiding the training data does not solve this
           | issue.
           | 
           | But regardless of copyright issues, people here are
           | complaining about the malicious use of the term "open
           | source", to signify a completely different thing (more like
           | "open api").
        
             | tempfile wrote:
             | > If that is the case then the weights must inherit all
             | these copyrights.
             | 
             | Not if it's a fair use (which is obviously the defence
             | they're hoping for)
        
               | anon373839 wrote:
               | Also, fair use is just one defense to a copyright
               | infringement claim. The plaintiff first has to prove the
               | elements of infringement; if they can't do this, no
               | defense is needed.
        
         | jdminhbg wrote:
         | > In any case, if a program was built by training against a
         | dataset, the whole dataset is part of the source code.
         | 
         | I'm not sure why I keep seeing this. What is the equivalent of
         | the training data for something like the Linux kernel?
        
           | enriquto wrote:
           | > What is the equivalent of the training data for something
           | like the Linux kernel?
           | 
           | It's the source code.
           | 
           | For the linux kernel:
           | compile(sourcecode) = binary
           | 
           | For llama:                         train(data) = weights
        
             | jdminhbg wrote:
             | That analogy doesn't work. `train` is not a deterministic
             | process. Meta has all of the training data and all of the
             | supporting source code and they still won't get the same
             | `weights` if they re-run the process.
             | 
             | The weights are the result of the development process, like
             | the source code of a program is the result of a development
             | process.
        
       | indus wrote:
       | Is there an argument against Open Source AI?
       | 
       | Not the usual nation-state rhetoric, but something that justifies
       | that closed source leads to better user-experience and fewer
       | security and privacy issues.
       | 
       | An ecosystem that benefits vendors, customers, and the makers of
       | close source?
       | 
       | Are there historical analogies other than Microsoft Windows or
       | Apple iPhone / iOS?
        
         | kjkjadksj wrote:
         | Lets take the iphone. Secured by the industries best security
         | teams I am sure. Closed source, yet teenagers in eastern europe
         | have cracked into it dozens of times making jailbreaks. Every
         | law enforcement agency can crack into it. Closed source is not
         | a security moat, but a trade protection moat.
        
         | finolex1 wrote:
         | Replace "Open Source AI" in "is there an argument against xxx"
         | with bioweapons or nuclear missiles. We are obviously not at
         | that stage yet, but it could be a real, non-trivial concern in
         | the near future.
        
       | GaggiX wrote:
       | Llama 3.1 405B is on par with GPT-4o and Claude 3.5 Sonnet, the
       | 70B model is better than GPT 3.5 turbo, incredible.
        
       | itissid wrote:
       | How are smaller models distilled from large models, I know of
       | LoRA, quantization like technique; but does distilling also mean
       | generating new datasets for conversing with smaller models
       | entirely from the big models for many simpler tasks?
        
         | tintor wrote:
         | Smaller models can be trained to match log probs of the larger
         | model. Larger model can be used to generate synthethic data for
         | the smaller model.
        
       | popcorncowboy wrote:
       | > Developers can run inference on Llama 3.1 405B on their own
       | infra at roughly 50% the cost of using closed models like GPT-4o
       | 
       | Does anyone have details on exactly what this means or where/how
       | this metric gets derived?
        
         | rohansood15 wrote:
         | I am guessing these are prices on services like AWS Bedrock
         | (their post is down right now).
        
         | PlattypusRex wrote:
         | a big chunk of that is probably the fact that you don't need to
         | pay someone who is trying to make a profit by running inference
         | off-premises.
        
       | wesleyyue wrote:
       | Just added Llama 3.1 405B/70B/8B to https://double.bot (VSCode
       | coding assistant) if anyone would like to try it.
       | 
       | ---
       | 
       | Some observations:
       | 
       | * The model is much better at trajectory correcting and putting
       | out a chain of tangential thoughts than other frontier models
       | like Sonnet or GPT-4o. Usually, these models are limited to
       | outputting "one thought", no matter how verbose that thought
       | might be.
       | 
       | * I remember in Dec of 2022 telling famous "tier 1" VCs that
       | frontier models would eventually be like databases: extremely
       | hard to build, but the best ones will eventually be open and win
       | as it's too important to too many large players. I remember the
       | confidence in their ridicule at the time but it seems
       | increasingly more likely that this will be true.
        
       | didip wrote:
       | Is it really open source though? You can't run these models for
       | your company. The license is extremely restrictive and there's NO
       | SOURCE CODE.
        
       | jamiedg wrote:
       | Looks like it's easy to test out these models now on Together AI
       | - https://api.together.ai
        
       | KingOfCoders wrote:
       | Open Source AI needs to include training data.
        
       | fsndz wrote:
       | Small language models is the path forward
       | https://medium.com/thoughts-on-machine-learning/small-langua...
        
       | pja wrote:
       | "Commoditise your complement" in action!
        
       | manishrana wrote:
       | rally useful insights
        
       | manishrana wrote:
       | really useful insights
        
       | bufferoverflow wrote:
       | Hard disagree. So far every big important model is closed-source.
       | Grok is sort-of the only exception, and it's not even that big
       | compared to the (already old) GPT-4.
       | 
       | I don't see open source being able to compete with the cutting-
       | edge proprietary models. There's just not enough money. GPT-5
       | will take an estimated $1.2 billion to train. MS and OpenAI are
       | already talking about building a $100 billion training data
       | center.
       | 
       | How can you compete with that if your plan is to give away the
       | training result for free?
        
         | sohamgovande wrote:
         | Where is the $1.2b number from?
        
           | bufferoverflow wrote:
           | There are a few numbers floating around, $1.2B being the
           | lowest estimate.
           | 
           | HSBC estimates the training cost for GPT-5 between $1.7B and
           | $2.5B.
           | 
           | Vlad Bastion Research estimates $1.25B - 2.25B.
           | 
           | Some people on HN estimate $10B:
           | 
           | https://news.ycombinator.com/item?id=39860293
        
       | smusamashah wrote:
       | Meta's article with more details on the new LLAMA 3.1
       | https://ai.meta.com/blog/meta-llama-3-1/
        
       | 6gvONxR4sf7o wrote:
       | > Third, a key difference between Meta and closed model providers
       | is that selling access to AI models isn't our business model.
       | That means openly releasing Llama doesn't undercut our revenue,
       | sustainability, or ability to invest in research like it does for
       | closed providers. (This is one reason several closed providers
       | consistently lobby governments against open source.)
       | 
       | The whole thing is interesting, but this part strikes me as
       | potentially anticompetitive reasoning. I wonder what the lines
       | are that they have to avoid crossing here?
        
         | phkahler wrote:
         | >> ...but this part strikes me as potentially anticompetitive
         | reasoning.
         | 
         | "Commoditize your complements" is an accepted strategy. And
         | while pricing below cost to harm competitors is often illegal,
         | the reality is that the marginal cost of software is zero.
        
           | Palomides wrote:
           | spending a very quantifiable large amount of money to release
           | something your nominal competitors charge for without having
           | your own direct business case for it seems a little much
        
             | phkahler wrote:
             | Companies spend very large amounts of money on all sorts of
             | things that never even get released. Nothing wrong with
             | releasing something for free that no longer costs you
             | anything. Who knows why they developed it in the first
             | place, it makes no difference.
        
       | frabjoused wrote:
       | Who knew FB would hold OpenAI's original ideals, and OpenAI now
       | holds early FB ideals/integrity.
        
         | boringg wrote:
         | FB needed to differentiate drastically. FB is at its best
         | creating large data infra.
        
         | krmboya wrote:
         | Mark Zuckerberg was attacked by the media when it suited their
         | tech billionaire villain narrative. Now there's Elon Musk so
         | Zuckerberg gets to be on the good side again
        
       | jmward01 wrote:
       | I never thought I would say this but thanks Meta.
       | 
       | *I reserve the right to remove this praise if they abuse this
       | open source model position in the future.
        
         | frabcus wrote:
         | If it was actually open source with data and the data curation
         | code releases, they wouldn't be able to abuse it the same way.
         | It is open weights, closed training data.
        
       | gooob wrote:
       | why do they keep training on publicly available online data, god
       | dammit? what the fuck. don't they want to make a good LLM? train
       | on the classics, on the essentials reference manuals for
       | different technologies, on history books, medical encyclopedias,
       | journal notes from the top surgeons and engineers, scientific
       | papers of the experiments that back up our fundamental theories.
       | we want quality information, not recent information. we already
       | have plenty of recent information.
        
       | mmmore wrote:
       | I appreciate that Mark Zuckerberg soberly and neutrally talked
       | about some of the risks from advances in AI technology. I agree
       | with others in this thread that this is more accurately called
       | "public weights" instead of open source, and in that vein I
       | noticed some issues in the article.
       | 
       | > This is one reason several closed providers consistently lobby
       | governments against open source.
       | 
       | Is this substantially true? I've noticed a tendency of those who
       | support the general arguments in this post to conflate the
       | beliefs of people concerned about AI existential risk, some of
       | whom work at the leading AI labs, with the position of the labs
       | themselves. In most cases I've seen, the AI labs (especially
       | OpenAI) have lobbied against any additional regulation on AI,
       | including with SB1047[1] and the EU AI Act[2]. Can anyone provide
       | an example of this in the context of actual legislation?
       | 
       | > On this front, open source should be significantly safer since
       | the systems are more transparent and can be widely scrutinized.
       | Historically, open source software has been more secure for this
       | reason.
       | 
       | This may be true if we could actually understand what was
       | happening in neural networks, or train them to consistently avoid
       | unwanted behaviors. As things are, the public weights are simply
       | inscrutable black boxes, and the existence of jailbreaks and
       | other strange LLM behaviors show that we don't understand how our
       | training processes create models' emergent behaviors. The
       | capabilities of these models and their influence are growing
       | faster than our understand of them, and our ability to steer them
       | to behave precisely how we want, and that will only get harder as
       | the models get more powerful.
       | 
       | > At this point, the balance of power will be critical to AI
       | safety. I think it will be better to live in a world where AI is
       | widely deployed so that larger actors can check the power of
       | smaller bad actors.
       | 
       | This paragraph ignores the concept of offense/defense balance.
       | It's much easier to cause a pandemic than to stop one, and
       | cyberattacks, while not as bad as pandemics, seem to also favor
       | the attacker (this one is contingent on how much AI tools can
       | improve our ability to write secure code). At the extreme, it
       | would clearly be bad if everyone had access to a anti-matter
       | weapon large enough to destroy the Earth; at some level of
       | capability, we have to limit the commands an advanced AI will
       | follow from an arbitrary person.
       | 
       | That said, I'm unsure if limiting public weights at this time
       | would be good regulation. They do seem to have some benefits in
       | increasing research around alignment/interpretability, and I
       | don't know if I buy the argument that public weights are
       | significantly more dangerous from a "misaligned ASI" perspective
       | than many competing closed companies. I also don't buy the view
       | of some in the leading labs that we'll likely have "human level"
       | systems by the end of the decade; it seems possible but unlikely.
       | But I worry that Zuckerberg's vision of the future does not
       | adequately guard against downside risks, and is not compatible
       | with the way the technology will actually develop.
       | 
       | [1] https://thebulletin.org/2024/06/california-ai-bill-
       | becomes-a...
       | 
       | [2] https://time.com/6288245/openai-eu-lobbying-ai-act/
        
       | btbuildem wrote:
       | The "open source" part sounds nice, though we all know there's
       | nothing particularly open about the models (or their weights).
       | The barriers to entry remain the same - huge upfront investments
       | to train your own, and steep ongoing costs for "inference".
       | 
       | Is the vision here to treat LLM-based AI as a "public good", akin
       | to a utility provider in a civilized country (taxpayer funded,
       | govt maintained, non-for-profit)?
       | 
       | I think we could arguably call this "open source" when all the
       | infra blueprints, scripts and configs are freely available for
       | anyone to try and duplicate the state-of-the-art (resource and
       | grokking requirements nonwithstanding)
        
         | brrrrrm wrote:
         | check out the paper. it's pretty comprehensive
         | https://ai.meta.com/research/publications/the-llama-3-herd-o...
        
       | openrisk wrote:
       | Open source "AI" is a proxy for democratising and making (much)
       | more widely useful the goodies of high performance computing
       | (HPC).
       | 
       | The HPC domain (data and compute intensive applications that
       | typically need vector, parallel or other such architectures) have
       | been around for the longest time, but confined to academic /
       | government tasks.
       | 
       | LLM's with their famous "matrix multiply" at their very core are
       | basically demolishing an ossified frontier where a few commercial
       | entities (Intel, Microsoft, Apple, Google, Samsung etc) have
       | defined for decades what computing looks like _for most people_.
       | 
       | Assuming that the genie is out of the bottle, the question is:
       | what is the shape of end-user devices that are optimally designed
       | to use compute intensive open source algorithms? The "AI PC" is
       | already a marketing gimmick, but could it be that Linux desktops
       | and smartphones will suddenly be "AI natives"?
       | 
       | For sure its a transformational period and the landscape T+10 yrs
       | could be drastically different...
        
         | frabcus wrote:
         | Unfortunately it is barely more open source than Windows. Llama
         | 3 weights are binary code and while the license is pretty good
         | it isn't open source.
        
       | LarsDu88 wrote:
       | Obligatory reminder of why tech companies subsidize open source
       | projects: https://www.joelonsoftware.com/2002/06/12/strategy-
       | letter-v/
        
       | avivo wrote:
       | The FTC also recently put out a statement that is fairly pro-open
       | source: https://www.ftc.gov/policy/advocacy-research/tech-at-
       | ftc/202...
       | 
       | I think it's interesting to think about this question of open
       | source, benefits, risk, and even competition, without all of the
       | baggage that Meta brings.
       | 
       | I agree with the FTC, that the benefits of open-weight models are
       | significant for competition. _The challenge is in distinguishing
       | between good competition and bad competition._
       | 
       | Some kind of competition can harm consumers and critical public
       | goods, including democracy itself. For example, competing for
       | people's scarce attention or for their food buying, with
       | increasingly optimized and addictive innovations. Or competition
       | to build the most powerful biological weapons.
       | 
       | Other kinds of competition can massively accelerate valuable
       | innovation. The FTC must navigate a tricky balance here --
       | leaning into competition that serves consumers and the broader
       | public, while being careful about what kind of competition it is
       | accelerating that could cause significant risk and harm.
       | 
       | It's also obviously not just "big tech" that cares about the
       | risks behind open-weight foundation models. Many people have
       | written about these risks even before it became a subject of
       | major tech investment. (In other words, A16Z's framing is often
       | rather misleading.) There are many non-big tech actors who are
       | very concerned about current and potential negative impacts of
       | open-weight foundation models.
       | 
       | One approach which can provide the best of both worlds, is for
       | cases where there are significant potential risks, to ensure that
       | there is at least some period of time where weights are not
       | provided openly, in order to learn a bit about the potential
       | implications of new models.
       | 
       | Longer-term, there may be a line where models are too risky to
       | share openly, and it may be unclear what that line is. In that
       | case, it's important that we have governance systems for such
       | decisions that are not just profit-driven, and which can help us
       | continue to get the best of all worlds. (Plug: my organization,
       | the AI & Democracy Foundation; https://ai-dem.org/; is working to
       | develop such systems and hiring.)
        
         | whimsicalism wrote:
         | making food that people want to buy is good actually
         | 
         | i am not down with this concept of the chattering class
         | deciding what are good markets and what are bad, unless it is
         | due to broad-based and obvious moral judgements.
        
           | endorphine wrote:
           | Except of 90% of the food in the supermarket shelves out
           | there, which is packed in sugar and conservatives.
        
       | tpurves wrote:
       | 405 sounds like a lot of B's! What do you need to practically run
       | or host that yourself?
        
         | sumedh wrote:
         | You cannot run it locally
        
       | tpurves wrote:
       | 405 is a lot of B's. What does it take to run or host that?
        
         | danielmarkbruce wrote:
         | quantize to 0 bit. Run on a potato.
         | 
         | Jokes aside ~ 405b x 2 bytes of memory (FP16), so say 810 gigs,
         | maybe 1000 gigs or so required in reality, need maybe 2 aws p5
         | instances?
        
       | dang wrote:
       | Related ongoing thread:
       | 
       |  _Llama 3.1_ - https://news.ycombinator.com/item?id=41046540 -
       | July 2024 (114 comments)
        
       | littlestymaar wrote:
       | I love how Zuck decided to play a new game called "commoditize
       | some other billionaire's business to piss him", I can't wait
       | until this becomes a trend and we get plenty of open source cool
       | stuff.
       | 
       | If he really wants to replicate Linux's success against
       | proprietary Unices, he needs to release Llama with some kind of
       | GPL equivalent, that forces everyone to play the open source
       | game.
        
       | Dwedit wrote:
       | Without the raw data that trained the model, how is it open
       | source?
        
       | suyash wrote:
       | Open source is a welcome step but what we really need is complete
       | decentralisation so people can run their own private AI Models
       | that keep all the data private to them. We need this to happen
       | locally on laptops, mobile phones, smart devices etc. Waiting for
       | when that will become ubiquitous.
        
         | frabcus wrote:
         | It is open weights not open source. If you can't train it and
         | don't know the training data and can't use it to train your own
         | models, it is a closed model aa a whole. Even if you have the
         | binary weights.
        
       | zoogeny wrote:
       | Totally tangential thought, probably doomed to be lost in the
       | flood of comments on this very interesting announcement.
       | 
       | I was thinking today about Musk, Zuckerberg and Altman. Each
       | claims that the next version of their big LLMs will be the best.
       | 
       | For some reason it reminded me of one apocryphal cause of WW1,
       | which was that the kings of Europe were locked in a kind of ego
       | driven contest. It made me think about the Nation State as a
       | technology. In some sense, the kings were employing the new
       | technology which was clearly going to be the basis for the future
       | political order. And they were pitting their own implementation
       | of this new technology against the other kings.
       | 
       | I feel we are seeing a similar clash of kings playing out. The
       | claims that this is all just business or some larger claim about
       | the good of humanity seem secondary to the ego stakes of the
       | major players. And when it was about who built the biggest
       | rocket, it felt less dangerous.
       | 
       | It breaks my heart just a little bit. I feel sympathy in some
       | sense for the AIs we will create, especially if they do reach the
       | level of AGI. As another tortured analogy, it is like a bunch of
       | competitive parents forcing their children into adversarial
       | relationships to satisfy the parent's ego.
        
       | light_triad wrote:
       | They are positioning themselves as champions of AI open source
       | mostly because they were blindsided by OpenAI, are not in the
       | infra game, and want to commoditize their complements as much as
       | possible.
       | 
       | This is not altruism although it's still great for devs and
       | startups. All FB GPU investments is primarily for new AI products
       | "friends", recommendations and selling ads.
       | 
       | https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/
        
         | baby wrote:
         | Meta does a good thing
         | 
         | HN spends a day figuring out how it's actually bad
        
           | shnock wrote:
           | It's not actually bad, OP's point is that it is not motivated
           | by altruism. An action can be beneficial to the people
           | without that effect being the incentive
        
             | j_maffe wrote:
             | Of course, it's not altruism; it's a publicly traded
             | corporation. No one should ever believe in any such claims
             | by these organizations. Non-altruistic organizations can
             | still make positive-impact actions when they align with
             | their goals.
        
           | satvikpendem wrote:
           | No one said it was bad. It's just self interested (as
           | companies generally are) and are using that to have a PR spin
           | on the topic. But again, this is what all companies do and
           | nothing about it is bad per se.
        
             | MrScruff wrote:
             | Nothing they're doing is bad, and sometimes we benefit when
             | large companies interests align with our own. All the spiel
             | about believing in open systems because of being prevented
             | from making their best products by Apple is a bit much
             | considering we're talking about Facebook which is hardly an
             | 'open platform', and the main thing Apple blocked them on
             | was profiling their users to target ads.
        
           | m3kw9 wrote:
           | So you think FB did this with zero benefit to themselves?
           | They did open source so people could improve their models and
           | eventually have a paid tier later either from hosting
           | services or other strategies
        
             | WithinReason wrote:
             | The linked article already spelled out the benefits
        
           | sensanaty wrote:
           | By virtue of it being Meta, it's automatically bad.
           | 
           | If we lived in a sensible world we'd have nuked Meta into a
           | trillion tiny little pieces some time around the Cambridge
           | Analytica bullshit.
        
         | war321 wrote:
         | They've been working on AI for a good bit now. Open source
         | especially is something they've championed since the mid 2010s
         | at least with things like PyTorch, GraphQL, and React. It's not
         | something they've suddenly pivoted to since ChatGPT came in
         | 2022.
        
         | kertoip_1 wrote:
         | They are giving it "for free" because:
         | 
         | * they need LLMs that they can control for features on their
         | platforms (Fb/Instagram, but I can see many use cases on VR
         | too)
         | 
         | * they cannot sell it. They have no cloud services to offer.
         | 
         | So they would spend this money anyways, but to compensate some
         | losses they just decided to use it to fix their PR by
         | contenting developers
        
           | sterlind wrote:
           | They also reap the benefits of AI researchers across the
           | world using Llama as a base. All their research is
           | immediately applicable to their models. It's also likely a
           | strategic decision to reduce the moat OpenAI is building
           | around itself.
           | 
           | I also think LeCunn opposes OpenAI's gatekeeping at a
           | philosophical/political level. He's using his position to
           | strengthen open-source AI. Sure, there's strategic business
           | considerations, but I wouldn't rule out principled
           | motivations too.
        
             | TaylorAlexander wrote:
             | Yes LeCun has said he thinks AI should be open like
             | journalism should be - that openness is inherently valuable
             | in such things.
             | 
             | Add to the list of benefits to Meta that it keeps LeCun
             | happy.
        
         | sebastiennight wrote:
         | I think people massively underestimate how much time/attention
         | span (and ad revenue) will be up for grabs once a platform
         | really nails the "AI friend" concept. And it makes sense for
         | Meta to position themselves for it.
        
           | zmmmmm wrote:
           | yes ... I remember when online dating was absolutely cringe /
           | weird thing to do. Ten years later and it's the primary way a
           | whole generation seeks a partner.
           | 
           | It will seem incredibly weird today to have an imaginary
           | friend that you treat as a genuine relationship but I
           | genuinely expect this will happen and become a commonplace
           | thing within the next two decades.
        
         | Havoc wrote:
         | > they were blindsided by OpenAI
         | 
         | Given the mountain of GPUs they bought at precisely the right
         | moment I don't think that's entirely accurate
        
           | sumedh wrote:
           | > Given the mountain of GPUs they bought at precisely the
           | right moment I don't think that's entirely accurate
           | 
           | If I remember correctly, FB didnt buy those GPUs because of
           | Open AI, they were going to buy it anyway but Mark said
           | whatever we are buying let's double it.
        
             | Havoc wrote:
             | Yeah, still not entirely clear what exactly they're doing
             | with all of it...but they certainly saw the GPU supply
             | crunch earlier than the rest
        
         | brigadier132 wrote:
         | Intentions are overrated. Given how many people with good
         | intentions fuck up everything, I'd rather have actual results,
         | even if the intention is self-serving.
        
         | istjohn wrote:
         | AI is not a "complement" of a social network in the way Spolsky
         | defines the term.
         | 
         | > A complement is a product that you usually buy together with
         | another product. Gas and cars are complements. Computer
         | hardware is a classic complement of computer operating systems.
         | And babysitters are a complement of dinner at fine restaurants.
         | In a small town, when the local five star restaurant has a two-
         | for-one Valentine's day special, the local babysitters double
         | their rates. (Actually, the nine-year-olds get roped into early
         | service.)
         | 
         | > All else being equal, demand for a product increases when the
         | prices of its complements decrease.
         | 
         | Smart phones ar a complement of Instagram. VR headsets are a
         | complement of the metaverse. AI could be a component of a
         | social network, but it's not a complement.
        
       | anthomtb wrote:
       | > My framework for understanding safety is that we need to
       | protect against two categories of harm: unintentional and
       | intentional. Unintentional harm is when an AI system may cause
       | harm even when it was not the intent of those running it to do
       | so. For example, modern AI models may inadvertently give bad
       | health advice. Or, in more futuristic scenarios, some worry that
       | models may unintentionally self-replicate or hyper-optimize goals
       | to the detriment of humanity. Intentional harm is when a bad
       | actor uses an AI model with the goal of causing harm.
       | 
       | Okay then Mark. Replace "modern AI models" with "social media"
       | and repeat this statement with a straight face.
        
       | j_m_b wrote:
       | > We need to protect our data.
       | 
       | This is a very important concern in Health Care because of HIPAA
       | compliance. You can't just send your data over the wire to
       | someone's proprietary API. You would at least need to de-identify
       | your data. This can be a tricky task, especially with
       | unstructured text.
        
       | xpe wrote:
       | Zuck needs to get real. They are Open Weights not Open Source.
        
       | Sparkyte wrote:
       | The real path forward is recognizing what AI is good at and what
       | it is bad at. Focus on making what it is good at even better and
       | faster. Open AI will definitely give us that option but it isn't
       | a miracle worker.
       | 
       | My impression is that AI if done correctly will be the new way to
       | build APIs with large data sets and information. It can't write
       | code unless you want to dump billions of dollars into a solution
       | with millions of dollars of operational costs. As it stands it
       | loses context too quickly to do advance human tasks. BUT this is
       | where it is great at assembling data and information. You know
       | what is great at assembling data and information? APIs.
       | 
       | Think of it this way if we can make it faster and it trains on a
       | datalake for a company it could be used to return information
       | faster than a nested micro-service architecture that is just a
       | spiderweb of dependencies.
       | 
       | Because AI loses context simple API requests could actually be
       | more efficient.
        
       | Bluescreenbuddy wrote:
       | >This is how we've managed security on our social networks - our
       | more robust AI systems identify and stop threats from less
       | sophisticated actors who often use smaller scale AI systems.
       | 
       | So about all the bots and sock puppets on social media..
        
       | pjkundert wrote:
       | Deployment of PKI-signed distributed software systems to use
       | community-provisioned compute, bandwidth and storage at scale is,
       | now quite literally, the future.
       | 
       | We mostly don't all want or need the hardware to run these AIs
       | ourselves, all the time. But, when we do, we need lots of it for
       | a little while.
       | 
       | This is what Holochain was born to do. We can rent massive
       | capacity when we need it, or earn money renting ours when we
       | don't.
       | 
       | All running cryptographically trusted software at Internet scale,
       | without the knowledge or authorization of commercial or
       | government "do-gooders".
       | 
       | Exciting times!
        
       | ayakang31415 wrote:
       | Massive props to AI teams at Meta that released this model open
       | source
        
       | ceva wrote:
       | They have earned so much money on all of their users, this is
       | least they can do to give back to the community, if this can be
       | considered that ;)
        
       | animanoir wrote:
       | "Says the Meta Inc".
        
       | seydor wrote:
       | That assumes LLMs are the path to AI, which is increasingly
       | becoming an unpopular opinion
        
       | tmsh wrote:
       | Software 2.0 is about open licensing.
       | 
       | I.e., the more important thing - the more "free" thing - is the
       | licensing now.
       | 
       | E.g., I play around with different image diffusion models like
       | Stable Diffusion and specific fine-tuned variations for
       | ControlNet or LoRA that I plug into ComfyUI.
       | 
       | But I can't use it at work because of the licensing. I have to
       | use InvokeAI instead of ComfyUI if I want to be careful and only
       | very specific image diffusion models without the latest and
       | greatest fine-tuning. As others have said - the weights
       | themselves are rather inscrutable. So we're building on more
       | abstract shapes now.
       | 
       | But the key open thing is making sure (1) the tools to modify the
       | weights are open and permissive (ComfyUI, related scripts or
       | parts of both the training and deployment) and (2) the underlying
       | weights of the base models and the tools to recreate them have
       | MIT or other generous licensing. As well as the fine-tuned
       | variants for specific tasks.
       | 
       | It's not going to be the naive construction in the future where
       | you take a base model and as company A you produce company A's
       | fine tuned model and you're done.
       | 
       | It's going to be a tree of fine-tuned models as a node-based
       | editor like ComfyUI already shows and that whole tree has to be
       | open if we're to keep the same hacker spirit where anyone can
       | tinker with it and also at some point make money off of it. Or go
       | free software the whole way (i.e., LGPL or equivalent the whole
       | tree of tools).
       | 
       | In that sense unfortunately Llama has a ways to go to be truly
       | open: https://news.ycombinator.com/item?id=36816395
        
         | Palmik wrote:
         | In the LLM world there are many open source solutions to find
         | tuning, maybe the best one being from Meta:
         | https://github.com/pytorch/torchtune
         | 
         | In terms of inference and interface (since you mentioned comfy)
         | there are many truly open source options such as vLLM (though
         | there isn't a single really performant open source solution for
         | inference yet).
        
       | jameson wrote:
       | It's hard to say Llama is an "open source" when their license
       | states Meta has full control under certain circumstances
       | 
       | https://raw.githubusercontent.com/meta-llama/llama-models/ma...
       | 
       | > 2. Additional Commercial Terms. If, on the Llama 3.1 version
       | release date, the monthly active users of the products or
       | services made available by or for Licensee, or Licensee's
       | affiliates, is greater than 700 million monthly active users in
       | the preceding calendar month, you must request a license from
       | Meta, which Meta may grant to you in its sole discretion, and you
       | are not authorized to exercise any of the rights under this
       | Agreement unless or until Meta otherwise expressly grants you
       | such rights.
        
         | __loam wrote:
         | It should be transparently clear that this move was taken by
         | Meta to drive their competitors out of business in a capital
         | intensive space.
        
           | apwell23 wrote:
           | not sure how it drives competitors out of business. OpenAI is
           | losing money on queries not on model creation. This
           | opensource model has no impact of their business model of
           | charging users money to run queries.
           | 
           | on a side note OpenAI is losing users on its own. It doesn't
           | need meta to put it out of business.
        
         | systemvoltage wrote:
         | Tbh, it's incredibly generous.
        
       | nailer wrote:
       | Llama isn't open source. The license is at
       | https://llama.meta.com/llama3/license/ and includes various
       | restrictions on use, which means it falls outside the rules
       | created by the https://opensource.org/osd
        
       | war321 wrote:
       | Even if it's just open weights and not "true" open source, I'll
       | still give Meta the appreciation of being one of the few big AI
       | companies actually committed to open models. In an ecosystem
       | where groups like Anthropic and OpenAI keep hemming and hawing
       | about safety and the necessity of closed AI systems "for our
       | sake", they stand out among the rest.
        
         | meowtimemania wrote:
         | Why would openai/anthropic's approach be more safe? Are people
         | able to remove all the guard rails on the llama models?
        
           | alfalfasprout wrote:
           | They're not safer. The claim is that OpenAI will enforce
           | guard rails and take steps to ensure model outputs and
           | prompts are responsible... but only a fool would take them at
           | their word.
        
             | seoulmetro wrote:
             | Yeah.. and Facebook said they would enforce censorship on
             | their platforms to ensure content safety.. that didn't turn
             | out so well. Now it just censors anything remotely
             | controversial, such as World War 2 historical facts or even
             | just slightly offensive wording.
        
               | Spivak wrote:
               | You're really just arguing about the tuning. I get that
               | it's annoying as a user but as a moderator going into it
               | with the mentality that any post is expendable and
               | bringing down the banhammer on everything near the line
               | keeps things civil. HN does that too with the no flame-
               | bait rule.
        
               | koolala wrote:
               | Censorship isn't moderation.
        
               | seoulmetro wrote:
               | HN moderation is quite poor and very subjective. The
               | guidelines are not the site rules, the rules are made up
               | on the spot.
               | 
               | HN censors too. Facebook just does it automatically on a
               | huge scale with no reasoning behind each censor.
               | 
               | Censorship is just tuning people or things you don't want
               | out. Censorship of your own content as a user is
               | extremely annoying and Facebook's censorhsip is quite
               | unethical. It doesn't help safety of the users, it helps
               | safety of the business.
               | 
               | Also Facebook censors things that are not objectively not
               | offensive in lots of instances. YouTube too. Safety for
               | their brand.
        
               | zelphirkalt wrote:
               | The banhammer can quickly become a tool of net negative
               | though, when actual facts are being repressed/censored.
        
           | Der_Einzige wrote:
           | Yes, it's trivial to remove guardrails from any open access
           | model:
           | 
           | https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-
           | in...
           | 
           | https://huggingface.co/failspy/Llama-3-70B-Instruct-
           | ablitera...
        
             | Zuiii wrote:
             | Humanity is so fortunate this "guardrails" mentality didn't
             | catch on when we started publishing books. While too close
             | for comfort, we got twice lucky that computing wasn't
             | hampered by this mentality either.
             | 
             | This time, humanity narrowly averted complete disaster
             | thanks to the huge efforts and resources of a small number
             | of people.
             | 
             | I wonder if we are witnessing humanity's the end of open
             | knowledge and compute (at least until we pass through a neo
             | dark ages and reach the next age of enlightenment).
             | 
             | Whether it'll be due to profit or control, it looks like
             | humanity is posed to get fucked.
        
               | xvector wrote:
               | > Humanity is so fortunate this "guardrails" mentality
               | didn't catch on when we started publishing books.
               | 
               | Ah, but it almost did[1]:
               | 
               | > novels [...] were accused of corrupting the youth, of
               | planting dangerous ideas into the heads of housewives
               | 
               | The pessimist playbook is familiar when it comes to
               | technological/human progress. Today, the EU has made it
               | so hard to release AI models in the region that most
               | companies simply don't bother. Case in point: Meta and
               | others have decided to make their models unavailable in
               | the EU for the forseeable future [2]. I can only imagine
               | how hard it is for a startup like Mistral.
               | 
               | [1]: https://pessimistsarchive.org/list/novels/clippings
               | 
               | [2]: https://www.theguardian.com/technology/article/2024/
               | jul/18/m...
        
               | shkkmo wrote:
               | The EU hasn't made it hard to release models (yet). The
               | EU has made it hard to train models on EU data. Meta has
               | responded by blocking access to the models trained on
               | non-EU data as a form of leverage/retribution. This is
               | explained by your own reference.
        
         | loceng wrote:
         | To me it will be most interesting to see who attempts to
         | manipulate the models by stuffing them with content,
         | essentially adding "duplicate" content such as via tautology,
         | in order to make it have added-misallocated weight; which I
         | don't think an AI model will automatically be able to
         | determine, unless it was truly intelligent, instead it would
         | require to be trained by competent humans.
         | 
         | And so the models that have mechanisms for curating and
         | preventing such misapplied weighting, and then the
         | organizations and individuals who accurately create adjustments
         | to the models, will in the end be the winners - where truth has
         | been more honed for.
        
       | rednafi wrote:
       | How's only sharing the binary artifact is open source? There's
       | the data aspect of things that they can't share because of
       | licensing and the code itself isn't accessible.
        
         | Palmik wrote:
         | It's much better than sharing a binary artifact of regular
         | software, since the weights can be and are easily and
         | frequently modified by fine tuning the model. This means you
         | can modify the "binary artifact" to your needs, similar to how
         | you might change the code of open source software to add
         | features etc.
        
       | AMICABoard wrote:
       | Okay if anyone wants to try Llama 3.1 inference on CPU, try this:
       | https://github.com/trholding/llama2.c (L2E)
       | 
       | It's a bit buggy but it is fun.
       | 
       | Disclaimer: I am the author of L2E
        
       | sebastiennight wrote:
       | I've summarized this entire thread in 4 lines (didn't even use AI
       | for it!)
       | 
       | Step 1. Chick-Fil-A releases a grass-fed beef burger to spite
       | other fast-food joints, calls it "the vegan burger"
       | 
       | Step 2. A couple of outraged vegans show up in the comments,
       | pointing out that beef, even grass-fed beef, isn't vegan
       | 
       | Step 3. Fast food enthusiasts push back: it's unreasonable to
       | want companies to abide by this restrictive definition of
       | "vegan". Clearly this burger is a gamechanger and the definition
       | needs to adapt to the times.
       | 
       | Step 4. Goto Step 2 in an infinite loop
        
         | nathansherburn wrote:
         | Open source software is one of our best and most passionately
         | loved inventions. It'd be much easier to have a nuanced
         | discussion about "open weights" but I don't think that's in
         | Facebook's interest.
        
         | llm_trw wrote:
         | More like vegetarians show up claiming to be vegans, then
         | vegans show up and explain why eating animal products is still
         | wrong.
         | 
         | That's the difference between open source and free software.
        
           | dogcomplex wrote:
           | Yeah the moral step up from the status quo is still laudable.
           | Open weights are still much improved over the closed creepy
           | spy agency clusterfucks that OpenAI/Microsoft/Google/Apple
           | are bringing to the table.
        
         | dilliwal wrote:
         | On point, and pretty good analogy
        
       | dev1ycan wrote:
       | Reality: they've realize gpt 4 is a wall, they can't keep pouring
       | trillions of dollars into it for no improvement or little at all,
       | so now they want to put it to the open source until someone
       | figures out the next step then they'll take it behind closed
       | doors again.
       | 
       | I hate how the moment it's too late will be, by design, closed
       | doors.
        
         | Ukv wrote:
         | > Reality: they've realize gpt 4 is a wall, they can't keep
         | pouring trillions of dollars into it for no improvement or
         | little at all, so now they want to put it to the open source
         | [...]
         | 
         | This is Meta (LLaMA, which has had available weights for a
         | while), not OpenAI (GPT).
        
           | dev1ycan wrote:
           | How does that change anything about my comment? what does
           | "available weights" change about the system being closed
           | source? additionally, they have the developers, the second
           | they figure out a way to achieve AGI or something close to it
           | they'll take it closed source, this is just outsourcing
           | maintenance and small tweaks.
        
       | smashah wrote:
       | Actually open source Whatsapp is the way forward.
        
       | petetnt wrote:
       | Great, now release the datasets used for training your AI so
       | everyone can get compensated accordingly and ask that your
       | competition follow suit.
        
       | twelve40 wrote:
       | It'll be interesting to come back here in a couple of years and
       | see what's left. What do they even do anymore? They have
       | Facebook, which hasn't visibly changed in a decade. They have
       | Instagram, which feels a bit sleeker but also remained more or
       | less the same. and Whatsapp. Ad network that runs on top of those
       | services and floods them with trash. Bunch of stuff that doesn't
       | seem to exist anymore - Libra, the grandiose multibillion dollar
       | Legless VR, etc.
       | 
       | But they still have 70 thousand people (a small country) doing
       | _something_. What are they doing? Updating Facebook UI? Not
       | really, the UI hasn't been updated, and you don't need 70
       | thousand people to do that. Stuff like React and Llama? Good, I
       | guess, we'll see how they make use of Llama in a couple of years.
       | Spellcheck for posts maybe?
        
         | therealdrag0 wrote:
         | And still making 135B dollars in revenue, or 2M per employee. I
         | don't know what they do either lol, but I don't mind that
         | revenue supporting jobs.
        
       | benreesman wrote:
       | In general I look back on my time at FB with mixed feelings, I'm
       | pretty skeptical that modern social media is a force for good and
       | I was there early enough to have moved the needle.
       | 
       | But this is really positive stuff and it's nice to view my time
       | there through the lens of such a change for the better.
       | 
       | Keep up the good work on this folks.
       | 
       | Time to start thinking about opening up a little on the training
       | data.
        
       | bentice wrote:
       | Ironically, this benefits Apple so much.
        
         | netsec_burn wrote:
         | How? They are prohibited from using it in the license.
        
       | ysofunny wrote:
       | or else is not even scientific
        
       | elecush wrote:
       | Ok one notable difference: did the linux researchers of yore warn
       | about adversarial giants getting this tech? Or is this unique to
       | the current moment? That for me is the largest question when
       | considering the logical progression on "linux open is better
       | therefore ai open is better".
        
         | Spivak wrote:
         | We can't open source Linux because bad people might run
         | servers?
         | 
         | Can you imagine the disinformation they could spread with
         | those? With enough of them you could have a massively global
         | site made entirely for spreading it. God what if such a thing
         | got into the hands of an egocentric billionaire?
        
           | endorphine wrote:
           | An operating system is not "generative", hence it's not a
           | force multiplier.
        
             | Spivak wrote:
             | Are we on the same forum? Our _entire_ field is building
             | force multipliers to extend ourselves well beyond of what
             | we 're capable as individuals and the OS is tool that lets
             | you get it done. Scale is like.. our entire thing. I feel
             | like we're just so used to the world with computers that we
             | forget how much power they allow people to wield. Which,
             | honestly is maybe a good sign for the next of tools because
             | AI isn't going to be more impactful than computers and we
             | all survived.
        
             | Zuiii wrote:
             | I don't understand. A bad actor can use a linux server to
             | automatically run botnets and exploit new devices. How is
             | that not a force multiplier?
        
       | turingbook wrote:
       | Open-weights models are not really open source.
        
       | scotty79 wrote:
       | It's more like a freeware than open source. You can launch it on
       | your hardware and use it but how it was created is mostly not
       | published.
       | 
       | Still huge props to them for doing what they do.
        
       | casebash wrote:
       | I expect this to end up having been one of the worst timed blog
       | posts in history. Open source AI has mostly been good for the
       | world up until now, but we're getting to the point where we're
       | about to find out why open-sourcing sufficiently bad models is a
       | terrible idea.
        
       | tananaev wrote:
       | I don't think weights is the source. Data is the source. But
       | still better than nothing.
        
       | cratermoon wrote:
       | You first, Zuck.
        
       | ofou wrote:
       | Llama 3 Training System                      Total: 19.2 exaFLOPS
       | |                 +-------------+-------------+                 |
       | |           Cluster 1               Cluster 2         9.6
       | exaFLOPS             9.6 exaFLOPS                |
       | |         +------+------+         +------+------+         |
       | |         |             |      12K GPUs      12K GPUs  12K GPUs
       | 12K GPUs         |             |         |             |
       | [####]       [####]     [####]       [####]       400+
       | 400+      400+          400+      TFLOPS/GPU   TFLOPS/GPU
       | TFLOPS/GPU   TFLOPS/GPU
        
       | tomjen3 wrote:
       | Re safety: Just release two models, one that has been tuned and
       | one that hasn't.
       | 
       | Claude is supposed to be better, but it is also even more locked
       | down than ChatGPT.
       | 
       | Word will let me write a manifest for a new Nazi party, but
       | Claude is so locked down that it won't find a cartoon in a
       | picture and Gemini... well.
       | 
       | If AIs are not to harm society, they need to enable us to think
       | in new ways.
        
       | slowhadoken wrote:
       | First you're going to have to write some laws that prevent
       | openwashing and legitimate open source projects from becoming
       | proprietary.
        
       | thntk wrote:
       | When Zuck said spy can easily steal models, I wonder how much of
       | it comes from experiences. I remember they struggled to train OPT
       | not long ago.
       | 
       | On a more serious note, I don't really buy his arguments about
       | safety. First, widespread AI does not reduce unintentional harm
       | but increases it, because the _rate of accident_ is compound.
       | Second, the chance of success for threat actors will increase,
       | because of the _asymmetric advantage_ of gaining access to all
       | open information and hiding their own information. But there is
       | no reverse at this point, I enjoy it while it lasts, AGI will
       | come sooner or later anyway.
        
       | Simon_ORourke wrote:
       | I thoroughly support Meta's open-sourcing of these AI models
       | going forward. However, for a company that absolutely closed down
       | discussions about providing API access to their platform, I'm
       | left wondering what's in it (monetarily) for them by doing this?
       | Is it to simply undercut competition in the space, like some
       | grocery store selling below cost?
        
         | yard2010 wrote:
         | Stay assured these guys are working day and night to make our
         | world a darker place
        
         | GreenWatermelon wrote:
         | Meta open sources its tools: React, GraphQL, PyTorch, and now
         | these new models. Meta seems to be about open sourcing tools,
         | not providing open access to their platforms.
         | 
         | The AI model complements the platform, and their platform is
         | the money maker. They hold the belief that open sourcing their
         | tools benefit their platform on the long run, which is why
         | they're doing it. And in doing so, they aren't under the
         | control of any competitors.
         | 
         | I would say it's more like a grocery store providing free
         | parking, a bus stop, self-checkout, online menu, and free
         | delivery.
        
       | msnkarthik wrote:
       | Interesting discussion! While I agree with Zuckerberg's vision,
       | the comments raise valid concerns. The point about GPU
       | accessibility and cost is crucial. Public clusters are great, but
       | sustainable funding and equitable access are essential to avoid
       | exacerbating existing inequalities. I also resonate with the call
       | for CUDA alternatives. Breaking the dependence on proprietary
       | technology is key for a truly open AI ecosystem. While existing
       | research clusters offer some access, their scope and resources
       | often pale in comparison to what companies like Meta are
       | proposing. We need a multi-pronged approach: open-sourcing models
       | AND investing in accessible infrastructure, diverse hardware
       | options, and sustainable funding models for a truly democratic AI
       | future.
        
         | fnordpiglet wrote:
         | I suspect we are still early in the optimization evolution. The
         | weights are what matter. The ability to run them anywhere might
         | come.
        
           | troupo wrote:
           | The training datasets and methodology are what matters. None
           | of that is disclosed by anyone
        
         | narrator wrote:
         | 3nm chip fabs take years to build. You don't just go to AWS and
         | spin one up. This is the very hard part about AI that breaks a
         | lot of the usual tech assumptions. We have entered a world
         | where suddenly there isn't enough compute, because it's just
         | too damn hard to build capacity and that's different from the
         | past 40 years.
        
       | ohthehugemanate wrote:
       | Has anyone taken apart the llama community license and compared
       | it to validated open source licenses? Red Hat is making a big
       | deal about releasing the Granite LLM released under Apache. Is
       | there a real difference between that and what Llama does?
       | 
       | https://www.redhat.com/en/topics/ai/open-source-llm
        
       | rightbyte wrote:
       | What would be the speed of a querry of running this model from
       | disk on a ordinary PC?
       | 
       | Has anyone tried that?
        
       | v3ss0n wrote:
       | We welcome Mark Zuckerberg's Redemption Arc! Opensource AI Here
       | we go!
        
       | rldjbpin wrote:
       | this is very cool indeed that meta has made available more than
       | they need to _in terms of model weights_.
       | 
       | however, the "open-source" narrative is being pushed a bit too
       | much like descriptive ML models were called "AI", or applied
       | statistics "data science". with reinforced examples such as this,
       | we start to lose the original meaning of the term.
       | 
       | the current approach of startups or small players "open-sourcing"
       | their platforms and tools as a means to promote network effect
       | works but is harmful in the long run.
       | 
       | you will find examples of terraform and red hat happening, and a
       | very segmented market. if you want the true spirit of open-
       | source, there must be a way to replicate the weights through
       | access to training data and code. whether one could afford
       | millions of GPU hours or not, real innovation would come from
       | remixing the internals, and not just fine-tuning existing stuff.
       | 
       | i understand that this is not realistically going to ever happen,
       | but don't perform deceptive marketing at the same time.
        
       | avereveard wrote:
       | they said, while not releasing the video part of chamleon model
        
       | chx wrote:
       | A total ban on generative AI is the way forward. If the industry
       | refuses to make it safe by self regulating then the regulator
       | must step in and ban it until better, more fine tuned regulation
       | can be made. It is needed to protect our environment, our
       | democracy, our very lives.
        
         | yard2010 wrote:
         | Isn't it like banning Christianity? I don't think it can be
         | done
        
       | yard2010 wrote:
       | I don't believe in any word coming out from this lizard. He is
       | the most evil villain I know, and I live in the middle east, can
       | you imagine
        
       | pandaswo wrote:
       | Way to go!
        
       | tarruda wrote:
       | From the "Why Open Source AI Is Good for Meta" section, none of
       | the four given reasons seem to justify spending so much money to
       | train these powerful models and give them away for free.
       | 
       | > Third, a key difference between Meta and closed model providers
       | is that selling access to AI models isn't our business model.
       | That means openly releasing Llama doesn't undercut our revenue,
       | sustainability, or ability to invest in research like it does for
       | closed providers. (This is one reason several closed providers
       | consistently lobby governments against open source.)
       | 
       | Maybe this is a strategic play to hurt other AI companies that
       | depend on this business model?
        
       | sensanaty wrote:
       | Meanwhile Facebook is flooded with AI-generated slop with
       | hundreds of thousands of other bots interacting with it to boost
       | it to whoever is insane enough to still use that putrid hellhole
       | of a mass-data-harvesting platform.
       | 
       | Dead internet theory is very much happening in real time, and I
       | dread what's about to come since the world has collectively
       | decided to lose their minds with this AI crap. And people on this
       | site are unironically excited about this garbage that is
       | indistinguishable from spam getting more and more popular. What a
       | fucking joke
        
         | rocgf wrote:
         | I agree with the overall sentiment, but it's is not necessarily
         | the case that "the world has collectively decided to lose their
         | minds with this AI crap". You only need a relatively small
         | number of bad actors for this to be the case.
        
           | consf wrote:
           | I think the situation highlights the need for better
           | regulation
        
             | throwaway7ahgb wrote:
             | No thank you. Use existing laws that cover wide nets and
             | actually enforce them.
        
         | TheAceOfHearts wrote:
         | I think there's room for a middle-ground. I agree that there's
         | a lot of slop being generated and shared around. Part of it is
         | due to the initial excitement over these AI tools. Part of it
         | is that most AI tools still kinda suck at the moment. In the
         | long term I expect tools to get way better, which I'm hopefully
         | could help enable smaller teams of people than ever before to
         | execute on their creative vision.
         | 
         | Personally I have tons of creative ideas which I think would be
         | interesting and engaging but for which I lack the resources to
         | bring into this world, so I'm hoping that in the long term AI
         | tools can help bridge this gap. I'm really hopeful for a world
         | where people from all over the world can share their creative
         | stories, rather than being mostly limited to a few rich people
         | in Hollywood.
         | 
         | Unfortunately I do expect this to end up being the minority of
         | content, especially as we continue being flooded by increasing
         | amounts of trash. But maybe that's just opening up the
         | opportunity for someone to develop new content curation tools.
         | If anything, even before the rise of AI stuff there were
         | mountains of content, and we saw with the rise of TikTok that a
         | good recommendation algorithm still leaves room for new
         | platforms.
        
         | have_faith wrote:
         | The feed certainly is, but I suspect most activity left on
         | Facebook is happening in group pages. Groups are the only thing
         | I still log in for as some of them, particularly the local
         | ones, have no other way of taking part. They are also commonly
         | membership by request and actively moderated. If I had the time
         | (and energy) I might put some effort into advocating to moving
         | to something else, but it will be an uphill battle.
        
           | consf wrote:
           | The challenges of moving to alternative solutions
        
             | have_faith wrote:
             | The irony of a bot account sliding into a convo about
             | internet slop is not lost.
        
               | sgu999 wrote:
               | How do you know?
        
               | justneedaname wrote:
               | The comment history does read much like you'd expect from
               | a bot, lots of short, generic statements that vaguely tie
               | to the subject of the post
        
           | jrnx wrote:
           | I'd assume that any platform which get's sufficiently popular
           | will become a bot and AI content target...
        
           | baq wrote:
           | I stopped going on facebook a few years ago and don't miss
           | it; I don't even need messenger as everyone migrated to
           | whatsapp (yes I know, normal people don't want to move to
           | signal, but got quite a few techy friends to migrate). The
           | FB-only groups are indeed a problem, I'm delegating them to
           | my wife.
           | 
           |  _IF_ I ever had to go to FB for anything, I 'd probably
           | install a wall-removing browser extension. Mobile app is of
           | course out of question.
        
             | makingstuffs wrote:
             | > IF I ever had to go to FB for anything, I'd probably
             | install a wall-removing browser extension. Mobile app is of
             | course out of question.
             | 
             | You'll probably find you can no longer make an account. I'm
             | in the same boat as you (not used and haven't missed in
             | over a decade), however, my partner needed an account to
             | manage an ad campaign for a client and neither of us were
             | able to make one. Both tried a load of different things
             | and, ultimately, gave up. Had to tell the client what they
             | needed over a video call
        
               | ohlookcake wrote:
               | I just tried making one after reading your comment, and
               | it was... pretty straightforward? I'm curious what
               | blockers you encountered
        
           | throwawayfour wrote:
           | For me it's the Marketplace. Left FB many years ago only to
           | come back to keep an eye out for used Lego for the kiddos. At
           | least in my region, and for my purposes, Marketplace is miles
           | better than any other competing sites/apps.
        
           | throwaway7ahgb wrote:
           | Same here, Groups + Marketplace is actually a wealth of
           | information. There are still a few dark patterns but most
           | manageable for a "free" platform.
           | 
           | OPs comments read like we're describing something the SS
           | built (Godwin says hi).
        
           | giancarlostoro wrote:
           | It is probably a mix of people who got nowhere else to
           | interact with people, and people using Groups. Facebook was
           | where you'd go to talk to all your friends and family, most
           | of my friends have been getting shadowbanned since 2012 ~ so
           | it made me use it less. I got auto striked on my account for
           | making a meme joke about burning a house down due to a giant
           | spider in a video. I appealed, and it got denied. I'm not
           | using a platform that will inadvertently ban me by AI. But
           | the people actually posting to kill others, and actually burn
           | shit down, and bots stay just fine?
           | 
           | Plus I didn't want to risk my employers Facebook App being in
           | limbo if I got banned, so I left Facebook alone, never to
           | return.
           | 
           | Facebook trying to police the world is the only thing keeping
           | me away, if I can use the platform and post meme comments
           | again, maybe I might reconsider, but I doubt it. Reddit is in
           | a similar boat. You can get banned, but all the creepy
           | pedophile comments from decades and recently are still up no
           | problem.
        
             | kalsnab wrote:
             | > But the people actually posting to kill others, and
             | actually burn shit down ...
             | 
             | That kind of burning down is classified as "mostly
             | peaceful" by mainstream and AI.
        
           | sgu999 wrote:
           | > If I had the time (and energy) I might put some effort into
           | advocating to moving to something else, but it will be an
           | uphill battle.
           | 
           | What are the alternatives for local groups? I've recently
           | seen an increase in the amount of Discourse forums available,
           | which is nice, but I don't think it'd be very appealing to
           | the average cycling or hiking group.
        
         | consf wrote:
         | Indeed, this perspective is understandable, given the rapid and
         | often disruptive changes brought by AI, but it is also
         | important to consider the potential benefits which are quite
         | promising
        
         | kashyapc wrote:
         | I get your frustration of a scorched internet. But I don't
         | think it's all that gloomy. Whether we like it or not, LLMs and
         | some kind of a "down-to-earth AI" is here to stay, once the
         | dust settles. Right now, it feels like everything is burning
         | because we're in the thick of an evolving situation; and the
         | Silicon Valley tech-bros are in a hurry to ride the wave and
         | make a quick buck with their ChatGPT wrapper. (I can't speak of
         | social networks, I don't have any accounts for 10+ years,
         | except for HN.)                   * * *
         | 
         | On "collective losing of minds", you might appreciate this
         | quote from 1841 (!) by Charles MacKay. I quoted it in the
         | past[1] here, but is worth re-posting:
         | 
         |  _" In reading the history of nations, we find that, like
         | individuals, they have their whims and their peculiarities;
         | their seasons of excitement and recklessness, when they care
         | not what they do. We find that whole communities suddenly fix
         | their minds upon one object, and go mad in its pursuit; that
         | millions of people become simultaneously impressed with one
         | delusion, and run after it, till their attention is caught by
         | some new folly more captivating than the first [...]_
         | 
         |  _" Men, it has been well said, think in herds; it will be seen
         | that they go mad in herds, while they only recover their senses
         | slowly, and one by one."_
         | 
         | -- from MacKay's book, 'Extraordinary Popular Delusions and the
         | Madness of Crowds'
         | 
         | [1] https://news.ycombinator.com/item?id=25767454
        
           | sgu999 wrote:
           | What a nice quote. What were "millions of people"
           | simultaneously obsessed with around 1841?
        
             | kashyapc wrote:
             | I didn't read the book, I'm afraid. I don't know if he
             | actually mentions it anywhere.
        
         | LoveMortuus wrote:
         | In regards to the dead internet hypothesis, the content that
         | you're enjoying today, will still be there tomorrow. What I
         | mean is if you, for example, like Mr.Beast, AI is not going to
         | replace him and the content that he produces. Now, he might use
         | AI to boost the productivity of his company, but the end result
         | is still "making the best video ever" as he's often said.
        
           | LauraMedia wrote:
           | The big problem with this is that content is harder and
           | harder to find. Try to find a non-AI generated reply to a
           | viral post on Twitter, you're looking at having to scroll
           | down 5-6 1080p screens to finally get to some actual stuff
           | people wrote.
           | 
           | The content you're enjoying today still exists, but it's a
           | needle in a haystack of AI spam
        
             | Shinchy wrote:
             | This is the exact thing I keep telling people. It's all
             | well and good saying human made content will still be
             | around, but it will be covered in a tidal wave of cheaply
             | generated AI hogwash.
        
               | throwawayfour wrote:
               | Reminds me of shopping on <enter your favorite large
               | ecommerce site>
        
             | shinycode wrote:
             | We need a law or something that impose platforms to label
             | any text that is only AI and text reworked by AI. And the
             | possibility to filter both (we did this with industrial
             | products). Then let humanity decide what it wants to feed
             | itself with. I prefer to give up completely internet if it
             | would only be filled with generated content. I gladly let
             | it to people who enjoy that. Maybe a platform that label
             | this and allows strict filtering (if possible) would be a
             | success.
        
               | cpursley wrote:
               | Can we do that for the mountains of ghost-written content
               | and books as well?
        
               | shinycode wrote:
               | My take is to explicitly mark a difference between human
               | generated content and AI generated content. Not to label
               | one superior to the other. It's just to let people choose
               | what they prefer. Like in chat bots for some companies
               | they let you know you don't talk to a human. Would you
               | blindly accept a medical prescription generated by an AI
               | ? Some people might even prefer the prescription made by
               | the AI. All I'm saying is to inform people. After they
               | make their choice.
        
             | lukas099 wrote:
             | The signal:noise ratio is decreasing because it's easier to
             | generate noise. I think paying for content (or content
             | curation) is probably the way to curate high-signal
             | information feeds.
        
         | elorant wrote:
         | Every content that exists on the web could now be rewritten and
         | repurposed by LLMs. This could lead into an explosion of web
         | sites that could easily double in size every few years. Good
         | luck indexing all that crap and deciding what is duplicate and
         | what not.
        
         | sbeaks wrote:
         | Are we going to end up with rererecapture where to post you
         | need something on your phone/laptop measuring typing speed and
         | scanning your face with an IR cam? Or a checkmark showing typed
         | out by hand? Wouldn't get rid of ai content but may slow down
         | bots posting it.
        
         | kristopolous wrote:
         | They're trajectory is so close to AOL's it's almost
         | implausible. Their cash cow flagship product is widely panned
         | by tech insiders as abusive, manipulative, and toxic but they
         | also place significant financial resources in high quality open
         | source projects out of what can only be described as
         | benevolence and some commitment to the common good.
        
           | aranke wrote:
           | Which open-source projects is AOL known for? A quick Google
           | search isn't returning much.
        
             | jonathanwallace wrote:
             | https://www.google.com/search?hl=en&q=aol%20tcl
             | 
             | https://wiki.tcl-lang.org/page/AOLserver
        
             | kristopolous wrote:
             | Dropping cash for Netscape/Mozilla is the big one.
        
         | csomar wrote:
         | That's actually the preferred outcome. The open internet noise
         | ratio will be so high that it turns into pure randomness. The
         | traditional old venues (reputed blogs, small communities
         | forums, pay for valued information, pay for your search, etc..)
         | will resurface again. The Popular Web has been in a slow
         | decline, time to kill it.
        
           | wildrhythms wrote:
           | My concern is that these platforms will soon sell Human
           | Created (tm) content back to us.
        
           | thierrydamiba wrote:
           | Is it really a decline? If people are looking for and
           | consuming the slop, where is the issue?
           | 
           | There is still plenty high quality stuff too if that is what
           | you're looking for. If you want to roll with the pigs in the
           | shit, who am I to tell you no?
        
           | dspillett wrote:
           | _> The traditional old venues [...] will resurface again._
           | 
           | ... to be subsequently drowned out by AI "copies" of
           | themselves, which in turn are used to train more AIs, until
           | we don't have a Dead Internet1 but a Habsburg Internet.
           | 
           | --
           | 
           | [1] https://en.wikipedia.org/wiki/Dead_Internet_theory
        
         | eightysixfour wrote:
         | I am one of those people unironically excited - the social
         | parts of the internet have been dead and filled with bots for a
         | long time. Now people just see it more.
         | 
         | Maybe they'll go outside.
        
           | zwnow wrote:
           | Bots were easy to detect, now they're almost
           | indistinguishable from human interaction. The death of the
           | internet would be an incredible loss for humanity. You will
           | not be able to trust anything you find online. Nothing is
           | safe from this.
        
             | eightysixfour wrote:
             | I am using the broad definition of bots to include large
             | numbers of accounts controlled by small groups of people to
             | influence online discourse.
             | 
             | Between those bots (for nefarious, mundane, or marketing
             | reasons) and previous attempts at automated bots, "broad"
             | internet discourse was already ruined. Now people recognize
             | it. This will have the effect of pushing communities back
             | to smaller sizes, I think this is a good thing.
             | 
             | People shouldn't have trusted all the things they read
             | online from untrusted sources in the first place.
        
         | Kiro wrote:
         | And I don't understand why people lump all AI together as if a
         | coding assistant is the same thing as AI generated spam and
         | other garbage. I'm pretty sure no-one here is excited about
         | that.
         | 
         | I'm excited about the former since AI has massively improved my
         | productivity as a programmer to a point where I can't imagine
         | going back. Everything is not black or white and people can be
         | excited about one part of something and hate another at the
         | same time.
        
           | chr15m wrote:
           | fear is why
        
           | sensanaty wrote:
           | Seeing some of the code my colleagues are shitting out with
           | the help of coding "assistants", I would definitely
           | categorize their contributions as spam, and has had nothing
           | but an awful effect on my own time and energy, having to sift
           | through the unfiltered crap. The problem being, of course,
           | that the idiotic C-suite in their infinite wisdom decided to
           | push "use AI assistants" as a KPI, so people are even
           | encouraged to spam PRs with terrible code.
           | 
           | If this is what productivity looks like then I'm proud to be
           | unproductive.
        
             | Kiro wrote:
             | I'm sorry that you work at a dysfunctional company.
        
         | testfrequency wrote:
         | Hear, hear!
        
       | codedokode wrote:
       | AI should not be open source because it can be used in military
       | applications. It doesn't make sense to give away a technology
       | others might use against you.
        
       | ChanderG wrote:
       | I think all this discussion around Open-source AI is a total
       | distraction from the elephants in the room. Let's list what you
       | need to run/play around with something like Llama:
       | 
       | 1. Software: this is all Pytorch/HF, so completely open-source.
       | This is total parity between what corporates have and what the
       | public has.
       | 
       | 2. Model weights: Meta and a few other orgs release open models -
       | as opposed to OpenAI's closed models. So, ok, we have something
       | to work with.
       | 
       | 3. Data: to actually do anything useful you need tons of data.
       | This is beyond the reach of the ordinary man, setting aside the
       | legality issues.
       | 
       | 4. Hardware: GPUs, which are extremely expensive. Not just that,
       | even if you have the top dollars, you have to go stand in a queue
       | and wait for O(months), since mega-corporates have gotten there
       | before you.
       | 
       | For Inference, you need 1,2 and 4. For training (or fine-tuning),
       | you need all of these. With newer and larger models like the
       | latest Llama, 4 is truly beyond the reach of ordinary entities.
       | 
       | This is NOTHING like open-source, where a random guy can
       | edit/recompile/deploy software on a commodity computer. Wrt LLMs,
       | Data/Hardware are in the equation, the playing field is complete
       | stacked. This thread has a bunch of people discussing nuances of
       | 1 and 2, but this bike-shedding only hides the basic point:
       | Control of LLMs are for mega-corps, not for individuals.
        
         | fishermanbill wrote:
         | But there is an insidiousness to Meta calling their software
         | 'open source'. It feels as if they are riding on the coat tails
         | of the term as if they are being altruistic, when in fact they
         | are being no more altruistic than any large corporation that
         | wants to capture market share via their financial muscle -
         | which I suppose touches on your last point.
        
       | fishermanbill wrote:
       | Its not open source.
       | 
       | We don't get the data or training code. The small runtime
       | framework is open source but that's of little use as its largely
       | fixed in implementation due to the weights. Yes we can fine tune
       | but that is akin to modifying video games - we can do it but
       | there's only so much you can do within reasonable effort and no
       | one would call most video games 'open source'*.
       | 
       |  _Its freeware and Meta 's strategy is much more akin to the
       | strategy Microsoft used with Internet Explorer to capture the web
       | browser market. No one was saying god bless Microsoft for trying
       | to capture the browser market with I.E. Nothing wrong with Meta's
       | strategy just don't call it open source._
       | 
       | *weights are data and so is the video/audio output of a video
       | game. If we gave away that video game output for free we wouldn't
       | call the video game open source as the myriad freeware games
       | essentially do.
        
         | Palmik wrote:
         | I don't think these analogies work.
         | 
         | Meta provides open source code to modify the the weights (fine
         | tune the model). In this context, fine-tuning the model is
         | better converted to being able to modify the code of the game.
        
           | fishermanbill wrote:
           | So do video game developers (provide source code to modify
           | their games) the analogy absolutely works. I can list a huge
           | amount of actually open source software that I can see the
           | source code and data for which is very different from Llama
           | etc.
        
       | OriginalMrPink wrote:
       | Open Source AI is the path forward, but I have hard time
       | believing that Meta should be affiliated with it.
        
       | pera wrote:
       | I wish Meta stopped using the "open source" misnomer for free of
       | charge weights. In the US the FTC already uses the term _Open-
       | Weights_ , and it seems the industry is also adopting this term
       | (e.g. Mistral).
       | 
       | Someone can correct me here but AFAIK we don't even know which
       | datasets are used to train these models, so why should we even
       | use "open" to describe Llama? This is more similar to a freeware
       | than an open-source project.
       | 
       | [1] https://www.ftc.gov/policy/advocacy-research/tech-at-
       | ftc/202...
        
         | benrutter wrote:
         | This is such a good point. The industry is really putting the
         | term "open source" through the ringer at the moment but I don't
         | see any justification for considering the final weight output a
         | "source" anymore than releasing a compiled binary would be open
         | source.
         | 
         | In fairness to Llama, the source code itself (though not the
         | training data) _is_ available to access, although not really
         | under a license that many would consider open source.
        
         | zelphirkalt wrote:
         | Facebook is one of the great when it comes to twisting words
         | and appropriating terms in ways that benefit Facebook.
        
       | largbae wrote:
       | The All-In podcast predicted this exact strategy for keeping
       | OpenAI and other upstarts from disrupting the existing big tech
       | firms.
       | 
       | By giving away higher and higher quality models, they undermine
       | the potential return on investment for startups who seek money to
       | train their own. Thus investment in foundation model building
       | stops and they control the ecosystem.
        
         | giancarlostoro wrote:
         | I enjoy that podcast, only really listened to it a few times,
         | but they definitely bring up some interesting topics, the kind
         | I come on HN for.
        
         | thierrydamiba wrote:
         | I predicted this 7 months ago-can I get a podcast?
         | 
         | https://news.ycombinator.com/item?id=38556771#38559118
        
       | zelphirkalt wrote:
       | Open data for open algo for open AI is the path forward.
        
       | dcist wrote:
       | So commoditize the complement.
        
       | ssahoo wrote:
       | Additional Commercial Terms. If, on the Llama 3.1 version release
       | date, the monthly active users of the products or services made
       | available by or for Licensee, or Licensee's affiliates, is
       | greater than 700 million monthly active users in the preceding
       | calendar month, you must request a license from Meta, which Meta
       | may grant to you in its sole discretion, and you are not
       | authorized to exercise any of the rights under this Agreement
       | unless or until Meta otherwise expressly grants you such rights.
       | 
       | Which open-source has such restrictions and clause?
        
         | rmbyrro wrote:
         | Which open source costs dedicated usage of 16 thousand H100
         | over several months?
         | 
         | C'mon folks, they're opening up for free to 99.99% of potential
         | users what cost hundreds of millions of dollars, if not in the
         | ballpark of a billion.
         | 
         | Let's appreciate that, instead of focusing on semantics for a
         | while.
        
           | arrosenberg wrote:
           | I don't think the largest tech companies in the world have
           | earned that view of benevolence. Its real hard to take
           | altruism seriously when it is coming from Zuckerberg.
        
           | birdalbrocum wrote:
           | Licensing is not a simple semantic problem. It is a legal
           | problem that have strong ramifications, especially things are
           | on their way to standardize. What Facebook is trying to do
           | with their "open source" models is to exhaust possibility of
           | fully open source models to be industry standarts. and create
           | an alternative monopoly to Microsoft/OpenAI. Think of it as
           | if an entity had right to ISO standards, they would be
           | extremely rich. Eventually researchers will release pretty
           | advance ML models that are fully open source(from dataset to
           | training code) and Facebook is trying to block them even
           | before start to prevent of the possibility of this models to
           | be standard. This is a complementary tactic of the industry
           | to closed source rivals and should not be understood as
           | challenging to them.
           | 
           | A good wording for this is "open-washing" as described in
           | this paper:
           | https://dl.acm.org/doi/fullHtml/10.1145/3630106.3659005
        
       | maxdo wrote:
       | I'm really unsure if it's a good idea given the current
       | geopolitics.
       | 
       | Open-Source Code in the past was fantastic because the West had a
       | monopoly on CPUs and computers. Sharing and contributing was
       | amazing while ensured that tyrants couldn't use this tech to harm
       | people simply because they don't have a hardware to run.
       | 
       | But now, things are different. China is advancing in chip
       | technology, and Russia is using open-source AI to harm people on
       | the scale today, with auto-targeting drones being just the start.
       | Red sea conflict etc.
       | 
       | And somehow, Zuckerberg keeps finding ways to mess up people's
       | lives, despite having the best intentions.
       | 
       | Right now you can build a semi-autonomous drone with AI to kill
       | people for ~$500-700. The western world will still use safe and
       | secure commercial models. While new axis of evil will use models
       | based on Meta or any other open source to do whatever harm they
       | can imagine with not a hint of control.
       | 
       | This particular model. Fine-tune it to develop a nuclear bomb
       | using all possible research that level of government can get on
       | the scale. Killing drone swarms etc. Once the knowledge got
       | public these models can be a base model to get expert-level
       | knowledge to anyone who wants it, uncensored. Especially if you
       | are government that wants to destroy a peaceful order for
       | whatever reason.
        
         | rmbyrro wrote:
         | You think Russia and China wouldn't be able to steal any closed
         | model for a couple million dollars?
        
           | that_guy_iain wrote:
           | Wouldn't even need to pay to steal it. FAANG have been shown
           | to be hacked by state actors. China has some of the best
           | hackers in the world.
        
           | maxdo wrote:
           | stealing a model and building the entire AI community is a
           | very very big difference :
           | 
           | Fine tune, update, Run model without very deep domain
           | knowledge, that's what we receive as an outcome.
           | 
           | If you are a software engineer and you steal a model in some
           | close format of Open AI , you will not get lots of benefits
           | even if you understand the format of that model, it's a
           | complex beast by all means.
           | 
           | This is a playbook how anyone can run it.
           | 
           | So yeah, big corp is evil from one side, but oh well, think
           | of North Korea, Russia etc level of evilness and what they
           | can do whit that.
        
             | talldayo wrote:
             | > think of North Korea, Russia etc level of evilness and
             | what they can do whit that.
             | 
             | To date, I have not seen any "evil" applications of AI, let
             | alone dangerous or even useful ones. If Russia or North
             | Korea get their hands on a modern AI model, the CIA will
             | get their "Red Mercury" call:
             | https://en.wikipedia.org/wiki/Red_mercury
        
         | adhamsalama wrote:
         | So, only the west should be able to use AI to kill people
         | because they're the good guys?
        
         | AlexandrB wrote:
         | This argument reminds me a lot of restrictions on exporting
         | encryption in the 90s.
        
         | wavemode wrote:
         | You're vastly overestimating the capability of LLMs to create
         | new knowledge not already contained in their training material.
        
         | keepswag wrote:
         | It was not amazing that the West had monopolies because they
         | are the ones using AI and advancing AI tech to harm people. Im
         | not sure what youre getting at here with that comment
         | 
         | https://www.vox.com/future-perfect/24151437/ai-israel-gaza-w...
         | 
         | https://www.972mag.com/mass-assassination-factory-israel-cal...
         | 
         | https://www.theguardian.com/world/2024/apr/03/israel-gaza-ai...
        
       | Gravityloss wrote:
       | Can't it be divided into multiple parts to have a more meaningful
       | discussion? For example the terminology could identify four key
       | areas:                   - Open training data (this is very big)
       | - Open training algorithms (does it include infrastructure code?)
       | - Open weights (result of previous two)         - Open runtime
       | algorithm
        
       | jll29 wrote:
       | The question is what is "open source" in the case of a matrix of
       | numbers, as opposed to code.
       | 
       | Also, are there any "IP" rights attached at all to a bunch
       | numbers coming out of a formula that someone else calculated for
       | you? (edit: after all, a "model" is just a matrix of numbers
       | coming out of running a training algorithm that is not owned by
       | Meta over training data that is not owned by Meta.)
       | 
       | Meta imposes a notification duty AND a request for another
       | license (no mention of the details of these) for applications of
       | their model with a large number of users. This is against the
       | spirit of open source. (In practical terms it is not a show
       | stopper since you can easily switch models, although they all
       | have subtlely different behaviours and quality levels.)
        
       | abss wrote:
       | Interesting, but we have to consider this information with
       | skepticism since it comes from Meta. Additionally, merely open-
       | sourcing models is insufficient; the training data must also be
       | accessible to verify the outcomes. Furthermore, tools and
       | applications must be freely deployable and capable of storing and
       | sharing data under our personal control. Self-promotion: We have
       | initiated experiments for an AI-based operating system, check
       | AssistOS.org. We recently received a European research grant to
       | support the improvement of AssistOS components. Contact us if you
       | find our work interesting, wish to contribute, conduct research
       | with us, or want to build an application for AssistOS.
        
       | Purplehermann wrote:
       | Well that's it then, we're gonna die
        
       | bzmrgonz wrote:
       | I see it as a new race to build the personal computer (PC) all
       | over again. I hope we can apply the lessons learned and can jump
       | into open source to speed up development and democratize ai for
       | all. We know how Microsoft played dirty in the early days of the
       | PC revolution.
        
       | arisAlexis wrote:
       | Zuckerberg and LeCunn put humans at great risk
        
       ___________________________________________________________________
       (page generated 2024-07-24 23:14 UTC)