[HN Gopher] Building Meta's GenAI infrastructure
___________________________________________________________________
Building Meta's GenAI infrastructure
Author : mootpt
Score : 383 points
Date : 2024-03-12 15:52 UTC (7 hours ago)
(HTM) web link (engineering.fb.com)
(TXT) w3m dump (engineering.fb.com)
| alexsereno wrote:
| Honestly Meta is consistently one of the better companies at
| releasing tech stack info or just open sourcing, these kinds of
| articles are super fun
| adamnemecek wrote:
| Do you find this informative?
| alexsereno wrote:
| Yes of course - it depends on what lens though. If you mean
| "I'm learning to build better from this" then no, but its
| very informative on Meta's own goals and mindset as well as
| real numbers that allow comparison to investment in other
| areas, etc. Also the point was mostly that Meta does publish
| a lot in the open - including actual open source tech stacks
| etc. They're reasonably good actors in this specific domain.
| rshm wrote:
| I think some elements of this stack might flow into the open
| compute.
| CuriouslyC wrote:
| Yann wants to be open and Mark seems happy to salt the earth.
| bananabrick wrote:
| What do you mean?
| CuriouslyC wrote:
| In pretty much every interview, Yann has talked about how
| important that AI infrastructure is open and distributed for
| the good of humanity, and how he wouldn't work for a company
| that wasn't open. Since Mark doesn't have an AI product to
| cannibalize, it's in his interest to devalue the AI products
| of others ("salting the earth").
| Legend2440 wrote:
| I don't see how they're devaluing other people's AI
| products.
| CuriouslyC wrote:
| The Llama models have played a large part in fostering
| the development of the open source LLM ecosystem, and I
| expect Llama3 to put in performance > mistral medium and
| anthropic haiku while being fully open and able to be run
| on consumer hardware.
| crakenzak wrote:
| The angle is that by releasing cutting edge AI research
| to the public openly, the relative difference between
| open source models/tech and closed source tech shrinks.
|
| Whether or not you think the "value" of AI products is
| proportional to their performance gap vs the next closest
| thing or not is up to you. Very interesting PG essay I
| read recently talks about the opposite of this
| (Superlinear returns) where if you're half as good as the
| next competitor, you don't get half the customers, you
| get 0.
|
| Essay: https://paulgraham.com/superlinear.html
| nova22033 wrote:
| New Linux versions don't "salt the earth" for Windows.
| nemothekid wrote:
| Linux is not competitive as a desktop platform for
| regular users, but linux did "salt the earth" for the
| Server market.
| conradev wrote:
| Mistral makes comparable models to Facebook. Mistral
| charges money, Facebook does not. This negatively
| affect's Mistral's pricing power because a customer can
| get 70% of the performance they need for 0% of the cost.
|
| The "0% of the cost" part is unique to software
| businesses because you can copy software so cheaply
| jedberg wrote:
| It's called commoditize the compliment.
|
| If they make AI models free to use it makes OpenAI nearly
| valueless, which means that they can't survive and then
| sell Meta's competitors a better GenAI product than Meta
| can make themselves.
|
| So basically since they don't make money directly on
| GenAI, it makes sense for them to release it for free so
| no one else can have something better, so they don't have
| to compete on GenAI abilities with their competitors.
| chasd00 wrote:
| is "salting the earth", in the biblical sense of destroying
| your enemy and their land to the point where not even
| plants grow again, a SV term used for companies that
| promote open source?
| torginus wrote:
| I genuinely think one of the most plausible short-term dangers
| of AI is the creation of lifelike bots which will be absolutely
| indistinguishable from real humans in short-form online
| interaction.
|
| Since people don't want to talk to algorithms, this would
| result in them shunning all social media, which is a huge
| danger to companies in the space.
| marmaduke wrote:
| Just for comparison, Swiss CSCS new Alps system will get 5k GH200
| nodes (each with a H100).
| mjburgess wrote:
| I'd be great if they could invest in an alternative to nvidia --
| then, in one fell swoop, destroy the moats of everyone in the
| industry.
| math_dandy wrote:
| A company moving away from Nvidia/CUDA while the field is
| developing so rapidly would result in that company falling
| behind. When (if) the rate of progress in the AI space slows,
| then perhaps the big players will have the breathing room to
| consider rethinking foundational components of their
| infrastructure. But even at that point, their massive
| investment in Nvidia will likely render this impractical.
| Nvidia decisively won the AI hardware lottery, and that's why
| it's worth trillions.
| mjburgess wrote:
| I'm more concerned to avoid nvidia (et al.) market
| domination, than chasing the top-edge of the genAI benefits
| sigmoid. This will prevent much broad-based innovation.
| hx8 wrote:
| This space is so compeitive, even if Nvidia is asleep at
| the wheel a competitor will come and push them before too
| long. AMD has a history of noticing when their competitors
| are going soft and rapidly being compeitive.
| whiplash451 wrote:
| People said the same thing when tensorflow was all the rage
| and pytorch was a side project.
|
| Granted, HW is much harder than SW, but I would not discount
| Meta's ability to displace NVIDIA entirely.
| Cthulhu_ wrote:
| I don't think they could; nvidia has tons of talent, Meta
| would have to steal that. Meta doesn't do anything in
| either consumer or datacenter hardware that isn't for
| themselves either.
|
| Meta is a services company, their hardware is secondary and
| for their own usage.
| Wazako wrote:
| meta has the Quest. It's not so bad that they're looking
| to create an LPU for their headset to offer local play.
| John23832 wrote:
| https://engineering.fb.com/2023/10/18/ml-applications/meta-a...
| paxys wrote:
| Except that "one fell swoop" would realistically be 20+ years
| of research and development from the top minds in the
| semiconductor industry.
| logicchains wrote:
| It's not the hardware keeping NVidia ahead, it's the
| software. Hardware-wise AMD is competitive with NVidia, but
| their lack of a competitive CUDA alternative is hurting
| adoption.
| aeyes wrote:
| Isn't Google trying to do this with their TPUs?
| crakenzak wrote:
| I still, for the life of me, can't understand why Google
| doesn't just start selling their TPUs to everyone. Nvidia
| wouldn't be anywhere near their size if they only made H100s
| available through their DGX cloud, which is what Google is
| doing only making TPUs available through Google Cloud.
|
| Good hardware, good software support, and market is starving
| for performant competitors to the H100s (and soon B100s).
| Would sell like hotcakes.
| qiine wrote:
| Maybe selling hardware to customers worldwide + support
| like Nvidia does is actually not trivial ?
| ajcp wrote:
| And undercut what they'd like to use as a huge motivator in
| people moving to GCP? Not likely. Even if they wanted to
| they can't keep up with their own internal demand.
|
| Beyond that they might not be as stable or resilient
| outside of the closely curated confines of their own data-
| centers. In that case selling them would be more of an
| embarrassment.
| htrp wrote:
| >Beyond that they might not be as stable or resilient
| outside of the closely curated confines of their own
| data-centers. In that case selling them would be more of
| an embarrassment.
|
| Once you go out of your heavily curated hardware stack,
| the headaches multiply exponentially.
| aseipp wrote:
| It is an absolutely massive amount of work to turn
| something designed for your custom software stack and data
| centers (custom rack designs, water cooling, etc) into a
| COTS product that is plug-and-play; not just technically
| but also things like sales, support, etc. You are
| introducing a massive amount of new problems to solve and
| pay for. And the in-house designs like TPUs (or Meta's
| accelerators) are cost effective in part because they _don
| 't_ do that stuff at all. They would not be as cheap per
| unit of work if they had to also pay off all that other
| stuff. They also have had a very strong demand for TPUs
| internally which takes priority over GCP.
| neuronexmachina wrote:
| The impression I got from this thread yesterday is that
| Google's having difficulty keeping up with the heavy
| internal demand for TPUs:
| https://news.ycombinator.com/item?id=39670121
| dekhn wrote:
| Do you mean, sell TPU hardware to other companies that
| would run it in their data centers? I can't imagine that
| would ever really work. The only reason TPUs work at Google
| is because they have huge teams across many different areas
| to keep them running (SRE, hardware repair, SWE, hardware
| infra) and it's coupled to the design of the data centers.
| To vend and externalize the software would require google
| to setup similar teams for external customers (well beyond
| what Google Cloud provides for TPUs today) just to eke out
| some margin of profit. Plus, there is a whole proprietary
| stack running under the hood that google wouldn't want to
| share with potential competitors.
|
| Google used to sell a search appliance-in-a-box and
| eventually lost interest because hardware is so high-touch.
| aeyes wrote:
| > Google used to sell a search appliance-in-a-box and
| eventually lost interest because hardware is so high-
| touch.
|
| We had a GSA for intranet search and other than the paint
| this was a standard Dell server. I remember not being
| impressed by what the GSA could do.
|
| We also had Google Urchin for web analytics, it wasn't a
| hardware appliance but the product wasn't very impressive
| either. They then killed that and tried to get you onto
| Google Analytics.
|
| They just didn't commit to these on premise enterprise
| products.
| brucethemoose2 wrote:
| Facebook very specifically bought and customized Intel SKUs
| tailored for AI workloads for some time.
| islewis wrote:
| I know we won't get it this from FB, but I'd be really interested
| to see how the relationship of compute power to engineering hours
| scales.
|
| They mention custom building as much as they can. If FB magically
| has the option to 10x the compute power, would they need to re-
| engineer the whole stack? What about 100x? Is each of these re-
| writes just a re-write, or is it a whole order of magnitude more
| complex?
|
| My technical understanding of what's under the hood of these
| clusters is pretty surface level- super curious if anyone with
| relevant experience has thoughts?
| bilekas wrote:
| I'm not 100% sure but I would.make an educated guess that that
| cluster in the first image for example is a sample of scalable
| clusters, so throwing more hardware at it could bring
| improvements but sooner or later the cost to improvements will
| call for an optimization or rewrite as you call it, so a bit of
| both usually. It seems a bit of a balancing act really!
| tintor wrote:
| "just a re-write"
| mirekrusin wrote:
| ...the idea is that at some point it "just re-writes" itself.
| lvl102 wrote:
| This reads more like a flex for the investment community.
| choppaface wrote:
| Total cluster they say will reach 350k H100, which at $30k street
| price is about $10b.
|
| In contrast, Microsoft is spending over $10b per quarter capex on
| cloud.
|
| That makes Zuck look conservative after his big loss on
| metaverse.
|
| https://www.datacenterdynamics.com/en/news/q3-2023-cloud-res...
| baby wrote:
| What loss lol. Stop the fud
| Legend2440 wrote:
| Has literally anyone spent money on the metaverse? Maybe
| it'll still take off in the future, but it's a $40b loss so
| far.
| artninja1988 wrote:
| >Has literally anyone spent money on the metaverse?
|
| I guess people buy their vr headsets, if that counts. I'm
| not too familiar with what the "metaverse" entails
| though...
| yuliyp wrote:
| That's a weird comparison. The GPU is only a part of the capex:
| there's the rest of the servers and racks, the networking, as
| well as the buildings/cooling systems to support that.
| KaiserPro wrote:
| the biggest cost at meta is infra.
|
| > In contrast, Microsoft is spending over $10b per quarter
| capex on cloud.
|
| to service other people's work load. Its a different business.
| DEDLINE wrote:
| I wonder if Meta would ever try to compete with AWS / MSFT / GOOG
| for AI workloads
| lifeisstillgood wrote:
| FB does not have the flywheel of running data centres - all
| three of those mentioned run hyper scale datacentres that they
| can then juice by "investing" billions in AI companies who then
| turn around and put those billions as revenue in the investors
|
| OpenAI takes money from MSFT and buys Azure services
|
| Anthropic takes Amazon money and buys AWS services (as do many
| robotics etc)
|
| I am fairly sure it's not illegal but it's definitely low
| quality revenue
| woah wrote:
| Sounds like it's free equity at the very least
| lotsofpulp wrote:
| How is it free equity? Spending money to invest it
| somewhere involves risks. You might recover some of it if
| the investment is valued by others, but there is no
| guarantee.
| miohtama wrote:
| You do not need cash in hands to invest. Instead, you
| print your own money (AWS credit) and use that to drive
| up the valuation, because this money costs you nothing
| today.
|
| It might cost tomorrow though, when the company starts to
| use your services. However depending the deal structure
| they might not use all the credit, go belly up before
| credit is used or bought up by someone with real cash.
| virtuallynathan wrote:
| Facebook has more datacenter space and power than Amazon,
| Google, and Microsoft -- possibly more than Amazon and
| Microsoft combined...
| dsp wrote:
| [citation needed]
| jedberg wrote:
| Unless you've worked at Amazon, Microsoft, Google, and
| Facebook, or a whole bunch of datacenter providers, I'm not
| sure how you could make that claim. They don't really share
| that information freely, even in their stock reports.
|
| Heck I worked at Amazon and even then I couldn't tell you
| the total datacenter space, they don't even share it
| internally.
| virtuallynathan wrote:
| You can just map them all... I have. I also worked at AWS
| :)
| karmasimida wrote:
| I don't think so, AWS hasn't disclosed this numbers, like
| datacenter spaces occupied, so how do you know.
| virtuallynathan wrote:
| I have mapped every AWS data center globally, and I
| worked at AWS.
|
| Facebook publishes this data.
| pgwhalen wrote:
| I have zero evidence, but this seems extremely unlikely. Do
| you have more than zero evidence?
| meiraleal wrote:
| Meta can use all their datacenter space while Amazon,
| Google, and Microsoft datacenter space is mostly rented.
| virtuallynathan wrote:
| To date, facebook has built, or is building, 47,100,000 sq
| ft of space, totaling nearly $24bn in investment. Based on
| available/disclosed power numbers and extrapolating per
| sqft, I get something like 4770MW.
|
| Last I updated my spreadsheet in 2019, Google had $17bn in
| investments across their datacenters, totaling 13,260,000
| sq ft of datacenter space. Additional buildings have been
| built since then, but not to the scale of an additional
| 30mil sq ft.
|
| Amazon operates ~80 datacenter buildings in Northern
| Virginia, each ~200,000 sq ft -- about 16,000,000sq ft
| total in that region, the other regions are much much
| smaller, perhaps another 4 mil sq ft. When I'm bored I'll
| go update all my maps and spreadsheets.
| vineyardmike wrote:
| NVidia also invests in their AI customers.
| itslennysfault wrote:
| Neither did AWS when they started. They were just building
| out data centers to run their little book website and decided
| to start selling the excess capacity. Meta could absolutely
| do the same, but in the short term, I think they find using
| that capacity more valuable than selling it.
| otterley wrote:
| > Neither did AWS when they started. They were just
| building out data centers to run their little book website
| and decided to start selling the excess capacity.
|
| This is a myth. It simply isn't true. AWS was conceived as
| a greenfield business by its first CEO. Besides, S3 and SQS
| were the first AWS services; EC2 didn't appear till a few
| years later. And it wasn't built from excess Amazon server
| capacity; it was totally separate.
| miohtama wrote:
| Such barter deals were also popular during the 00s Internet
| Bubble.
|
| Here more on the deals (2003):
|
| https://www.cnet.com/tech/services-and-software/aol-saga-
| ope...
|
| Popular names included AOL, Cisco, Yahoo, etc.
|
| Not sure if Amazon's term sheets driving high valuation are
| nothing but AWS credits (Amazon's own license to print
| money).
| rthnbgrredf wrote:
| Meta could build their own cloud offering. But it would take
| years to match the current existing offerings of AWS, Azure and
| GCP in terms of scale and wide range of cloud solutions.
| bionhoward wrote:
| aww, those existing offerings are overcomplicated as hell, a
| fresh look could yield substantially simpler cloud developer
| experience and this would compete well against those other
| cloud offerings on simplicity alone
| oblio wrote:
| The real question is: why aren't they? They had the
| infrastructure needed to seed a cloud offering 10 years ago.
| Heck, if Oracle managed to be in 5th (6th? 7th?) place,
| Facebook for sure could have been a top 5 contender, at
| least.
| KaiserPro wrote:
| because meta sucks at software, documentation and making
| sure end user products work in a supported way.
|
| Offering reliable IaaS is super hard and capital intensive.
| Its also not profitable if you are perceived as shit.
| logicchains wrote:
| >because meta sucks at software
|
| Google started a cloud and their user-facing software is
| atrocious. Compared e.g. Angular to React, Tensorflow to
| Pytorch.
| Cthulhu_ wrote:
| And then there's sales. All of those three - and more you
| haven't considered, like the Chinese mega-IT companies -
| spend huge amounts on training, partnerships, consultancy,
| etc to get companies to use their services instead of their
| competitors. My current employer seems all-in on Azure,
| previous one was AWS.
|
| There was one manager who worked at two large Dutch companies
| and sold AWS to them, as in, moving their entire IT,
| workloads and servers over to AWS. I wouldn't be surprised if
| there was a deal made there somewhere.
| crowcroft wrote:
| I think Meta have avoided doing this because it would
| complicate their business priorities. They don't really do B2B.
| carlossouza wrote:
| What do you mean by "they don't do B2B"? They sell ads to
| companies, don't they?
| redleader55 wrote:
| For consumers, AI could just be stateless "micro service". Meta
| already has enough surfaces where customers can interact with
| AI.
| hendersoon wrote:
| 350k H100 cards, around ten _billion_ dollars just for the GPUs.
| Less if Nvidia gives a volume discount, which I imagine they do
| not.
| renegade-otter wrote:
| It will be ironic if Meta sinks all this money into the new
| trend and finds out later that it has been a huge boondoggle,
| just as publishers followed Facebook's "guidance" on video
| being the future, subsequently gutting the talent pool and
| investing into video production and staff - only to find out it
| was all a total waste.
| motoxpro wrote:
| It already paid off. When the world moved from determinisic
| to probablistic ad modeling. That's why their numbers are so
| good right now compared to every other advertiser
| blitzar wrote:
| It already paid off. FB stonk price is up lots.
| echelon wrote:
| As a practitioner in the field, I can assure you this is not
| a boondoggle.
|
| Those GPUs are going to subsume the entire music, film, and
| gaming industries. And that's just to start.
| __loam wrote:
| "My paycheck depends on this technology destroying every
| field producing cultural artifacts"
| echelon wrote:
| Said the butter churner, cotton ginner, and petrol
| pumper.
|
| I work in film. I've shot dozens of them the old
| fashioned way. I've always hated how labor, time, and
| cost intensive they are to make.
|
| Despite instructions from the luminaries to "just pick up
| a camera", the entire process is stone age. The field is
| extremely inequitable, full of nepotism and "who you
| know". Almost every starry-eyed film student winds up
| doing drudge work for the rest of their lives. Most will
| never make a feature to match their ambition.
|
| If the whole task was to simply convey my thoughts and
| dreams to others, why am I scrambling around to sign
| location rights, capture photons on expensive glass, and
| then smear and splice things together for months on end?
| This is ceremonial and soon to be anachronistic. I'm glad
| that whole mess is going to be replaced. It's a farce.
|
| To phrase it another way - would you like to be hand-
| writing assembly on punch cards? To only gain entrance
| into the field with your mathematics PhD?
|
| To speak of the liberty and the economics, why should I
| have to sell the rights to my idea to a studio so I can
| get it off the ground? Why should I have to obey the
| studio's rules and mind their interference?
|
| This whole Gen AI thing is going to be the biggest
| liberating moment for filmmaking creatives. I know,
| because I am one.
|
| And if you think any Jack or Jill can just come in and
| text prompt a whole movie, you're crazy. It's still hard
| work and a metric ton of good taste.
|
| Art will never die. It's the human soul. It'll take more
| than some tech bros with GPUs to kill it.
|
| AI is just another tool for the artist. A "bicycle for
| the mind" to quote Jobs, and a rocket ship for the
| imagination to convey my own direct experience.
| sangnoir wrote:
| > And if you think any Jack or Jill can just come in and
| text prompt a whole movie, you're crazy. It's still hard
| work and a metric ton of good taste.
|
| Yeah, I cant wait for ChuChuTV to get the best film Oscar
| /s.
| wizzwizz4 wrote:
| > _And if you think any Jack or Jill can just come in and
| text prompt a whole movie, you 're crazy. It's still hard
| work and a metric ton of good taste._
|
| If you want anything _good_ , yes. If you just want
| _something_ ... I reckon it'd take a week to assemble an
| incomprehensible-nonsense-film pipeline, after which it's
| just a matter of feeding the computer electricity.
|
| Short-term, this is going to funnel resources _away_ from
| the people with good taste. Long-term, it might help
| collapse the entire "creative industry", after which we
| might get some of that artist liberation stuff you're
| talking about - but we might just end up with new
| gatekeeping strategies from the wealthy and connected,
| and business as usual.
| echelon wrote:
| > If you want anything good, yes. If you just want
| something ...
|
| You don't even need AI for that.
|
| https://en.wikipedia.org/wiki/YouTube_poop
|
| https://en.wikipedia.org/wiki/Skibidi_Toilet
|
| The idea that AI isn't going to be used as a creative
| tool too and that it won't lead to more and better art is
| a defeatist, Luddite attitude.
|
| Similarly shaped people thought that digital cameras
| would ruin cinema and photography.
|
| > Short-term, this is going to funnel resources away from
| the people with good taste.
|
| On the contrary - every budding film student will soon
| [1] be able to execute on their entire visions straight
| out of the gates. No decades of clawing their way to a
| very limited, almost impossible to reach peak.
|
| > it might help collapse the entire "creative industry"
|
| The studio system. Not the industry.
|
| > new gatekeeping strategies from the wealthy and
| connected, and business as usual.
|
| Creatives have more ways of building brands and
| followings for themselves than ever before. It's one of
| the largest growing sectors of the economy, and lots of
| people are earning livings off of it.
|
| You'll be able to follow that steampunk vampire creator
| that's been missing from the world until now. Every long
| tail interest will be catered to. Even the most obscure
| and wild tastes, ideas, and designs. Stuff that would
| never get studio funding.
|
| As a creative, I'm overjoyed by this. My friends and I
| are getting to create things we never could make before
| [2].
|
| [1] This and next year.
|
| [2] Just an inspiration / aesthetic sample, but we're
| making a full film: https://imgur.com/a/JNVnJIn
| renegade-otter wrote:
| > Similarly shaped people thought that digital cameras
| would ruin cinema and photography.
|
| Obviously, but you seem to be arguing that AI is just
| another evolution of productivity tools. You still need
| to have a photographer's eye while using this technology.
|
| If you couldn't make a good composition on film, a
| digicam will not save you, and it definitely did not
| _replace_ photographers. Perhaps lowered the barrier of
| entry for prosumers.
|
| https://www.nytimes.com/2023/12/26/opinion/ai-future-
| photogr...
| echelon wrote:
| We're arguing the same point. :)
| crmd wrote:
| >You'll be able to follow that steampunk vampire creator
| that's been missing from the world until now. Every long
| tail interest will be catered to. Even the most obscure
| and wild tastes, ideas, and designs. Stuff that would
| never get studio funding.
|
| Your optimism reminds me of the optimism I had around the
| early internet. Power to the people, long tail, rise of
| the creative class, the fall of gatekeeping corporations,
| etc.
|
| It was like that for a couple of years in the late 90s
| before power and control got vastly more centralized than
| before. Maybe this time it'll be different.
| munificent wrote:
| The big difference is that back then, anyone with a
| consumer-level computer in their bedroom could turn it
| into a server and be a first-class citizen on the
| Internet.
|
| With generative AI, models will be controlled by a
| handful of giant corporations who have the enormous
| corpuses (of dubious provenance) and compute ability to
| train them.
|
| So it will be like last time, but even worse.
| echelon wrote:
| You can run ComfyUI and AnimateDiff on your PC. If you
| haven't checked them out, please do.
|
| And there are other angles to consider. Apple, for one,
| is expressly interested in not becoming a thin client to
| cloud AI. They're baking a lot of inference power into
| their chips. If the creative class don't need their
| devices, that doesn't bode well for them...
| munificent wrote:
| Running local models isn't the same as being able to
| train them from scratch yourself on a corpus of your own
| choosing.
| echelon wrote:
| There are so many ways to do exactly this too!
|
| FakeYou, CivitAi, WeightsGg, Comflowy, ... -- there are
| tons of vibrant communities to teach you everything you
| need to know. The tools are open source, free to use, and
| accessible.
|
| This isn't hard at all once you dive in.
| wizzwizz4 wrote:
| Many YouTube Poops are artistic expression (e.g.
| https://redirect.invidious.io/watch?v=dO4eIEvHjSw).
| Skibidi Toilet is _definitely_ artistic expression: it 's
| a full-on _epic_. (Reactions from one [?]50-year-old:
| "baffling" "how did they do that?" "why would anyone make
| this?")
|
| If you think the Luddites were defeatist, you don't know
| much about the Luddites.
|
| > _On the contrary - every budding film student will soon
| [1] be able to execute on their entire visions straight
| out of the gates._ [...] _Creatives have more ways of
| building brands and followings for themselves than ever
| before._
|
| Yet, we have no shortage of starving artists. Will AI
| provide them food and shelter?
|
| This is unequivocally a win for creative expression _for
| hobbyists_ , but it stands to harm professionals - at
| least in the short term, perhaps longer-term. It's not
| happening in a vacuum: the greedy are revoking
| livelihoods because they think AI can do it faster and
| cheaper (laundering appropriated hobbyist and
| increasingly-cheap professional labour).
|
| > _The studio system. Not the industry._
|
| Huh, the word 'industry' has a specialised meaning in
| economics. Didn't know that.
| munificent wrote:
| Call me crazy, but I don't think churning butter and
| writing a novel are in the same category of human
| endeavor at all.
| dist-epoch wrote:
| > The field is extremely inequitable, full of nepotism
| and "who you know"
|
| Maybe, but it's never been cheaper to make a movie.
|
| I know someone with no connections and (almost) no money
| which in 4 years made multiple no. 1 box-office films
| (obviously not in US, in a smaller country) and then got
| picked up by Netflix.
| tayo42 wrote:
| What does video not be in the future mean? In social media
| tiktok and reels are everywhere?
| michaelt wrote:
| There are reports [1] that a bunch of companies like
| "College Humor" were convinced to switch to producing
| native video for facebook (instead of directing users to
| their own sites) on the basis of bullshit metrics from
| facebook, and had an extremely bad time as a result, with
| some companies going bankrupt.
|
| Something like counting an autoplaying video that ran for 3
| seconds as a 'view' IIRC
|
| [1]
| https://twitter.com/adamconover/status/1183209875859333120
| scubbo wrote:
| Thankfully, Dropout (a spin-off of College Humor) is
| alive and well, and producing some of the best D&D Actual
| Play series as well as other non-D&D comedy shows. One of
| the entertainment services that I happily pay for because
| I want to support what they're doing.
| neon_electro wrote:
| They are referring to Facebook/Meta's 2015 "pivot to
| video", speculating there may be a similar thing happening
| more recently with AI.
|
| https://en.wikipedia.org/wiki/Pivot_to_video
| tayo42 wrote:
| Interesting thanks!
|
| Feels like in hind sight, maybe they were just to early
| to it.
| neuronexmachina wrote:
| TIL. Reading up on it a little, I'm surprised the class-
| action settlement was just $40M:
| https://www.videoadvertisingsettlement.com/
| foobarian wrote:
| There is still hope then for cheap gaming GPUs some day soon!
| I have pretty much the last 10 years of flagship releases to
| catch up on...
| gingergoat wrote:
| The article doesn't mention MTIA, meta's custom ASIC for training
| & inference acceleration. https://ai.meta.com/blog/meta-training-
| inference-accelerator...
|
| I wonder if they will use it in RSC.
| dazhbog wrote:
| Searched H100 and an Amazon link popped up. Good reviews.
|
| https://www.amazon.com/Tesla-NVIDIA-Learning-Compute-Graphic...
| mejutoco wrote:
| Those reviews are hilarious
| zerop wrote:
| > At Meta, we handle hundreds of trillions of AI model executions
| per day
|
| Such a large number, makes sense?
| pants2 wrote:
| Perhaps there's some combinatorics where every time an ad or
| post is displayed to the user, it runs through some
| hundreds/thousands of candidates and computes their relevance.
| GeneralMayhem wrote:
| Sure. 100T/day * 1day/86400sec ~= 1B/sec. They're probably
| considering at least a few hundred candidates per impression,
| and every impression is going to go through _at least_ two
| models (relevance and pCTR/revenue), so you could get there
| just with online serving at 5Mqps, which is plausible. But
| they're also going to be doing a lot of stuff in batch - spam
| predictions, ad budget forecasts, etc - so that every candidate
| actually runs through four or five different models, and every
| actual impression could do more than that.
| dakiol wrote:
| What's an "AI model execution"? When I ask something to ChatGPT
| and it answers to me, does that count as 1 "AI model execution"
| for OpenAI?
| sangnoir wrote:
| How many ads does Meta serve a day, and how many AI model
| executions are done for each one? Repeat the same for stories,
| post and comment recommendations on Facebook and Instagram, and
| you have _very_ big numbers. To that, Add VR, internal modeling
| and other backoffice / offline analyses over billions of users
| and you'll easily get into the trillions.
| danielhanchen wrote:
| float8 got a mention! x2 more FLOPs! Also xformers has 2:4
| sparsity support now so another x2? Is Llama3 gonna use like
| float8 + 2:4 sparsity for the MLP, so 4x H100 float16 FLOPs?
| Pytorch has fp8 experimental support, whilst attention is still
| complex to do in float8 due to precision issues, so maybe
| attention is in float16, and RoPE / layernorms in float16 /
| float32, whilst everything else is float8?
| GamerAlias wrote:
| I was thinking why is this one guy on HN so deeply interested
| and discussing technical details from a minor remark. Then I
| clocked the name. Great work on Gemma bugs
| danielhanchen wrote:
| Oh thanks :) I always like small details :)
| andy99 wrote:
| Is there float8 support in any common CPU intrinsics? It sounds
| interesting but curious what will be the impact if any on CPU
| inference.
| ashvardanian wrote:
| Nope. Moreover, simulating it even with AVX-512 is quite an
| experience. Been postponing it for 2 years now... But first
| of all, you need to choose the version of float8 you want to
| implement, as the standards differ between GPU vendors.
| ipsum2 wrote:
| You're still bounded by memory bandwidth, so adding multiples
| to FLOPs is not going to give you a good representation of
| overall speedup.
| jabl wrote:
| Well, those smaller floats require less BW to transfer back
| and forth as well. Perhaps not a reduction linear in the size
| of the float, as maybe smaller floats require more iterations
| and/or more nodes in the model graph to get an equivalent
| result.
|
| But rest assured there's an improvement, it's not like people
| would be doing it if there wasn't any benefit!
| andy99 wrote:
| The impact on bandwidth is the main reason smaller is
| better I belive, certainly when it's the bottleneck. I'm
| only really familiar with CPU but with say FP16 you might
| convert back to FP32 when you're doing the actual
| multiplication (so conversion plus multiplication is
| actually slower) but because you're moving half the data in
| and off you still get a huge speedup.
| j45 wrote:
| Is it safe to assume this is the same float16 that exists in
| Apple m2 chips but not m1?
| elwell wrote:
| > Meta's long-term vision is to build artificial general
| intelligence (AGI)
| latchkey wrote:
| > we have successfully used both RoCE and InfiniBand clusters for
| large, GenAI workloads (including our ongoing training of Llama 3
| on our RoCE cluster) without any network bottlenecks.
|
| Interesting dig on IB. RoCE is the right solution since it is
| open standards and more importantly, available without a 52+ week
| lead time.
| loeg wrote:
| Yeah, and RoCE isn't single vendor. I'm not sure IB scales to
| the relevant cluster sizes, either.
| anonymousDan wrote:
| Is NVLink just not scalable enough here?
| loeg wrote:
| I don't know. I haven't actually worked with IB in this
| specific space (or since before Nvidia acquired MLNX). My
| experience with RoCE/IB was for storage cluster backend in
| the late 2010s.
| fuddle wrote:
| How much are they paying for H100's? If they are paying $10k:
| 350,000 NVIDIA H100 x $10k = $3.5b
| ZiiS wrote:
| They may have to pay a premium to secure ~1/4 of the output;
| certainly unlikely to be that steep a discount.
| theptip wrote:
| Semi analysis posted recently noting that Meta locked in
| these purchases a while ago; something like a year or more.
| So they probably didn't pay today's spot rate.
| YetAnotherNick wrote:
| > $3.5b
|
| Which is a fourth of what they spent in VR/AR in a year. And
| Gen AI is something they could easily get more revenue as it
| has now become proven technology, and Meta could possibly
| leapfrog others because of the data moat.
| NBJack wrote:
| What moat exactly? Much of the user data they have access to
| is drying up due to new regulations, some of which prohibit
| IIRC direct use on models as well. I'm not even sure they can
| use historical data.
|
| Meta certainly has an edge in engineer count, undoubtedly.
| But I'd say they really, really want the metaverse to succeed
| more to have their on walled garden (i.e. equivalent power of
| Apple and Google stores, etc.). There's a reason they gave a
| hard pass to a Google partnership.
| Dr_Birdbrain wrote:
| I think the raw text inside Facebook groups is at least as
| valuable as Reddit data. Even if demographics data is
| restricted under European law, the raw text of people
| interacting is quite valuable.
| calvinmorrison wrote:
| facebooks downfall will be their lock in. every other
| social media platform lets you view a public profile,
| discussion groups etc. it's all locked inside facebook.
| verticalscaler wrote:
| Indeed, my deranged auntie posting on FB is approximately
| as valuable as my ADHD/PTSD quaranteeny nephew redditing.
| YetAnotherNick wrote:
| > Much of the user data they have access to is drying up
| due to new regulations, some of which prohibit IIRC direct
| use on models as well.
|
| Source would be appreciated, because this is opposite of
| obvious. Regulations against using public first party would
| be a big news and I haven't heard of anything like that.
| They use my data for recommending feed so why not for
| answering my question?
| dougb5 wrote:
| Proven technology, maybe, but proven product-market fit for
| the kinds of things Facebook is using it for? Their linked
| blog about AI features gives examples "AI stickers" and image
| editing... cool, but are these potential multi-billion dollar
| lifts to their existing business? I guess I'm skeptical it's
| worthwhile unless they're able to unseat ChatGPT with a
| market-leading general purpose assistant.
| pests wrote:
| I have a few group chats just that devolve into hours of
| sending stickers or image generation back and forth, lately
| we've been "writing a book together" with @Meta AI as the
| ghost writer, and while it utterly sucks, its been a
| hilarious shared experience.
|
| I don't think anyone else has gotten that group chat with
| AI thing so nailed.
| TaylorAlexander wrote:
| On the podcast TrashFuture, November Kelly recently
| described AI systems as "garbage dispensers" which is
| both a funny image (why would anyone make a garbage
| dispenser??) and an apt description. Certainly these
| tools have some utility, but there are a load of startups
| claiming to "democratize creativity" by allowing anyone
| to publish AI generated slop to major platforms. On the
| podcast this phrase was used during discussion of a
| website which lets you create AI generated music and push
| it to Spotify, a move which Spotify originally pushed
| back on but has now embraced. Garbage dispenser indeed.
| YetAnotherNick wrote:
| > unseat ChatGPT with a market-leading general purpose
| assistant.
|
| It's not impossible. The prediction from many(not that I
| believe it) is that over long run modelling tricks would
| become common knowledge and only thing that matters is
| compute and data, both of which Meta has.
|
| Also there could be a trend of LLMs for ads or feed
| recommendation in the future as they has large completely
| unstructured dataset per user across multiple sites.
| cj wrote:
| Compute, data, and most importantly distribution/users.
|
| IMO standalone AI companies like OpenAI might be
| successful by providing infrastructure to other
| companies, but I can't imagine ChatGPT remaining #1 many
| years from now.
|
| The web is still trending towards being a walled garden.
| Maybe not right now, but long term I think people will
| use whatever AI is most convenient which probably will be
| AI built into a giant company with established user base
| (FB, GOOG, MSFT, and Apple if they ever get around to
| launching - would love Siri 2.0 if it meant not needing
| to open the ChatGPT iOS app)
| trsohmers wrote:
| Significantly more than that; MFN pricing for NVIDIA DGX H100
| (which has been getting priority supply allocation, so many
| have been suckered into buying them in order to get fast
| delivery) is ~$309k, while a basically equivalent HGX H100
| system is ~$250k, coming to a price per GPU at the full server
| level being ~$31.5k. With Meta's custom OCP systems integrating
| the SXM baseboards from NVIDIA, my guess is that their cost per
| GPU would be in the ~$23-$25k range.
| fuddle wrote:
| 350,000 NVIDIA H100 x $23k = $8b :0
| verticalscaler wrote:
| Wait till you find out how much they spent on VR.
|
| It is a real loophole in the economy. If you're a trillion
| dollar company the market will insist you set such sums on
| fire just to be in the race for $current-hype. If they do
| it drives their market cap higher still and if they don't
| they risk being considered un-innovative and therefore
| doomed to irrelevancy and the market cap will spiral
| downwards.
|
| Sort of reminds me of The Producers.
| oblio wrote:
| The thing is, this could be considered basic research,
| right? Basic research IS setting money on fire until (and
| if) that basic research turns into TCP/IP, Ethernet and
| the Internet.
| verticalscaler wrote:
| I wish.
|
| Funnily enough Arpanet and all that Xerox stuff were like
| <$50 million (inflation adjusted!) total. Some real
| forward thinkers were able to work the system by breaking
| off a tiny pittance of a much larger budget.
|
| Where as I think this more appropriately can be
| considered the meta PR budget. They simply can't not
| spend it, would look bad for Wall Street. Have to keep up
| with the herd.
| lotsofpulp wrote:
| > If you're a trillion dollar company the market will
| insist you set such sums on fire just to be in the race
| for $current-hype. If they do it drives their market cap
| higher still and if they don't they risk being considered
| un-innovative and therefore doomed to irrelevancy and the
| market cap will spiral downwards.
|
| You don't think earning increasing amounts of tens of
| billions of dollars in net income per year at some of the
| highest profit margins in the world at that size for 10+
| years has anything to do with market cap?
| dekhn wrote:
| That sounds like a reasonable budget for 3 years of hardware at
| a major AI company.
| vineyardmike wrote:
| It's often forgotten now, but just a few years NVidia was
| cancelled production batches and writing down inventory when
| the GPU shortage cleared. No one needed more GPUs. It also
| happens to be when Meta first announced they were going to
| increase CapEx spending on compute.
|
| I'm guessing that Meta got a sweetheart deal to help take a lot
| of inventory for NVidia and make commitments for future
| purchases.
| loeg wrote:
| Yes, billions in GPU cap ex.
| froonly wrote:
| lmfao at the Meta folks not giving any credit whatsoever to the
| company that actually came up with and implemented the
| infrastructure work.
| jfkfif wrote:
| What's the company?
| sangnoir wrote:
| Facebook.
| zone411 wrote:
| Meta is still playing catch-up. Might be hard to believe but
| according to Reuters they've been trying to run AI workloads
| mostly on CPUs until 2022 and they had to pull the plug on the
| first iteration of their AI chip.
|
| https://www.reuters.com/technology/inside-metas-scramble-cat...
| axpy906 wrote:
| Definitely has some pr buzz and flex in the article. Now I see
| why.
| pwb25 wrote:
| so tired of this, not everyone need to work with AI stuff. work
| on facebook that is a disaster page instead
| delegate wrote:
| Subtitled 'Here's what you'll never be able to do'.
| ilaksh wrote:
| "Everything You Wanted to Know About GenAI at Meta, Except the
| One Thing You Honestly Care About" (Llama 3).
| wseqyrku wrote:
| > Commitment to open AI innovation
|
| I see what you did there, Meta.
| owenpalmer wrote:
| Haha, I noticed that too xD
| delanyoyoko wrote:
| You've got to read "open" roughly 3x in a paragraph.
| papichulo2023 wrote:
| If they release models I dont care honestly, they can brag
| about that as much as they want.
| dekhn wrote:
| it's really interesting just how similar these systems are to the
| designs adopted for HPC over the past few decades. I'm salty
| because it took a while for the ML community to converge on this
| (20+K GPUs connected by a real fabric with low latency and high
| bandwidth).
| mrkramer wrote:
| "Share this: Hacker News" Noice
| BonoboIO wrote:
| I thought at first "what are you talking about", when i check
| my uBlock filters. Was blocking the whole "Share this" content
| section.
|
| Sharing on Hacker News ... they now their audience.
| mrkramer wrote:
| I also use uBlock but my filters are the default ones and I
| saw it without any problem but tbh this is the first time
| that I saw some post on the Web have HN as a share option or
| the first time that I was surprised seeing it. Maybe it has
| something to do with Google ranking "trusted human
| information and knowledge" higher than "non-human"
| information and knowledge[0] or simply some Meta software
| engineer loves and uses HN so s/he decided to include HN as
| well, idk.
|
| [0] https://news.ycombinator.com/item?id=39423949
| jvanderbot wrote:
| So, I'd love to work on optimizing pipelines like this. How does
| one "get into" it? It seems a ML scientist with some C/C++ and
| infra knowledge just dips down into the system when required? Or
| is it CUDA/SIMD experts who move "up" into ML?
| KaiserPro wrote:
| A lot of the optimisation at this level is getting data into
| the right place at the right time, without killing the network.
|
| Its also a group effort to provide simple to use primitives
| that "normal" ML people can use, even if they've never used
| hyper scale clusters before.
|
| So you need a good scheduler, that understand dependencies (no,
| the k8s scheduler(s) are shit for this, plus it wont scale past
| 1k nodes without eating all of your network bandwidth), then
| you need a dataloader that can provide the dataset access, then
| you need the IPC that allows sharing/joining of GPUs together.
|
| all of that needs to be wrapped up into a python interface that
| fairly simple to use.
|
| Oh and it needs to be secure, pass an FCC audit (ie you need to
| prove that no user data is being used) have a high utilisation
| efficiency and uptime.
|
| the model stuff is the cherry on the top
| jvanderbot wrote:
| Ok, but back to my main question, how do I get into this?
| willsmith72 wrote:
| It looks more like an infra problem than ML. "Software
| architect"s mixed with devops/infra/sre people
| jvanderbot wrote:
| Well since I'm not a ML engineer of any kind - that's
| good!
| zooq_ai wrote:
| at the end of the day, you are still moving, storing and
| manipulating 1's and 0's, whether you are a front end
| engineer or a backend engineer or systems engieer or an
| ML engineer or an infra engineer
| seydor wrote:
| This is great news for Nvidia and their stock, but are they sure
| the LLMs and image models will scale indefinitely? nature and
| biology has a preference for sigmoids. What if we find out that
| AGI requries different kinds of cpu capabilities
| jiggawatts wrote:
| If anything, NVIDIA H100 GPUs are _too_ general purpose! The
| optimal compute for AI training would be more specialised, but
| then would be efficient at only one NN architecture. Until we
| know what the best architecture is, the general purpose
| clusters remain a good strategy.
| pinko wrote:
| The link mentions "our internal job scheduler" and how they had
| to optimize it for this work -- does anyone know what this job
| scheduler is called, or how it works?
| KaiserPro wrote:
| it might be twine:
| https://www.usenix.org/system/files/osdi20-tang.pdf
|
| but I suspect its not that, because Twine is optimised for
| services rather than batch processing, and doesn't really have
| the concept of priorities.
| radicality wrote:
| I would think it's probably that. Also, has this been renamed
| to Twine from Tupperware?
| benreesman wrote:
| I think it's always useful to pay attention to the history on
| stuff like this and it's a rare pleasure to be able to give some
| pointers in the literature along with some color to those
| interested from first-hand experience.
|
| I'd point the interested at the DLRM paper [1]: that was just
| after I left and I'm sad I missed it. FB got into disagg racks
| and SDN and stuff fairly early, and we already had half-U dual-
| socket SKUs with the SSD and (increasingly) even DRAM elsewhere
| in the rack in 2018, but we were doing _huge_ NNs for
| recommenders and rankers even for then. I don't know if this is
| considered proprietary so I'll play it safe and just say that a
| click-prediction model on IG Stories in 2018 was on the order of
| a modest but real LLM today (at FP32!).
|
| The crazy part is they were HOGWILD trained on Intel AVX-2, which
| is just wild to think about. When I was screwing around with CUDA
| kernels we were time sharing NVIDIA dev boxes, typically 2-4
| people doing CUDA were splitting up a single card as late as
| maybe 2016. I was managing what was called "IGML Infra" when I
| left and was on a first-name basis with the next-gen hardware
| people and any NVIDIA deal was still so closely guarded I didn't
| hear more than rumors about GPUs for training let alone
| inference.
|
| 350k Hopper this year, Jesus. Say what you want about Meta but
| don't say they can't pour concrete and design SKUs on a dime:
| best damned infrastructure folks in the game pound-for-pound to
| this day.
|
| The talk by Thomas "tnb" Bredillet in particular I'd recommend:
| one of the finest hackers, mathematicians, and humans I've ever
| had the pleasure to know.
|
| [1] https://arxiv.org/pdf/1906.00091.pdf
|
| [2] https://arxiv.org/pdf/2108.09373.pdf
|
| [3] https://engineering.fb.com/2022/10/18/open-source/ocp-
| summit...
|
| [4] https://youtu.be/lQlIwWVlPGo?si=rRbRUAXX7aM0UcVO
| junim wrote:
| Otarios kkkkkk vcs vao ver quem vai ser o maior hacjer do mundo
| kkkkkkkkk
___________________________________________________________________
(page generated 2024-03-12 23:00 UTC)