[HN Gopher] Building the heap: racking 30 petabytes of hard driv...
___________________________________________________________________
Building the heap: racking 30 petabytes of hard drives for
pretraining
Author : nee1r
Score : 223 points
Date : 2025-10-01 15:00 UTC (7 hours ago)
(HTM) web link (si.inc)
(TXT) w3m dump (si.inc)
| g413n wrote:
| No mention of disk failure rates? curious how it's holding up
| after a few months
| ClaireBookworm wrote:
| good point
| bayindirh wrote:
| The disk failure rates are very low when compared to decade
| ago. I used to change more than a dozen disks every week a
| decade ago. Now it's an eyebrow raising event which I seldom
| see.
|
| I think following Backblaze's hard disk stats is enough at this
| point.
| gordonhart wrote:
| Backblaze reports an annual failure rate of 1.36% [0]. Since
| their cluster uses 2,400 drives, they would likely see ~32
| failures a year (extra ~$4,000 annual capex, almost
| negligible).
|
| [0] https://www.backblaze.com/cloud-storage/resources/hard-
| drive...
| joering2 wrote:
| Their rate will probably be higher since they are utilizing
| used drives. From the spec:
|
| 2,400 drives. Mostly 12TB used enterprise drives (3/4 SATA,
| 1/4 SAS). The JBOD DS4246s work for either.
| antisthenes wrote:
| Not necessarily, since disk failures are typically
| U-shaped.
|
| Buying used drives eliminates the high rate of early
| failure (but does get you a bit closer to the 2nd part of
| the U-curve).
|
| Typically most drives would become more obsolete before
| hitting the high failure rate of the right side of the
| U-curve from longevity (7+ years)
| cjaackie wrote:
| They mentioned the cluster being used enterprise drives, I can
| see the desire to save money but agree, that is going to be one
| expensive mistake down the road.
|
| I should also note personally for home cluster use, I learned
| quickly that used drives didn't seem to make sense. Too much
| performance variability.
| g413n wrote:
| in a datacenter context failure rates are just a remote-hands
| recurring cost so it's not too bad with front-loaders
|
| e.g. have someone show up to the datacenter with a grocery
| list of slot indices and a cart of fresh drives every few
| months.
| guywithahat wrote:
| Used drives make sense if maintaining your home server is a
| hobby. It's fun to diagnose and solve problem in home
| servers, and failing drives give me a reason to work on the
| server. (I'm only half-joking, it's kind of fun)
| jms55 wrote:
| If I remember correctly, most drives either:
|
| 1. Fail in the first X amount of time
|
| 2. Fail towards the end of their rated lifespan
|
| So buying used drives doesn't seem like the worst idea to me.
| You've already filtered out the drivers that would fail
| early.
|
| Disclaimer: I have no idea what I'm talking about
| g413n wrote:
| we don't have perfect metrics here but this seems to match
| our experience; a lot of failures happened shortly after
| install before the bulk of the data download onto the heap,
| so actual data loss is lower than hardware failure rates
| frakkingcylons wrote:
| [delayed]
| dboreham wrote:
| Over in hardware-land we call this "the bathtub curve".
| dylan604 wrote:
| I've mentioned this story before, but we had massive drive
| failures when bringing up multiple disk arrays. We get them
| racked on a friday afternoon, and then I wrote a quick and
| dirty shell script to read/write data back and forth between
| them over the weekend that was to kick in after they finished
| striping the raid arrays. By quick and dirty I mean there was
| no logging, and just a bunch of commands saved as .sh. Came in
| on Monday to find massive failures in all of the arrays, but no
| insight into when they failed during the stripe or during
| stressing them. It was close to 50% failure rate. Turned out to
| be a bad batch from the factory. Multiple customers of our
| vendor were complaining. All the drives were replaced by the
| manufacturer. It just delayed the storage being available to
| production. After that, not one of them failed in the next 12
| months before I left for another job.
| jeffrallen wrote:
| > next 12 months before I left for another job
|
| Heh, that's a clever solution to the problem of managing
| storage through the full 10 year disk lifecycle.
| ClaireBookworm wrote:
| great write up, really appreciate the explanations / showing the
| process
| nharada wrote:
| So how do they get this data to the GPUs now...? Just run it over
| the public internet to the datacenter?
| bayindirh wrote:
| They can rent a dark fiber for themselves for that distance,
| and it'll be cheap.
|
| However, as they noted they use 100gbps capacity from their
| ISP.
| nee1r wrote:
| We want to get darkfiber from the datacenter to the office. I
| love 100Gbps
| dylan604 wrote:
| I'm now envisioning a poster with a strand of fiber wearing
| aviators with large font size Impact font reading Dark
| Fiber with literal laser beams coming out of the eyes.
| geor9e wrote:
| Does San Francisco really still have dark fiber? That 90s
| bubble sure did overshoot demand.
| madsushi wrote:
| DWDM tech improvements have outpaced nearly every other
| form of technology growth, so the same single pair of fiber
| that used to carry 10 Mbps can now carry 20 Tbps, which is
| a 2,000,000x multiplier. The same somewhat-fixed supply of
| fiber can go a very long way today, so the price pressure
| for access is less than you might expect.
| dpe82 wrote:
| I think these days folks say "dark fiber" for any kind of
| connection you buy. It bothers me too.
| bayindirh wrote:
| I meant a "single mode, non terminated fiber optic cable
| from point to point". In other words, your own cable
| without any other traffic on it.
|
| A shared one will be metro Ethernet in my parlance.
| g413n wrote:
| 7.5k for zayo 100gig so that's like half of the MRC
| nee1r wrote:
| yeah, exactly! we have a 100G uplink, and then we use nginx
| secure links that we then just curl from the machines using
| HTTP. (funnily HTTPS adds overhead so we just pre-sign URLs)
| miniman1337 wrote:
| Used Disks, No DR, not exactly a real shoot out.
| nee1r wrote:
| True, though this is specifically for pretraining data (S3
| wouldn't sell us used disk + no DR storage).
| p_ing wrote:
| You're in a seismically active part of the world. Will the
| venture last in a total loss scenario?
| nee1r wrote:
| We're currently 1/1 for the recent 4.3 magnitude earthquake
| (though if SF crumbles we might lose data)
| p_ing wrote:
| 4.3 is a baby quake. I'd hope that you'd be 1/1!
| antonkochubey wrote:
| They spent $300,000 on drives, with AWS they would have
| spent 4x that PER MONTH. They're already ahead of the
| cloud.
| p_ing wrote:
| AWS/cloud doesn't factor into my question what so ever.
| Loss of equipment is one thing. Loss of all data is quite
| a different story.
| Sanzig wrote:
| I do appreciate the scrappiness of your solution. Used drives
| for a storage cluster is like /r/homelab on steroids. And
| since it's pretraining data, I suppose data integrity isn't
| critical.
|
| Most venture-backed startups would have just paid the AWS or
| Cloudflare tax. I certainly hope your VCs appreciate how
| efficient you are being with their capital :)
| g413n wrote:
| worth stressing that we literally could not afford
| pretraining without this, approx our entire seed round
| would go into cloud storage costs
| leejaeho wrote:
| how long do you think it'll be before you fill all of it and have
| to build another cluster LOL
| nee1r wrote:
| Already filled up and looking to possibly copy and paste :)
| giancarlostoro wrote:
| So, others have asked, and I'm curious myself are you
| sourcing the videos yourselves or third parties?
| tomas789 wrote:
| My guess would be they are running some dummy app like
| quote of the day or something and it records the screen at
| 1fps or so.
| not--felix wrote:
| But where do you get 90 million hours worth of video data?
| myflash13 wrote:
| And not just any video data, they specifically mentioned screen
| recordings for agentic computer uses. A very specific kind of
| video. My guess is they have a partnership with someone like
| Rewind.ai
| conception wrote:
| Arrr matey
| mschuster91 wrote:
| Shows how crazy cheap on prem can be. _tips hat_
| nee1r wrote:
| _tips hat back_
| stackskipton wrote:
| Not included is overhead of dealing with maintenance. S3/R2
| generally don't require OPS type dedicated to care and feeding.
| This type of setup will likely require someone to spend 5 hours
| a week dealing with it.
| mschuster91 wrote:
| I once had about three racks full of servers under my
| control, admittedly they weren't a ton of disks, but still
| the hardware maintenance effort was pretty much negligible
| over a few years (until it all went to the cloud).
|
| The majority of server wrangling work I spent dealing with OS
| updates and, most annoyingly, OpenStack. But that's something
| you can't escape even if you run your stuff in the cloud...
| stackskipton wrote:
| With S3/R2 whatever, you do get away from it. You dump a
| bunch of files on them and then retrieve them. OS Updates,
| Disk Failures, OpenStack, additional hardware? Pssh, that's
| S3 company problem, not yours.
|
| $LastJob we ran a ton of Azure Web App Containers, alot of
| OS work no longer existed so it's possible with Cloud to
| remove alot of OS toil.
| nee1r wrote:
| True, this is a large reason why we chose to have the
| datacenter a couple blocks away from the office.
| hanikesn wrote:
| Why 5h a week? Just for hardware?
| datadrivenangel wrote:
| 5h a week is basically 3 days a month. So if you have an
| issue that takes a couple of days per month to fix, which
| seems very fair, you're at that point.
| dpe82 wrote:
| a) 5hrs/week is negligible compared to that potential AWS
| bill.
|
| b) The seem tolerant of failures so it's not going to be
| anything like 5hrs/week of physical maintenance. It will be
| bursty though (eg. box died, time to replace it...) but
| assuming they have spares of everything sitting around /
| already racked it shouldn't be a big deal.
| buckle8017 wrote:
| And this is actually relatively expensive.
| g413n wrote:
| the doodles are great
| nee1r wrote:
| Thanks! Lots of hard work went into them.
| zparky wrote:
| $125/disk, 12k/mo depreciation cost which i assume means disk
| failures, so ~100 disks/mo or 1200/yr, which is half of their
| disks a year - seems like a lot.
| devanshp wrote:
| no, we wanted to be conservative by depreciating somewhat more
| aggressively than that. we have much closer to 5% yearly disk
| failure rates.
| AnotherGoodName wrote:
| It's an accounting term. You need to report the value of assets
| of your company each reporting cycle. This allows you to report
| company profit more accurately since the 2400 drives aren't
| likely not worth what the company originally paid. It's stated
| as a tax write-off but people get confused with that term (they
| think X written off == X less tax paid). It's better to
| correctly state it as a way to more accurately report profit
| (which may end up with less company tax paid but obviously not
| 1:1 since company tax is not 100%).
|
| So anyway you basically pretend you resold the drives today.
| Here they are assuming in 3 years time no one will pay anything
| for the drives. Somewhat reasonable to be honest since the
| setup's bespoke and you'll only get a fraction of the value of
| 3 year old drives if you resold them.
| zparky wrote:
| oh i see, thanks! i might be too used to reading backblaze
| reports :p
| ttfvjktesd wrote:
| The biggest part that is always missing in such comparisons is
| the employee salaries. In the calculation they give $354k/year of
| total cost per year. But now add the cost of staff in SF to
| operate that thing.
| g413n wrote:
| someone has to go and power-cycle the machines every couple
| months it's chill, that's the point of not using ceph
| paxys wrote:
| So the drives are never going to fail? PSUs are never going
| to burn out? You are never going to need to procure new
| parts? Negotiate with vendors?
| buckle8017 wrote:
| They mention data loss is acceptable, so im guessing
| they're only fixing big outages.
|
| Ignoring failed hdds week likely mean very little
| maintenance.
| theideaofcoffee wrote:
| This concern troll that everyone trots out when anyone
| brings up running their own gear is just exhausting. The
| hyperscalers have melted people's brains to a point where
| they can't even fathom running shit for themselves.
|
| Yes, drives are going to fail. Yes, power supplies are
| going to burn out. Yes, god, you're going to get new parts.
| Yes, you will have to actually talk to vendors.
|
| Big. Deal. This shit is -not- hard.
|
| For the amount of money you save by doing it like that, you
| should be clamoring to do it yourself. The concern trolling
| doesn't make any sort of argument against it, it just makes
| you look lazy.
| immibis wrote:
| Very good point. There was something on the HN front page
| like this about self-hosted email, too.
|
| I point out to people that AWS is between _ten_ to _one
| hundred_ times more expensive than a normal server. The
| response is "but what if I only need it to handle peak
| load three hours a day?" _Then you still come out ahead
| with your own server._
|
| We have multiple colo cages. We handle enough traffic -
| terabytes per second - that we'll never move those to
| cloud. Yet management always wants more cloud. While
| simultaneously complaining about how we're not making
| enough money.
| ranger_danger wrote:
| I don't think the answer is so black-and-white. IMO This
| only realistically applies to larger companies or ones
| that either push lots of traffic or have a need for large
| amounts of compute/storage/etc.
|
| But for smaller groups that don't have large/sustained
| workloads, I think they can absolutely save money
| compared to colo/dedicated servers using one of multiple
| different kinds of AWS services.
|
| I have several customers that coast along just fine with
| a $50/mo EC2 instance or less, compared to hundreds per
| month for a dedicated server... I wouldn't call that "ten
| times" by any stretch.
| ttfvjktesd wrote:
| You are under the assumption that only Ceph (and similar
| complex software) requires staff, whereas plain 30 PB can be
| operated basically just by rebooting from time to time.
|
| I think that anyone with actual experience of operating
| thousands of physical disks in datacenters would challenge
| this assumption.
| devanshp wrote:
| we have 6 months of experience operating thousands of
| physical disks in datacenters now! it's about a couple
| hours a month of employee time in steady-state.
| ttfvjktesd wrote:
| How about all the other infrastructure. Since you are
| obviously not using the cloud, you must have massive
| amounts of GPUs and operating systems. All of that has
| been working together, it's not just keep watching for
| the physical disks and all is set.
|
| Don't get me wrong, I buy the actual numbers regarding
| hardware costs, but in addition to that presenting the
| rest as basically a one man show in terms of maintenance
| hours is the point where I'm very sceptical.
| g413n wrote:
| oh we use cloud gpus, infiniband h100s absolutely aren't
| something we want to self-host. not aws tho, they're
| crazy overpriced; mithril and sfcompute!
|
| we also use cloudflare extensively for everything that
| isn't the core heap dataset, the convenience of buckets
| is totally worth it for most day-to-day usage.
|
| the heap is really _just_ the main pretraining corpus and
| nothing else.
| ttfvjktesd wrote:
| How is it going to work when the GPU is in the cloud and
| the storage is miles away in a local colo in SF down the
| street? I was under the impression that the GPUs has to
| go multiple times over the training dataset, which means
| transfer 30 PB multiple times in and out of the clouds.
| Is the data link even fast enough? How much are you
| charged for data transfer fees.
| datadrivenangel wrote:
| Assuming that they end up hiring a full time ops person at
| 500k annually total costs (250k base for a data center
| wizard), then that's 42k extra a month, or ~$70k. Still 200k
| per month lower than their next best offering.
| Symbiote wrote:
| It's really not necessary.
|
| I have four racks rather than ten, and less storage but
| more compute. All purchased new from HP with warranties.
|
| Ordering each year takes a couple of days work. Racking
| that takes one or two.
|
| Initial setup (seeing differences with a new generation of
| server etc and customizing Ubuntu autoinstallation) is done
| in a day.
|
| So that's a week per year for setup.
|
| If we are really unlucky, add another week for a strange
| failure. (This happened once in the 10 years I've been
| doing this, a CPU needed replacement by the HP engineer.)
|
| I replaced a couple of drives in July, and a network fibre
| transceiver in May.
| 827a wrote:
| The biggest part missing from the opposing side is: Their view
| is very much rooted in the pre-Cloud hardware infrastructure
| world, where you'd pay sysadmins a full salary to sit in a dark
| room to monitor these servers.
|
| The reality nowadays is: the on-prem staff is covered in the
| colo fees, which is split between everyone coloing in the
| location and reasonably affordable. The software-level work
| above that has massively simplified over the past 15 years, and
| effectively rivals the volume of work it would take to run
| workloads in the cloud (do you think managing IAM and Terraform
| is free?)
| ttfvjktesd wrote:
| > do you think managing IAM and Terraform is free?
|
| No, but I would argue that a SaaS offering, where the whole
| maintenance of the storage system is maintained for you
| actually requires less maintenance hours than hosting 30 PB
| in a colo.
|
| In terraform you define the S3 bucket and run terraform
| apply. Afterwards the company's credit card is the limit.
| Setting up and operating 30 PB yourself is an entirely
| different story.
| g413n wrote:
| yeah colo help has been great, we had a power blip and
| without any hassle they covered the cost and installation of
| UPSes for every rack, without us needing to think abt it
| outside of some email coordination.
| Aurornis wrote:
| Small startup teams can sometimes get away with datacenter
| management being a side task that gets done on an as-needed
| basis at first. It will come with downtime and your stability
| won't be anywhere near as good as Cloudflare or AWS no matter
| how well you plan, though.
|
| Every real-world colocation or self-hosting project I've ever
| been around has underestimate their downtime and rate of
| problems by at least an order of magnitude. The amount of time
| lost to driving to the datacenter, waiting for replacement
| parts to arrive, and scrambling to patch over unexpected
| failure modes is always much higher than expected.
|
| There is a false sense of security that comes in the early days
| of the project when you think you've gotten past the big issues
| and developed a system that's reliable enough. The real test is
| always 1-2 years later when teams have churned, systems have
| grown, and the initial enthusiasm for playing with hardware has
| given way to deep groans whenever the team has to draw straws
| to see who gets to debug the self-hosted server setup this time
| or, worse, drive to the datacenter again.
| calvinmorrison wrote:
| > The amount of time lost to driving to the datacenter,
| waiting for replacement parts to arrive, and scrambling to
| patch over unexpected failure modes is always much higher
| than expected.
|
| I don't have this experience at all. Our colo handled almost
| all work. the only time i ever went to the server farm was to
| build out whole new racks. Even replacing servers the colo
| handled for us at good cost.
|
| Our reliability came from software not hardware, though of
| course we had hundreds of spares sitting by, the defense in
| depth (multiple datacenters, each datacenter having 2
| 'brains' which could hotswap, each client multiply backed up
| on 3-4 machines)...
|
| servers going down were fairly common place, servers dying
| were commonplace. i think once we had a whole rack outage
| when the switch died, and we flipped it to the backup.
|
| Yes these things can be done and a lot cheaper than paying
| AWS.
| Aurornis wrote:
| > Our reliability came from software not hardware, though
| of course we had hundreds of spares sitting by, the defense
| in depth (multiple datacenters, each datacenter having 2
| 'brains' which could hotswap, each client multiply backed
| up on 3-4 machines)...
|
| Of course, but building and managing the software stack,
| managing hundreds of spares across locations, spanning
| across datacenters, having a hotswap backup system is not a
| simple engineering endeavor.
|
| The only way to reach this point is to invest a very large
| amount of time into it. It requires additional headcount or
| to put other work on pause.
|
| I was trying to address the type of buildout in this
| article: Small team, single datacenter, gets the job done
| but comes with tradeoffs.
|
| The other type of self buildout that you describe is ideal
| when you have a larger team and extra funds to allocate to
| putting it all together, managing it, and staffing it.
| However, once you do that it's not fair to exclude the cost
| of R&D and the ongoing headcount needs.
|
| It's tempting to sweep it under the rug and call it part of
| the overall engineering R&D budget, but there is no
| question a large cost associated with what you described as
| opposed to spinning up an AWS or Cloudflare account and
| having access to your battle-tested storage system a few
| minutes later.
| g413n wrote:
| not caring about redundancy/reliability is really nice,
| each healthy HDD is just the same +20TB of pretraining
| data and every drive lost is the same marginal cost.
| jeffrallen wrote:
| When you lose 20 TB of video, where do you get 20 TB of
| new video to replace it?
| wongarsu wrote:
| To be fair, what's described here is much more robust
| than what you get with a simple AWS setup. At a minimum
| that's a multi-region setup, but if the DCs have
| different owners I'd even compare it to a multi-cloud
| setup.
| g413n wrote:
| fwiw our first test rack has been up for about a year now and
| the full cluster has been operational for training for the
| past ~6 months. having it right down the block from our
| office has been incredibly helpful, I am a bit worried abt
| what e.g. freemont would look like if we expand there.
|
| I think another big crux here is that there isn't really any
| notion of cluster-wide downtime, aside from e.g. a full
| datacenter power outage (which we've had ig, and now have
| UPSes in each rack kindly provided and installed by our
| datacenter). On the software/network level the storage isn't
| really coordinated in any manner, so failures of one machine
| only reflect as a degradation to the total theoretical
| bandwidth for training. This means that there's generally no
| scrambling and we can just schedule maintenance at our
| leisure. Last time I drew straws for maintenance I clocked a
| 30min round-trip to walk over and plug a crash cart into each
| of the 3 problematic machines to reboot and re-intialize and
| that was it.
|
| Again having it right by the office is super nice, we'll need
| to really trust our kvm setup before considering anything
| offsite.
| kabdib wrote:
| I've built and maintained similar setups (10PB range).
| Honestly, you just shove disks into it, and when they fail you
| replace them. You need folks around to handle things like
| controller / infrastructure failure, but hopefully you're
| paying them to do other stuff, too.
| OutOfHere wrote:
| Is it correct that you have zero data redundancy? This may work
| for you if you're just hoarding videos from YouTube, but not for
| most people who require an assurance that their data is safe.
| Even for you, it may hurt proper benchmarking, reproducibility,
| and multi-iteration training if the parent source disappears.
| nee1r wrote:
| Definitely much less redundancy, this was definitely a tradeoff
| we made for pretraining data and cost.
| Sanzig wrote:
| Did you do any kind of redundancy at least (eg: putting every
| 10 disks in RAID 5 or RAID Z1)? Or I suppose your training
| application doesn't mind if you shed a few terabytes of data
| every so often?
| g413n wrote:
| atm we don't and we're a bit unsure whether it's a free
| lunch wrt adding complexity. there's a really nice property
| of having isolated hard drives where you can take any
| individual one and `sudo mount` it and you have a nice
| chunk of training data, and that's something anyone can
| feel comfortable touching without any onboarding to some
| software stack
| RagnarD wrote:
| I love this story. This is true hacking and startup cost
| awareness.
| nee1r wrote:
| Thanks!! :)
| boulos wrote:
| It's quite cheap to just store data at rest, but I'm pretty
| confused by the training and networking set up here. It sounds
| like from other comments that you're not going to put the GPUs in
| the same location, so you'll be doing all training over X 100
| Gbps lines between sites? Aren't you going to end up totally
| bottlenecked during pretraining here?
| g413n wrote:
| yeah we just have the 100gig link, atm that's about all the gpu
| clusters can pull but we'll prob expand bandwidth and storage
| as we scale.
|
| I guess worth noting that we do have a bunch of 4090s in the
| colo and it's been super helpful for e.g. calculating
| embeddings and such for data splits.
| mwambua wrote:
| How did you arrive at the decision of not putting the GPU
| machines in the colo? Were the power costs going to be too
| high? Or do you just expect to need more physical access to
| the GPU machines vs the storage ones?
| g413n wrote:
| When I was working at sfcompute prior to this we saw
| multiple datacenters literally catch on fire bc the
| industry was not experienced with the power density of
| h100s. Our training chips just aren't a standard package in
| the way JBODs are.
| Symbiote wrote:
| Isn't the easy option to spread the computers out, i.e.
| not fill the rack, but only half of it?
|
| A GPU cluster next to my servers has done this,
| presumably they couldn't have 64A in one rack so they've
| got 32A in two. (230V 3phase.)
| pixl97 wrote:
| Rackspace is typically at a premium at most data centers.
| Symbiote wrote:
| I'm more surprised that a data centre will apparently
| provide more power to a rack than is safe to use.
| lemonlearnings wrote:
| Adding the compute story would be interesting as a follow
| up.
|
| Where is that done? How many GPUs do you need to
| crunching all that data. Etc.
|
| Very interesting and refreshing read though. Feels like
| what Silicon Valley is more about than just the usual: tf
| apply then smile and dial.
| huxley_marvit wrote:
| damn this is cool as hell. estimate on the maintenance cost in
| person-hours/month?
| nee1r wrote:
| Around 2-5 hours/month, mostly powercycling the servers and
| replacing hard drives
| Symbiote wrote:
| You should be able to power cycle the servers from their
| management interfaces.
|
| (But I have the luxury of everything being bought new from
| HP, so the interfaces are similar.)
| jonas21 wrote:
| Nice writeup. All of the technical detail is great!
|
| I'm curious about the process of getting colo space. Did you use
| a broker? Did you negotiate, and if so, how large was the
| difference in price between what you initially were quoted and
| what you ended up paying?
| nee1r wrote:
| We reached out to almost every colocation space in SF/some in
| Fremont to get quotes. There wasn't a difference between the
| quote price and what we ended up paying, though we did
| negotiate terms + one-time costs.
| archmaster wrote:
| Had the pleasure of helping rack drives! Nothing more fun than an
| insane amount of data :P
| nee1r wrote:
| Thanks for helping!!!
| miltonlost wrote:
| And how much did the training data cost?
| jimmytucson wrote:
| Just wanted to say, thanks for doing this! Now the old rant...
|
| I started my career when on-prem was the norm and remember so
| much trouble. When you have long-lived hardware, eventually, no
| matter how hard you try, you just start to treat it as a pet and
| state naturally accumulates. Then, as the hardware starts to be
| not good enough, you need to upgrade. There's an internal team
| that presents the "commodity" interface, so you have to pick out
| your new hardware from their list and get the cost approved (it's
| a lot harder to just spend a little more and get a little more).
| Then your projects are delayed by them racking the new hardware
| and you properly "un-petting" your pets so they can respawn on
| the new devices, etc.
|
| Anyways, when cloud came along, I was like, yeah we're switching
| and never going back. Buuut, come to find out that's part of the
| master plan: it's a no-brainer good deal until you and everyone
| in your org/company/industry forgets HTF to rack their own
| hardware, and then it starts to go from no-brainer to brainer.
| And basically unless you start to pull back and rebuild that
| muscle, it will go from brainer to no-brainer _bad_ deal. So
| thanks for building this muscle!
| theideaofcoffee wrote:
| I'm not op, but thanks for this. Like I mentioned in another
| comment, the wholesale move to the cloud has caused so many
| skills to become atrophied. And it's good that someone is
| starting to exercise that skill again, like you said. The
| hyperscalers are mostly to blame for this, the marketing FUD
| being that you can't possibly do it yourself, there are too
| many things to keep track of, let us do it (while conveniently
| leaving out how eye-wateringly expensive they are in
| comparison).
| tempest_ wrote:
| The other thing the cloud does not let you do is make trade
| offs.
|
| Sometimes you can afford not to have triple redundant 1000GB
| network or a simple single machine with raid may have
| acceptable down time.
| g413n wrote:
| yeah this
|
| it means that even after negotiating much better terms than
| baseline we run into the fact that cloud providers just
| have a higher cost basis for the more premium/general
| product.
| g413n wrote:
| we're in a pretty unique situation in that _very early on_ we
| fundamentally can 't afford the hyperscaler clouds to cover
| operations, so we're forced to develop some expertise. turned
| out to be reasonably chill and we'll prob stick with it for the
| foreseeable future, but we have seen a little bit of the state-
| creep you mention so tbd.
| nodja wrote:
| Yeah from memory on-prem was always cheaper, it just removed a
| lot of logistic obstacles and made everything convenient under
| one bill.
|
| IIRC the wisdom of the time cloud started becoming popular was
| to always be on-prem and use cloud to scale up when demand
| spiked. But over time temporarily scaling up became permanent,
| and devs became reliant on instantly spawning new machines for
| things other than spikes in demand and now everyone defaults to
| cloud and treats it as the baseline. In the process we lost the
| grounding needed to assess the real cost of things and
| predictably the cost difference between cloud and on-prem has
| only widened.
| luhn wrote:
| > IIRC the wisdom of the time cloud started becoming popular
| was to always be on-prem and use cloud to scale up when
| demand spiked.
|
| I've heard that before but was never able to make sense of
| it. Overflowing into the cloud seems like a nightmare to
| manage, wouldn't overbuilding on-prem be cheaper than paying
| your infra team to straddle two environments?
| sgarland wrote:
| As someone with experience with a company that did hybrid,
| I'll say: it only makes sense if your infra team deeply
| understands computers.
|
| The end state is "just some IaC," wherein it doesn't really
| matter to anyone where the application lives, but all of
| the underlying difficulties in getting to that state
| necessitate that your team actually, no-shit knows how
| distributed systems work. They're going to be doing a lot
| of networking configuration, for one, and that's a whole
| speciality.
| ares623 wrote:
| Wanna see us do it again?
| matt-p wrote:
| Docker is _amazing_ for forcing the machines not to be pets,
| seriously, a racked sever is just another K3 or K8 node (or
| whatever) and doesn 't get the choice or ability of being
| petted. It's so nice. You could maybe of said the same about
| vm's but not really, the VM just became the pet, OK you could
| at least image/snapshot it but it's not the same.
| doublerabbit wrote:
| I've found docker is as of a monstrous pet.
|
| Docker is a monster that you have to treat as a pet. You've
| still got to pet it through stages of updating, monitoring,
| snapshots and networking. When the internal system breaks
| it's no different to a server collapsing.
|
| Snapshots are a haircut for the monster, useful but can make
| things worse.
| matt-p wrote:
| Not in my experience, super easy to setup a K3s cluster in
| a single rack. Certainly less hassle than VMWare was or XEN
| ever was.
| pronoiac wrote:
| I wonder if they'll go with "toploaders" - like Backblaze Storage
| Pods - later. They have better density and faster setup, as they
| don't have to screw in every drive.
|
| They got used drives. I wonder if they did any testing? I've
| gotten used drives that were DOA, which showed up in tests -
| SMART tests, short and long, then writing pseudorandom data to
| verify capacity.
| g413n wrote:
| yeah we're very interested in trying toploaders, we'll do a
| test rack next time we expand and switch to that if it goes
| well.
|
| w.r.t. testing the main thing we did was try to buy a bit from
| each supplier a month or two ahead of time, so by the time we
| were doing the full build that rack was a known variable. We
| did find one drive lot which was super sketchy and just didn't
| include it in the bulk orders later. diversity in suppliers
| helps a lot with tail risk
| joshvm wrote:
| "don't have to screw in every drive" is relative, but at
| least tool-less drive carriers are a thing now.
|
| A lot of older toploaders from vendors like Dell are not
| tool-free. If you bought vendor drives and one fails, you RMA
| it and move on. However if you want to replace failed drives
| in the field, or want to go it alone from the start with
| refurbished drives... you'll be doing a lot of screwing.
| They're quite fragile and the plastic snaps easily. It's
| pretty tedious work.
| tempest_ wrote:
| Used Supermicro machines of this generation and very cheap (all
| things considered)
|
| https://www.theserverstore.com/supermicro-superstorage-ssg-6...
| synack wrote:
| IPMI is great and all, but I still prefer serial ports and remote
| PDUs. Never met a BMC I could trust.
| jeffrallen wrote:
| Try Lenovo. Their BMCs Don't Suck (tm).
| fragmede wrote:
| My question isn't why do it yourself. A quick back of the
| envelope math shows AWS being much more expensive. My question is
| why San Francisco? It's one of the most expensive real estate
| markets in the US (#2 residential, #1 commercial), and
| electricity is _expensive_. $0.71 /KwH peak residential rate! A
| jaunt down 280 to San Jose's gonna be cheaper, at the expense of.
| having to take that drive to get hands on. But I'm sure you can
| find someone who's capable of running a DC that lives in San Jose
| and needs a job so the SF team doesn't have to commute down to
| South Bay. Now obviously there's something to be said for having
| the rack in the office, I know of at least two (three, now) in
| San Francisco, it just seems like a weird decision if you're
| already worrying about money to the point of not using AWS.
| hnav wrote:
| Article says their recurring cost is $17.5k, they'll spend at
| least that amount in terms of human time tending to their
| cluster if they have to drive to it. It's also a question of
| magnitudes, going from $0.5m/mo to $0.05m/mo (hard costs plus
| the extra headaches of dealing with cluster) is an order of
| magnitude, even if you could cut another order of magnitude it
| wouldn't be as impactful.
| renewiltord wrote:
| Problem when you self-roll this is that you inevitably make
| mistakes and the cycle time of going down and up ruins
| everything. Access trumps everything.
|
| You can get a DC guy but then he doesn't have much to do post
| setup and if you contract that you're paying mondo dollars
| anyway to get it right and it's a market for lemons (lots of
| bullshitters out there who don't know anything).
|
| Learned this lesson painfully.
| g413n wrote:
| it's not just in sf it's across the street from our office
|
| this has been incredibly nice for our first hardware project,
| if we ever expand substantially then we'd def care more about
| the colo costs.
| tarasglek wrote:
| i am still confused what their software stack is, they dont use
| ceph but bought netapp, so they use nfs?
| OliverGuy wrote:
| The NetApps are just disk shelves, can plug it into a SAS
| controller and use whatever software stack you please.
| tarasglek wrote:
| but they have multiple head nodes, so its some distributed
| setup or just active/passive type thing?
| hnav wrote:
| I'm guessing the client software (outside the dc) is
| responsible for enumerating all the nodes which all get
| their own IP.
| trebligdivad wrote:
| The networking stuff seems....odd.
|
| 'Networking was a substantial cost and required experimentation.
| We did not use DHCP as most enterprise switches don't support it
| and we wanted public IPs for the nodes for convenient and
| performant access from our servers. While this is an area where
| we would have saved time with a cloud solution, we had our
| networking up within days and kinks ironed out within ~3 weeks.'
|
| Where does the switch choice come into whether you DHCP? Wth
| would you want public IPs.
| giancarlostoro wrote:
| > Wth would you want public IPs.
|
| So anyone can download 30 PB of data with ease of course.
| buzer wrote:
| > Wth would you want public IPs.
|
| Possibly to avoid needing NAT (or VPN) gateway that can handle
| 100Gbps.
| bombcar wrote:
| I don't know what they're doing, but Mikrotik can perhaps
| route that -
| https://mikrotik.com/product/ccr2216_1g_12xs_2xq#fndtn-
| testr... and is about the cost of their used thing.
|
| And I think this would be a banger for IPv6 if they really
| "need" public IPs.
| dustywusty wrote:
| Exactly what I came in to say, CCR2216 can do this for <
| $2k, and does it well.
| xp84 wrote:
| No DHCP doesn't mean public IPs nor impact the need for NAT,
| it just means the hosts have to be explicitly configured with
| IP addresses, default gateways if they need egress, and DNS.
|
| Those IPs you end up assigning manually could be private ones
| or routable ones. If private, authorized traffic could be
| bridged onto the network by anything, such as a random
| computer with 2 NICs, one of which is connected eventually to
| the Internet and one of which is on the local network.
|
| If public, a firewall can control access just as well as
| using NAT can.
| buzer wrote:
| I know, I was specifically answering the question of "why
| the hell would you want public IPs".
|
| I don't know why their network setup wouldn't support DHCP,
| that's extremely common especially in "enterprise" switches
| via DHCP forwarding.
| pclmulqdq wrote:
| They didn't seem to want to use a router. Purpose-built 100
| Gbps routers are a bit expensive, but you can also turn a
| computer into one.
| flumpcakes wrote:
| Many switches are L3 capable, making them in effect a router.
| Considering their internet lines appear to be hooked up to
| their 100 Gbps switch, I'd guess this is one of the L3 ones.
| mystifyingpoi wrote:
| It really feels like they wanted 30 PB of storage accessible
| over HTTP and _literally nothing else_. No redundancy, no NAT,
| dead simple nginx config + some code to track where to find
| which file on the filesystem. I like that.
| matt-p wrote:
| This was not written by a network person, quite clearly.
| Hopefully it's just a misunderstanding, otherwise they do need
| someone with literally any clue about networks.
| g413n wrote:
| yeah misunderstanding we'll update the post-- separately it's
| true that we aren't network specialists and the network
| wrangling was prob disproportionately hard for us/ shouldn't
| have taken so long.
| trebligdivad wrote:
| I assume your actual training is being done somewhere else?
| Did you try getting colocation space in the same datacentre
| as somewhere with the compute - it would have reduced your
| internet costs even further.
| g413n wrote:
| yeah the cost calculus is very different for gpus, it
| absolutely makes sense for us to be using cloud there.
| also hardly any datacenters can support the power
| density, esp in downtown sf
| trebligdivad wrote:
| Yeh; one other thing - you list a separate management
| network as an optional - it's not optional! Under no
| circumstance must you expose the managemnt IPs of
| switches or the servers to the internet; they are, on
| average, about as secure as a drunk politician. Use a
| separate management net, make sure it's only securly
| accessed.
| Symbiote wrote:
| I understood that it's optional because they can walk
| down the road to the data center instead.
|
| They mention plugging monitors in several times. I think
| I've only done that once in the last couple of years,
| when a firmware upgrade failed and reset the management
| interface IP.
| matt-p wrote:
| Massive props for getting it done anyway. For others
| reading: In general a switch should never run DHCPd, but
| will normally/often relay it for you, your arista's would
| 100% have supported relaying, but in this case it sounds
| like it might even be flat L2. Normally you'd host dhcpd on
| a server.
|
| Some general feedback incase it's helpful.. -20K on
| contractors seems insane if we're talking about rack and
| stack for 10 racks. Many datacentres can be persuaded to do
| it for free as part of you agreeing to sign their contract.
| Your contractors should at least be using a server lift of
| some kind, again often provided kindly by the facility. If
| this included paying for server configuration and so on,
| then ignore that comment (bargin!).
|
| -I would almost never expect to actually pay a setup fee
| (beyond something nominal like 500 per rack) to the
| datacentre either, certainly if you're going to be paying
| that fee it had better include rack and stack.
|
| -A crash cart should not be used for a install of this
| size, the servers should be plugged into the network, and
| then automatically configured by a script/IPXE. It might
| sound intimidating or hard but it's not, doesn't even
| require IMPI (though frankly I would strongly, strongly
| recommend it, if you do't already have it). I would use
| managed switches for the management network too, for sure.
|
| -Consider two switches, especially if they are second hand.
| The cost of the cluster not being usable for a few days
| while you source and install a replacement even here
| probably is still thousands.
|
| -Personally not a big fan of the whole JBOD architecture
| and would have just filled by boots with single socket 4u
| supermicro chasis. To each their own, but JBOD's main
| benefit is a very small financial saving at the cost of
| quite a lot of drawbacks IMO. YMMV.
|
| -Depending on who you use for GPUs, getting a private link
| or 'peering' to them might save you some cost and provide
| higher capacity.
|
| -I'm kind of shocked that FMT2 didn't turn out much cheaper
| than your current colo, would expect less than those
| figures possibly _with_ the 100G DIA included (normally
| about $3000 /month no setup).
| XorNot wrote:
| I mean generally above a certain size of deployment DHCP is
| much more trouble then it's worth.
|
| DHCP is really only worth it when your hosts are truly dynamic
| (i.e. not controlled by you). Otherwise it's a lot easier to
| handle IP allocation as part of the asset lifecycle process.
|
| Heck even my house IoT network is all static IPs because at the
| small scale it's much more robust to not depend on my home
| router for address assignment - replacing a smart bulb is a big
| enough event, so DHCP is solely for bootstrapping in that case.
|
| At the enterprise level unpacking a server and recording the
| asset IDs etc is the time to assign IP addresses.
| Symbiote wrote:
| I have static, public IPs across 80 or so servers.
|
| It gets set approximately once when the server's automated
| Ubuntu installation runs, and I never think about it.
|
| > Where does the switch choice come into whether you DHCP?
|
| Perhaps from home routers which include I've.
|
| > Wth would you want public IPs.
|
| Why wouldn't you? They have a firewall.
| OliverGuy wrote:
| Aren't those netapp shelves pretty old at this point? See a lot
| of people recommending against them even for homelab type uses.
| You can get those 60 drive SuperMicro JBODs for pretty cheap now,
| and those aren't too old, would have been my choice.
|
| Plus, the TCO is already way under the cloud equiv. so might as
| well spend a little more to get something much newer and more
| reliable
| g413n wrote:
| yeah it's on the wishlist to try
| drnick1 wrote:
| Everyone should give AWS the middle finger and start doing this.
| Beyond cost, it's a matter of sovereignty over one's computing
| and data.
| twoodfin wrote:
| If this is a real market, I'd expect AWS to introduce S3
| Junkyard with a similar durability and cost structure.
|
| They probably still won't budge on the egress fees.
| alchemist1e9 wrote:
| Would have been much easier and probably cheaper to buy gear from
| 45drives.
| renewiltord wrote:
| The cost difference is huge. Modern compute is just so much
| bigger than one would think. Hurricane Electric is incredibly
| cheap too. And Digital Realty in the city are pretty good. The
| funny thing is that the Monkeybrains guys will make room for you
| at $75/amp but that isn't competitive when a 9654 based system
| pulls 2+ amps at peak.
|
| Still fun for someone wanting to stick a computer in a DC though.
|
| Networking is surprisingly hard but we also settled for the
| cheapo life QSFP instead of the new Cisco switches that do 800
| Gbps that are coming. Great writeup.
|
| One that would be fun is about the mechanics of layout and
| cabling and that sort of thing. Learning all that manually was a
| pain in the ass. It's not just written down somewhere and I
| should have done it when I was doing it but now I no longer am
| doing it and so can't provide good photos.
| ThinkBeat wrote:
| So now you have all
|
| - your storage in one place
|
| - you own all backup,
|
| -- off site backup (hot or cold)
|
| - uptime worries
|
| - maintenance drives
|
| -- how many can fail. before it is a problem
|
| - maintenance machines
|
| -- how many can fail. before it is a problem
|
| - maintenance misc/datacenter
|
| - What to do the electricity is cut off suddenly
|
| -- do you have a backup provider?
|
| -- disel generators?
|
| -- giant batteries?
|
| -- Will the backup power also run cooling?
|
| -natural disaster
|
| -- earthquake
|
| -- flooding
|
| -- heatwave
|
| - physical security
|
| - employee training / (esp. if many quit)
|
| - backup for networking (and power for it)
|
| - employees on call 24/7
|
| - protection against hacking
|
| +++++
|
| I agree that a lot of cloud providers overcharge by a lot, but
| doing it all yourself gives you a lot of headaches.
|
| co-hosting would seem like a valuable partial mitigator.
| pclmulqdq wrote:
| Most of these come from your colo provider (including a good
| backup power and networking story), and you can pay remote
| hands for a lot of the rest.
|
| Things like "protection from hacking" also don't come from AWS.
| yread wrote:
| You could get pretty close to the cost 1$/TB/month using
| Hetzner's sx135 with 8x22TB so 140TB in raidz1 for 240 eur. Maybe
| you get a better rate if you rent 200 of them. Someone else takes
| care of a lot of risks and you can sleep well at night
| nodja wrote:
| I don't think Hetzner provides locations in SF. Those 100GBit
| connections don't do much if they need to connect outside the
| city the rest of the equipment is in, but maybe peering has
| gotten better and my views are outdated.
| fuzzylightbulb wrote:
| You're good. The speed of light through a glass fiber is
| still just as slow as it ever was.
| g413n wrote:
| yeah it's totally plausible that we go with something like this
| in the future. We have similar offers where we could separate
| out either the financing, the build-out, or both and just do
| the software.
|
| (for Hetzner in particular it was a massive pain when we were
| trying to get CPU quotas with them for other data operations,
| and we prob don't want to have it in Europe, but it's been
| pretty easy to negotiate good quotes on similar deals locally
| now that we've shown we can do it ourselves)
| mx7zysuj4xew wrote:
| You cannot use hetzner for anything serious.
|
| They'd most likely claim abuse and delete your data wholesale
| without notice
| coleca wrote:
| For a workload of that size you would be able to negotiate
| private pricing with AWS or any cloud provider, not just
| CloudFlare. You can get a private pricing deal on S3 with as
| little as half a PB. Not saying that your overall expenses would
| be cheaper w/a CSP than DIY, but its not exactly an apples to
| apples comparison of taking full retail prices for the CSPs
| against eBayed equipment and free labor (minus the cost of the
| pizza).
| g413n wrote:
| egress costs are the crux for AWS and they didn't budge when we
| tried to negotiate that we them, it's just entirely unusable
| for AI training otherwise. I think the cloudflare private quote
| is pretty representative of the cheaper end of managed object-
| bucket storage.
|
| obv as we took on this project the delta between our cluster
| and the next-best option got smaller, in part bc the ability to
| host it ourselves gives us negotiating leverage, but managed
| bucket products are fundamentally overspecced for simple
| pretraining dumps. glacier does a nice job fitting the needs of
| archival storage for a good cost, but there's nothing similar
| for ML needs atm.
| landryraccoon wrote:
| Their electricity costs are $10K per month or about $120K per
| year. At an interest rate of 7% that's $1.7M of capital tied up
| in power bills.
|
| At that rate I wonder if it makes sense to do a massive solar
| panel and battery installation. They're already hosting all of
| their compute and storage on prem, so why not bring electricity
| generation on prem as well?
| moffkalast wrote:
| Let's just say we're not seeing all of these sudden private
| nuclear reactor investments for no reason.
| datadrivenangel wrote:
| At 120K per year over the three year accounting life of the
| hardware, that's 360k... how do you get to 1.7M?
| landryraccoon wrote:
| It seems unlikely to me that they'll never have to retrain
| their model to account for new data. Is the assumption that
| their power usage drastically drops after 3 years?
|
| Unless they go out of business in 3 years that seems unlikely
| to me. Is this a one-off model where they train once and it
| never needs to be updated?
| intalentive wrote:
| "Solve computer use" and previous work is audio conversation
| model. How do these go together? Is the idea to replace keyboard
| and mouse with spoken commands? a la Star Trek
| g413n wrote:
| just general research work. Once the recipes are efficient
| enough the modality is a smaller detail.
|
| On the product side we're trying to orient more towards
| 'productive work assistant' rather than the default pull of
| audio models towards being an 'ai friend'.
| nerpderp82 wrote:
| Make me transparent aluminum!
| Onavo wrote:
| > _We kept this obsessively simple instead of using MinIO or Ceph
| because we didn't need any of the features they provided; it's
| much, much simpler to debug a 200-line program than to debug
| Ceph, and we weren't worried about redundancy or sharding. All
| our drives were formatted with XFS._
|
| What do you plan to do if you start getting corruption and
| bitrot? The complexity of S3 comes with a lot of hard guarantees
| for data integrity.
| g413n wrote:
| our training stack doesn't make strong assumptions about data
| integrity, it's chill
| htrp wrote:
| >We threw a hard drive stacking party in downtown SF and got our
| friends to come, offering food and custom-engraved hard drives to
| all who helped. The hard drive stacking started at 6am and
| continued for 36 hours (with a break to sleep), and by the end of
| that time we had 30 PB of functioning hardware racked and wired
| up.
|
| So how many actual man hours for 2400 drives?
| g413n wrote:
| around 250
| Havoc wrote:
| Cool write-up.
|
| I do feel sorry for the friends that go suckered into doing a
| bunch of grunt work for free though
| g413n wrote:
| yeah that's why we started paying people near the second half-
| not super clearly stated in the blogpost, but the novelty
| definitely wore off with plenty of drives left to stack, so we
| switched strategies to get it done in time.
|
| I think everyone who showed up for a couple hours as part of
| the party had a good time tho, and the engraved hard drives we
| were giving out weren't cheap :p
| pighive wrote:
| HDDs - are never one time costs. Do datacenters also offer
| ordering and replacing HDDs?
| epistasis wrote:
| With 30PB it's likely they will simply let capacity fall as
| drives fail.
|
| They apparently have zero need for redundancy in their use
| case, and the failure rate won't be high enough to take out a
| significant percentage of their capacity.
| Symbiote wrote:
| They offer replacing, yes, but normally expect you to order the
| new one. (Usually covered by a warranty, sent next business
| day.)
| supermatt wrote:
| Where does one get "90 million hours of video data"?
| hmcamp wrote:
| I'm also curious about this. I don't recall seeing that
| mentioned in the article
| neilv wrote:
| As a fan of eBay for homelab gear, I appreciate the can-do
| scrappiness of doing it for a startup.
|
| To adapt the old enterprise information infrastructure saying for
| startups:
|
| "Nobody Ever Got Fired for Buying eBay"
| Scramblejams wrote:
| Fun piece, thanks to the author. But for vicarious thrills like
| this, more pictures are always appreciated!
| echelon wrote:
| If the authors chime in, I'd like to ask what "Standard
| Intelligence PBC" does.
|
| Is it a public benefit corp?
|
| What are y'all building?
| ThrowawayTestr wrote:
| DIY is always cheaper than paying someone else. Great write-up.
| akreal wrote:
| How is/was the data written to disks? Something like
| rsync/netcat?
| lucb1e wrote:
| The linked Discord post is also interesting and fun to read. Most
| of the post is more serious but this is one of the small gems:
|
| > One thing we discovered very quickly was that [world cup] goals
| scored showed up in our monitoring graphs. This was very cool
| because not only is it neat to see real-world events show up in
| your systems, but this gave our team an excuse to watch soccer
| during meetings. We weren't "watching soccer during meetings", we
| were "proactively monitoring our systems' performance."
|
| https://discord.com/blog/how-discord-stores-trillions-of-mes...
|
| It is linked as evidence for Discord using "less than a petabyte"
| of storage for messages. My best guess is that they multiplied
| node size and count from this post, which comes out to 708 TB for
| the old cluster and 648 in the new setup (presumably it also has
| some space to grow)
___________________________________________________________________
(page generated 2025-10-01 23:00 UTC)