[HN Gopher] Fly.io has GPUs now
___________________________________________________________________
Fly.io has GPUs now
Author : andes314
Score : 561 points
Date : 2024-02-13 22:06 UTC (1 days ago)
(HTM) web link (fly.io)
(TXT) w3m dump (fly.io)
| iambateman wrote:
| It's cool to see that they can handle scaling down to zero.
| Especially for working on experimental sites that don't have the
| users to justify even modest server costs.
|
| I would love an example on how much time a request charges.
| Obviously it will vary, but is it 2 seconds or "minimum 60
| seconds per spin up"?
| mrkurt wrote:
| We charge from the time you boot a machine until it stops.
| There's no enforced minimum, but in general it's difficult to
| get much out of a machine in less than 5 seconds. For GPU
| machines, depending on data size for whatever is going into GPU
| memory, it could need 30s of runtime to be useful.
| andes314 wrote:
| Do you offer some sort of keep_warm parameter that removes
| this latency (for a greater cost)?
| mrkurt wrote:
| You control machine lifecycles. To scale down, you just set
| the appropriate restart policy, then exit(0).
|
| You can also opt to let our proxy stop machines for you,
| but the most granular option is to just do it in code.
|
| So yes, kind of. You just wait before you exit.
| Aeolun wrote:
| So just to confirm, for these workloads, it'd start a
| machine when the request comes in, and then shut it down
| immediately after the request is finished (with some
| 30-60s in between I suppose)? Is there some way to keep
| it up if additional requests are in the queue?
|
| Edit: Found my answer elsewhere (yes).
| sodality2 wrote:
| How long does model loading take? Loading 19GB into a machine
| can't be instantaneous (especially if the model is a network
| share).
| loloquwowndueo wrote:
| There are no "network shares". The typical way to store
| model data would be in a volume, which is basically local
| nvme storage.
| xena wrote:
| Wellllllll, technically there is LSVD which would let you
| store model weights in S3.
|
| God that's a horrible idea. Blog time!
| carl_dr wrote:
| It takes about 7s to load a 9GB model on Beam (they claim,
| and tested as about right), I imagine it is similar with
| Fly - I've not had any performance issues with Fly.
| bbkane wrote:
| I see the whisper transcription article. Is there an easy way
| to limit it to, say $100 worth of transcription a month and
| then stop till next month? I want to transcribe a bunch of
| speeches but I want to spread the cost over time
| IanCal wrote:
| Probably available elsewhere but you could setup an account
| with a monthly spend limit with openai and use their API
| until you hit errors.
|
| $100/mo is about 10 days of speeches a month, how much data
| do you have?
|
| edit - if the pricing seems reasonable, you can just limit
| how many minutes you send. AssemblyAI is another provider
| at about the same cost.
| bbkane wrote:
| Thanks! Maybe 50hr of speeches. It's a hobby idea so I'll
| check these out when I get some time
| xena wrote:
| Email xe@fly.io, I'm intrigued.
| IanCal wrote:
| I can probably just run these through whisper locally for
| you if you want and are able to share. Email is in my bio
| (ignore the pricing, I'm obv not charging)
| holoduke wrote:
| Anybody has experience with the performance. First glance is that
| they are quite expensive. Compared to for example Hetzner (cpu
| machines)
| impulser_ wrote:
| I'm not sure about others, but you can get A100s with 90gb of
| RAM from DigitalOcean for $1.15 an hour. So about 1/3 the
| price.
|
| You can even get H100s for cheaper than these prices at $2.24
| an hour.
|
| So these do seem a bit expensive, but this might be because
| there is high demand for them from customers and they don't
| have the supply.
| skrtskrt wrote:
| getting supply is super hard right now, DigitalOcean just
| straight up bought Paperspace to get access to those GPUs.
|
| The whole reason Coreweave is on a fat growth trajectory
| right now is they used their VC money to buy a ton of GPUs at
| the right time
| treesciencebot wrote:
| Just to correct the record, both $1.15 per A100 and $2.24 per
| H100 require a 3-year-commitment. On-demand prices are 2.5X
| that.
| Aeolun wrote:
| > _$2.24 /hour pricing is for a 3-year commitment. On-
| demand pricing for H100 is $5.95/hour under our special
| promo price.\$1.15/hour pricing is for a 3-year
| commitment._
|
| Wow, that's some spectacularly false advertising.
| dathinab wrote:
| Company I work for had multiple times problems of not being
| able to allocate any gpus from some larger cloud providers
| (with the region restrictions we have, which still include
| all of EU as regions).
|
| (I'm not sure which of them it was, we are currently
| evaluating multiple providers and I'm not really involved in
| that process.)
| andes314 wrote:
| Has anyone who has used Beam.Cloud compare that service to this
| one?
| Havoc wrote:
| How fast is the spin up/down on this scale to zero? If it is fast
| this could be pretty interesting
| amanda99 wrote:
| I think the bigger question is how long it takes to load any
| meaningful model onto the GPU.
| fideloper wrote:
| that's exactly right.
|
| gpu-friendly base images tend to be larger (1-3g+) so that
| takes time (30s - 2m range) to create a new Machine (vm).
|
| Then there's "spin up time" of your software - downloading
| model files adds as long as it takes to download GB of model
| files.
|
| Models (and pip dependencies!) can generally be "cached" if
| you (re)use volumes.
|
| Attaching volumes to gpu machines dynamically created via the
| API takes a bit of management on your end (in that you'd need
| to keep track of your volumes, what region they're in, and
| what to do if you need more volumes than you have)
| dathinab wrote:
| I know it's not common in research and makes often little
| sense there.
|
| But at least in theory for deployments you should generate
| deployment images.
|
| I.e. no pip included in the image(!), all dependencies
| preloaded, unnecessary parts stripped, etc.
|
| Models likely might also be bundled, but not always.
|
| Still large images, but also depending on what they are for
| the same image might be reused often so it can be cached by
| the provider to some degree.
| nextworddev wrote:
| Somehow cheaper than AWS?
| reactordev wrote:
| AWS isn't the cheapest so how is that a surprise? They are a
| business and know how to turn the right knobs to increase cash
| flow. GPUs for AI is one major knob right now.
| CGamesPlay wrote:
| AWS is one of the most expensive infrastructure providers out
| there (especially anything beyond the "basic" services like
| EC2). And even though AWS still has some globally-notable
| uptime issues, "nobody ever got fired for picking AWS".
| dathinab wrote:
| I mean from hearsay of people which had to work with AWS &
| Google Cloud & Microsoft Azure it seems to me that the other
| two are in practice worse to a point they would always pick
| AWS over them even through they hate the AWS UX.
|
| And if it's the best of the big 3 providers, then it can't
| that bad, right ..... right? /s
| andersa wrote:
| It would be absurd if it wasn't.
| seabrookmx wrote:
| They're a "real" cloud provider (with their own hardware) and
| not a reseller like Vercel and Netlify. So this isn't _that_
| surprising. AWS economies of scale do allow them to make
| certain services cheap but only if they choose. A lot of time
| they choose to make money!
| patmorgan23 wrote:
| They run their own data centers.
| tptacek wrote:
| We run our own hardware, but not our own data centers.
| huydotnet wrote:
| Is there any write up on how Fly.io run your
| infrastructure? The "not data center" fact makes me
| interested a little bit.
| tptacek wrote:
| We should write that up! We lease space in data centers
| like Equinix.
| rxyz wrote:
| It's just renting space in a big server room. Every mid-
| to-large city has companies providing that kind of
| service
| Sohcahtoa82 wrote:
| Genuine question...why are you surprised?
| dathinab wrote:
| As a person working in a startup which used AWS for a while:
|
| *AWS is expensive, always, except if magic*
|
| Where magic means very clever optimizations (often deeply
| affecting your project architecture/code design) which require
| the right amount of knowledge/insights into a very confusing
| UI/UX and enough time evaluate all aspects. I.e. it might
| simple not be viable for startups and is expensive in it's own
| way.
|
| Through most cheaper alternatives have their own huge bag of
| issues.
|
| Most important fly.io is their own cloud provider not just a
| more easy way to use AWS. I mean while I don't know if they
| have their own server centers in every region they do have
| their own servers.
| nakovet wrote:
| About Fly but not about the GPU announcement, I wish they had a
| S3 replacement, they suggest a GNU Affero project that is a
| dealbreaker for any business, needing to leave Fly to store user
| assets was a dealbreaker for us to use Fly on our next project,
| sad cause I love the simplicity, the value for money, the built
| in VPN.
| benatkin wrote:
| This looks promising https://github.com/seaweedfs/seaweedfs
| candiddevmike wrote:
| Seaweed requires a separate coordination setup which may
| simplify the architecture but complicates the deployment.
| JoshTriplett wrote:
| > I wish they had a S3 replacement, they suggest a GNU Affero
| project that is a dealbreaker for any business
|
| AGPL does not mean you have to share everything you've built
| atop a service, just everything you've linked to it and any
| changes you've made to it. If you're accessing an S3-like
| service using only an HTTPS API, that isn't going to make your
| code subject to the AGPL.
| bradfitz wrote:
| Regardless, some companies have a blanket thou-shalt-not-use-
| AGPL-anything policy.
| trollian wrote:
| Lawyercats are the worst cats.
| hiharryhere wrote:
| Some companies Including Google.
|
| I've sold Enterprise Saas to Google and we had to attest we
| have no AGPL code servicing them. This is for a CRM-like
| app.
| anonzzzies wrote:
| Yep, our lawyers say to not use and we have to check
| components and libs we use too. People are really shooting
| themselves in the foot with that license.
| aragilar wrote:
| You assume that people want you to use their project. For
| MinIO, the AGPL seems to be a way to get people into
| their ecosystem so they can sell exceptions. Others might
| want you to contribute code back.
| anonzzzies wrote:
| I have no problem with contributing back: we do that all
| the time on MIT / BSD projects even if we don't have to.
| AGPL just restricts the use-cases and (apparently) there
| is limited legal precedence in my region to see if we
| don't have to give away everything that's not even
| related but uses it, so the lawyers (I am not a lawyer,
| so I cannot provide more details) say to avoid it
| completely. Just to be safe. And I am sure it hurts a lot
| of projects... There are many modern projects that are
| the same thing, but they don't share code because the
| code is agpl.
| corobo wrote:
| Sounds more like the license is doing its job as
| intended, and businesses that can afford lawyers but not
| bespoke licenses are shooting themselves in the foot with
| that policy
| RcouF1uZ4gsC wrote:
| > AGPL does not mean you have to share everything you've
| built atop a service, just everything you've linked to it and
| any changes you've made to it. If you're accessing an S3-like
| service using only an HTTPS API, that isn't going to make
| your code subject to the AGPL.
|
| I am not so sure about that. Otherwise, you could trivially
| get around the AGPL by using https services to launder your
| proprietary changes.
|
| There is not enough caselaw to say how a case that used only
| http services provided by AGPL to run a proprietary service
| would turn out, and it is not worth betting your business on
| it.
| xcdzvyn wrote:
| > you could trivially get around the AGPL by using https
| services to launder your proprietary changes.
|
| This is a very interesting proposition that makes me
| reconsider my opinion of AGPL.
| mbreese wrote:
| Anything "clever" in a legal sense is a red flag for
| me... Computer people tend to think of the law as a black
| and white set of rules, but it is and it isn't. It's
| interpreted by people and "one clever trick" doesn't
| sound like something I'd put a lot of faith in. Intent
| can matter a lot.
|
| (Regardless of how you see the AGPL)
| internetter wrote:
| > Computer people tend to think of the law as a black and
| white set of rules
|
| I've never seen someone put this into words, but it makes
| a lot of sense. I mean, idealistically computers are
| deterministic, whereas the law is not (by design), yet
| there exists many parallels between the two. For
| instance, the lawbook has strong parallels to the
| documentation for software. So it makes sense why
| programmers might assume the law is also mostly
| deterministic, even if this is false
| ozr wrote:
| I'm an engineer with a passing interest in the law. I've
| frequently had to explain to otherwise smart and capable
| people that their _one weird trick_ will just get them a
| contempt charge.
| Dylan16807 wrote:
| On the other hand the AGPL itself is trying to be one
| clever trick in the first place, so maybe it's
| appropriate here.
| xcdzvyn wrote:
| Even if that wasn't directly targeted at me, I'll
| elaborate on my concern:
|
| That it's possible to interpret the AGPL both ways (that
| the prior hack is legal, and that it is not), and that
| the project author could very well believe either one,
| suggests to me that the AGPL's terms aren't rigidly
| binding, but ultimately a kind of "don't do what the
| author thinks the license says, whatever that is".
| c0balt wrote:
| > > AGPL does not mean you have to share everything you've
| built atop a service, just everything you've linked to it
| and any changes you've made to it. If you're accessing an
| S3-like service using only an HTTPS API, that isn't going
| to make your code subject to the AGPL.
|
| Correct, this is a known caveat, that's also covered a bit
| more in the GNU article about the AGPL when discussing
| Software as a Service Substitutes, ref:
| https://www.gnu.org/licenses/why-affero-gpl.html.en
| benhoyt wrote:
| They're about to get an S3 replacement, called Tigris (it's a
| separate company but integrated into flyctl and runs on Fly.io
| infra): https://benhoyt.com/writings/flyio-and-tigris/
| benbjohnson wrote:
| We have an region-aware S3 replacement that's in beta right
| now: https://community.fly.io/t/global-caching-object-storage-
| on-...
| simonw wrote:
| Sounds like you might be interested in the Tigris preview:
|
| - https://www.tigrisdata.com/
|
| - https://benhoyt.com/writings/flyio-and-tigris/ (discussed
| here: https://news.ycombinator.com/item?id=39360870)
|
| - https://fly.io/docs/reference/tigris/
| martylamb wrote:
| Funny you should mention that:
| https://news.ycombinator.com/item?id=39360870
| tptacek wrote:
| Give us a minute.
| itake wrote:
| The dealbreaker should be their uptime and support. They
| deleted my database and have many uptime issues.
| riquito wrote:
| Is there any configuration to keep alive the machine for X
| seconds after a request has been served, instead of scaling down
| to zero immediately? I couldn't find it skimming the docs
| mrkurt wrote:
| Machines are both dumber and more powerful than you'd think.
| Scaling down means just exit(0) if you have the right restart
| policy set. So you can implement any kind of keep-warm logic
| you want.
| Aeolun wrote:
| Oh! I hadn't thought if it like that. That makes sense.
| kylemclaren wrote:
| you might also be looking for `kill_signal` and `kill_timeout`:
| https://fly.io/docs/reference/configuration/#runtime-options
| xena wrote:
| Hi, author of the post and Fly.io devrel here in case anyone has
| any questions. GPUs went GA yesterday, you can experiment with
| them to your heart's content should the fraud algorithm machine
| god smile upon you. I'm mostly surprised my signal post about
| what the "GPUs" are didn't land well here:
| https://fly.io/blog/what-are-these-gpus-really/
|
| If anyone has any questions, fire away!
| qeternity wrote:
| I posted further down before seeing your comment. First,
| congrats on the launch!
|
| But who is the target user of this service? Is this mostly just
| for existing fly.io customers who want to keep within the
| fly.io sandbox?
| subarctic wrote:
| Commenters like this, for one thing:
| https://news.ycombinator.com/item?id=34242767
| xena wrote:
| Part of it is for people that want to do GPU things on their
| fly.io networks. One of the big things I do personally is I
| made Arsene (https://arsene.fly.dev) a while back as an
| exploration of the "dead internet" theory. Every 12 hours it
| pokes two GPUs on Fly.io to generate article prose and key
| art with Mixtral (via Ollama) and an anime-tuned Stable
| Diffusion XL model named Kohaku-XL.
|
| Frankly, I also see the other part of it as a way to ride the
| AI hype train to victory. Having powerful GPUs available to
| everyone makes it easy to experiment, which would open Fly.io
| as an option for more developers. I think "bring your own
| weights" is going to be a compelling story as things advance.
| gooseyman wrote:
| https://en.m.wikipedia.org/wiki/Dead_Internet_theory
|
| What have you learned from the exploration?
| xena wrote:
| Enough that I'd probably need to write a blogpost about
| it and answer some questions that I have about it. The
| biggest one I want to do is a sentiment analysis of these
| horoscopes vs market results to see if they are
| "correct".
| cosmojg wrote:
| Interesting setup! What's the monthly cost of running
| Arsene on fly.io?
| xena wrote:
| Because I have secret magical powers that you probably
| don't, it's basically free for me. Here's the breakdown
| though:
|
| The application server uses Deno and Fresh
| (https://fresh.deno.dev) and requires a shared-1x CPU at
| 512 MB of ram. That's $3.19 per month as-is. It also uses
| 2GB of disk volume, which would cost $0.30 per month.
|
| As far as post generation goes: when I first set it up it
| used GPT-3.5 Turbo to generate prose. That cost me
| rounding error per month (maybe like $0.05?). At some
| point I upgraded it to GPT-4 Turbo for free-because-I-
| got-OpenAI-credits-on-the-drama-day reasons. The prose
| level increase wasn't significant.
|
| With the GPU it has now, a cold load of the model and
| prose generation run takes about 1.5 minutes. If I didn't
| have reasons to keep that machine pinned to a GPU
| (involving other ridiculous ventures), it would probably
| cost about 5 minutes per day (increased the time to make
| the math easier) of GPU time with a 40 GB volume (I now
| use Nous Hermes Mixtral at Q5_K_M precision, so about 32
| GB of weights), so something like $6 per month for the
| volume and 2.5 hours of GPU time, or about $6.25 per
| month on an L40s.
|
| In total it's probably something like $15.75 per month.
| That's a fair bit on paper, but I have certain
| arrangements that make it significantly less cheap for
| me. I could re-architect Arsene to not have to be online
| 24/7, but it's frankly not worth it when the big cost is
| the GPU time and weights volume. I don't know of a way to
| make that better without sacrificing model quality more
| than I have to.
|
| For a shitpost though, I think it'd totally worth it to
| pay that much. It's kinda hilarious and I feel like it
| makes for a decent display of how bad things could get if
| we go full "AI replaces writers" like some people seem to
| want for some reason I can't even begin to understand.
|
| I still think it's funny that I have to explicitly tell
| people to not take financial advice from it, because if I
| didn't then they will.
| tptacek wrote:
| This isn't the target user, but the boy's been using it at
| the soil bacteria lab he works in to do basecalling for a
| FAST5 data from a nanopore sequencer.
| yard2010 wrote:
| Can you please elaborate?
| tptacek wrote:
| I am nowhere within a million miles smart enough to
| elaborate on this one.
| bl4kers wrote:
| How difficult world it be to set up Folding@home on these?
| https://foldingathome.org
| xena wrote:
| I'm not sure, the more it uses CUDA the easier I bet. I don't
| know if it would be fiscally worth it though.
| yla92 wrote:
| Not a question but the link "Lovelace L40s are coming soon
| (pricing TBD)" is 404.
| xena wrote:
| Uhhhh that's not ideal. I'll go edit that after dinner.
| Thanks!
| thangngoc89 wrote:
| If it's a link to nvidia.com then it's expected to be broken.
| Seriously, I've never seen a valid link to nvidia.com
| Nevin1901 wrote:
| How fast are coldstarts, and how do you compare against other
| gpu providers (runpod modal etc)
| xena wrote:
| The slowest part is loading weights into vram in my
| experience. I haven't done benchmarking on that. What kind of
| benchmark would you like to see?
| ipsum2 wrote:
| I would like to see time to first inference for typical
| models (llama-7b first token, SDXL 1 step, etc)
| thangngoc89 wrote:
| This is right on time. I'm evaluating "severless" GPU services
| for my upcoming project. I see on the announcement that pricing
| is per hours. Is scaling to zero priced based on
| minutes/seconds? For my workflow, medical image segmentation,
| one file takes about 5 minutes.
| benreesman wrote:
| I'd be fascinated to hear your thoughts on Apple hardware for
| inference in particular. I spend a lot of time tuning up
| inference to run locally for people with Apple Silicon on-prem
| or even on-desk, and I estimate a lot of headroom left even
| with all the work that's gone into e.g. GGUF.
|
| Do you think the process node advantage and SoC/HBM-first will
| hold up long enough for the software to catch up? High-end
| Metal gear looks expensive until you compare it to NVIDIA with
| 64Gb+ of reasonably high memory bandwidth attached to dedicated
| FP vector units :)
|
| One imagines that being able to move inference workloads on and
| off device with a platform like `fly.io` would represent a lot
| of degrees of freedom for edge-heavy applications.
| xena wrote:
| Well, let me put it this way. I have a MacBook with 64 GB of
| vram so I can experiment with making an old-fashioned x.ai
| clone (the meeting scheduling one, not the "woke chatgpt"
| one) amongst other things now. I love how Apple Silicon makes
| things vroomy on my laptop.
|
| I do know that getting those working in a cloud provider
| setup is a "pain in the ass" (according to ex-AWS friends) so
| I don't personally have hope in seeing that happen in
| production.
|
| However, the premise makes me laugh so much, so who knows? :)
| niz4ts wrote:
| As far as I know, Fly uses Firecracker for their VMs. I've been
| following Firecracker for a while now (even using it in a
| project), and they don't support GPUs out of the box (and have no
| plan to support it [1]).
|
| I'm curious to know how Fly figured their own GPU support with
| Firecracker. In the past they had some very detailed technical
| posts on how they achieved certain things, so I'm hoping we'll
| see one on their GPU support in the future!
|
| [1]: https://github.com/firecracker-
| microvm/firecracker/issues/11...
| mrkurt wrote:
| The simple spoiler is that the GPU machines use Cloud
| Hypervisor, not Firecracker.
| niz4ts wrote:
| Way simpler than what I was expecting! Any notes to share
| about Cloud Hypervisor vs Firecracker operationally? I'm
| assuming the bulkier Cloud Hypervisor doesn't matter much
| compared to the latency of most GPU workloads.
| tptacek wrote:
| They are operationally pretty much identical. In both
| cases, we drive them through a wrapper API server that's
| part of our orchestrator. Building the cloud-hypervisor
| wrapper took me all of about 2 hours.
| qeternity wrote:
| Who is the target market for this? Small/unproven apps that need
| to run some AI model, but won't/can't use hosted offerings by the
| literally dozens of race-to-zero startups offering OSS models?
|
| We run plenty of our own models and hardware, so I get wanting to
| have control over the metal. I'm just trying to figure out who
| _this_ is targeted at.
| mrkurt wrote:
| We have some ideas but there's no clear answer yet. Probably
| people building hosting platforms. Maybe not obvious hosting
| platforms, but hosting platforms.
| KTibow wrote:
| Fly is an edge network - in theory, if your GPUs are next to
| your servers and your servers are next to your users, your app
| will be very fast, as highlighted in the article. In practice
| this might not matter much since inference takes a long time
| anyway.
| tptacek wrote:
| We're really a couple things; the edge stuff was where we got
| started in 2020, but "fast booting VMs" is just as important
| to us now, and that's something that's useful whether or not
| you're doing edge stuff.
| joshxyz wrote:
| this is crazy, this move alone cements fly as an edge player
| for the next 3 / 5 / 10 years.
| dathinab wrote:
| TL;DR: (skip to last paragraph)
|
| - having the GPU compute in the same data center or at least
| from the same cloud provider can be a huge plus
|
| - it's not that rare for various providers we have tried out to
| run out of available A100 GPUs, even with large providers we
| had issues like that multiple times (less an issue if you
| aren't locked to specific regions)
|
| - not all providers provide a usable scale down to zero "on
| demand" model, idk. how well it works with fly long term but
| that could be another point
|
| - race-to-zero startups have the tendency to not last, it's
| kind by design from a 100 of them just a very few survive
|
| - if you are already on fly and write a non-public tech demo
| which just gets evaluated a few times their GPU offering can
| act like a default don't think much about it solution (through
| you using e.g. Huggingface services would be often more likely)
|
| - A lot of companies can't run their own hardware for various
| reasons, at best they can rent a rack in another Datacenter but
| for small use use-cases this isn't always worth it. Similar
| there are use cases which do might A100s but only run them
| rarely (e.g. on weekly analytics data). Potentially less then
| 1h/w in which case race-to-zero pricing might not look
| interesting at all
|
| To sum up I think there are many small reasons why some
| companies, not just startups, might have interest in fly GPUs,
| especially if they are already on fly. But there is no single
| "that's why" argument, especially if you are already deploying
| to another cloud.
| qeternity wrote:
| It's not like Fly has GPUs in every PoP...so there goes all
| the same datacenter stuff (unless you just want to be in the
| PoP with GPUs in which case...)
|
| But none of this answers my question.
|
| I'm trying to understand the intersection of things like
| "people who need GPU compute" and "people who need to scale
| down to zero".
|
| This can't be a very big market.
| DreamGen wrote:
| I am not seeing any race-to-zero in the hosted offering space.
| Most charge multiples of what you would pay on GCP, and the
| public prices on GCP are already several times what you would
| pay as an enterprise customer.
| qeternity wrote:
| I don't know what you think I'm talking about, or who is
| charging multiples of GCP? But I'm talking about hosted
| inference, where many startups are offering Mistral models
| cheaper than Mistral are.
| dcsan wrote:
| Can fly run cog files like replicate uses? Would be nice to take
| those pre packaged models run them here with the same prediction
| API
|
| Maybe cos it's replicate they might be hesitant to adopt it but
| it does seem to make things a lot smoother Even with lambalabs'
| lambdastack I still hit cuda hell
| https://github.com/replicate/cog
| UncleOxidant wrote:
| I don't want to deploy an app, I just want to play around with
| LLMs and don't want to go out and buy an expensive PC with a
| highend GPU just now. Is Fly.io a good way to go? What about
| alternatives?
| leourbina wrote:
| Paperspace is a great way to go for this. You can start by just
| using their notebook product (similar to Collab), and you get
| to pick which type of machine/GPU it runs on. Once you have the
| code you want to run, you can rent machines on demand:
|
| https://www.paperspace.com/notebooks
| janalsncm wrote:
| I used paperspace for a while. Pretty cheap for mid tier gpu
| access (A6000 for example). There were a few things that
| annoyed me though. For one, I couldn't access free GPUs with
| my team account. So I ended up quitting and buying a 4090
| lol.
| mrkurt wrote:
| You might actually be better off building a gaming rig and
| using that. The datacenter GPUs are silly expensive, because
| this is how NVIDIA price discriminates. The consumer, game GPUs
| work really well and you can buy them for almost as cheap as
| you can lease datacenter ones.
| mrcwinn wrote:
| https://ollama.com/ - Easy setup, run locally, free.
| UncleOxidant wrote:
| Yeah, but I've got an RTX1070 in my circa 2017 PC. How well
| is that going to work?
| thangngoc89 wrote:
| It's slow but still decent since it has 8GB of RAM.
| jeswin wrote:
| You mean GTX 1070. There's no RTX 1070.
| nojs wrote:
| I can recommend runpod.io after a few months of usage - very
| easy to spin up different GPU configurations for testing and
| the pricing is simple and transparent. Using TheBloke docker
| images you can get most local models up and running in a few
| minutes.
| ignoramous wrote:
| > _What about alternatives?_
|
| Custom models? Apart from the Big 3 (in no particular order):
|
| - https://together.ai/
|
| - https://replicate.com/
|
| - https://anyscale.com/
|
| - https://baseten.co/
|
| - https://modal.com/
|
| - https://banana.dev/
|
| - https://runpod.io/
|
| - https://bentoml.com/
|
| - https://brev.dev/
|
| - https://octo.ai/
|
| - https://cerebrium.ai/
|
| ...
| ayewo wrote:
| > Apart from the Big 3 ...
|
| Who are the big 3 in this context?
| gk1 wrote:
| OpenAI, Anthropic, Cohere
| mrb wrote:
| Use https://vast.ai and rent a machine for as long as you need
| (minutes, hours, days). You pick the OS image, and you get a
| root shell to play with. An RTX 4090 currently costs $0.50 per
| hour. It literally took me less than 15 minutes to sign up for
| the first time a few weeks ago.
|
| For comparison, the first time experience on Amazon EC2 is much
| worse. I had tried to get a GPU instance on EC2 but couldn't
| reserve it (cryptic error message). Then I realized as a first-
| time EC2 user my default quota simply doesn't allow any GPU
| instances. After contacting support and waiting 4-5 days I
| eventually got a response my quota was increased, but I still
| can't launch a GPU instance... apparently my quota is still
| zero. At this point I gave up and found vast.ai. I don't know
| if Amazon realizes how FRUSTRATING their useless default quotas
| are for first-time EC2 users.
| janalsncm wrote:
| Pretty much had the same experience with EC2 GPUs. No
| permission, had to contact support. Got permission a day
| later. I wanted to run on A100 ($30/hour, 8GPU minimum) but
| they were out of them that night. I tried again next day,
| same thing. So I gave up and used RunPod.io.
| dathinab wrote:
| main question, do you need a A100?
|
| some use cases do so.
|
| but if not there are much cheaper consumer GPU based choices
|
| but then maybe you anyway just use it for 1-2 hours in total in
| which case the price difference might just not matter
| k8svet wrote:
| Does it have basic functioning other stuff? I am _shocked_ at how
| our production usage of Fly has gone. Even basic stuff as support
| not being able to just... look up internal platform issues.
| Cryptic /non-existent error messages. I'm not impressed. It feels
| like it's compelling to those scared of or ignorant of
| Kubernetes. I thought I was over Kubernetes, but Fly makes me
| miss it.
| chachra wrote:
| Been on it 7 months, 0 issues. Feel like you're alone on this
| potentially.
| weird-eye-issue wrote:
| Alone? _Every_ thread about Fly has complaints about
| reliability and people complain about it on Twitter too
| nixgeek wrote:
| That hasn't been my experience with Fly but I'm sorry to
| hear it seems to be others :(
| chachra wrote:
| ok possibly not alone, maybe the issues happened before I
| started using them extensively. I've had ~no downtime that
| affects me in 7 months.
|
| I do wish they had some features I need, but their support
| and responses are top notch. And I've lost much less hair
| and time than I would going full-blown AWS or another cloud
| provider.
| jokethrowaway wrote:
| To be fair most hosting providers come with plenty of
| public complaints about downtime. The big ones do way
| better, the best one is AWS, then GC and last Azure. They
| cost stupid money though.
|
| Digital ocean has been terrible for me, some regions just
| go down every month and I lose thousands of requests,
| increasing my churn rate.
|
| Fly.io had tons of weird issues but it got better in the
| last months. It's still very incomplete in terms of
| functionality and figuring out how to deploy the first time
| is a massive pain.
|
| My plan is to add Hetzner and load balance with bunnycdn
| across DO and H
| loloquwowndueo wrote:
| Every thread on the Internet about any product or service
| has complaints.
| weird-eye-issue wrote:
| Not to this extent, it has always stood out to me in
| particular
| weird-eye-issue wrote:
| Actually here is a good example: Cloudflare. Sure people
| complain a ton about privacy but I haven't seen a single
| complaint about the reliability of Cloudflare Workers or
| similar product in the dozens of threads I've seen on HN
| jrockway wrote:
| It's hard to tell how meaningful the reviews are. I have
| used AWS, GCP, DigialOcean, and Linode throughout my
| career. Every single one of these, through no fault of
| myself or my team, messed up and caused downtime. Like, you
| can get most SRE types in a room to laugh if you blurt out
| "us-east-1", because it's known to be so unreliable. And
| yet, it's where every Fortune 500 puts every service; we
| laugh about the reliability and it's literally powering the
| economy just fine.
|
| So yes, a lot of people on HN complain about fly's
| reliability. fly posts to HN a lot and gives them the
| opportunity. Is it actually meaningful compared to the
| alternatives? It's very hard to tell.
| tptacek wrote:
| Hoo boy.
|
| First: this is 100% a "live by the sword, die by the
| sword" situation for us. We're as aware as anybody about
| our weird HN darling status (this is a post from two
| months ago, about an announcement from many months ago,
| that spent like 12 hours plastered to the front page; we
| have no idea why it hit today, and it actually stepped on
| another thing we wanted to post today so don't think we
| secretly orchestrated any of this!). We've allowed
| ourselves to be ultra-visible here, and threads like this
| are natural consequence.
|
| Moreover: a lot of this criticism is well warranted! I
| can cough up a litany of mitigating factors (the guy who
| stored his database in ephemeral instance storage instead
| of a volume, for instance), but I mean, come on. The
| single most highly upvoted and trafficked thing we've
| ever written was a post a year ago owning up to
| reliability issues on the platform. People have
| definitely had issues!
|
| A fun cop-out answer here is to note all the times people
| compare us to AWS or Cloudflare, as if we were a
| hyperscaler public cloud. More fun still is to search HN
| for stories about us-east-1. We certainly do that to
| self-sooth internally! And: also? If your only
| consideration for picking a place to host an application
| is platform reliability? You're hosting on AWS anyways.
| But it's still a cop-out.
|
| So I guess I'd sum all this up as: we've picked a hard
| problem to work on. Things are mathematically guaranteed
| to go wrong even if we're perfect, and we are not that.
| People should take criticisms of us on these threads
| seriously. We do. This is a tough crowd (the threads, if
| not the vote scores on our blog post) and there's value
| in that. Over the last year, and through this upcoming
| year, staffing for infra reliability has been the single
| biggest driver of hiring at Fly.io, I think that's the
| right call, and I think the fact that we occasionally get
| mauled on threads is part of what enabled us to make that
| call.
|
| (Ordinarily I'd shut up about this stuff and let the
| thread die out itself, but some dearly loved user of ours
| took a stand and said they'd never had any problems on
| us, which: you can imagine the "ohhhhh nooooooo" montage
| that took place in my brain when I read that someone had
| essentially dared the thread to come up with times when
| we'd sucked for some user, so I guess all bets are off.
| Go easy on Xe, though: they really are just an ultra-
| helpful uncynical person, and kind of walked into a
| buzzsaw here).
| jrockway wrote:
| I also don't know why HN is so upset about people willing
| to help out in the threads. The way I see it is, if you
| talk about your product on HN, inevitably someone will
| remember they have a support inquiry while HN is open,
| and ask it there instead of over email. Since employees
| are probably reading HN, they are naturally going to want
| to answer or say they escalated there. I don't think it's
| some sort of scam, just what any reasonable person would
| do.
| tptacek wrote:
| It's become a YC cliche, that the way to get support for
| any issue is to get a complaint upvoted to the top of a
| thread. People used to talk about "Collison installs",
| which are real-use product demos that are so slick your
| company founder (in this case Stripe's 'pc) can just
| wander around installing your product for people to
| evangelize it; there should be another Collison term for
| decisively resolving customer support issues by having
| the founder drop into a thread, and I think that's the
| vibe people are reacting to here.
| uo21tp5hoyg wrote:
| https://community.fly.io/t/reliability-its-not-great/11253
| heeton wrote:
| Not alone, I've been part of two teams who have evaluated fly
| and hit weird reliability or stability issues, deemed it not
| ready yet.
| yawnxyz wrote:
| this is what I thought, until once I spent two days to
| publish a new, trivial code change to my Fly.io hosted API --
| it just wouldn't update! And every time I tried to re-publish
| it'd give me a slightly different error.
|
| When it works, it's brilliant. The problem is that it hasn't
| worked too well in the last few months.
| xena wrote:
| Can you email the first two letters of my username at fly.io
| with more details? I'd love to find out what you've been having
| trouble with so I can help make the situation better any way I
| can. Thanks!
| bongobingo1 wrote:
| Another support.flycombinator.com classic.
| azinman2 wrote:
| Would you rather them be unresponsive?
| lostemptations5 wrote:
| It's HN -- if the company proved responsive it might
| invalidate his OP and everyone who band wagons on it.
| zmgsabst wrote:
| Why would you care about customer problems if they don't
| embarrass you in public?
|
| /s
| keeganpoppen wrote:
| the only thing easier than them responding in this thread
| is someone making this comment in this thread...
| throwaway220033 wrote:
| ...as if it's one person who had issues! I thought it was
| just incompetency. But it now looks like a theatre,
| pretending now.
| ignoramous wrote:
| I've been a paying Fly.io customer for 3 years now, and for
| the past 18 months, I've had no real issue with any of my
| apps. In fact, I don't even monitor our Fly.io servers any
| more than I monitor S3 buckets; the kind of _zero devops_ I
| expect from it is already a reality.
|
| > _it 's one person who had issues_
|
| Issues specific to an application or one particular account
| _have_ to be addressed as special cases (like any
| _NewCloud_ platform, Fly.io has its own idiosyncrasies).
| The first step anyway is figuring out just what you 're
| dealing with (special v common failure).
|
| > _looks like a theatre_
|
| I have had the Fly.io CEO do customer service. Some may
| call it theatre, but this isn't uncommon for smaller
| upstarts, and indicative of their commitment, if anything.
| throwaway220033 wrote:
| You're right, we have been quite unfair to Fly.io. All
| these people who's talking bad about Fly io, like those
| who lost their database, those who sent weekends trying
| to get their product up and running while Fly.io not even
| communicating but busy with their public image, we're
| just some bad people talking bad about Fly.io. Your one
| personal experience invalidates the whole data.
|
| May be we should all switch using to your personal
| account, as everything great works for you.
| pech0rin wrote:
| Yep they have terrible reliability and support. Couldn't deploy
| for 2 days once and they actually told me to use another
| company. Unmanaged dbs masquerading as managed. Random
| downtime. I could go on but it's not a production ready service
| and I moved off of it months ago.
| biorach wrote:
| > Unmanaged dbs masquerading as managed
|
| Are you talking about fly postgres? Because I use it and feel
| they've been pretty clear that it's unmanaged.
| andy_ppp wrote:
| Seriously! That's crazy. I need to setup terraform and move
| to AWS before launching I guess.
| biorach wrote:
| > Seriously! That's crazy
|
| huh? it does what it says on the tin. nothing crazy about
| it.
|
| They spell out for you in detail what they offer:
| https://fly.io/docs/postgres/getting-started/what-you-
| should...
|
| And suggest external providers if you need managed
| postgres: https://fly.io/docs/postgres/getting-
| started/what-you-should...
| andy_ppp wrote:
| I was shocked because I didn't realise it wasn't managed.
| Even Digital Ocean offer managed Postgres.
|
| If you are offering a service like Fly I think the
| database should be managed personally, the whole point of
| Fly.io is to provide abstractions to make production
| simpler.
|
| Do you think the type of user who is using fly.io is
| interested in or capable of managing their own Postgres
| database? I'd rather just trust RDS or another provider.
| corobo wrote:
| > Do you think the type of user who is using fly.io is
| interested in or capable of managing their own Postgres
| database?
|
| Honestly.. kinda, yeah
|
| At least I'm projecting my weird "I want to love you for
| some reason, Fly" plus my skillset onto anyone else that
| wants to love Fly too haha
|
| They feel very developer/nerd/HN/tinkerer targeted
| benzible wrote:
| The header at the top of their Getting Started is "This Is
| Not Managed Postgres " [1]
|
| and they have a managed offering [2] in private beta now...
|
| > Supabase now offers their excellent managed Postgres
| service on Fly.io infrastructure. Provisioning Supabase via
| flyctl ensures secure, low-latency database access from
| applications hosted on Fly.io.
|
| [1] https://fly.io/docs/postgres/getting-started/what-you-
| should...
|
| [2] https://fly.io/docs/reference/supabase/
| awestroke wrote:
| I have run several services on Fly for almost a year now, have
| not had any issues.
| parhamn wrote:
| I was hoping to migrate to Fly.io and during my testing I found
| that simple deploys would drop connections for a few seconds
| during a deploy switch over. Try a `watch -n 2 curl
| <serviceipv4>` during a deploy to see for yourself (try any one
| of the the strategies documented including blue-green). I
| wonder how many people know this?
|
| When I tested it I was hoping for at worst early termination of
| old connections with no dropped new connections and at best I
| expected them to gracefully wait for old connections to finish.
| But nope, just a full downtime switch over every time. But then
| when you think about the network topology described in their
| blog posts, you realize theres no way it could've been done
| correctly to begin with.
|
| It's very rare for me to comment negatively on a service but
| that fact that this was the case paired with the way support
| acted like we were crazy when we sent video evidence of it
| definitely irked me for infrastructure company standards.
| Wouldn't recommend it outside of toy applications now.
|
| > It feels like it's compelling to those scared of or ignorant
| of Kubernetes
|
| I've written pretty large deployment systems for kubernetes.
| This isn't it. Theres a real space for heroku-like deploys done
| properly and no one is really doing it well (or at least
| without ridiculously thin or expensive compute resources)
| asaddhamani wrote:
| Yeah I had a similar experience where I got builds frozen for
| a couple days, such that I was not able to release any
| updates. When I emailed their support, I got an auto-response
| asking me to post in the forum. Pretty much all hosts are
| expected to offer a ticket system even for their unmanaged
| services if its a problem on their side. I just moved over
| all my stuff to Render.com, it's more expensive, but its been
| reliable so far.
| loloquwowndueo wrote:
| The first (pinned) post in the fly.io forum explains it:
|
| https://community.fly.io/t/fly-io-support-community-vs-
| email...
| malfist wrote:
| That forum post just says what OP said, that they will
| ignore all tickets from unnmanaged customers. Which is a
| pretty shitty thing to do to your customers.
| sofixa wrote:
| > I've written pretty large deployment systems for
| kubernetes. This isn't it. Theres a real space for heroku-
| like deploys done properly and no one is really doing it well
| (or at least without ridiculously thin or expensive compute
| resources)
|
| Have you tried Google Cloud Run(based on KNative) I've never
| used it in production, but on paper seems to fit the bill.
| parhamn wrote:
| Yeah we're mostly hosted there now. The cpu/virtualization
| feels slow but I haven't had time to confirm (we had to
| offload super small ffmepg operations).
|
| It's in a weird place between heroku and lambda. If your
| container has a bad startup time like one of our python
| services, autoscaling can't be used as latency becomes a
| pain. Its also common deploy services on there that need
| things like health checks (unlike functions which you
| assume are alive), this assumes at least 1 instance of
| sustained use as well, assuming you do minute health
| checks. Their domain mapping service is also really really
| bad and can take hours to issue a cert for a domain so you
| have to be very careful about putting a lb in front of it
| for hostname migrations.
|
| I don't care right now but the fact that we're paying 5x in
| compute is starting to bother me a bit. A 8core 16gb 'node'
| is ~$500/month ($100 on DO) assuming you don't scale to
| zero (which you probably wont). Plus I'm pretty sure the 8
| cores reported isn't a meaty 8 cores.
|
| But its been pretty stable and nice to use otherwise!
| jetbalsa wrote:
| A 6c / 12t Dedicated Server with 32GB of ram is 65$ a
| month with OVH
|
| I do get that it is a bare server, but if you deploy even
| just bare containers to it, you would be saving a good
| bit of money and get better performance from it.
| doctorpangloss wrote:
| Another interpretation is the so-called dedicated servers
| are too good to be true.
| jrockway wrote:
| It depends on what the 6 cores are. Like I have a 8C/8T
| dedicated server sitting in my closet that costs $65 per
| the number of times you buy it. (Usually once.) The cores
| are not as fast as the highest-end Epyc cores, however ;)
| ac29 wrote:
| At the $65/month level for an OVH dedicated server, you
| get a 6-core CPU from 2018 and a 500Mbps public network
| limit. Doesnt even seem like that good a deal.
|
| There is also a $63/month option that is significantly
| worse.
| dig1 wrote:
| I have yet to gain positive experience with Cloud Run. I
| have one project with it, and Cloud Run is very
| unpredictable with autoscaling. Sometimes, it can start
| spinning up/down containers without any apparent reason,
| and after hunting Google support for months, they said it
| is an "expected behavior". Good luck trying to debug this
| independently because you don't have access to knative
| logs.
|
| Starting containers on Cloud Run is weirdly slow, and oh
| boy, how expensive that thing is. I'm getting the
| impression that pure VMs + Nomad would be a way better
| option.
| parhamn wrote:
| > Starting containers on Cloud Run is weirdly slow
|
| What is this about? I assumed a highly throttled cpu or
| terrible disk performance. A python process that would
| start in 4 seconds locally could easily take 30 seconds
| there.
| JoshTriplett wrote:
| Last I checked, Cloud Run isn't actually running real
| Linux, it's emulating Linux syscalls.
| sofixa wrote:
| > I'm getting the impression that pure VMs + Nomad would
| be a way better option
|
| As a long time Nomad fan (disclaimer: now I work at
| HashiCorp), I would certainly agree. You lose some on the
| maintenance side because there's stuff for you to deal
| with that Google could abstract for you, but the added
| flexibility is _probably_ worth it.
| jonatron wrote:
| I just use AWS EC2, load balancer, auto scaling groups.
| The user_data pulls and runs a docker image. To deploy I
| do an instance refresh which has no downtime. Obvious
| downside is more configuration than more managed
| services.
| giovannibonetti wrote:
| I have been using Google Cloud Run in production for a few
| years and have had a very good experience. It has the
| fastest auto scaler I have ever seen, except only for FaaS,
| which are not a good option for client-facing web services.
| davidspiess wrote:
| Same experience here, using it for years in production
| for our critical api services without issues.
| rollcat wrote:
| > Try a `watch -n 2 curl <serviceipv4>` during a deploy
|
| You need blackbox HTTP monitoring right now, don't _ever_
| wait for your customer to tell you that your service is down.
|
| I use Prometheus (&Grafana), but you can also get a hosted
| service like Pingdom or whatever.
| morgante wrote:
| Unfortunately this is a pretty common story. Half the people I
| know who adopted Fly migrated off it.
|
| I was very excited about Fly originally, and built an entire
| orchestrator on top of Fly machines--until they had a multi-day
| outage where it took days to even get a response.
|
| Kubernetes can be complex, but at least that complexity is (a)
| controllable and (b) fairly well-trodden.
| loloquwowndueo wrote:
| Fly.io is not comparable to Kubernetes. It's a bit like
| comparing AWS to Terraform.
|
| Or to clarify your comment, Kubernetes on which cloud?
| Amazon? google? Linode?
| jrockway wrote:
| Kubernetes on AWS, GCP, and Linode are all controllable and
| well-trodden.
|
| I definitely understand the comparison between Kubernetes
| and fly. You have couple apps that are totally unrelated,
| managed by separate teams, and you want to figure out how
| you can avoid the two teams duplicating effort. One option
| is to use something like fly.io, where you get a command
| line you run to build your project and push the binary to a
| server. Another option is to self-host infrastructure like
| Kubernetes, and eventually get that down to one command to
| build and push (or have your CI system do it).
|
| The end result that organizations are aiming for are
| similar; developers code the code and then the code runs in
| production. Frankly, a lot of toil and human effort is
| spent on this task, and everyone is aiming to get it to
| take less effort. fly.io is an approach. Kubernetes is an
| approach. Terraform on AWS is an approach.
| loloquwowndueo wrote:
| Maybe you're comparing flyctl with Kubernetes?
|
| That'd be a slightly more valid comparison albeit flyctl
| is much less ambitious by choice and design. That said,
| using flyctl to orchestrate your deployments is not the
| only way to Fly. Example:
|
| https://fly.io/blog/fks/
| morgante wrote:
| > Fly.io is not comparable to Kubernetes.
|
| The Fly team has worked on solving similar problems to
| Kubernetes. Ex://fly.io/blog/carving-the-scheduler-out-of-
| our-orchestrator/
|
| Of course, Fly _also_ provides the underlying
| infrastructure stack too. If you want to be pedantic, you
| can compare it to GKE /AKS/EKS.
|
| Kubernetes on any major cloud platform is more mature,
| controllable, and reliable than Fly.
| throwaway220033 wrote:
| I switched to Kamal and Hetzner. It's the sweet spot.
| rmbyrro wrote:
| I find it amazing how much bad vibes fly.io gets here.
|
| It looks worse than AWS or Azure to me.
|
| Never used the service, but based on what I hear, I'll never
| try...
| m3kw9 wrote:
| Now having GPUs is news now?
| Mikejames wrote:
| anyone know if this is a PCI passthrough for a full a100? or some
| fancy clever vgpu thing?
| mrkurt wrote:
| Passthrough, yes.
| tptacek wrote:
| Do not get me started on the fancy vGPU stuff.
| mgliwka wrote:
| I'll bite :-) What are your experiences with that?
| tptacek wrote:
| Bad.
| dvrp wrote:
| too expensive
| ec109685 wrote:
| The recipe example or any any LLM use case seems like a very poor
| way of highlighting "inference at the edge" given the extra few
| hundred ms round trip won't matter.
| manishsharan wrote:
| This. I cannot think of a business case for running LLMs on the
| edge. Is this a Pets.com moment for the AI industry?
| unraveller wrote:
| The better use case is obviously voice assistant at the edge.
| As in voice 2 text 2 search/GPT 2 voice generated response.
| That is where ms matter but it is also a high abuse angle no
| one wants to associate with just yet. My guess is they are
| going to do this in another post, and if so they should make
| their own perplexity style online-gpt. For now they just wanted
| to see what else people can think up by making the introduction
| of it boring.
| ec109685 wrote:
| There's three options for inference: 1) On device inference
| 2) Inference "on the edge" 3) Inference in a data center
|
| Given fly is deployed in equinox data centers just like
| everyone else, fundamentally there isn't much difference
| between #2 and #3.
| bugbuddy wrote:
| This is amazing and it shows that Nvidia should be the most
| valuable stock in the world. Every company, country, city, town,
| village, large enterprise, medium and small business, AI bro,
| Crypto bro, gamer bro, big tech, small tech, old tech, new tech,
| and start up want Nvidia GPUs. Nvidia GPUs will become the new
| green oil of the 21st century. I am all in and nothing short of a
| margin call will change my mind.
| isoprophlex wrote:
| Almost twice as cheap as Modal! Very nice!
| pgt wrote:
| I was an early adopter of Fly.io. It is not production-ready.
| They should fix their basic features before adding new ones.
| urduntupu wrote:
| Unfortunately true. Also jumped the fly.io ship after initial
| high excitement for their offering. Moved back to
| DigitalOcean's app platform. A bit more config effort,
| significantly pricier, but we need stability on production.
| Can't have my customers call me b/c of service interruption.
| throwaway220033 wrote:
| +1 - It's the most unreliable hosting service I've ever used in
| my life with "nice looking" packaging. There were frequently
| multiple things broken at same time, status page would always
| be green while my meetings and weekends were ruined. Software
| can be broken but Fly handles incidents with unprofessional,
| immature attitude. Basically you pay 10x more money for an
| unreliable service that just looks "nice". I'm paying 4x less
| to much better hardware with Hetzner + Kamal; it works
| reliably, pricing is predictable, I don't pay 25% more for the
| same usage next month.
|
| https://news.ycombinator.com/item?id=36808296
| ecmascript wrote:
| Comments like these are just sad to see on HN. It is not
| constructive. What is these basic features that need fixing
| you're speaking about and what is the fixes required?
| cschmatzler wrote:
| Reliability and support. Having even "the entire node went
| down" tickets get an auto-response to "please go fuck off
| into the community forum" is insane. What is the community
| forum gonna do about your reliability issues? I can get a
| 4EUR/mo server at Hetzner and have actual people in the
| datacenter respond to my technical inquiries within minutes.
| DreamGen wrote:
| Great, more competition for the price-gouging platforms like
| Replicate and Modal is needed. As always with these, I would be
| curious about the cold-start time -- are you doing anything smart
| about being able to start (load models into VRAM) quickly? Most
| platforms that I tested are completely naive in their
| implementation, often downloading the docker image just-in-time
| instead of having it ready to be deployed on multiple machines.
| wslh wrote:
| Interesting. We have this discussing this kind of services
| (offloading training) over the last several days [1] [2] [3].
| Thinking on the opportunity to compete with top cloud services
| such as Google Cloud, AWS, and Azure.
|
| [1] https://news.ycombinator.com/item?id=39353663
|
| [2] https://news.ycombinator.com/item?id=39329764
|
| [3] https://news.ycombinator.com/item?id=39263422
| unixhero wrote:
| I use Fly.io free tier to run uptime monitoring with Uptime kuma.
| It works insanely well, and I'm a really happy camper.
| rozenmd wrote:
| What do you use to let you know uptime kuma went down?
| unixhero wrote:
| It doesn't
| faust201 wrote:
| > The speed of light is only so fast
|
| This is the title of one of the sections. Why? Think IT sector
| needs to stop using such titles.
| jimnotgym wrote:
| It is a bit of an odd thing that we still call GPUs GPUs when the
| main use for them seems to have little to do with Graphics!
___________________________________________________________________
(page generated 2024-02-14 23:01 UTC)