[HN Gopher] Webflow Down for >31 Hours
___________________________________________________________________
Webflow Down for >31 Hours
Author : philip1209
Score : 91 points
Date : 2025-07-29 21:38 UTC (1 hours ago)
(HTM) web link (status.webflow.com)
(TXT) w3m dump (status.webflow.com)
| dangoodmanUT wrote:
| Hugs for their SREs sweating bullets rn
| wavemode wrote:
| CEO's statement:
| https://www.reddit.com/r/webflow/comments/1mcmxco/from_webfl...
| edoceo wrote:
| Interesting the phrase "I'm sorry" was in there. Almost feels
| like someone in the Big Chair taking a bit of responsibility.
| Cheers to that.
| progbits wrote:
| > 99.99%+ uptime is the standard we need to meet, and lately,
| we haven't.
|
| Four nines is not what I would be citing at this point. (That's
| less than an hour per year, so they burned that for next three
| decades)
|
| Maybe aim for 99% first.
|
| Otherwise a pretty honest and solid response, kudos for that!
| Spivak wrote:
| I strive for one 9, thank you. No need to overcomplicate. We
| use Lambda on top of Glacier.
| jeeyoungk wrote:
| why go for 9's when you can go for 8s? you can aim for
| 88.8888888!
| hnlmorg wrote:
| Hit that and you also master time travel.
| hinkley wrote:
| There's an old rant I cannot find at the moment that
| argued that most systems that believe they are 5 9's are
| really more like 5 8's.
| theideaofcoffee wrote:
| Lots get starry-eyed and aim for five nines right out of the
| gate where they should have been targeting nine fives and
| learning from that. Walk before you run.
| zamadatix wrote:
| One could have nearly 3 such incidents per year and still
| have hit 99%.
|
| I always strive for 7 9s myself, just not necessarily
| consecutive digits.
| thih9 wrote:
| > Change controls are tighter, and we're investing in long-term
| performance improvements, especially in the CMS.
|
| This reads as if overall performance was an afterthought and
| this doesn't seem practical; it should be a business metric, it
| is important to the users after all.
|
| Then again, it's easy to comment like this in hindsight. We'll
| see what happens long term.
| stackskipton wrote:
| I mean, if customers don't leave them over this, higher ups
| likely won't care after dust settles.
| newZWhoDis wrote:
| As a former webflow customer I can assure you performance was
| always an afterthought.
| nusl wrote:
| My sympathy for those in the mud dealing with this. Never a fun
| place to be. Hope y'all figure it out and manage to de-stress :)
| pton_xd wrote:
| Claude, here is the bug, fix it. This is the new log output, fix
| the error. Fix the bug. Try a different approach. Reimplement the
| tests you modified. The bug is still happening, fix it. Fix the
| error.
|
| We're out of credits, create a new account. We've been API rate
| limited? When did that start happening? When are we going to get
| access again?
|
| Good luck engineers of the future!
| ed_mercer wrote:
| You forgot to add "think hard!" :)
| esafak wrote:
| And a subtle threat: "... or else".
| lgl wrote:
| Comment of the year 2025! Thanks for that :D
| zvmaz wrote:
| How do you know?
| troyvit wrote:
| More like "Good luck users of the future" that have to wade
| through failing infrastructure and tools that were vibe coded
| to begin with, rate limits notwithstanding.
| bravesoul2 wrote:
| Wow >31h I am surprised they couldnt rebuild their entire systems
| in parallel on new infra in that time. Can be hard if data loss
| is invokved tho (a guess). Would love to see the post mortem so
| we all can learn.
| stackskipton wrote:
| I doubt it's infra failure but software failure. Their bad
| design has caught up and they can't toss more hardware for some
| reason. Most companies have this https://xkcd.com/2347/ in
| their stack and it's fallen over.
| sangeeth96 wrote:
| Hugs to the ones dealing with this and the users of Webflow who
| invested in them for their clientele. Hoping they'll release a
| full postmortem once the sky clears up.
| ChrisArchitect wrote:
| Incident link: https://status.webflow.com/incidents/0xg8xq3l0h0q
| stackskipton wrote:
| My SRE brain reading between the lines is they have been feature
| factory and tech debt finally caught up to them.
|
| My guess is reason they been down so long is they don't have good
| rollback so they attempting to fix forward with limited success.
| xyst wrote:
| I have no clue of "webflow" purpose based on it's
| marketing/buzzword filled landing page, but seems it's just a "no
| code" abstraction on top of HTML/CSS?
|
| yet another SaaS that really does not need to be online 24/7. It
| could have been a simple app where you could "no code" on local
| machine and async state with webflow servers.
| dylan604 wrote:
| if you have a web based SaaS, everyone gets the updates. if you
| have a "simple app", then you are dependent on all of the users
| being up to date which you just cannot guarantee. also, what is
| a "simple app" that does not care about differences among
| various OSes found in the wild? how large of a team do you need
| for each of those OSes to support as wide of a user base as a
| web only app?
| mattbillenstein wrote:
| We're sorry https://www.youtube.com/watch?v=9u0EL_u4nvw
|
| Edit, an outage of this length smells of bad systems
| architecture...
| hinkley wrote:
| Prediction: Someone confidently broke something, then
| confidently 'fixed' it, with the consequence of breaking more
| things instead. And now they have either been pulled off of the
| cleanup work or they wish they had been.
| betaby wrote:
| I'm more surprised that WordPress-like platforms are profitable
| businesses in 2025.
| bogzz wrote:
| Why? Genuinely asking. Did you mean because there are free
| alternatives to self-host? I don't think that it would be so
| easy for someone in the market for a WYSIWYG blog builder to
| set everything up themselves.
| betaby wrote:
| Exactly. Because of the abundance of the one-click deploy
| WordPress offerings from value providers like OVH / Hetzner I
| would think margins are very low for WYSIWYG site builders.
| newZWhoDis wrote:
| We moved away from webflow because it was slow (got the
| nickname web-slow internally).
|
| Plus, despite marketing begging for the WYSIWYG interface they
| actually weren't creative enough to generate new content at a
| pace that required it.
|
| We massively increased conversion rates by going full native
| and having 1 Engineer churn out parts kits/kitbash LPs from
| said kits.
|
| Scale for reference: ~$10M/month
| wewewedxfgdf wrote:
| Companies get very good at handling disasters - after the
| disaster has happened.
| dylan604 wrote:
| The problem is they get good in that specific disaster. They
| can only plug a hole in the dike after the hole exists, then
| they look at the hole and make a plug the exact shape of that
| hole. The next hole starts the process over for it
| specifically. Each time. There's no generic plug that can be
| used each time. So sure, the get very good at making specific
| plugs. They never get to the point of making a better dike that
| doesn't spring so many leaks.
| wewewedxfgdf wrote:
| It is the job of the CTO to ensure the company has
| anticipated as many as possible such situations.
|
| It's not a very interesting thing to do however.
| dylan604 wrote:
| okay. and? the CTO isn't the last word in anything. if they
| are overruled to keep releasing new features, acquiring new
| users/clients, sales forward dev cycles, then the whole
| thing has potential to collapse under the weight of itself.
|
| It's actually the job of the CEO to keep all of the c-suite
| people doing jobs. Doesn't seem to stop the CEO salary
| explosions.
| wewewedxfgdf wrote:
| I think we are agreed.
|
| Companies, after a disaster, focus lots of effort on that
| particular disaster, leaving all the other potential
| disasters unplanned for.
|
| If you work at Webflow, you can anticipate LOTS of work
| in disaster recovery in the next 12 months. This has
| magically become a high priority for the CEO, who
| previously wanted features more than disaster recovery
| planning.
|
| They will wait to focus massive resources on their
| security until after they get hacked.
| willejs wrote:
| Hugops to the people working on this for the last 31+ hours.
| Running incidents of this significance is hard, draining and
| requires a lot of effort, this going on for so long must be very
| difficult for all involved.
| acedTrex wrote:
| An outage of this magnitude is almost ALWAYS the direct and
| immediate fault of senior leaderships priorities and focus.
| Pushing too hard in some areas, not listening to engineers on
| needed maintenance tasks etc.
| plutaniano wrote:
| Will the company survive long enough to produce a postmortem?
___________________________________________________________________
(page generated 2025-07-29 23:00 UTC)