[HN Gopher] Almost every infrastructure decision I endorse or re...
___________________________________________________________________
Almost every infrastructure decision I endorse or regret
Author : slyall
Score : 1007 points
Date : 2024-02-09 11:05 UTC (1 days ago)
(HTM) web link (cep.dev)
(TXT) w3m dump (cep.dev)
| shrubble wrote:
| "Since the database is used by everyone, it becomes cared for by
| no one. Startups don't have the luxury of a DBA, and everything
| owned by no one is owned by infrastructure eventually"
|
| I think adding a DBA or hiring one to help you layout your
| database should not be considered a 'luxury'...
| winrid wrote:
| Yeah I mean, hiring one person to own that for 5-10 teams is
| pretty cheap... Cheaper than each team constantly solving the
| same problems and relearning the same gotchas/operational stuff
| that doesn't add much value when writing your application code.
| steveBK123 wrote:
| There's even consultants you can hire out by the day instead of
| a full-time DBA.
|
| Maybe you need help with setup for a few weeks/months, and then
| some routine billable hours per month for maintenance / change
| advice.
| Scubabear68 wrote:
| The kitchen sink database used by everybody is such a common
| problem, yet it is repeated over and over again. If you grow it
| becomes significant tech debt and a performance bottleneck.
|
| Fortunately, with managed DBs like RDS it is really easy to run
| individual DB clusters per major app.
| sgarland wrote:
| The downside is then you have many, many DBs to fight with, to
| monitor, to tune, etc.
|
| This is rarely a problem when things are small, but as they
| grow, the bad schema decisions made by empowering DBA-less
| teams to run their own infra become glaringly obvious.
| Scubabear68 wrote:
| Not a downside to me. Each team maintains their own DB and
| pays for their own choices.
|
| In the kitchen sink model all teams are tied together for
| performance and scalability, and some bad apple applications
| can ruin the party for everyone.
|
| Seen this countless times doing due diligence on startups.
| The universal kitchen sink DB is almost always one of the
| major tech debt items.
| sgarland wrote:
| I'm a DBRE, which means it's somehow always my fault until
| proven otherwise. And even then, it's usually on me to work
| around the insane schema dreamt up by the devs.
|
| Multi-tenant DBs can work fine as long as every app has its
| own users, everyone goes through a connection pooler / load
| balancer, and every user has rate limits. You want to write
| shitty queries that time out? Not my problem. Your GraphQL
| BFF bullshit is trying to make 10,000 QPS? Nope, sorry, try
| again later.
|
| EDIT: I say "not my problem," but as mentioned, it
| inevitably becomes my problem. Because "just unblock them
| so the site is functional" is far more attractive to the
| C-Suite than "slow down velocity to ensure the dev teams
| are doing things right."
| CoolCold wrote:
| You forgot the modern mantra - dev team is always right!
| Scubabear68 wrote:
| I agree. My gripe was everybody in the same schema with a
| global "app" user.
| dalyons wrote:
| Or, you just avoid doing multi tenet from the start and
| none of those become your problem to unblock. What's the
| downside?
| sgarland wrote:
| Done that as well; it still becomes my problem because
| teams without RDBMS knowledge eventually break it, and...
| then I get paged.
|
| Full Stack is a lie, and the sooner companies accept that
| and allow people to specialize again, and to pay for the
| extra headcount, the better off everyone will be.
| dalyons wrote:
| I disagree I guess. Multiple companies I've worked at
| have broken up their shared db into many dbs that
| individual teams own the operations of, and it works just
| fine. At significant scale in traffic and # of eng. No
| central dbas needed - smaller databases require much less
| skills to manage. The teams that own them learn enough.
| maccard wrote:
| > Not a downside to me. Each team maintains their own DB
| and pays for their own choices.
|
| This is how you end up with the infamous "jira and
| confluence have two different markdown flavors" issue.
| Sankozi wrote:
| I don't think Jira and Confluence different markdown
| setup is due to them not sharing their databases. It is
| just poor product management from Attlasian.
| maccard wrote:
| My point is that forcing these arbitrary decisions is
| poor product management.
| vrosas wrote:
| Bad schema decisions are made regardless of whether you're
| one database or 50. At least with many databases the problems
| are localized.
| sgarland wrote:
| But then the DB Team - if you have one - is responsible for
| 50 databases, each full of their own unique problems.
|
| This will undoubtedly go over poorly, but honestly I think
| every data decision should be gated through the DB Team
| (again, if you have them). Your proposed schema isn't
| normalized? Straight to jail. You don't want to learn SQL?
| Also straight to jail. You want to use a UUIDv4 as a
| primary key? Believe it or not, jail.
|
| The most performant and referentially sound app in the
| world, because of jail.
| inquist wrote:
| What's wrong with uuidv4 as PK?
| marcosdumay wrote:
| Serial integers always work better than any uuid as PKs,
| but the thing with uuid4 is that it disrupts any kind of
| index or physical ordering you decide to put on your
| data.
|
| Uuids are really for external communication, not in-
| system organization.
| dalyons wrote:
| FWIW this isn't true anymore with newer uuid schemes like
| v7 that are roughly time sortable.
| ildjarn wrote:
| Serial index forces a synchronisation point on every
| entity that can create records. If this is only ever a
| single database that's fine but plenty of apps can't
| scale this way.
| marcosdumay wrote:
| They don't. Clustered databases deal with parallel
| generation of them just fine.
|
| They require periodic synchronization. What isn't a big
| deal at all and is required by many other database
| features.
| sgarland wrote:
| If you have a sharded DB, each instance can get its own
| range of ints, which are periodically refreshed.
|
| PlanetScale uses int PKs [0], and they seem to have
| scaled just fine.
|
| [0]:
| https://github.com/planetscale/discussion/discussions/366
| sgarland wrote:
| Anything non-k-sortable in a B[+,-]tree will cause a ton
| of page splits. This is a more noticeable performance
| impact in RDBMS with a clustered index (MySQL's InnoDB,
| MS SQL Server) [0], but it also impacts Postgres [1] in
| multiple [2] ways.
|
| [0]: https://www.percona.com/blog/uuids-are-popular-but-
| bad-for-p...
|
| [1]: https://www.cybertec-postgresql.com/en/unexpected-
| downsides-...
|
| [2]: https://www.2ndquadrant.com/en/blog/on-the-impact-
| of-full-pa...
| Glyptodon wrote:
| What's the best non serial option for PKs in your view?
| Or do you prefer dual PK approach?
| Sankozi wrote:
| No single team should not be responsible for all
| databases. If such team exists they will either become
| bottleneck for every other team (by auditing carefully
| each schema change) or become bloated and not utilized
| 90% of time, or (most common) they will become nearly
| useless or even harmful - they will not be really
| responsible and they will act as dumb proxy - they will
| introduce latency to the schema updates, but they will
| not bother to check them very well (why would they? they
| are not responsible for the whole product, just for the
| database), some DB refactoring/migrations will be totally
| abandoned because DB team will make them too painful.
|
| DB team could act as an auditor and expert support, but
| they should never be fully responsible for DB layer.
| sgarland wrote:
| > If such team exists they will either become bottleneck
| for every other team (by auditing carefully each schema
| change)
|
| That's the point. Would you send a backend code review to
| a frontend team? Why do DBs not deserve domain expertise,
| especially when the entire company depends on them?
|
| > they are not responsible for the whole product, just
| for the database
|
| I assure you, that's a lot to be responsible for at
| scale.
|
| > DB team could act as an auditor and expert support, but
| they should never be fully responsible for DB layer.
|
| Again, the issue here is when the DB gets borked enough
| that a SME is required to fix it, they effectively do
| become responsible, because no CTO is going to accept,
| "sorry, we'll be down for a couple of days because our
| team doesn't really know how this thing works."
|
| And if your answer is, "AWS Premium Support," they'll
| just tell you to upsize the instance. Every time. That is
| not a long-term strategy.
| calvinmorrison wrote:
| It's because I hate databases and programming separately. I
| would rather slow code then have to dig into some database
| procdure. Its just another level of separation thats too
| mentally hard to manage. Its like... my queries go into a VM
| and now I have to worry about how the VM is performing.
|
| I wish and maybe there is a programming language with first
| class database support. I mean really first class not just
| let me run queries but almost like embedded into the language
| in a primal way where I can both deal with my database
| programming fancyness and my general development together.
|
| Sincerely someone who inherited a project from a DBA.
| sgarland wrote:
| > I mean really first class not just let me run queries but
| almost like embedded into the language
|
| Not quite embedded into the OS, but Django is a damn good
| ORM. I say that as a DBRE, and someone obsessed with
| performance (inherent issues with interpreted languages
| aside).
| leetharris wrote:
| The closest thing to what you're describing is Prisma in
| Node. It generates a Typescript file from your schema so
| you get code completion on your data. And it exists
| somewhere between a query builder and a traditional ORM.
|
| I have worked in many languages with many ORMs and this has
| been my personal favorite.
| sgarland wrote:
| Until Prisma can manage JOINs [0] there is no way I can
| recommend it.
|
| [0]: https://github.com/prisma/prisma/discussions/12715
| kkarimi wrote:
| The support for JOINs is coming, currently under a
| feature flag [0]
|
| [0]: https://github.com/prisma/prisma/issues/5184#issueco
| mment-18...
| mkesper wrote:
| But the migration stuff is a horrible joke. No way to
| just rollback a broken migration.
| https://www.prisma.io/docs/orm/prisma-
| migrate/workflows/gene...
| chasd00 wrote:
| The language you're talking about is APEX. I believe it
| comes from Oracle and is the backend language for
| Salesforce development. You'll like the first class
| database support but that's about it.
| el_benhameen wrote:
| Lots of interesting comments on this one. Anyone have any good
| resources for learning how not to fuck up schema/db design for
| those of us who will probably never have a DBA on the team?
| magicalhippo wrote:
| Good question. We don't have a DBA either. I've learned SQL
| as needed and while I'm not terrible, it's still daunting
| when making the schema for a new module that might require
| 10-20 tables or more.
|
| One thing that has worked well for us is to alway include the
| top-most parent key in all child tables down yhe hierarchy.
| This way we can load all the data for say an order without
| joins/exists.
|
| Oh and never use natural keys. Each time I thought finally I
| had a good use-case, it has bitten me in some way.
|
| Apart from that we just try to think about the required data
| access and the queries needed. Main thing is that all queries
| should go against indexes in our case, so we make sure the
| schema supports that easily. Requires some educated guesses
| at times but mostly it's predictable IME.
|
| Anyway would love to see a proper resource. We've made some
| mistakes but I'm sure there's more to learn.
| AznHisoka wrote:
| Not to pick on you, but is SQL not basic knowledge for
| every software engineer these days? Or have times changed?
| rswail wrote:
| Times have changed. If you have C# programmers and they
| can't do it in Entity Framework/LINQ, then they can't do
| it.
| neonsunset wrote:
| This seems like a stereotype from 2010s and disconnected
| from reality today.
| mordae wrote:
| Nope. None of my below 30 colleagues know SQL. They use
| ORM in REPL or visual tools.
| neonsunset wrote:
| LINQPad is awesome and EF Core is just _this_ good so I
| can see why some would just choose not to deal with SQL.
|
| With that said, this still sounds like a strange
| situation - most colleagues, acquaintances and people I
| consulted know they way around SQL and dropping down to
| 'dbset.FromSql($"SELECT {...' is very commonplace out of
| the need to use sprocs, views or have tighter control
| over the query.
| deskamess wrote:
| I had not updated LINQPad in a while and just saw the
| price this year. Eeesh. I now live in a .NET Interactive
| (Jupyter like) environment.
| magicalhippo wrote:
| Perhaps I undersold myself a little. By the time I got my
| first job I was fairly well versed in SQL querying, and
| these days I feel comfortable writing what I'd consider
| complex queries. That is with various window functions,
| nested queries, recursion (though I try to avoid that)
| etc, and I have a good handle on what the query optimizer
| likes and doesn't like.
|
| But schema design is something else. I still take my time
| doing that.
|
| Especially since our application is written with
| backwards compatibility in mind, so changing schema after
| it's deployed is something we try very hard to avoid.
|
| But yeah, when hiring we require they are comfortable
| writing "normal" SQL queries (multiple joins, aggregation
| etc).
| marcosdumay wrote:
| > not to fuck up schema/db design
|
| The neat thing is, you don't. Nobody ever avoids fucking up
| db design.
|
| The best you can do is decide what is really important to get
| right, and not fuck that part up.
| gregw2 wrote:
| Wow, what an astute comment! Thank you!
|
| P.S. to the original person concerned about this though...
| for your own sake and your successors, please keep trying.
| marcosdumay wrote:
| Assuming that was sarcastic, you are free to try, I guess
| everyone needs to try it once.
|
| Just do the exercise of deciding what is really important
| first, so you can make sure you succeed for that stuff.
| nitwit005 wrote:
| The moment you have two databases is the moment you need to
| deal with data consistency problems.
|
| If you can't do something like determine if you can delete
| data, as the article mentions, you won't be able to produce an
| answer to how to deal with those problems.
| eduction wrote:
| Management problem masquerading as a tech problem.
|
| Being shared between applications is literally what databases
| were invented to do. That's why you learn a special dsl to
| query and update them instead of just doing it in the same
| language as your application.
|
| The problem is that data is a shared resource. The database is
| where multiple groups in an organization come together to get
| something they all need. So it needs to be managed. It could be
| a dictator DBA or a set of rules designed in meetings and
| administered by ops, or whatever.
|
| But imagine it was money. Different divisions produce and
| consume money just like data. Would anyone imagine suggesting
| either every team has their own bank account or total
| unfettered access to the corporate treasury? Of course not. You
| would make a system. Everyone would at least mildly hate it.
| That's how databases should generally be managed once the
| company is any real size.
| dalyons wrote:
| Why would you make it a shared resource if you don't have to?
|
| Decades of experience have shown us the massive costs of
| doing so - the crippled velocity and soul crushing agony of
| dba change control teams, the overhead salary of database
| priests, the arcane performance nightmares, the nuclear blast
| radius, the fundamental organizational counter-incentives of
| a shared resource .
|
| Why on earth would we choose to pay those terrible prices in
| this day and age, when infrastructure is code, managed
| databases are everywhere and every team can have their own
| thing. You didn't have a choice previously, now you do.
| eduction wrote:
| You wouldn't but in any decent sized organization you will
| have to. If it is an organization that needs to exist there
| will be some common set of critical data.
| webo wrote:
| In my experience, isolated (repeated) data storage
| paradigm is even more common at large organizations. They
| share data via services, ETLs, event buses, etc.
| dalyons wrote:
| That's just not true though, I've worked at decent sized
| companies without shared RDBMs, so you don't have to.
|
| You DO have to share data in other ways, usually
| datawarehouse or services, but that is not the same
| thing.
| eduction wrote:
| To me this is semantics. So it's a data warehouse rather
| than a database. Ok. Or we share data from a common
| source via "services" - ok but that's another word for a
| database and a client (using http to do the talking
| doesn't really change anything).
|
| I'm not saying literally every source of data has to be
| shared and centrally managed. I'm also not saying "rdbms
| accessed via traditional client and queried via sql" when
| I say database. I'm just saying a shared database of some
| shape is inevitable.
| dalyons wrote:
| Ok, but the OP and the article are talking specifically
| about a directly shared rdbms scenario, not some nebulous
| concept of shared data.
|
| Also, operationally it's not "semantics" at all. You
| don't get into (many) operational problems with analysts
| sharing a datawarehouse. You absolutely do with online
| apps sharing a rdbms, they aren't the same thing.
| IggleSniggle wrote:
| ...I worked at a large software organization where larger
| teams had their own bank account, and there was a lot of
| internal billing, etc, mixed with plenty of funny-money to go
| along with it. That's not a contradiction, though, it
| perfectly illustrated your point for me.
| hayst4ck wrote:
| I would love to see this type of thing from multiple sources.
| This reflects a lot of my own experience.
|
| I think the format of this is great. I suppose it would take a
| motivated individual to go around and ask people to essentially
| fill out a form like this to get that.
| kaycebasques wrote:
| I also think it's a great format.
|
| One suggestion if we're gonna standardize around this format.
| Avoid the double negatives. In some cases author says "avoided
| XYZ" and then the judgment was "no regrets". Too many layers
| for me to parse there. Instead, I suggest each section being
| the product that was used. If you regret that product, in the
| details is where you mention the product you should have used.
| Or you have another section for product ABC and you provide the
| context by saying "we adopted ABC after we abandoned XYZ".
|
| I don't recommend trying to categorize into general areas like
| logging, postmortems, etc. Just do a top-level section for each
| product.
| electroly wrote:
| > The markup cost of using RDS (or any managed database) is worth
| it.
|
| Every so often I price out RDS to replace our colocated SQL
| Server cluster and it's so unrealistically expensive that I just
| have to laugh. It's absurdly far beyond what I'd be willing to
| pay. The markup is enough to pay for the colocation rack, the AWS
| Direct Connects, the servers, the SAN, the SQL Server licenses,
| the maintenance contracts, _and a full-time in-house DBA_.
|
| https://calculator.aws/#/estimate?id=48b0bab00fe90c5e6de68d0...
|
| Total 12 months cost: 547,441.85 USD
|
| Once you get past the point where the markup can pay for one or
| more full-time employees, I think you should consider doing that
| instead of blindly paying more and more to scale RDS up. You're
| REALLY paying for it with RDS. At least re-evaluate the choices
| you made as a fledgling startup once you reach the scale where
| you're paying AWS "full time engineer" amounts of money.
| vasco wrote:
| That's a huge instance with an enterprise license on top. Most
| large SaaS companies can run off of $5k / m or cheaper RDS
| deployments which isn't enough to pay someone. The amount of
| people running half a million a year RDS bills might not be
| that large. For most people RDS is worth it as soon as you have
| backup requirements and would have to implement them yourself.
| electroly wrote:
| Definitely--I recommend this _after_ you 've reached the
| point where you're writing huge checks to AWS. Maybe this is
| just assumed but I've never seen anyone else add that nuance
| to the "just use RDS" advice. It's always just "RDS is worth
| it" full stop, as in this article.
| Aeolun wrote:
| To some extend that is probably true, because when you've
| built a business that needs a 500k/year database fully on
| RDS it's already priced into your profits, and switching to
| a self-hosted database will seem unacceptably risky for
| something that works just fine.
| groestl wrote:
| > it's already priced into your profits
|
| Assuming you have any. You might not, because of AWS.
| sroussey wrote:
| I mean, just use supabase instead. So much easier than RDS.
| Why even deal with AWS directly? Might as well have a Colo
| if you need AWS.
| sgarland wrote:
| > Most large SaaS companies can run off of $5k / m or cheaper
| RDS
|
| Hard disagree. An r6i.12xl Multi-AZ with 7500 IOPS / 500 GiB
| io1 books at $10K/month on its own. Add a read replica, even
| Single-AZ at a smaller size, and you're half that again. And
| this is without the infra required to run a load balancer /
| connection pooler.
|
| I don't know what your definition of "large" is, but the
| described would be adequate at best at the ~100K QPS level.
|
| RDS is expensive as hell, because they know most people don't
| want to take the time to read docs and understand how to
| implement a solid backup strategy. That, and they've somehow
| convinced everyone that you don't have to tune RDS.
| rswail wrote:
| If you're not using GP3 storage that provides 12K minimum
| IOPS without requiring provisioned IOPS for >400GB storage,
| as well as 4 volume striping, then you're overpaying.
|
| If you don't have a reserved instance, then you're giving
| up potentially a 50% discount on on-demand pricing.
|
| An r6i.12xl is a huge instance.
|
| There are other equivalents in the range of instances
| available (and you can change them as required, with
| downtime).
| sgarland wrote:
| > GP3... as well as 4 volume striping
|
| For MySQL and Postgres, RDS stripes across four volumes
| once you hit 400 GiB. Doesn't matter the type.
|
| The latency variation on gp3 is abysmal [0], and the
| average [1] isn't great either. It's probably fine if you
| have low demands, or if your working set fits into memory
| and you can risk the performance hit when you get an
| uncached query.
|
| 12K IOPS sounds nice until you add latency into it. If
| you have 2 msec latency, then (ignoring various other
| overheads, and kernel or EBS command merging) the maximum
| a single thread can accomplish in one second is (1000
| msec / 1 sec / 2 msec) = 500 I/O. Depending on your needs
| that may be fine, of course.
|
| > If you don't have a reserved instance, then you're
| giving up potentially a 50% discount on on-demand
| pricing.
|
| True, of course. Large customers also don't pay retail.
|
| > An r6i.12xl is a huge instance.
|
| I mean, it goes well past that to .32xl, so I wouldn't
| say it's huge. I work with DBs with 1 TiB of RAM, and I'm
| positive there are people here who think those are toys.
| The original comment I replied to said, "large SaaS," and
| a .12xl, as I said, would be roughly adequate for ~100K
| QPS, assuming no absurdly bad queries.
|
| [0]: https://www.percona.com/blog/performance-of-various-
| ebs-stor...
|
| [1]: https://silashansen.medium.com/looking-into-the-new-
| ebs-gp3-...
| dzikimarian wrote:
| >Most large SaaS companies can run off of $5k / m or cheaper
| RDS deployments which isn't enough to pay someone.
|
| After initial setup, managing equivalent of $5k/m RDS is not
| full time job. If you add to this, that wages differ a lot
| around the world, $5k can take you very, very far in terms of
| paying someone.
| renewiltord wrote:
| You don't get the higher end machines on AWS unless you're a
| big guy. We have Epyc 9684X on-prem. Cannot match that at the
| price on AWS. That's just about making the choices. Most
| companies are not DB-primary.
| sgarland wrote:
| I think most people who've never experienced native NVMe for
| a DB are also unaware of just how blindingly fast it is. Even
| io2 Block Express isn't the same.
| renewiltord wrote:
| Yes. We have it 4x striped on those same machines. Burns
| like lightning.
| sgarland wrote:
| The only problem is it hides all of the horrible queries.
| Ah well, can't have it all.
| Cacti wrote:
| I have one of those. It's so fast I don't even know what
| to do with it.
| icelancer wrote:
| Ha, I did just the same thing - and also optimized for an
| extremely fast per-thread CPU (which you never get from
| managed service providers).
|
| The query times are incredible.
| sroussey wrote:
| Most databases expressly say don't run storage over a
| network.
| amluto wrote:
| To be fair, most networked filesystems are nowhere near
| as good as EBS. That's one AWS service that takes real
| work to replicate on-prem.
|
| OTOH, as noted, EBS does not perform as well as native
| NVMe and is hilariously expensive if you try. And quite a
| few use cases are just fine on plain old NVMe.
| tpetry wrote:
| Thats because EBS is a network block device and not a
| network filesystem - that would be EFS. And with network
| block devices you get the same perf and better compared
| to EBS.
| ndriscoll wrote:
| Funny enough, the easiest way to experience this is
| probably to do some performance experimentation on the
| machine you code on. If it's a laptop made in the last few
| years, the performance you can get out of it knowing that
| it's sipping on a 45W power brick with probably not great
| cooling will make you very skeptical of when people talk
| about "scale".
| steveBK123 wrote:
| RDS pricing is deranged at the scales I've seen too. $60k/year
| for something I could run on just a slice of one of my on-prem
| $20k servers. This is something we would have run 10s of.
| $600k/year operational against sub-$100k capital cost pays
| DBAs, backups, etc with money to spare.
|
| Sure, maybe if you are some sort of SaaS with a need for a
| small single DB, that also needs to be resilient, backed up,
| rock solid bulletproof.. it makes sense? But how many cases are
| there of this? If its so fundamental to your product and needs
| such uptime & redundancy, what are the odds its also reasonably
| small?
| macNchz wrote:
| > Sure, maybe if you are some sort of SaaS with a need for a
| small single DB, that also needs to be resilient, backed up,
| rock solid bulletproof.. it makes sense? But how many cases
| are there of this?
|
| Most software startups these days? The blog post is about
| work done at a startup after all. By the time your db is big
| enough to cost an unreasonable amount on RDS, you're likely a
| big enough team to have options. If you're a small startup,
| saving a couple hundred bucks a month by self managing your
| database is rarely a good choice. There're more valuable
| things to work on.
| tw04 wrote:
| >By the time your db is big enough to cost an unreasonable
| amount on RDS, you're likely a big enough team to have
| options.
|
| By the time your db is big enough to cost an unreasonable
| amount on RDS, you've likely got so much momentum that
| getting off is nearly impossible as you bleed cash.
|
| You can buy a used server and find colocation space and
| still be pennies on the dollar for even the smallest
| database. If you're doing more than prototyping, you're
| probably wasting money.
| theptip wrote:
| That's just another way of saying the opportunity cost
| isn't worth paying to do the migration.
|
| Optionality and flexibility are extremely valuable, and
| that is why cloud compute continues to be popular,
| especially for rapidly/burstily growing businesses like
| startups.
| latch wrote:
| I don't mean to pick on your specific comments, but I
| find these analysis almost always lack a crucial
| perspective: level of knowledge. This is the single
| biggest factor, and it's the hardest one to be honest
| about. No one wants to say "RDS is a good choice . . .
| because I don't know how nor have I ever self managed a
| database."
|
| If you want a different opportunity cost, get people with
| different experience. If RDS is objectively expensive,
| objectively slow, but subjectively easy, change the
| subject.
| pcl wrote:
| _> No one wants to say "RDS is a good choice . . .
| because I don't know how nor have I ever self managed a
| database."_
|
| I don't think that's accurate. I've self-managed
| databases, and I still think that RDS is compelling for
| small engineering teams.
|
| There's a lot to get right when managing a database, and
| it's easy to screw something up. Perhaps none of the
| individual parts are super-complicated, but the cost of
| failure is high. Outsourcing that cost to AWS is pretty
| compelling.
|
| At a certain team size, you'll end up with a section of
| the team that's dedicated to these sorts of careful
| processes. But the first place these issues come up is
| with the database, and if you can put off that bit of
| organizational scaling until later, then that's a great
| path to choose.
| maccard wrote:
| I disagree here. This falls apart when you zoom out one
| step. I'm perfectly capable of managing a database. I'm
| also capable of maintaining load balancers, redis,
| container orchestrators, Jenkins, perforce, grafana,
| Loki, Oncall, individually. But each of those has the
| high chance of being a distraction from what our software
| actually does.
|
| Its about tradeoffs, and some tradeoffs are often more
| applicable than others - getting a ping at 7am on a
| Sunday because your ec2 instance filled it's drive up
| with logs and your log rotation script failed because it
| didn't have a long enough retey is a problem I'm happy to
| outsource when I should be focusing on the actual app.
| graemep wrote:
| People do not really understand the value of the former.
| Even dealing with financial options (buy/sell and
| underlying) which are a pure form of it, people either do
| not understand the value, or do so in a very abstract way
| they do not intuit.
| matwood wrote:
| Good point. And, since you brought up financials, you
| also see this when people use a majority of their savings
| to lump sum pay off a mortgage. They take an overweighted
| view of saving on interest and, IMO, underweight the
| flexibility of liquidity.
| graemep wrote:
| On the other hand cloud platforms can be hard to migrate
| off, which is very much taking away options.
| macNchz wrote:
| In the small SaaS startup case, I'd say the production
| database is typically the most critical single piece of
| infra, so self hosting is just not a compelling
| proposition unless you have a strong technical reason
| where having super powerful database hardware is
| important, or a team with multiple people who have
| sysadmin or DBA experience. I think both of those cases
| are unusual.
|
| I've been the guy managing a critical self-hosted
| database in a small team, and it's such a distraction
| from focusing on the actual core product.
|
| To me, the cost of RDS covers tons of risks and time
| sinks: having to document the db server setup so I'm not
| the only one on the team who actually knows how to
| operate it, setting up monitoring, foolproof backups so I
| don't need to worry that they're silently failing because
| a volume is full and I misconfigured the monitoring, PITR
| for when someone ships a bad migration, one click HA so
| the database itself is very unlikely to wake me at 3am,
| blue/green deploys to make major version upgrades totally
| painless, never having to think about hardware failures
| or borked dist-upgrades, and so on.
|
| Each of those is ultimately either undifferentiated work
| to develop in-house RDS features that could have been
| better spent on product, or a risk of significant data
| loss, downtime, or firefighting. RDS looks like a pretty
| good deal, up to a point.
| remus wrote:
| I like fiddling with databases, but I totally agree with
| this. Unless you really need a big database and are going
| to save 100k+ per year by going self managed then RDS or
| similar just saves you so much stress. We've been using
| it for the best part of 10 years and uptime and latency
| have consistently been excellent, and functionality is
| all rock solid. I never have to think about it, which is
| just what I want from something so core to the business.
| matwood wrote:
| I _am_ good at databases (have been a DBA in the past),
| and 100% agree with this. RDS is easy to standup and get
| all the things you mentioned, and not have to think about
| again. If we grow to the point where the overhead is more
| than a FT DBA, awesome. It means we are successful, and
| are fortunate to have options.
| rnts08 wrote:
| Unfortunately there are so many people and teams who
| thinks that simply running their databases on RDS means
| that they're backed up, highly-available and can be
| easily load balanced, upgraded, partitioned, migrated and
| so on which is simply not the case with the basic
| configuration.
|
| RDS is a great choice, for prototyping and only for
| production if you know what you're doing when setting it
| up.
|
| FWIW, this is common in all cloud deployments, people
| assume that running something "severless" is a magical
| silver bullet.
| macNchz wrote:
| Well...just using the defaults when creating an RDS
| Postgres in the console give you an HA cluster with two
| read replicas, 7 days of backups restorable to any point
| in time, automatic minor version upgrades, and very easy
| major upgrades. So unless you start actively unchecking
| stuff those are not entirely invalid assumptions.
| optymizer wrote:
| I agree, but I also classify some of these as "learn them
| once and you're all set".
|
| Maybe it takes you a month the first time around and a
| week the 10th time around. First product suffers, the
| other products not so much. Now it just takes a week of
| your time and does not require you to pay large AWS fees,
| which means you are not bleeding money
|
| I like to set up scrappy products that do not rack up
| large monthly fees. This means I can let them run
| unprofitable for longer and I don't have to seek an
| investor early, which would light up a large fire under
| everyone's butts and start influencing timelines because
| now they have the money and want a return asap.
|
| I'll launch a week later - no biggie usually. I could
| have come up with the idea a month later, so I'm still 3
| weeks early ;)
|
| It doesn't work for all projects, obviously, but I've
| seen plenty of SaaS start out with a shopping spree, then
| pay monthly fees and purchase licenses for stuff that
| they could have set up for free if they put some (usually
| not a lot) effort into it. When times get rough, the
| shorter runway becomes a hard fact of life. Maybe they
| wouldn't have needed a VC and could have bootstrapped and
| also survived for longer.
| macNchz wrote:
| Learning it all is what gave me an appreciation for RDS!
| I've self managed a number of Postgres and MySQL
| databases, including a 10TB Postgres cluster with all of
| the HA and backup niceties.
|
| While I generally agree as far as initial setup time
| goes, I favor RDS because I can forget about it, whereas
| the hand rolled version demands ongoing maintenance, and
| incurs a nonzero chance of simple mistakes that, if made,
| could result in a 100% dataloss unrecoverable scenario.
|
| I'm also mostly talking about typical, funded startups
| here, as opposed to indie/solo devs. If you're flying
| solo launching a tiny proof of concept that may only ever
| have a few users, by all means run it yourself if you'd
| like, but if you've raised money to grow faster and are
| paying employees to iterate rapidly searching for
| PMF...just pay for RDS and make sure as much time as
| possible is spent on product features that provide actual
| business value. It starts at like $15/month. The cost of
| simply not being laser-focused on product is far greater.
| crazygringo wrote:
| > _you 've likely got so much momentum that getting off
| is nearly impossible as you bleed cash._
|
| Databases are not particularly difficult to migrate
| between machines. Of all the cloud services to migrate,
| they might actually be the easiest, since the databases
| don't have different API's that need to be rewritten for,
| and database replication is a well-established thing.
|
| Getting off is quite the _opposite_ of nearly impossible.
| viraptor wrote:
| Lots of cases. It doesn't even have to be a tiny database.
| Within <1TB range there's a huge number of online companies
| that don't need to do more than hundreds of queries per
| second, but need the reliability and quick failover that RDS
| gives them. The $600k cost is absurd indeed, but it's not the
| range of what those companies spend.
|
| Also, Aurora gives you the block level cluster that you can't
| deploy on your own - it's way easier to work with than the
| usual replication.
| steveBK123 wrote:
| Once you commit to more deeply Amazon flavored parts of AWS
| like Aurora, aren't you now fairly committed to hoping your
| scale never exceeds the cost-benefit tradeoff?
| viraptor wrote:
| Or you're realistic about what you're doing. Will you
| _ever_ need to scale more than 10x? And on the timescales
| where you do grow over 10x, would it be better to
| reconsider /re-architect everything anyway?
|
| I mean, I'm looking after a 4 instance Aurora cluster
| which is great feature wise, is slightly overprovisioned
| for special events, and is more likely to shrink than
| grow 2x in the next decade. If we start experiencing any
| issues, there's lots of optimisations that can be still
| gained from better caching and that work will be cheaper
| than the instance size upgrade.
| zmgsabst wrote:
| ...no?
|
| There's still a defined cost to swapping your DB code
| over to a different backend. At the point where it
| becomes uneconomical, you're also at a scale you can
| afford rewriting a module.
|
| That's why we have things like "hexagonal architecture",
| which focus on isolating the storage protocol from the
| code. There's an art to designing such that your
| prototype can scale with only minor rework -- but that's
| why we have senior engineers.
| callalex wrote:
| If you're paying list price at scale you are doing it
| very wrong.
| tw04 wrote:
| Sure, but if you're paying anywhere near list price for
| your on-prem hardware at scale you're also doing it
| wrong. I've never seen a scenario where Amazon discounts
| exceed what you would get from a hardware or software
| vendor at the same scale.
| osigurdson wrote:
| Interesting how cloud services are sold like used cars.
| rswail wrote:
| It's more interesting how cloud services are sold like
| any other consumables or corporate services.
|
| No one runs their own electricity supply (well until
| recently with renewables/storage), they buy it as a
| service, up to a pretty high scale before it becomes more
| economic to invest the capex and opex to run your own.
| nemothekid wrote:
| If my scale exceeds the cost benefit tradeoff, then I
| will thank God/Allah/Buddah/Spaghetti Monster.
|
| These questions always sound flawed to me. It's like
| asking won't I regret moving to California and paying
| high taxes once I start making millions of dollars?
| Maybe? But that's an amazing problem to have and one that
| I may be much better equipped to solve.
|
| If you are small, RDS is much cheaper, and many company
| killing events, such as not testing your backups are
| solved. If you are big and you can afford a 60K/yr RDS
| bill than you can make changes to move on-prem. Or you
| can open up excel and do the math if your margins are
| meaningfully affected by moving on-prem.
| pclmulqdq wrote:
| I assume that you do that math on all your new features
| too, right? The calculation of how much extra money they
| will bring in?
|
| On some level, AWS/GCP/California relies on you doing
| this calculation for the things that you can do it on
| easily (the savings of moving away), while not doing this
| calculation on things where it's hard to do (new
| development). That way, you can pretend that your new
| features are a lot more valuable than the $Xk/year you
| will save by moving your infra.
| nemothekid wrote:
| > _The calculation of how much extra money they will
| bring in?_
|
| Yes, I've done the math. The piece you are missing is,
| saving money on infra will bring in $0 new dollars. There
| is a floor to how much money I can save. There is no
| ceiling to how much money the right feature can bring in.
| Penny pinching on infra, especially when the amount of
| money is saved is less than the cost of an engineer is
| almost always a waste of time while you are growing a
| company. If you are at the point where you are wasting
| 1x,2x,3x of an engineers salary of superflous
| infrastructure - then congratulations you have survived
| the great filter for 99% of startups.
|
| > _That way, you can pretend that your new features are a
| lot more valuable than the $Xk /year you will save by
| moving your infra._
|
| Finding product market fit is 1000x harder than moving
| from RDS to On-prem. If you haven't solved PMF, then no
| amount of $Xk/year in savings will save you from having
| to shut down your company.
| pclmulqdq wrote:
| I am well aware of the math on that. Also, switching to
| faster infra can be a surprising benefit to your revenue,
| by the way, if it makes your app feel nicer.
|
| The thing is, most features, particularly later in the
| life of a company, don't have an easy-to-measure revenue
| impact, and I suspect that many features are actually
| worth $0 of revenue. However, they cost money to
| implement (both in engineering time and infra), making
| them very much net negative value propositions. This is
| why Facebook and Google can cut tons of staff and lose
| nothing off their revenue number.
|
| Also, there's a bit of a gambling mentality here which is
| that a feature could be worth effectively infinite
| revenue (ie it could be the thing that gives you PMF), so
| it's always worth doing over things with known, bounded
| impact on your bottom line. However, improving your
| efficiency gives you more cracks at finding good features
| before you run out of money.
| matwood wrote:
| Agree. "What if you're wildly successful and get huge?"
| Awesome, we'll solve the problem then. The other part is
| what if AWS was a part of becoming successful? IE, it
| freed my small team from having to worry all that much
| about a database and instead focused on features.
| rswail wrote:
| Aurora supports standard Postgres clients.
|
| So moving to/from Aurora/RDS/own EC2/on-prem _should_ be
| a matter of networking and changing connection strings in
| the clients.
|
| Your operational requirements and processes
| (backup/restore, failover, DR etc) will change, but
| that's because you're making a deliberate decision
| weighing up those costs vs benefits.
| gregw2 wrote:
| Pro tip side note:
|
| You can use DNS to mitigate the pain of changing those
| connection strings, decoupling client change management
| from backend change process, or if you had foresight, not
| having to change client connection strings at all.
| DiggyJohnson wrote:
| The US DoD for sure.
| amluto wrote:
| I have a small MySQL database that's rather important, and
| RDS was a complete failure.
|
| It would have cost a negligible amount. But the sheer amount
| of time I wasted before I gave up was honestly quite
| surprising. Let's see:
|
| - I wanted one simple extension. I could have compromised on
| this, but getting it to work on RDS was a nonstarter.
|
| - I wanted RDS to _import the data_. Nope, RDS isn't "SUPER,"
| so it rejects a bunch of stuff that mysqldump emits. Hacking
| around it with sed was not confidence-inspiring.
|
| - The database uses GTIDs and needed to maintain replication
| to a non-AWS system. RDS nominally supports GTID, but the
| documented way to enable it at import time strongly suggests
| that whoever wrote the docs doesn't actually understand the
| purpose of GTID, and it wasn't clear that RDS could do it
| right. At least Azure's docs suggested that I could have
| written code to target some strange APIs to program the thing
| correctly.
|
| Time wasted: a surprising number of hours. I'd rather give
| someone a bit of money to manage the thing, but it's still on
| a combination of plain cloud servers and bare metal. Oh well.
| blantonl wrote:
| replication to non-AWS systems. "simple" extension problems
| importing data into RDS because of your custom stuff
| lurking in a mysqldump
|
| Sounds like you are walking massive edge
| ehnto wrote:
| > Sure, maybe if you are some sort of SaaS with a need for a
| small single DB, that also needs to be resilient, backed up,
| rock solid bulletproof.. it makes sense? But how many cases
| are there of this?
|
| Very small businesses with phone apps or web apps are often
| using it. There are cheaper options of course, but when there
| is no "prem" and there are 1-5 employees then it doesn't make
| much sense to hire for infra. You outsource all digital work
| to an agency who sets you up a cloud account so you have
| ownership, but they do all software dev and infra work.
|
| > If its so fundamental to your product and needs such uptime
| & redundancy, what are the odds its also reasonably small?
|
| Small businesses again, some of my clients could probably run
| off a Pentium 4 from 2008, but due to nature of the org and
| agency engagement it often needs to live in the cloud
| somewhere.
|
| I am constantly beating the drum to reduce costs and use as
| little infra as needed though, so in a sense I agree, but the
| engagement is what it is.
|
| Additionally, everyone wants to believe they will need to
| hyperscale, so even medium scale businesses over-provision
| and some agencies are happen to do that for them as they
| profit off the margin.
| graemep wrote:
| A lot of my clients are small businesses in that range or
| bigger.
|
| AWS and the like are rarely a cost effective option, but it
| is something a lot of agencies like, largely because they
| are not paying the bills. The clients do not usually care
| because they are comfortable with a known brand and the
| costs are a small proportion of the overall costs.
|
| A real small business will be fine just using a VPS
| provider or a rented server. This solves the problem of not
| having on premise hardware. They can then run everything on
| a single server, which is a lot simpler to set up, and a
| lot simpler to secure. That means the cost of paying
| someone to run it is a lot lower too as they are needed
| only occasionally.
|
| They rarely need very resilient systems as they amount of
| money lost to downtime is relatively small - so even on AWS
| they are not going to be running in multiple availability
| zones etc.
| neeleshs wrote:
| Out of curiosity, who is your onprem provider?
| kunley wrote:
| RDS is not so bulletproof as advertised, and the support is
| first arrogant then (maybe) helpful.
|
| People pay for RDS because they want to believe in a fairy
| tale that it will keep potential problems away and that it
| worked well for other customers. But those mythical other
| customers also paid based on such belief. Plus, no one wants
| to admit that they pay money in such irrational way. It's a
| bubble
| AtlasBarfed wrote:
| Plus aws outright lie to us about zero downtime upgrades.
|
| Come time for force major upgrade shoved down our throat?
| Downtime, surprise, surprise
| thelastparadise wrote:
| > $600k/year operational against sub-$100k capital cost pays
| DBAs, backups, etc with money to spare.
|
| One of these is not like the others (DBAs are not capex.)
|
| Have you ever considered that if a company can get the same
| result for the same price ($100K opex for RDS vs same for
| human DBA), it actually makes much more sense to go the route
| that takes the human out of the loop?
|
| The human shows up hungover, goes crazy, gropes Stacy from
| HR, etc.
|
| RDS just hums along without all the liabilities.
| AaronM wrote:
| Not only that, you can't just have one DBA. You need a team
| a them, otherwise that person is going to be on call 24/7,
| can never take a vacation, etc. Your probably looking at a
| minimum of 3.
| tpetry wrote:
| And when you have performance issues you still need a DBA.
| Because RDS only runs your database. It is up to you to
| make it fast.
| icedchai wrote:
| You'll need an engineer with database skills, not a
| dedicated DBA. I haven't seen a small company with a full
| time DBA in well over a decade. If you can learn a
| programming language, you can learn about indexes and
| basic tuning parameters (buffer pool, cache, etc.)
| infecto wrote:
| The problem you have here is by the time you reach the size of
| this DB, you are on a special discount rate within AWS.
| jacurtis wrote:
| Discount rates are actually much better too on the bigger
| instances. Therefore the "sticker price" that people compare
| on the public site is no where close to a fair comparison.
|
| We technically aren't supposed to talk about pricing
| publically, but I'm just going to say that we run a few 8XL
| and 12Xl RDS instances and we pay ~40% off the sticker price.
|
| If you switch to Aurora engine the pricing is absurdly
| complex (its basically impossible to determine without a
| simulation calculator) but AWS is even more aggressive with
| discounting on Aurora, not to mention there are some legit
| amazing feature benefits by switching.
|
| I'm still in agreeance that you could do it cheaper yourself
| at a Data Center. But there are some serious tradeoffs made
| by doing it that way. One is complexity and it certainly
| requires several new hiring decisions. Those have their own
| tangible costs, but there are a huge amount of intangible
| costs as well like pure inconvenience, more people
| management, more hiring, split expertise, complexity to
| network systems, reduce elasticity of decisions, longer
| commitments, etc.. It's harder to put a price on that.
|
| When you account for the discounts at this scale, I think the
| cost gap between the two solutions is much smaller and these
| inconveniences and complexities by rolling it yourself are
| sometimes worth bridging that smaller gap in cost in order to
| gain those efficiencies.
| jq-r wrote:
| > but I'm just going to say that we run a few 8XL and 12Xl
| RDS instances and we pay ~40% off the sticker price.
|
| Genuinely curious, how do you that?
|
| We pay a couple of million dollars per year and the biggest
| spend is RDS. The bulk of those are 8xl and 12xl as you
| mention and we have a lot of these. We do have savings
| plans, but those are nowhere near 40%.
| hardolaf wrote:
| Yeah 40% seems like a pipedream. I was at a Fortune 500
| defense firm and we couldn't get any cloud provider to
| even offer us anything close to that discount if we
| agreed to move to them for 3-4 years minimum. That org
| ended up not migrating because it was significantly
| cheaper to buy land and build datacenters from scratch
| than to rent in the cloud.
| overstay8930 wrote:
| There are basically no discounts in govcloud
| CubsFan1060 wrote:
| At least according to: https://instances.vantage.sh/rds/?
| selected=db.r6g.16xlarge,d...
|
| It looks like a reserved instance is 35% off sticker
| price? Add probably a discount and you'd be around 40%
| off.
| CubsFan1060 wrote:
| The new Aurora pricing model helps, and is honestly the
| only reason we're able to use it. It caps costs:
| https://aws.amazon.com/blogs/aws/new-amazon-aurora-i-o-
| optim...
| nyc_data_geek wrote:
| Some orgs are looking at moving back to on prem because they're
| figuring this out. For a while it was vogue to go from capex to
| opex costs, and C suite people were incentivized to do that via
| comp structures, hence "digital transformation" ie: migration
| to public cloud infrastructure. Now, those same orgs are
| realizing that renting computers actually costs more than
| owning them, when you're utilizing them to a significant
| degree.
|
| Just like any other asset.
| nextos wrote:
| Same experience here. As a small organization, the quotes we
| got from cloud providers have always been prohibitively
| expensive compared to running things locally, even when we
| accounted for geographical redundancy, generous labor costs,
| etc. Plus, we get to keep _know how_ and avoid lock-in, which
| are extremely important things in the long term.
|
| Besides, running things locally can be refreshingly simple if
| you are just starting something and you don't need tons of
| extra stuff, which becomes accidental complexity between you,
| the problem, and a solution. This old post described that
| point quite well by comparing Unix to Taco Bell:
| http://widgetsandshit.com/teddziuba/2010/10/taco-bell-
| progra.... See HN discussion:
| https://news.ycombinator.com/item?id=10829512.
|
| I am sure for some use-cases cloud services might be worth
| it, especially if you are a large organization and you get
| huge discounts. But I see lots of business types blindly
| advocating for clouds, without understanding costs and
| technical tradeoffs. Fortunately, the trend seems to be
| plateauing. I see an increasing demand for people with HPC,
| DB administration, and sysadmin skills.
| layoric wrote:
| > Plus, we get to keep know how and avoid lock-in, which
| are extremely important things in the long term.
|
| So much this. The "keep know how" has been so greatly
| avoided over the past 10 years, I hope people with these
| skills start getting paid more as more companies realize
| the cost difference.
| lanstin wrote:
| When I started working in the 1980s (as a teenager but
| getting paid) there was a sort of battle between the
| (genuinely cool and impressive) closed technology of IBM
| and the open world of open standards/interop like TCP/IP
| and Unix, SMTP, PCs, even Novell sort of, etc. There was
| a species of expert that knew the whole product offering
| of IBM, all the model numbers and recommended solution
| packages and so on. And the technology was good - I had
| an opportunity to program a 3093K(?) CM/VMS monster with
| APL and rexx and so on. Later on I had a job working with
| AS/400 and SNADS and token ring and all that, and it was
| interesting; thing is they couldn't keep up and the more
| open, less greedy, hobbyists and experts working on Linux
| and NFS and DNS etc. completely won the field. For
| decades, open source, open standards, and
| interoperability dominated and one could pick the best
| thing for each part of the technology stack, and be
| pretty sure that the resultant systems would be good. Now
| however, the Amazon cloud stacks are like IBM in the
| 1980s - amazingly high quality, but not open; the cloud
| architects master the arcane set of product offerings and
| can design a bespoke AWS "solution" to any problems. But
| where is the openness? Is this a pendulum that goes back
| and forth (and many IBM folks left IBM in the 1990s and
| built great open technologies on the internet) or was it
| a brief dawn of freedom that will be put down by the
| capital requirements of modern compute and networking
| stacks?
|
| My money is on openness continuing to grow and more and
| more pieces of the stack being completely owned by
| openness (kernels anyone?) but one doesn't know.
| nyc_data_geek wrote:
| Even without owning the infrastructure, running in the
| cloud without know-how is very dangerous.
|
| I hear tell of a shop that was running on ephemeral
| instance based compute fleets (EC2 spot instances, iirc),
| with all their prod data in-memory. Guess what happened
| to their data when spot instance availability cratered
| due to an unusual demand spike? No more data, no more
| shop.
|
| Don't even get me started on the number of privacy
| breaches because people don't know not to put customer
| information in public cloud storage buckets.
| hardolaf wrote:
| I was part of a relatively small org that wanted us to move
| to cloud dev machines. As soon as they saw the size of our
| existing development docker images that were 99.9% vendor
| tools in terms of disk space, they ran the numbers and told
| us that we were staying on-prem. I'm fairly sure just
| loading the dev images daily or weekly would be more
| expensive than just buying a server per employee.
| nicbou wrote:
| Is there a bit of risk involved since the know-how has a
| will of its own and sometimes gets sick?
|
| If I had a small business with very clever people I'd be
| very afraid of what happens if they're not available for a
| while.
| stingraycharles wrote:
| That's made possible because of all the orchestration
| platforms such as Kubernetes being standardized, and as such
| you can get pretty close to a cloud experience while having
| all your infrastructure on-premise.
| nyc_data_geek wrote:
| Yes, virtualization, overprovisioning and containerization
| have all played a role in allowing for efficient enough
| utilization of owned assets that the economics of cloud are
| perhaps no longer as attractive as they once were.
| oooyay wrote:
| Context: I build internal tools and platforms. Traffic on
| them varies, but some of them are quite active.
|
| My nasty little secret is for single server databases I have
| zero fear of over provisioning disk iops and running it on
| SQLite or making a single RDBMS server in a container. I've
| never actually run into an issue with this. It surprises me
| the number of internal tools I see that depend on large RDS
| installations that have piddly requirements.
| DeathArrow wrote:
| >making a single RDBMS server in a container
|
| On what disk is the actual data written? How do you do
| backups, if you do?
| BirAdam wrote:
| In most setups like this, it's going to be spinning rust
| with mdadm, and MySQL dumps that get created via cron and
| sent to another location.
| dvfjsdhgfv wrote:
| The problem with single instance is that while performance-
| wise it's best (at least on bare metal), there comes a
| moment when you simply have too much data and one machine
| can't handle. Your your scenario, it may never come up, but
| many organizations face this problem sooner or later.
| oooyay wrote:
| I agree, my point is that clusters are overused. Most
| applications simply don't need them and it results in a
| lot of waste. _Much_ of this has to do with engineers
| being tasked with an assortment of roles these days, so
| they obviously opt for the solution where a database and
| upgrades are managed for them. I 've just found that
| managing a single containers upgrades aren't that big of
| an issue.
| chii wrote:
| i would imagine that cloud infrastructure has the ability for
| fast scale up, unlike self-owned infrastructure.
|
| For example, how long does it take to rent another rack that
| you didnt plan for?
|
| And not to mention that the cost of cloud management
| platforms that you have to deploy to manage these owned
| assets is not free.
|
| I mean, how come even large consumers of electricity does not
| buy and own their own infrastructure to generate it?
| pinkgolem wrote:
| >I mean, how come even large consumers of electricity do
| not buy and own their own infrastructure to generate it?
|
| They sure do? BASF has 3 power plants in Hamburg, Disney
| operate Reedy Creek Energy with at least 1 power plant and
| I could list a fair bit more...
|
| >For example, how long does it take to rent another rack
| that you didnt plan for?
|
| I mean, you can also rent hardware a lot cheaper then on
| AWS. There certainly are providers where you can rent out a
| rack for a month within minutes
| sseagull wrote:
| Some universities also have their own power plants. It's
| also becoming more common to at least supplement power on
| campus with solar arrays.
| tpetry wrote:
| Ordering that amount of amount of servers takes about one
| hour with hetzner. If you truly want a complete rack on
| your own maybe a few days as they have to do it manually.
|
| Most companies don't need to scale up full racks in
| seconds. Heck, even weeks would be ok for most of them to
| get new hardware delivered. The cloud planted the lie into
| everyone's head that most companies dont have predictable
| and stable load.
| rajamaka wrote:
| What would be the cost/time of scaling down a rack on
| Hetzner?
| pinkgolem wrote:
| rental period is a month you can also use hetzner cloud,
| which is still roughly 10x less expensive then aws and
| that does not take into account the vastly cheaper
| traffic
| hardolaf wrote:
| Most businesses could probably know server needs 6-12
| months out. There's a small number of businesses in the
| world that actually need dynamic scaling.
| gorm wrote:
| One other appealing alternative for smaller startups is to
| run Docker on one burstable vm. This is a simple setup and
| allows you to go beyond the cpu limits and also scale up
| the vm.
|
| Might be other alternatives than using Docker so if anyone
| has tips for something simpler or easier to maintain,
| appreciate a comment.
| pinkgolem wrote:
| Keep in mind, there is an in between..
|
| I would have a hard time doing servers as cheap as hetzner
| for example including the routing and everything
| jwr wrote:
| I do that. In fact I've been doing it for years, because
| every time I do the math, AWS is unreasonably expensive and
| my solo-founder SaaS would much rather keep the extra
| money.
|
| I think there is an unreasonable fear of "doing the routing
| and everything". I run vpncloud, my server clusters are
| managed using ansible, and can be set up from either a list
| of static IPs or from a terraform-prepared configuration.
| The same code can be used to set up a cluster on bare-metal
| hetzner servers or on cloud VMs from DigitalOcean (for
| example).
|
| I regularly compare this to AWS costs and it's not even
| close. Don't forget that the performance of those bare-
| metal machines is _way_ higher than of overbooked VMs.
| pinkgolem wrote:
| I was more talking about physical backbone connection
| which hetzner does for you.
|
| We are using hetzner cloud.. but we are also scaling up
| and down a lot right now
| swores wrote:
| Could you please explain what you mean by "physical
| backbone connection", as I can't think of a meaning that
| fits the context.
|
| If you mean dealing with the physical dedicated servers
| that can be rented from Hetzner, that's what the person
| you replied to was talking about being not so difficult.
|
| If you mean everything else at the data centre that makes
| having a server there worthwhile (networking, power,
| cooling, etc.) I don't think people were suggesting doing
| that themselves (unless you're a big enough company to
| actually be in the data centre business), but were
| talking about having direct control of physical servers
| in a data centre managed by someone like Hetzner.
|
| (edit: and oops sorry I just realised I accidentally
| downvoted your comment instead of up, undone and
| rectified now)
| pinkgolem wrote:
| With "routing" I meant the backbone connection, which is
| included in the hetzner price.
|
| Aka if I add up power (including backup) + backbone
| connection rental + server deprication I can not do it
| for the hetzner price..
|
| That was quite imprecise, sorry about that.
| DeathArrow wrote:
| I think no one talked about having physical server on
| their own premises but colocating servers in a data
| center or renting servers in a data center.
| swores wrote:
| No worries, easy to not foresee every possible way in
| which strangers could interpret a comment!
|
| But I think that people (at least jwr, and probably even
| nyc_data_geek saying "on prem") are talking about cloud
| (like AWS) vs. renting (or buying) servers that live in a
| data centre run by a company like Hetzner, which can be
| considered "on prem" if you're the kind of data centre
| client who has building access to send your own staff
| there to manage your servers (while still leaving
| everything else, possibly even legal ownership and
| therefore deprecation etc. to the data centre owner).
|
| What you're thinking of - literally taking responsibility
| for running your own mini data centre - I think is hardly
| ever considered (at least in my experience), except by
| companies at the extremes of size. If you're as big as
| Facebook (not sure where the line is but obviously
| including some companies not AS big as Meta but still
| huge) then it makes sense to run your own data centres.
| If you're a tiny business getting less than thousands of
| website visits a day and where the website (or whatever
| is being hosted) isn't so important that a day of
| downtime every now and then isn't a big deal, then it's
| not uncommon to host from the company's office itself
| (just using a spare old PC or second hand cheap 1U
| server, maybe a cheap UPS, and just connected to the main
| internet connection that people in the office use, and
| probably managed by a single employee, or company owner,
| who happens to be geeky enough to think it's one or both
| of simple or fun to set up a basic LAMP server, or even a
| Windows server for its oh-so-lovely GUI).
| fgonzag wrote:
| You usually just do colocation. The data center will give
| you a rack (or space for one), an upstream gateway to
| your ISP, and redundant power. You still have to manage a
| firewall and your internal network equipment, but its not
| really that bad. I've used PFsense firewalls, configured
| by them for like $1500, with roaming vpn, high
| availability, point to point vpn, and as secure as
| reasonably possible. After that it's the same thing as
| the cloud except its physical servers.
| pinkgolem wrote:
| i mean, yes.. but you pay for that, and colocation +
| server deprication in the case i calculated was higher
| then just renting the servers
| DeathArrow wrote:
| 100% agree. People still think that maintaining
| infrastructure is very hard and requires lot of people.
| What they disregard is that using cloud infrastructure
| also requires people.
| tormeh wrote:
| When talking about Hetzner pricing, please don't change
| the subject to AWS pricing. The two have nothing in
| common, and intuition derived from one does not transfer
| to the other.
| dvfjsdhgfv wrote:
| > please don't change the subject to AWS pricing
|
| Why? The only reason I'm using Hetzner and not AWS for
| several of my own projects (even though I know AWS much
| better since this is what I use at work) is an enormous
| price difference in each aspect (compute, storage,
| traffic).
| KronisLV wrote:
| > The two have nothing in common
|
| If all you need are some cloud servers, or a basic load
| balancer, they _are_ pretty much the same.
|
| If you need a plethora of managed services and don't want
| to risk getting fired over your choice or specifics of
| how that service is actually rendered, they are nothing
| alike and you should go for AWS, or one of the other
| large alternatives (GCP, Azure etc.).
|
| On the flip side, if you are using AWS or one of those
| large platforms as a glorified VPS host and you aren't
| doing this in an enterprise environment, outside of
| learning scenarios, you are probably doing something
| wrong and you should look at Hetzner, Contabo, or one of
| those other providers, though some can still be a bit
| pricey - DigitalOcean, Vultr, Scaleway etc.
| jumploops wrote:
| Funny story time.
|
| I was once part of an acquisition from a much larger
| corporate entity. The new parent company was in the middle of
| a huge cloud migration, and as part of our integration into
| their org, we were required to migrate our services to the
| cloud.
|
| Our calculations said it would cost 3x as much to run our
| infra on the cloud.
|
| We pushed back, and were greenlit on creating a hybrid
| architecture that allowed us to launch machines both on-prem
| and in the cloud (via a direct link to the cloud datacenter).
| This gave us the benefit of autoscaling our volatile
| services, while maintaining our predictable services on the
| cheap.
|
| After I left, apparently my former team was strong-armed into
| migrating everything to the cloud.
|
| A few years go by, and guess who reaches out on LinkedIn?
|
| The parent org was curious how we built the hybrid infra, and
| wanted us to come back to do it again.
|
| I didn't go back.
| smitty1e wrote:
| My funny story is built on the idea that AWS is Hotel
| California for your data.
|
| A customer had an interest in merging the data from an
| older account into a new one, just to simplify matters.
| Enterprise data. Going back years. Not even leaving the
| region.
|
| The AWS rep in the meeting kinda pauses, says: "We'll get
| back to you on the cost to do that."
|
| The sticker shock was enough that the customer simply
| inherited the old account, rather than making things tidy.
| hhsectech wrote:
| Eh? I've never had a problem moving data out of AWS.
|
| Have people lost the ability to write export and backup
| scripts?
| Draiken wrote:
| The ingress/egress cost is ridiculously high. Some
| companies don't care, but it is there and I've seen it
| catch people off guard multiple times.
| varjag wrote:
| Oh come on from the description both accounts could be
| sitting on the same datacenter LAN.
| LadyCailin wrote:
| It's the cost of data egress, which isn't free.
| mciancia wrote:
| But there is no paid egress when we are moving data
| between account within one region, rigth?
| storyinmemo wrote:
| There is. You pay a price for any cross-VPC traffic.
| CubsFan1060 wrote:
| This isn't true, at least not anymore.
|
| You can peer two vpc's and as long as you are
| transferring within the same (real) AZ, it's free:
| https://aws.amazon.com/about-aws/whats-
| new/2021/05/amazon-vp...
|
| Even peered VPC's only pay "normal" prices:
| https://aws.amazon.com/ec2/pricing/on-
| demand/#Data_Transfer
|
| "Data transferred "in" to and "out" from Amazon EC2,
| Amazon RDS, Amazon Redshift, Amazon DynamoDB Accelerator
| (DAX), and Amazon ElastiCache instances, Elastic Network
| Interfaces or VPC Peering connections across Availability
| Zones in the same AWS Region is charged at $0.01/GB in
| each direction."
| interroboink wrote:
| My (peripheral) experience is that it is much cheaper to
| get data in than to get data out. When you have the
| amount of data being discussed -- "Enterprise data. Going
| back years." -- that can get very costly.
|
| It's the amount of data where it makes more sense to put
| hard drives on a truck and drive across the country
| rather than send it over a network, where this becomes an
| issue (actually, probably a bit _before_ then).
| fcarraldo wrote:
| AWS actually has a service for this - Snowmobile, a
| storage datacenter inside of a shipping container, which
| is driven to you on a semi truck.
| https://aws.amazon.com/snowmobile/
| xmcqdpt2 wrote:
| They do not!
|
| > Q: Can I export data from AWS with Snowmobile? > >
| Snowmobile does not support data export. It is designed
| to let you quickly, easily, and more securely migrate
| exabytes of data to AWS. When you need to export data
| from AWS, you can use AWS Snowball Edge to quickly export
| up to 100TB per appliance and run multiple export jobs in
| parallel as necessary. Visit the Snowball Edge FAQs to
| learn more.
|
| https://aws.amazon.com/snowmobile/faqs/?nc2=h_mo-lang
|
| Why would they make it convenient to leave?
| fcarraldo wrote:
| Oh, TIL! Thanks for correcting me.
| brickteacup wrote:
| That's only for data into AWS though, not data out
| Shorel wrote:
| Just in network costs, there's a huge asymmetry.
| Uploading data to AWS is free. Downloading data from
| them, you have to pay.
|
| When you have enough data, that cost is quite
| significant.
| mijoharas wrote:
| There's a cost for data egress (but not ingress)
| banku_brougham wrote:
| Is R2 a sensible option for hosting data? I understand
| egress is chesp.
| stickfigure wrote:
| R2 is great. Our GCS bill (almost all egress) jumped from
| a few hundred dollars a month to a couple thousand
| dollars a month last year due to a usage spike. We rush-
| migrated to R2 and now that part of the bill is $0.
|
| I've heard some people here on HN say that it's slow, but
| I haven't noticed a difference. We're mainly dealing with
| multi-megabyte image files, so YMMV if you have a
| different workload.
| hhsectech wrote:
| There are two possible scenarios here. Firstly, they can't
| find the talent to support what you implemented...or more
| likely, your docs suck!
|
| I've made a career out of inheriting other peoples whacky
| setups and supporting them (as well as fixing them) and
| almost always its documentation that has prevented the
| client getting anywhere.
|
| I personally dont care if the docs are crap because usually
| the first thing I do is update / actually write the docs to
| make them usable.
|
| For a lot of techs though crap documentation is a deal
| breaker.
|
| Crap docs aren't always the fault of the guys implementing
| though, sometimes there are time constraints that prevent
| proper docs being written. Quite frequently though its
| outsourced development agencies that refuse to write it
| because its "out of scope" and a "billable extra". Which I
| think is an egregious stance...doxs Should be part and
| parcel of the project. Mandatory.
| smokel wrote:
| I agree that bad documentation is a serious problem in
| many cases. So much so that your suggestion to write the
| documentation after the fact can become quite impossible.
|
| If there is only one thing that juniors should learn
| about writing documentation (be it comments or design
| documents), it is this: document _why_ something is
| there. If resources are limited, you can safely skip
| comments that describe _how_ something works, because
| that information is also available in code.
|
| (It might help to describe _what_ is available,
| especially if code is spread out over multiple
| repositories, libraries, teams, etc.)
|
| (Also, I suppose the comment I'm responding to could've
| been slightly more forgiving to GP, but that's another
| story.)
| lazyasciiart wrote:
| Unfortunately it's also possible that e.g the company
| switched from share point to confluence and lost half the
| entire knowledge base because it wasn't labeled the way
| they thought it was. Or that the docs were all purged
| because they were part of an abandoned project.
| adrianmsmith wrote:
| > Quite frequently though its outsourced development
| agencies that refuse to write it
|
| It's also completely against their interest to write docs
| as it makes their replacement easier.
|
| That's why you need someone competent on the buying side
| to insist on the docs.
|
| A lot of companies outsource because they _don 't_ have
| this competency themselves. So it's inevitable that this
| sort of thing happens and companies get locked in and
| can't replace their contractors, because they don't have
| any docs.
| thelastparadise wrote:
| > the first thing I do is update / actually write the
| docs to make them usable.
|
| OK so the docs are in sync for a single point of time
| when you finish. Plus you get to have the context in your
| head (bus factor of 1, job security for you, bad for the
| org.)
|
| How about if we just write clean infra configs/code,
| stick to well known systems like docker, ansible, k8s,
| etc.
|
| Then we can make this infra code available to an on prem
| LLM and ask it questions as needed without it drifting
| out of sync overtime as your docs surely will.
|
| Wrong documentation is worse than no documentation.
| maxrecursion wrote:
| "Crap docs aren't always the fault of the guys
| implementing though, sometimes there are time constraints
| that prevent proper docs being written."
|
| I can always guarantee a stream of consciousness one note
| that should have most of the important data, and a few
| docs about the most important parts. It's up to
| management if they want me to spend time turning that one
| note into actual robust documentation that is easily
| read.
| ZoomerCretin wrote:
| Documentation? What for? It's self-documenting (to me,
| because I wrote it)!
| nyc_data_geek wrote:
| Yes, I do believe autoscaling is actually a good use case
| for public cloud. If you have bursty load that requires a
| lot of resources at peak which would sit idle most of the
| time, probably doesn't make sense to own what you need for
| those peaks.
| throwawaaarrgh wrote:
| It's not an either/or. Many business both own and rent
| things.
|
| If price is the only factor, your business model (or
| executives' decision-making) is questionable. Buy only the
| cheapest shit, spend your time building your own office chair
| rather than talking to a customer, you aren't making a
| premium product, and that means you're not differentiated.
| Scubabear68 wrote:
| Elsewhere today I recommended RDS, but was thinking of small
| startup cases that may lack infrastructure chops.
|
| But you are totally right it can be expensive. I worked with a
| startup that had some inefficient queries, normally it would
| matter, but with RDS it cost $3,000 a month for a tiny user
| base and not that much data (millions of rows at most).
| rswail wrote:
| That sounds like the app needs some serious surgery.
| j16sdiz wrote:
| In another section , they mentioned they don't have DBA, no app
| team own the database and the infra team is overwhelmed.
|
| RDS make perfect sense for them
| osigurdson wrote:
| Cloud was supposed to be a commodity. Instead it is priced like
| at burger at the ski hill.
| Sparkyte wrote:
| Data isn't cheap never was. Paying the licensing fees on top
| make it more expensive. It really depends on the circunstance a
| managed database usually has exended support from the compaany
| providing it. You have to weigh a team's expertise to manage a
| solution on your own and ensure you spent ample time making it
| resilient. Other half is the cost of upgrading hardware
| sometimes it is better to just outright pay a cloud provider if
| you business does not have enough income to outright buy
| hardware.There is always an upfront cost.
|
| Small databases or test environment databases you can also
| leverage kubernetes to host an operator for that tiny DB. When
| it comes to serious data and it needs a beeline recovery
| strategy RDS.
|
| Really it should be a mix self hosted for things you aren't
| afraid to break. Hosted for the things you put at high risk.
| silisili wrote:
| Even for small workloads it's a difficult choice. I ran a small
| but vital db, and RDS was costing us like 60 bucks a month per
| env. That's 240/month/app.
|
| DynamoDB as a replacement, pay per request, was essentially
| free.
|
| I found Dynamo foreign and rather ugly to code for initially,
| but am happy with the performance and especially price at the
| end.
| dfgdfg34545456 wrote:
| For big companies such as banks this cost comparison is not as
| straight forward. They have whole data centres just sitting
| there for disaster recovery. They periodically do switchovers
| to test DR. All of this expense goes away when they migrate to
| cloud.
| nightfly wrote:
| > All of this expense goes away when they migrate to cloud.
|
| Just to pay someone else enough money to provide the same
| service and make a profit while do it
| dfgdfg34545456 wrote:
| Well corporations pay printers to do their printing because
| they don't want to be in the business of printing. It's the
| same with infrastructure, a lot of corporations simply
| don't want to be in the data centre business.
| jabradoodle wrote:
| That's how nearly every aspect of every business works;
| would you you start a bakery by learning construction and
| building it yourself?
| AtlasBarfed wrote:
| Construction is a one time cost. It infrastructure is in
| constant use.
|
| It's like accounting and finance. Yeah a lot of companies
| use tax firms, but they all have finance and accounting
| in-house.
| graemep wrote:
| > All of this expense goes away when they migrate to cloud.
|
| They need to replicate everything in multiple availability
| zones, which is going to be more expensive than replicating
| data centres.
|
| They still need to test their cloud infrastracuture works.
| AtNightWeCode wrote:
| In your case it sounds more viable to move to VMs instead of
| RDS, which some cloud providers also recommend.
| prisenco wrote:
| From what I've read, a common model for mmorpg companies is to
| use on-prem or colocated as their primary and then provision a
| cloud service for backup or overage.
|
| Seems like a solid cost effective approach for when a company
| reaches a certain scale.
| hardolaf wrote:
| Lots of companies, like Grinding Gear Games and Square Enix,
| just rent whole servers for a tiny fraction of the price
| compared to what the price gouging cloud providers would
| charge for the same resources. They get the best of both
| worlds. They can scale up their infrastructure in hours or
| even minutes and they can move to any other commodity
| hardware in any other datacenter at the drop of a hat if they
| get screwed on pricing. Migrating from one server provider
| (such as IBM) to another (such as Hetzner) can take an
| experienced team 1-2 weeks at most. Given that pricing
| updates are usually given 1-3 quarters ahead at a minimum,
| they have massive leverage over their providers because they
| an so easily switch. Meanwhile, if AWS decides to jack up
| their prices, well you're pretty much screwed in the short-
| term if you designed around their cloud services.
| fulafel wrote:
| I'd add another criticism to the whole quote:
|
| > Data is the most critical part of your infrastructure. You
| lose your network: that's downtime. You lose your data: that's
| a company ending event. The markup cost of using RDS (or any
| managed database) is worth it.
|
| You need well-run, regularly tested, air gapped or otherwise
| immutable backups of your DB (and other critical biz data).
| Even if RDS was perfect, it still doesn't protect you from the
| things that backups protect you from.
|
| After you have backups, the idea of paying enormous amounts for
| RDS in order to keep your company from ending is more far
| fetched.
| afpx wrote:
| That's the cost of two people.
| raffraffraff wrote:
| I agree that RDS is stupidly expensive and not worth it
| provided that the company actually hires at least 2x full-time
| database owners who monitor, configure, scale and back up
| databases. Most startups will just save the money and let
| developers "own" their own databases or "be responsible for"
| uptime and backups.
| rr808 wrote:
| For a couple hundred grand you can get a team of 20 fully
| trained people working full time in most parts of the world.
| CSMastermind wrote:
| So by and large I agree with the things in this article. It's
| interesting that the points I disagree with the author on are all
| SaaS products:
|
| > Moving off JIRA onto linear
|
| I don't get the hype. Linear is fine and all but I constantly
| find things I either can't or don't know how to do. How do I make
| different ticket types with different sets of fields? No clue.
|
| > Not using Terraform Cloud No Regrets
|
| I generally recommend Terraform Cloud - it's easy for you to grow
| your own in house system that works fine for a few years and
| gradually ends up costing you in the long run if you don't.
|
| > GitHub actions for CI/CD Endorse-ish
|
| Use Gitlab
|
| > Datadog Regret
|
| Strong disagree - it's easily the best monitoring/observability
| tool on the market by a wide margin.
|
| Cost is the most common complaint and it's almost always from
| people who don't have it configured correctly (which to be fair
| Datadog makes it far too easy to misconfigure things and blow up
| costs).
|
| > Pagerduty Endorse
|
| Pagerduty charges like 10x what Opsgenie does and offers no
| better functionality.
|
| When I had a contract renewal with Pagerduty I asked the sales
| rep what features they had that Opsgenie didn't.
|
| He told me they're positioning themselves as the high end brand
| in the market.
|
| Cool so I'm okay going generic brand for my incident reporting.
|
| Every CFO should use this as a litmus test to understand if their
| CTO is financially prudent IMO.
| crabmusket wrote:
| We moved from Trello to Linear and it's been fantastic. I hope
| to never work at an organisation large enough for JIRA to be a
| good idea.
| CSMastermind wrote:
| To be fair Linear does strike me as everything everyone
| always hoped Trello would be.
|
| So if that's the upgrade path you're going down I'd expect it
| to be fantastic.
| cqqxo4zV46cp wrote:
| Newer (aka next gen aka Team-managed) Jira projects are
| pretty solid.
| FridgeSeal wrote:
| Do jira pages still take 30 seconds to load, and have all
| the interaction speed of cold molasses? Does it have nice
| keyboard shortcuts yet? Do I still need to perform an
| arcane ritual of setup to get the ticket statuses to be
| what I want?
|
| Linear has been such a breath of fresh air, with such a
| solid desktop app (on Mac OS) that I don't ever want to go
| back. Stuff happens _instantly_ , the layout and semantics
| are an excellent "90% good enough" that I would happily
| relegate jira to only the most enterprise of enterprise
| projects.
| coffeebeqn wrote:
| At one of the bigger companies I was at we had an on-prem
| JIRA in the same office building and it was still so slow
| that I would often forget why I was loading that page
| Cacti wrote:
| trigger warning please on the Jira stuff
| crabmusket wrote:
| Linear is making (fairly) good on the promises of local-
| first software. As opposed to "every click is a round
| trip to the server" software.
| mjfisher wrote:
| No, Jira loading is relatively OK and on par with other
| SPAs. It's got a CTRL+SHIFT+P style actions menu for
| tickets which helps cut down on point and click pain
| (especially for linking issues etc). Setting up statues
| and workflows and how they map to a board is relatively
| straightforward.
|
| There are lots of things where Jira falls short, but the
| pain points on an under-resourced self hosted instance of
| ten years ago are nothing like the ones you'll find on
| Jira cloud today.
| aniforprez wrote:
| Does Jira still have multiple flavours of markdown for
| different fields and editors? Last I used it, it used a
| different flavour for creating and editing a ticket. Also
| another flavour for bitbucket. None of these were
| compatible and it would convert between them in the
| backend but I was left confused every time when I would
| have to switch formatting styles
| mjfisher wrote:
| I remember that from a while back, and getting annoyed -
| it doesn't appear to be something that annoys me at the
| moment so it might have been fixed, but on reflection I
| tend to just use the default rich text editor now.
|
| It takes markdownish input but converts it to rich text
| as you type - so asterisk-space starts a bullet point
| list, etc.
|
| I actually can't remember if it has a dedicated markdown
| mode anymore; the rich text editing supports the usual
| shortcuts that mean I tend to stick with it.
| tootie wrote:
| Interesting. Atlassian also just launched an integration with
| OpsGenie. I have the same opinion of JIRA. I've tried many
| competitors (not Linear so far) and regretted it every time.
| Jedd wrote:
| > Atlassian also just launched an integration with OpsGenie.
|
| Given Atlassian bought OpsGenie in 2018, this either
| somewhere between _quite late_ and _unsurprising_.
| rswail wrote:
| Two different measurements (time and Atlassian development
| processes) that are orthogonal.
|
| Anything Atlassian does is mostly quite late and its
| integration story is so pathetic that it's unsurprising.
|
| Try to have a bitbucket pipeline that pushes to confluence.
| Seems like a basic integration to have, after all,
| Confluence has an API (well, actually it has 3 different
| ones) so surely Atlassian would make a basic thing like
| "publish a wiki page" a thing you get out of the box.
|
| Nope.
| Jedd wrote:
| Oh, I am no great fan. Plus I have a nascent blog post on
| the subject of 'can you believe ...?' items around this
| subject.
|
| I suppose it comes back to the comparative priorities (as
| evaluated by recurrent revenue) of ticking rfq boxes vs
| solving actual problems.
| jacurtis wrote:
| I'm not sure they just launched anything. OpsGenie has been
| an Atlassian product for 5 or more years now. I've been using
| it for 3-4 myself and its been integrated with Jira the whole
| time.
|
| In fact, OpsGenie has mostly been on Auto-pilot for a few
| years now.
| steveBK123 wrote:
| Agreed on PagerDuty It doesn't really do a lot, administrating
| it is fairly finicky, and most shops barely use half the
| functionality it has anyway.
|
| To me its whole schedule interface is atrocious for its price,
| given from an SRE/dev perspective, that's literally its purpose
| - scheduled escalations.
| colechristensen wrote:
| PagerDuty's cheapest plan is $21 per user month
|
| OpsGenie's cheapest is $9 per user month but arbitrarily
| crippled, the plan anybody would want to use is $19 per user
| month
|
| So instead of a factor of ten it's ten percent cheaper. And i
| just kind of expect Atlassian to suck.
|
| Datadog is ridiculously expensive and on several occasions I've
| run into problems where an obvious cause for an incident was
| hidden by bad behavior of datadog.
| compumike wrote:
| Heii On-Call is $32 per month total for your team -- not per
| user. https://heiioncall.com/ (Full disclosure: part of the
| team building it)
| avemg wrote:
| How do you pronounce that?
| revscat wrote:
| "Hey".
| solatic wrote:
| Looks super interesting, and that $3/month for hobbyists is
| just low enough to meet my budget for hobby services, but
| please, for on-call stuff, you gotta have alerts that make
| phone calls. Nothing else is going to wake me in the middle
| of the night. This is the #1 feature I expect from an on-
| call service - you're on-call because you will be _called_.
| compumike wrote:
| Thanks for the feedback!
|
| We use iOS "Critical Alerts" and similar on Android that
| breaks through any Do-Not-Disturb settings.
| https://heiioncall.com/blog/better-alerting-for-heii-on-
| call... Would you be willing to give that a shot? It
| wakes me every time :)
|
| (It's configurable too; we have vibrate-only or silenced
| modes. Think old-school beeper.)
|
| In the rare case that it doesn't wake you, we have
| configurable escalation strategies to alert someone else
| on your team after a configurable number of minutes.
| mads_quist wrote:
| We are building a great and affordable incident
| escalation tool as well:
|
| https://allquiet.app
|
| With SMS, Phone Calls and Critical Alerts / DnD override.
|
| We're 5 USD/user.
|
| We try to build as close to our users as possible. Happy
| for any new try outs! :)
|
| (I am co founder)
| jpb0104 wrote:
| I just started building out on-call rotation scheduling to
| fit teams that already have an alerting solution and need
| simple automated scheduling. I'd love to get some feedback:
| https://majorpager.com
| skrtskrt wrote:
| Grafana OnCall can be self hosted for free or you can pay $20
| a month, and still always have the option to migrate to self
| hosting if you want to save money
| macNchz wrote:
| > Cost is the most common complaint and it's almost always from
| people who don't have it configured correctly (which to be fair
| Datadog makes it far too easy to misconfigure things and blow
| up costs).
|
| I loved Datadog 10 years ago when I joined a company that
| already used it where I never once had to think about pricing.
| It was at the top of my list when evaluating monitoring tools
| for my company last year, until I got to the costs. The pricing
| page itself made my head swim. I just couldn't get behind
| subscribing to something with pricing that felt designed to be
| impossible to reason about, even if the software is best in
| class.
| gen220 wrote:
| I'm a big fan of Datadog from multiple angles.
|
| Their pricing setup is evil. Breaking out by SKUs and having
| 10+ SKUs is fine, trialing services with "spot" prices before
| committing to reserved capacity is also fine.
|
| But (for some SKUs, at least) they make it really difficult
| to be confident that the reserved capacity you're purchasing
| will cover your spot use cases. Then, they make you contact a
| sales rep to lower your reserved capacity.
|
| It all feels designed to get you to pay the "spot" rate for
| as long as possible, and it's not a good look.
|
| I understand the pressures on their billing and sales teams
| that lead to these patterns, but they don't align with their
| customers in the long term. I hope they clean up their act,
| because I agree they're losing some set of customers over it.
| viraptor wrote:
| Another annoying thing is that the billing dashboards do
| not map clearly to what's on the pricing pages / in the
| contract. Good luck figuring out the extras for RUM when
| you have multiple orgs.
|
| Then they have things that I wanted to try for a long time,
| but... support doesn't care? Repeated "would you like to
| use this? / very likely, can we try it out? / (silence)". I
| love their product, but they are so annoying to deal with
| at the billing level.
| iaresee wrote:
| > Another annoying thing is that the billing dashboards
| do not map clearly to what's on the pricing pages / in
| the contract. Good luck figuring out the extras for RUM
| when you have multiple orgs.
|
| I, quite literally, was griping to my Datadog CSM about
| this exact thing last week. They'll email me and be, "Oh,
| you know you're logging volume this month put you into
| on-demand indexing rates, right?" and my answer is
| always, "No, because your monitoring platform makes it
| nearly impossible for me to monitor it correctly."
|
| You can't reference your contracted volume rates when
| building monitors out and the units for the metrics you
| need to watch don't match the units you contract with
| them on the SKU.
|
| Maddening.
| Solvency wrote:
| And why do you continue to deal with scum like this?
| You're ultimately going to pay it and business will carry
| on as usual for them.
| kevinslin wrote:
| > You can't reference your contracted volume rates when
| building monitors out and the units for the metrics you
| need to watch don't match the units you contract with
| them on the SKU.
|
| Are you referring to the
| `datadog.estimated_usage.logs.ingested_events` metric? It
| includes excluded events by default but you can get to
| your indexed volume by excluding excluded logs. `sum:data
| dog.estimated_usage.logs.ingested_events{datadog_index:*,
| datadog_is_excluded:false}.as_count()`
| jacurtis wrote:
| > Datadog makes it far too easy to misconfigure things and
| blow up costs
|
| I'll give you a fun example. It's fresh in my mind because i
| just got reamed out about it this week.
|
| In our last contract with DataDog, they convinced us to try
| out the CloudSIEM product, we put in a small $600/mo
| committment to it to try it out. Well, we never really set it
| up and it sat on autopilot for many months. We fell under our
| contract rate for it for almost a year.
|
| Then last month we had some crazy stuff happen and we were
| spamming logs into DataDog for a variety of reasons. I knew I
| didn't want to pay for these billions of logs to be indexed,
| so I made an exclusion filter to keep them out of our log
| indexes so we didn't have a crazy bill for log indexing.
|
| So our rep emailed me last week and said "Hey just a heads up
| you have $6,500 in on-demand costs for CloudSIEM, I hope that
| was expected". No, it was NOT expected. Turns out excluding
| logs from indexing does not exclude them from CloudSIEM. Fun
| fact, you will not find any documented way to exclude logs
| from CloudSIEM ingestion. It is technically possible, but
| only through their API and it isn't documented. Anyway, I
| didn't do or know this, so now i had $6,500 of on-demand
| costs plus $400-500 misc on-demand costs that I had to
| explain to the CTO.
|
| I should mention my annual review/pay raise is also next week
| (I report to the CTO), so this will now be fresh in their
| mind for that experience.
| macNchz wrote:
| That's just the sort of hypothetical scenario that kept
| running through my head as I tried to find a way for us to
| use Datadog. I even particularly wanted to use the
| CloudSIEM product. Bummer.
| xtracto wrote:
| DatDog is a freaking beast. NY wife works in workday (a huge
| employee management system) and they have a very large number
| of tutorials, videos, "working hours" and other tools to ensure
| their customers are making the best use of it.
|
| Datadog on the other side... their "DD University" is a shame
| and we as paying customers are overwhelmed and with no real
| guidance. DD should assign some time for integration for new
| customers, even if it is proportional to what you pay annually.
| (I think I pay around 6000 usd annually.
| jacurtis wrote:
| I mostly agreed with OP's article, but you basically nailed all
| of the points of disagreement I did have.
|
| Jira: Its overhyped and overpriced. Most HATE jira. I guess I
| don't care enough. I've never met a ticket system that I loved.
| Jira is fine. Its overly complex sure. But once you set it up,
| you don't need to change it very often. I don't love it, I
| don't hate it. No one ever got fired for choosing Jira, so it
| gets chosen. Welcome to the tech industry.
|
| Terraform Cloud: The gains for Terraform Cloud are minimal. We
| just use Gitlab for running Terraform pipelines and have a
| super nice custom solution that we enjoy. It wasn't that hard
| to do either. We maintain state files remotely in S3 with
| versioning for the rare cases when we need to restore a
| foobar'd statefile. Honestly I like having Terraform pipelines
| in the same place as the code and pipelines for other things.
|
| GitHub Actions: Yeah switch to GitLab. I used to like Github
| Actions until I moved to a company with Gitlab and it is best
| in class, full stop. I could rave about Gitlab for hours. I
| will evangelize for Gitlab anywhere I go that is using anything
| else.
|
| DataDog: As mentioned, DataDog is the best monitoring and
| observability solution out there. The only reason NOT to use it
| is the cost. It is absurdly expensive. Yes, truly expensive. I
| really hate how expensive it is. But luckily I work somewhere
| that lets us have it and its amazing.
|
| Pagerduty: Agree, switch to OpsGenie. Opsgenie is considerably
| cheaper and does all the pager stuff of Pager duty. All the
| stuff that PagerDuty tries to tack on top to justify its cost
| is stuff you don't need. OpsGenie does all the stuff you need.
| Its fine. Similar to Jira, its not something anyone wants
| anyway. No ones going to love it, no one loves being on call.
| So just save money with OpsGenie. If you're going to fight for
| the "brand name" of something, fight for DataDog instead, not a
| cooler pager system.
| bigstrat2003 wrote:
| I'm right there with you on Jira. The haters are wrong - it's
| a decent enough ticket system, no worse than anything else
| I've used. You can definitely torture Jira into something
| horrible, but that's not Jira's fault. Bad managers will ruin
| _any_ ticket system if they have the customization tools to
| do so.
| Cacti wrote:
| Using Jira feels like using IBM enterprise web software
| from 2005, and I am simply not going to make my teams put
| up with that amount of inanity.
| rswail wrote:
| We switched to JIRA around 2005 _away_ from IBM
| enterprise web software, because it was a breath of fresh
| air.
|
| So on the standard tech hype cycle, that sounds about
| right.
| mixmastamyk wrote:
| Found the person who never used _Lotus Notes_ haha.
| Cacti wrote:
| I was blown away when I found out a couple years ago that
| there were major corporations still using that as their
| primary communication platform.
| mixmastamyk wrote:
| Surely has improved in the last 20+ years? :hope:
| matwood wrote:
| Yeah, usually Jira hate is really convoluted company
| process hate. Of course the Jira software isn't perfect,
| but it's fine. Jira's strength and weakness is it's
| flexibility.
| benced wrote:
| After their ridiculous outage, I wouldn't touch OpsGenie with a
| 10ft pole.
| mardifoufs wrote:
| Why gitlab? GitHub actions are a mess but gitlab online's ci cd
| is not much better at all, and for self hosted it opens a whole
| different can of worms. At least with GitHub actions you have a
| plugin ecosystem that makes the super janky underlying platform
| a bit more bearable.
| YoshiRulz wrote:
| I've found GitLab CI's "DAG of jobs" model has made
| maintenance and, crucially for us, optimisation relatively
| easy. Then I look into GitHub Actions and... where are the
| abstraction tools? How do I cache just part of my "workflow"?
| Plugins be damned. GitLab CI is so good that I'm willing to
| overlook vendor lock-in and YAML, and use it for our GitHub
| project even without proper integration. (Frankly the rest of
| GitLab seems to always be a couple features ahead, but no-
| one's willing to migrate.)
| mardifoufs wrote:
| Mhmm that's actually a good point!! I didn't realize that I
| couldn't do that with GitHub, I never really used partial
| caching. I just had a lot (a looot) of issues with our
| kubernetes runner (which I even made sure to be as close to
| the vanilla docs example as possible). I guess the grass is
| always greener on the other side :)
| bilalq wrote:
| Linear has a lot going for it. It doesn't support custom
| fields, so if that's a critical feature for you, I can see it
| falling short. In my experience though, custom fields just end
| up being a mess anytime a manager changes and decides to do
| things differently, things get moved around teams, etc.
|
| - It's fast. It's wild that this is a selling point, but it's
| actually a huge deal. JIRA and so many other tools like it are
| as slow as molasses. Speed is honestly the biggest feature.
|
| - It looks pretty. If your team is going to spend time there,
| this will end up affecting productivity.
|
| - It has a decent degree of customization and an API. We've
| automated tickets moving across columns whenever something gets
| started, a PR is up for review, when a change is merged, when
| it's deployed to beta, and when it's deployed to prod. We've
| even built our own CLI tools for being able to action on Linear
| without leaving your shell.
|
| - It has a lot of keyboard shortcuts for power users.
|
| - It's well featured. You get teams, triaging, sprints
| (cycles), backlog, project management, custom views that are
| shareable, roadmaps, etc...
| marcinzm wrote:
| > Cost is the most common complaint and it's almost always from
| people who don't have it configured correctly (which to be fair
| Datadog makes it far too easy to misconfigure things and blow
| up costs).
|
| Datadog's cheapest pricing is $15/host/month. I believe that is
| based on the largest sustained peak usage you have.
|
| We run spot instances on AWS for machine learning workflows. A
| lot of them if we're training and none otherwise. Usually we're
| using zero. Using DataDog at it's lowest price would basically
| double the cost of those instances.
| data_maan wrote:
| This may be a noob question - but why not use Github Projects
| instead of Linear or Jita?
|
| You're staying within an ecosystem you know and it seems to
| offer almost all of the necessary functionality
| kevinslin wrote:
| In terms of Datadog - the per host pricing on infrastructure in
| a k8/microservices world is perhaps the most egregious of
| pricing models across all datadog services. Triply true if you
| use spot instances for short lived workloads.
|
| For folks running k8 at any sort of scale, I generally
| recommend aggregating metrics BEFORE sending them to datadog,
| either on a per deployment or per cluster level. Individual
| host metrics tend to also matter less once you have a large
| fleet.
|
| You can use opensource tools like veneur
| (https://github.com/stripe/veneur) to do this. And if you don't
| want to set this up yourself, third party services like Nimbus
| (https://nimbus.dev/) can do this for you automatically (note
| that this is currently a preview feature). Disclaimer also that
| I'm the founder of Nimbus (we help companies cut datadog costs
| by over 60%) and have a dog in this fight.
| lijok wrote:
| > I generally recommend Terraform Cloud
|
| I'll be dead in the ground before I use TFC. 10 cents per
| resource per month my ass. We have around 100k~ resources at an
| early-stage startup I'm at, our AWS bill is $50~/mo and TFC
| wants to charge me $10k/mo for that? We can hire a senior dev
| to maintain an in-house tool full time for that much.
| 005 wrote:
| Interesting read, I agree with adopting an identity platform but
| this can definitely be contentious if you want to own your data.
|
| I wonder how much one should pay attention to future problems at
| the start of a startup versus "move fast and break things." Some
| of this stuff might just put you off finishing.
| sakopov wrote:
| Who's using Pulumi here and how mature is it in comparison to
| terraform?
| jryan49 wrote:
| I think currently under the hood it's actually still terraform.
| I know they are working on their own native providers.
| dmattia wrote:
| I'm using Pulumi in production pretty heavily for a bunch of
| different app types (ECS, EKS, CloudFront, CloudFlare, Vault,
| Datadog monitors, Lambdas of all types, EC2s with ASGs, etc.),
| it's reasonably mature enough.
|
| As mentioned in the other comment, the most commonly used
| providers for terraform are "bridged" to pulumi, so the
| maturity is nearly identical to Terraform. I don't really use
| Pulumi's pre-built modules (crossroads), but I don't find I've
| ever missed them.
|
| I really like both Pulumi and Terraform (which I also used in
| production for hundreds of modules for a few years), which it
| seems like isn't always a popular opinion on HN, but I have and
| you absolutely can run either tool in production just fine.
|
| My slight preference is for Pulumi because I get slightly more
| willing assistance from devs on our team to reach in and change
| something in infra-land if they need to while working on app
| code.
|
| We do still use some Pulumi and some Terraform, and they play
| really nicely together: https://transcend.io/blog/use-
| terraform-pulumi-together-migr...
| rswail wrote:
| IaaC is one of the worst acronyms ever.
|
| Infrastructure should be _declared_ , not coded.
|
| Say what you want. The tool then builds that, or changes whats
| there to match.
|
| I've tried Pulumi and understanding the bit that runs before it
| tries to do stuff and the bit that runs after it tries to do
| stuff and working out where the bugs are is a PITA. It lulls
| you into a false sense of security that you can refer to your
| own variables in code, but that doesn't get carried over to
| when it is actually running the plan on the cloud service (ie
| actually creating the infrastructure) because you can only
| refer to the outputs of other infrastructure.
|
| CFN is too far in the other direction, primarily because it's
| completely invisible and hard to debug.
|
| Terraform has _enough_ programmability (eg for_each, for-
| expressions etc) that you can write "here is what I want and
| how the things link together" and terraform will work out how
| to do it.
|
| The language is... sometimes painful, but it works.
|
| The provider support is unmatched and the modules are of
| reasonable quality.
| breckenedge wrote:
| > There are no great FaaS options for running GPU workloads,
| which is why we could never go fully FaaS.
|
| I keep wondering when this is going to show up. We have a lot of
| service providers, but even more frameworks, and every vendor
| seems to have their own bespoke API.
| z3ugma wrote:
| Check out beam.cloud. They're impressing me with calling GPU
| runtimes as a FaaS
| gfodor wrote:
| I just started playing with modal.com and so far it seems good.
| I haven't run anything in production yet, so YMMV.
| gen220 wrote:
| I don't think anybody should go "fully FaaS", it's like saying
| screwdrivers are useless, all you need is a hammer.
|
| That being said, Cloudflare is on the path to offering a great
| GPU FaaS system for inference.
|
| I believe it's still in beta, but it's the most promising
| option at the moment.
| breckenedge wrote:
| Right, I still find it faster to manually provision a
| specific instance type, install PyTorch on it, and deploy a
| little flask app for an inference server.
| hermanradtke wrote:
| Without some sort of background on cost or scale it is hard to
| judge any of these decisions.
| cratermoon wrote:
| Even if others disagree with your endorsements or regrets, this
| record shows you're actually aware of the important decisions you
| made over the past four years and tracked outcomes. Did you
| record the decisions when you made them and revisit later?
| guhcampos wrote:
| Well it's a bit unfortunate this post was published in Feb 1st,
| it got really outdated really fast around the "choose flux for
| gitops" part.
| CoolCold wrote:
| Mind sharing bit more of the details?
| medina wrote:
| > engineers at Weaveworks built the first version of Flux >
| Weaveworks donated Flux and Flagger to the CNCF
|
| https://fluxcd.io/blog/2022/11/flux-is-a-cncf-graduated-
| proj...
|
| > Weaveworks will be closing its doors and shutting down
| commercial operations > Alexis Richardson, 5 Feb 2024
|
| https://www.linkedin.com/posts/richardsonalexis_hi-
| everyone-...
|
| If the project has legs, it's now under CNCF.
| plagiarist wrote:
| What's the news there? I was just about to try it out this
| weekend.
| alexjurkiewicz wrote:
| Context
| https://www.silverliningsinfo.com/automation/weaveworks-unra...
| zeeZ wrote:
| So far it seems fine, and the maintainers seem to be doing OK
| too.
|
| Is the project future at risk?
| https://github.com/fluxcd/flux2/discussions/4544
| sroussey wrote:
| If you are startup that can can't afford a DBA, then why why why
| are you using Kubernetes?
| jrockway wrote:
| Why wouldn't you use Kubernetes? There are basically 3 classes
| of deployments:
|
| 1) We don't have any software, so we don't have a prod
| environment.
|
| 2) We have 1 team that makes 1 thing, so we just launch it out
| of systemd.
|
| 3) We have between 2 and 1000 teams that make things and want
| to self-manage when stuff gets rolled out.
|
| Kubernetes is case 3. Like it or not, teams that don't
| coordinate with each other is how startups scale, just like big
| companies. You will never find a director of engineering that
| says "nah, let's just have one giant team and one giant
| codebase".
| otterley wrote:
| On AWS, at least, there are alternatives such as ECS and even
| plain old EC2 auto scaling groups. Teams can have the
| autonomy to run their infrastructure however they like
| (subject to whatever corporate policy and compliance regime
| requirements they might have to adhere to).
|
| Kubernetes is appealing to many, but it is not 100%
| frictionless. There are upgrades to manage, control plane
| limits, leaky abstractions, different APIs from your cloud
| provider, different RBAC, and other things you might prefer
| to avoid. It's its own little world on top of whatever world
| you happen to be running your foundational infrastructure on.
|
| Or, as someone has artistically expressed it:
| https://blog.palark.com/wp-
| content/uploads/2022/05/kubernete...
| ezrast wrote:
| The alternatives aren't frictionless either; many items
| from that image are not specific to Kubernetes. I
| personally find AWS API's frustrating to use, so even if I
| were running a one-person shop (and was bound to AWS for
| some reason - maybe a warlock has cursed me?) I'd lean
| towards managing things from EKS to get an interface that
| fits my brain better. It's just preference, though - EC2
| auto-scaling is perfectly viable if that's your jam.
| jrockway wrote:
| The iceberg is fine, but using ECS doesn't absolve you from
| needing to care about monitoring, affinity, audit logging,
| OS upgrades, authentication/IAM, etc. That's generally why
| organizations choose to have infrastructure teams, or to
| not have infrastructure at all.
|
| I have seen people rewrite Kubernetes in CloudFormation.
| You can do it! But it certainly isn't problem-free.
| otterley wrote:
| ECS Fargate does manage the security of the node up to
| and including the container runtime. Patches are often
| applied behind the scenes, without many folks even
| knowing, and for those that require interruption, a
| restart of the task will land it on a patched node.
|
| You're right that if you use a cloud provider, IAM is
| something that has to be reckoned with. But the question
| is, how many implementations of IAM and policy mechanisms
| do I want to deal with?
| klooney wrote:
| K8S has a credible local development and testing story, ECS
| and ASGs do not. The fact that there's a generic interface
| for load-balancer like things, and then you can have a
| different implementation on your laptop, in the datacenter,
| and in AWS, and everything ports, is huge.
|
| Also, you can bundle your load balancer config and
| application config together. No written description of the
| load balancer config + an RPM file to a disinterested
| different team.
| kccqzy wrote:
| One giant codebase is fine. Monorepo is better than lots of
| scattered repos linked together with git hashes. And it
| doesn't really get in the way of each team managing when
| stuff gets rolled out.
| jrockway wrote:
| I'm a big monorepo fan, but you run into that ownership
| problem. "It's slow to clone"; which team fixes that?
| Yasuraka wrote:
| some bored guy at $trillion_dollar_company
|
| https://github.com/martinvonz/jj
| https://github.com/facebook/sapling
| vander_elst wrote:
| Google has one _giant_ codebase. I am pretty sure the aren 't
| the only ones.
| ezrast wrote:
| Because it works, the infra folks you hired already know how to
| use it, the API is slightly less awful than working with AWS
| directly, and your manifests are kinda sorta portable in case
| you need to switch hosting providers for some reason.
| tomas789 wrote:
| This is my case. I'm one man show ATM so no DBA. I'm still
| using Kubernetes. Many things can be automated as simply as
| helm apply. Plus you get the benefit of not having a hot mess
| of systemd services, ad hoc tools which you don't remember how
| you configured, plethora bash scripts to do common tasks and so
| on.
|
| I see Kubernetes as one time (mental and time) investment that
| buys me somehow smoother sailing plus some other benefits.
|
| Of course it is not all rainbows and unicorns. Having a single
| nginx server for a single /static directory would be my dream
| instead of MinIO and such.
| sroussey wrote:
| I don't push to implement Kubernetes until I had 100
| engineers and a reason to use it.
| maccard wrote:
| Because I can go from main.go to a load balanced, autoscaling
| app with rolling deploys, segeregated environments, logging &
| monitoring in about 30 minutes, and never need to touch _any_
| of that again. Plus, if I leave, the guy who comes after me can
| look at a helm chart, terraform module + pipeline.yml and
| figure out how it works. Meanwhile, our janq shell script based
| task scheduler craps out on something new every month. What
| started as 15 lines of "docker run X, sleep 30 docker kill x"
| is now a polyglot monster to handle all sorts of edge cases.
|
| I have spent vanishingly close to 0 hours on maintaining our
| (managed) kubernetes clusters in work over the past 3 years,
| and if I didn't show up tomorrow my replacement would be fine.
| sroussey wrote:
| I spent zero hours on a MySQL server on bare hardware for
| seven years.
|
| Admittedly, I was afraid of ever restarting as I wasn't sure
| it would reboot. But still...
| viraptor wrote:
| You better invest some time in migrating away from your 5.7
| (or earlier) in that case, because it's EOL already ;)
| maccard wrote:
| You still need to get mysql installed and configured
| though. On AWS, it's 30 lines of terraform for RDS on an
| internal subnet with a security group only allowing access
| from your cluster.
|
| For that, you get automated backups, very simple read
| proxies, managed updates of you ever need them. You can
| vertically scale down, or uo to the point of "it's cheaper
| to hire a DBA to fix this".
| yellow_lead wrote:
| If you can do all that in 30 minutes (or even a few hours), I
| would love to read an article/post about your setup, or any
| resources you might recommend.
| maccard wrote:
| I've just done it a dozen times at this point. Hello world
| from gin-gonic [0], terraform file with a DO K8s cluster
| [1] and load balancer, and CI/CD [2] on deploy. There's
| even time to make a cuppa when you run terraform.
|
| We use this for our internal services at work, and the last
| time I touched the infra was in 2022 according to git
|
| [0] https://github.com/gin-gonic/gin
|
| [1] https://gist.github.com/donalmacc/0efbb0b377533232da3f7
| 76c60....
|
| [2] https://docs.digitalocean.com/products/kubernetes/how-
| to/dep...
| yellow_lead wrote:
| Thanks! Does DO K8s come with sufficient monitoring /
| logging or do you add anything?
| yolo3000 wrote:
| You can just deploy other applications to Kubernetes, for
| example you can deploy this operator https://prometheus-
| operator.dev/ and you get Prometheus and Grafana running
| with a bunch of dashboards already created. Then you
| annotate your pods to tell Prometheus what to scrape, and
| you got monitoring. It also comes with AlertManager for
| alerting. Same for logging, you deploy Elasticsearch and
| Kibana and you're good to go.
| maccard wrote:
| As the other commentor said, you can deploy
| Prometheus/grafana into the k8s cluster and it pretty
| much does what you want it to do.
| flemhans wrote:
| You'll need to touch it again. These paid services tend to
| change all the time.
|
| You also need to pay them which is an event.
| kwillets wrote:
| To make up for having a better schema in Terraform than in the
| database.
| paulgb wrote:
| I think a lot of startups have a set of requirements that is
| something like:
|
| - I want to spin up multiple redundant instances of some set of
| services
|
| - I want to load balance over those services
|
| - I want some form of rolling deploy so that I don't have
| downtime when I deploy
|
| - I want some form of declarative infrastructure, not click-ops
|
| Given these requirements, I can't think of an alternative to
| managed k8s that isn't more complex.
| sroussey wrote:
| A startup with no DBA does not need redundant anything. Too
| small.
| mardifoufs wrote:
| Uh? Even some larger startups don't have DBAs anymore. For
| better or for worse. Hell even the place I currently work
| in, which is not a startup at all has basically no DBA role
| to speak of.
| slyall wrote:
| Places get pretty big with no dedicated DBA resources these
| days. Last place I was at was a Fintech SaaS with 50
| engineers and half a million paying customers.
|
| Running off a couple of medium ( $3k/month each range ) RDS
| databases with failover setup. ECS for apps.
|
| Databases looked after themselves. The senior people
| probably spent 20% of a FTE on stuff like optimizing it
| when load crept up.
|
| Place before that was a similar size and no DBA either.
| People just muddled though.
| paulgb wrote:
| This is a sweeping generalization to make, and I think you
| underestimate how easy it is to achieve redundancy with
| modern tools these days.
|
| My company uses redundant services because we like to
| deploy frequently, and our customers notice if our API
| breaks while the service is restarted. Running the service
| redundantly allows us to do rolling deploys while
| continuing to serve our API. It's also saved us from
| downtime when a service encounters a weird code path and
| crashes.
| fulafel wrote:
| AWS Copilot (if you're on AWS). It's a bit like the older
| Elastic Beanstalk for EC2.
| klooney wrote:
| Helm is the only infrastructure package manager I've ever used
| where you could reliably get random third party things running
| without a ton of hassle. It's a huge advantage.
| lysecret wrote:
| Because they are on AWS and can't use Cloud Run.
| endisneigh wrote:
| Great post. I do wonder - what are the simplest K8s alternatives?
|
| Many say in the database world, "use Postgres", or "use sqlite."
| Similarly there are those databases that are robust that no one
| has heard of, but are very limited like FoundationDB. Or things
| that are specialized and generally respected like Clickhouse.
|
| What are the equivalents of above for Kubernetes?
| tomas789 wrote:
| You can always use old boring AWS EC2 and such. And sprinkle in
| some Terraform if you feel fancy. That would be my "use sqlite"
|
| Kubernetes is probably "use postgres"
| marcosdumay wrote:
| Kubernetes aren't like that.
|
| It's just that, you should start with a handful of backed-up
| pet servers. Then manually automate their deployment when you
| need it. And only then go for a tool that abstracts the
| automated deployment when you need it.
|
| But I fear the simplest option on the Kubernetes area is
| Kubernetes.
| doctor_eval wrote:
| I don't know that this is good advice.
|
| I shunned k8s for a long time because of the complexity, but
| the managed options are so much easier to use and deploy than
| pet servers that I can't justify it any more. For anything
| other than truly trivial cases, IMO kubernetes or (or
| similar, like nomad) is easier than any alternative.
|
| The stack I use is hosted Postgres and VKS from Vultr. It's
| been rock solid for me, and the entire infrastructure can be
| stored in code.
| lucw wrote:
| This is good advice, if you haven't experienced the pain of
| doing it yourself, you won't know what the framework does for
| you. There are limits to this reasoning of course, we don't
| reimplement everything on the stack just for the learning
| experience. But starting with just docker might be a good
| idea.
| busterarm wrote:
| The simplest k8s alternative (that is an actual alternative) is
| Nomad.
| Too wrote:
| It's mainly running your own control plane that is complex.
| Managed k8s (EKS, AKS, GKE) is not difficult at all. Don't
| listen to all the haters. It's the same crowd who think they
| can replace systemd with self hacked init scripts written in
| bash, because they don't trust abstractions and need to see
| everything the computer does step-by-step.
|
| I also stayed away for a long time due to all the fear spread
| here, after taking the leap, I'm not looking back.
|
| The lightweight "simpler" alternative is docker-compose. I put
| simpler in quotes because once you factor in all the auxiliary
| software needed to operate the compose files in a professional
| way (IaC, Ansible, monitoring, auth, VM provisioning, ...), you
| will accumulate the same complexity yourself, only difference
| is you are doing it with tools that may be more familiar to
| what you are used to. Kubernetes gives you a single point of
| control plane for all this. Does it come with a learning curve?
| Yes, but once you get over it there is nothing inherent about
| it that makes it unnecessary complex. You don't _need_
| autoscaler, replicasets and those more advanced features just
| because you are on k8s.
|
| If you want to go even simpler, the clouds have offerings to
| just run a container, serverless, no fuzz around. I have to
| warn everyone though that using ACI on Azure was the biggest
| mistake of my career. Conceptually it sounds like a good idea
| but Azures execution of it is just a joke. Updating a very
| small container image taking upwards of 20-30 minutes, no logs
| on startup crashes, randomly stops serving traffic, bad
| integration with storage.
| rmccue wrote:
| > Not using Terraform Cloud
|
| We adopted TFC at the start of 2023 and it was problematic right
| from the start; stability issues, unforeseen limitations, and
| general jankiness. I have no regrets about moving us away from
| local execution, but Terraform Cloud was a terrible provider.
|
| When they announced their pricing changes, the bill for our team
| of 5 engineers would have been roughly 20x, and more than hiring
| an engineer to literally sit there all day just running it
| manually. No idea what they're thinking, apart from hoping their
| move away from open source would lock people in?
|
| We ended up moving to Scalr, and although it hasn't been a long
| time, I can't speak highly enough of them so far. Support was
| amazing throughout our evaluation and migration, and where we've
| hit limits or blockers, they've worked with us to clear them very
| quickly.
| LispSporks22 wrote:
| Can any of your engineers run the product locally and iterate
| fast?
| cissmayazz wrote:
| Yeah typically run a single go service or use devspace to
| combine multiple services using published containers
| hintymad wrote:
| > EKS
|
| My contrarian view is that EC2 + ASG is so pleasant to use. It's
| just conceptually simple: I launch an image into an ASG, and
| configure my autoscale policies. There are very few things to
| worry about. On the other hand, using k8s has always been a big
| deal. We built a whole team to manage k8s. We introduce dozens of
| concepts of k8s or spend person-years on "platform engineering"
| to hide k8s concepts. We publish guidelines and sdks and all
| kinds of validators so people can use k8s "properly". And we
| still write 10s of thousands lines of YAML plus 10s of thousands
| of code to implement an operator. Sometimes I wonder if k8s is
| too intrusive.
| xyzzy_plugh wrote:
| I tend to agree that for most things on AWS, EC2 + ASG is
| superior. It's very polished. EKS is very bare bones. I would
| probably go so far as to just run Kubernetes on EC2 if I had to
| go that route.
|
| But in general k8s provides incredibly solid abstractions for
| building portable, rigorously available services. Nothing quite
| compares. It's felt very stable over the past few years.
|
| Sure, EC2 is incredibly stable, but I don't always do business
| on Amazon.
| Noumenon72 wrote:
| At first I thought your "in general" statement was
| contradicting your preference for EC2 + ASG. I guess AWS is
| such a large part of my world that "in general" includes AWS
| instead of meaning everything but AWS.
| cedws wrote:
| K8S is a disastrous complexity bomb. You need millions upon
| millions of lines of code just to build a usable platform.
| Securing Kubernetes is a nightmare. And lock-in never really
| went away because it's all coupled with cloud specific stuff
| anyway.
|
| Many of the core concepts of Kubernetes should be taken to
| build a new alternative without all the footguns. Security
| should be baked in, not an afterthought when you need
| ISO/PCI/whatever.
| xyzzy_plugh wrote:
| This isn't my experience at all. Maybe three or four years
| ago?
|
| Who exactly needs millions of lines of code?
| Spivak wrote:
| I think they're more getting a k8s requiring a whole mess
| of 3rd party code to actually be useful when bringing it to
| prod. For EKS you end up having coredns, fluentbit, secrets
| store, external dns, aws ebs csi controller, aws k8s cni,
| etc.
|
| And in the end it's hard to say if you've actually gained
| anything except now this different code manages your AWS
| resources like you were doing with CF or terraform.
| mschuster91 wrote:
| We have all of that neatly extracted into a Terraform
| module. Write it once and now EKS clusters are
| essentially disposable.
| Solvency wrote:
| You just added yet another Thing in that huge pile of
| things representing millions of lines of code. That's the
| point.
| dvfjsdhgfv wrote:
| Everything we run our workloads on is based on millions
| of LoCs, whether it's in the OS, in K8S, in is built-in
| or external kinds. If you decide to run K8S in AWS,
| you'll be better of using Karpenter, external-secrets and
| all these things as they will make your life easier in
| various ways.
| lijok wrote:
| Why is that inherently a problem?
|
| How many LOCs in the linux kernel again?
| woleium wrote:
| kinda like openshift?
| mardifoufs wrote:
| Millions upon millions of lines of code?! What? Can you
| specify what you were trying to do with it?
| cedws wrote:
| Argo CD, Argo Rollouts, Vault, External Secrets, Cert
| Manager, Envoy, Velero, plus countless operators, plus a
| service mesh if you need it, the list goes on. If you're
| providing Kubernetes as a platform at any sort of scale
| you're going to need most of this stuff or some
| alternatives. This sums up to at least multiple million
| LOC. Then you have Kubernetes itself, containerd, etcd...
| arccy wrote:
| that's not much different from using the cloud PaaS
| offerings besides who runs that million lines and who
| gets the freedom/control for customization.
| foofie wrote:
| > K8S is a disastrous complexity bomb. You need millions upon
| millions of lines of code just to build a usable platform.
|
| I don't know what you have been doing with Kubernetes, but I
| run a few web apps out of my own Kubernetes cluster and the
| full extent of my lines of code are the two dozen or so LoC
| kustomize scripts I use to run each app.
| WildGreenLeave wrote:
| I run my own cluster too, it is managed by one terraform
| file which is maintained on GitHub [0]. Along with that I
| deploy everything on here with 1 shell script and a bunch
| of yaml manifests for my services. It's perfect for
| projects that are managed by one person (me). Everything is
| in git and reproducable. The only thing I am doing
| unconventional is that I didn't want to use github actions,
| so I use Kaniko to build my Docker containers inside my
| cluster.
|
| 0 https://github.com/kube-hetzner/terraform-hcloud-kube-
| hetzne...
| cedws wrote:
| If you're using a K8S cluster just to deploy a few web apps
| then it's not really a platform that you could provide to
| an engineering team within a medium-large company. You
| could probably run your stuff on ECS.
| foofie wrote:
| > If you're using a K8S cluster just to deploy a few web
| apps (...)
|
| It's really not about what I do and do not do with
| Kubernetes. It's on you to justify your "millions upon
| millions lines of code" claim because it is so outlandish
| and detached from reality that it says more about your
| work than about Kubernetes.
|
| I repeat: I only need a few dozen lines of kustomize
| scripts to release whole web apps. Simple code. Easy
| peasy. What mess are you doing to require "millions upon
| millions" lines of code?
| cedws wrote:
| You are missing the point. I recommend you look into
| Platform Engineering and what it involves.
| avbanks wrote:
| While I love ECS you're not giving k8s enough credit.
| Nearly every COTS (common off the self) app has a helm
| chart, hardly any provide direct ECS support. If I want a
| simple kafka cluster or zookeeper cluster there's a
| supported helm chart for that, nothing is provided for
| ECS, you have to make that yourself.
| gtirloni wrote:
| You're both using hyperboles that don't match the reality
| of the average-sized company using Kubernetes. It's neither
| "millions upon millions of lines of code" nor "just a few
| dozen lines of kustomize scripts".
| mise_en_place wrote:
| kubeadm + fabric + helm got me 99% of the way there. My
| direct report, a junior engineer, wrote the entire helm chart
| from our docker-compose. It will not entirely replace our
| remote environment but it is nice to have something in
| between our SDK and remote deployed infra. Not sure what you
| meant by security; could you elaborate? I just needed to
| expose one port to the public internet.
| foofie wrote:
| > My contrarian view is that EC2 + ASG is so pleasant to use.
|
| Sometimes I think that managed kubernetes services like EKS are
| the epitome of "give the customers what they want", even when
| it makes absolutely no sense at all.
|
| Kubernetes is about stitching together COTS hardware to turn it
| into a cluster where you can deploy applications. If you do not
| need to stitch together COTS hardware, you have already far
| better tools available to get your app running. You don't need
| to know or care in which node your app is suppose to run and
| not run, what's your ingress control, if you need to evict
| nodes, etc. You have container images, you want to run
| containers out of them, you want them to scale a certain way,
| etc.
| mr_moose wrote:
| To me, it sounds like your company went through a complex re-
| architecturing exercise at the same time you moved to
| Kubernetes, and your problems have more to do with your
| (probably flawed) migration strategy than the tool.
|
| Lifting and shifting an "EC2 + ASG" set-up to Kubernetes is a
| straightforward process unless your app is doing something very
| non-standard. It maps to a Deployment in most cases.
|
| The fact that you even implemented an operator (a very advanced
| use-case in Kubernetes) strongly suggests to me that you're
| doing way more than just lifting and shifting your existing
| set-up. Is it a surprise then that you're seeing so much more
| complexity?
| krab wrote:
| Not familiar with the OP but this may have been the pitch for
| migration: "K8S will allow us better automation".
| LispSporks22 wrote:
| I feel like this is overkill for a startup.
|
| Why not dump your application server and dependencies into rented
| data center (or EC2 if you must) and setup a coarse DR? Maybe
| start with a monolith in PHP or Rails.
|
| None of that word salad sounds like startup to me, but then again
| everyone loves to refer to themselves as a startup (must be a
| recruiting tool?), so perhaps muh dude is spot on.
| icameron wrote:
| I would like to know what you're being downvoted for. It's not
| bad advice, necessarily... this was the way 20 years ago. I
| mean isn't hacker news running kind of like this as a monolith
| on a single server? People might be surprised how far you can
| get with a simple setup.
| charred_patina wrote:
| I don't want to be negative, but this post reads like a list of
| things that I want to avoid in my career. I did a brief stint
| in cloud stuff at a FAANG and I don't care to go back to it.
|
| Right now I'm engineer No. 1 at a current startup just doing
| DDD with a Django monolith. I'm still pretty Jr. and I'm
| wondering if there's a way to scale without needing to get into
| all of the things the author of this article mentions. Is it
| possible to get to a $100M valuation without needing all of
| this extra stuff? I realize it varies from business to
| business, but if anyone has examples of successes where people
| just used simple architecture's I'd appreciate it.
| krmboya wrote:
| I bet you can get pretty far with just ec2 and autoscaling,
| or comparable tech in other cloud platforms. With a managed
| database service.
| charred_patina wrote:
| That I'd be comfortable with.
| kevinqi wrote:
| I work at a startup and most of the stuff in the article
| covers things we use and solve real world problems.
|
| If you're looking for successful businesses, indie hackers
| like levelsio show you how far you can get with very simple
| architectures. But that's solo dev work - once you have a
| team and are dealing with larger-scale data, things like
| infrastructure as code, orchestration, and observability
| become important. Kubernetes may or may not be essential
| depending on what you're building; it seems good for AI
| companies, though.
| charred_patina wrote:
| How many people if I may ask? And how many TPS for your
| services? I am hoping I can get away with a simple monolith
| for a very long time.
| kevinqi wrote:
| 30-40 people; not much TPS but we're not primarily
| building a web app; we have event-driven data pipelines
| and microservices for ML data.
|
| If you're primarily building a web app, a monolith is
| fine for quite a while, I think. But a lot of the stuff
| in the post is still relevant even for monoliths - RDS,
| Redis, ECR, terraform, pagerduty,
| monitoring/observability.
| AznHisoka wrote:
| I bet Craigslist runs on much simpler infrastructure. Not
| sure how much they're worth though
| mixmastamyk wrote:
| Stackoverflow famously grew huge for a long time on a
| single Windows box. I don't recommend that but yeah KISS
| rule definitely. Floss version: supabase, open telemetry,
| etc.
| singron wrote:
| You don't need this many tools, especially really early. It
| also depends on the particulars of your business. E.g. if you
| are B2B SaaS, then you need a ton of stuff automatically to
| get SOC2 and generally appease the security requirements of
| your customers.
|
| That said, anything that's set-and-forget is great to start
| with. Anything that requires it's own care and feeding can
| wait unless it's really critical. I think we have a project
| each quarter to optimize our datadog costs and renegotiate
| our contract.
|
| Also if you make microservices, you are going to need a ton
| of tools.
| segfaltnh wrote:
| Also don't make microservices if you don't have teams that
| will independently own them.
| extr wrote:
| You can scale to any valuation with any architecture. Whether
| or not you need sophisticated scaling solutions depends on
| the characteristics of your product, mostly how pure of a
| software play it is. Pure software means you will run into
| scaling challenges quicker, since likely part of your value
| add is in fact managing the complexity of scaling.
|
| If you are running a marketplace app and collect fees you're
| going to be able go much further on simpler architectures
| than if you're trying to generate 10,000 AI images per
| second.
| movpasd wrote:
| I'm currently early in my career and "the software guy" in a
| non-software team and role, but I'm looking to move into a
| more engineering direction. You've pretty much got my dream
| next job at the moment -- if you don't mind me asking, how
| did you manage to find your role, especially being "still
| pretty Jr."?
| charred_patina wrote:
| What a coincidence! I've got my dream job too!
|
| The things I did to get here are honestly kind of stupid. I
| started out at a defense contractor after graduating and
| left in the first six months because all the software devs
| were jumping ship. Went to a small business defense
| contractor (yep that's a thing) and learned to build web
| apps with React and Django. Then the pace of business
| slowed so after about 18 months I got on the Leetcode grind
| and got into a FAANG. Realized I hated it, so I quit after
| about 9 months with no job lined up.
|
| While unemployed I convinced myself I was going to get a
| job in robotics (I actually got pretty close, I had 3 final
| level interviews with robotics companies), but the job
| market went to shit pretty much the exact day I quit my job
| lol. I spent about 6 months just learning ROS, Inverse
| Kinematics, math for robotics, gradient descent and
| optimization, localization, path planning, mapping etc. I
| taught at a game development summer camp for a month and a
| half, that was awesome. Working with kids is always a
| blast. Also learned Rust and built a prototype for a
| multiplayer browser-based coding game I had been thinking
| about for a while. It was an excuse to make a full stack
| application with some fun infrastructure stuff.
|
| https://ai-arena.com/#/multiplayer
|
| The backend is no longer running, but originally users
| could see their territory on the galaxy grow as their code
| won battles for them.
|
| For the current role, I really just got lucky. The previous
| engineer was on his way out for non-job related reasons. He
| had read a lot of the books I had (Code Complete, Domain
| Driven Design) and I think we just connected over shared
| interests and intellectual curiosity.
|
| I think that in the modern day, so many people are really
| just in this space for the paycheck-- and that's okay!
| Everyone needs to make a living. But I think that if you
| have that intellectual curiosity and like making stuff,
| people will see that and get excited. It ends up being a
| blessing and a curse.
|
| I have failed interviews because of honesty "I would Google
| the names of books and read up on that subject" or "I think
| if I was doing CSS then I would be in the wrong role" (I
| realize how douchey that sounds but I just was not meant to
| design things, I have tried). But I have also gone further
| in interviews than I should have because I was really
| engrossed in a particular problem like path planning or
| inverse kinematics and I was able to talk about things in
| plain terms.
|
| I think it's easier to learn things quickly if they are
| something you're actually interested in, it becomes
| effortless. Basically I just try to do that so I can learn
| optimally, then I try to get lucky.
|
| EDIT: Oh I just thought of more good advice. Find senior
| devs to learn from. They can be kind of grumpy in their
| online presence, but they help you avoid so many tar pits.
| I am in a Discord channel with a handful of senior
| engineers. The best way to get feedback is to naively say
| "I'm going to do X", they will immediately let you know why
| X is a bad idea. A lot of their advice boils down to KISS
| and use languages with strong typing.
| daxfohl wrote:
| I did this myself for a good 15 years or so, but
| eventually with a family, money became a bit more of a
| priority, and it's hard to get a good job if all you've
| worked at is small shops. Any next role in a larger tech
| company will likely be a downgrade until you can prove
| yourself out, which of course you may not be able to
| because things are so different, and motivation will run
| low because you're being tasked with all the stuff that
| caused you to leave big tech in the first place. It can
| be quite miserable to be grouped with a bunch of kids
| with 3-5 YOE that have no idea how to build something
| from scratch, and they're outperforming you because they
| know the system.
|
| In my case it took a good five years and a couple job
| hops to rebalance. But eventually you get back to a
| reasonable tech leadership role and back to making some
| of the bigger decisions to help make the junior devs'
| lives less miserable.
|
| No regrets, but the five years it takes to rebalance can
| be pretty hard.
| Arbortheus wrote:
| Currently working at a $100M valuation tech company that
| fundamentally is built on a Django monolith with some other
| fluffy stuff lying around it. You can go far with a Django
| monolith and some load balancing.
| daxfohl wrote:
| Don't need any of it. Start simple. Some may be useful
| though. The list makes good points. Keep it around and if you
| find yourself suffering from the lack of something, look
| through the list and see if anything there would be good ROI.
| But don't adopt something just because this list says you
| should.
|
| One thing though, I'd start with go. It's no more complex
| than python, more efficient, and most importantly IMO since
| it compiles down to binary it's easier to build, deploy,
| share, etc. And there's less divergence in the ecosystem;
| generally one simple way to do things like building and
| packaging, etc. I've not had to deal with versions or tooling
| or environmental stuff nearly as much since switching.
| krmboya wrote:
| Key term here: 'cloud native'. Which is supposedly the future
| jasoneckert wrote:
| After reading through this entire post, I'm pleasantly surprised
| that there isn't one item where I don't mirror the same
| endorse/regret as the author. I'm not sure if this is coincidence
| or popular opinion.
| ndjshe3838 wrote:
| I'm imagining a developer in the 90s/00s reading this list and
| being baffled by the complexity/terminology
| SoftTalker wrote:
| I am in 2024.
| kypro wrote:
| I thought the same reading it - is it really this hard to build
| an app these days?
|
| Things were more far more manual and much less secure, scalable
| and reliable in the past, but they were also far far simpler.
| xcrunner529 wrote:
| Agreed. It's just ridiculous. Some just love to spend money
| and make things more complex.
| LispSporks22 wrote:
| I agree. I'm afraid I'm one of those 00s developers and can
| relate. Back then many startups were being launched on super
| simple stacks.
|
| With all of that complexity/word salad from TFA, where's the
| value delivered? Presumably there's a product somewhere under
| all that infrastructure, but damn, what's left to spend on it
| after all the infrastructure variable costs?
|
| I get it's a list of preferences, but still once you've got
| your selection that's still a ton of crap to pay for and deal
| with.
|
| Do we ever seek simplicity in software engineering products?
| bigstrat2003 wrote:
| I think that far too many companies get sold on the vision of
| "it just works, you don't need to hire ops people to run the
| tools you need for your business". And that is true! And
| while you're starting, it may be that you can't afford to
| hire an ops guy and can't take the time to do it yourself.
| But it doesn't take _that_ much scale before you get to the
| point it would be cheaper to just manage your own tools.
|
| Cloud and SaaS tools are very seductive, but I think they're
| ultimately a trap. Keep your tools simple and just run them
| yourselves, it's _not_ that hard.
| TeMPOraL wrote:
| > _Do we ever seek simplicity in software engineering
| products?_
|
| Doubtfully. Simplicity of work breakdown structure - maybe.
| Legibility for management layers, possibly. Structural
| integrity of your CYA armor? 100%.
|
| The half-life of a software project is what now, a few years
| at most these days? Months, in webdev? Why build something
| that is robust, durable, efficient, make all the correct
| engineering choices, where you can instead race ahead with a
| series of "nobody ever got fired for using ${current hot
| cloud thing}" choices, not worrying at all about rapidly
| expanding pile of tech and organizational debt? If you push
| the repayment time far back enough, your project will likely
| be dead by then anyway (win), or acquired by a greater fool
| (BIG WIN) - either way, you're not cleaning up anything.
|
| Nobody wants to stay attached to a project these days anyway.
|
| /s
|
| Maybe.
| dogcomplex wrote:
| Don't worry, AI will wash all that away. Nothing says
| simplicity like an incomprehensible black box!
| izacus wrote:
| Look, the thing is - most of infra decisions are made by
| devops/devs that have a vested interest in this.
|
| Either because they only know how to manage AWS instances (it
| was the hotness and thats what all the blogs and YT videos
| were about) and are now terrified from losing their jobs if
| the companies switch stacks. Or because they needed to put
| the new thing on their CV so they remain employable. Also
| maybe because they had to get that promotion and bonus for
| doing hard things and migrating things. Or because they were
| pressured into by bean counters which were pressured by the
| geniuses of Wall Street to move capex to opex.
|
| In any case, this isn't by necessity these days. This is
| because, for a massive amount of engineers, that's the only
| way they know how to do things and after the gold rush of
| high pay, there's not many engineers around that are in it to
| learn or do things better. It's for the paycheck.
|
| It is what it is. The actual reality of engineering the
| products well doesn't come close to the work being done by
| the people carrying that fancy superstar engineer title.
| habinero wrote:
| That's for slower projects.
|
| You know the old adage "fast, cheap, good: pick two"? With
| startups, you're forced to pick fast. You're still probably
| not gonna make it, but if you don't build fast, you
| definitely won't.
| geraldhh wrote:
| "That's what they want you to think"
| DannyBee wrote:
| Yeah, I read the " My general infrastructure advice is "less is
| better".", and was like "when did this list of stuff become the
| definition of 'less'"
| segfaltnh wrote:
| My reaction exactly. I don't know their footprint but this is
| a long list of stuff.
| occams_chainsaw wrote:
| There's _a lot_ in the article that existed in the 00s. Now
| imagine a programmer from the 70s...
| smallnix wrote:
| I think engineers in the 20s who were putting out quality
| enigmas would be stunned by all the marketing lingo.
| davedx wrote:
| I've used most of these technologies and the sum value add over
| a way simpler monolith on a single server setup is negligible.
| It's pure insanity
| _kb wrote:
| It's a hedge.
|
| There's an easy bent towards designing everything for scale.
| It's optimistic. It's feels good. It's safe, defendable, and
| sound to argue that this complexity, cost, and deep
| dependency is warranted when your product is surely on the
| verge of changing the course of humanity.
|
| The reality is your SaaS platform for ethically sourced,
| vegan dog food is below inconsequential and the few users
| that you do have (and may positively affect) absolutely do
| not not need this tower of abstraction to run.
| timc3 wrote:
| Couldn't agree more. What a huge amount of tech and complexity
| just to get something off the ground
| LightFog wrote:
| The more complex you make it the better your job security eh?
| Maybe they'll even give you a whole team to look after it all.
| Absolute madness.
| lawgimenez wrote:
| My last web development project was in the FTP upload era.
| Reading this, I'm kinda glad I'm not in web dev.
| esskay wrote:
| The funny thing is a lot of smaller startups are seeing just
| how absurdly expensive these service are, and are just
| switching back to basic bare metal server hosting.
|
| For 99% of businesses it's a wasteful, massive overkill
| expense. You dont NEED all the shiny tools they offer, they
| don't add anything to your business but cost. Unless you're a
| Netflix or an Apple who needs massive global content
| distribution and processing services theres a good chance
| you're throwing money away.
| benreesman wrote:
| We had FB up to 6 figures in servers and a billion MAUs
| (conservatively) before even tinkering with containers.
|
| The "control plane" was ZooKeeper. Everything had bindings to
| it, Thrift/Protobuf goes in a znode fine. List of servers for
| FooService? znode.
|
| The packaging system was a little more complicated than a
| tarball, but it was spiritually a tarball.
|
| Static link everything. Dependency hell: gone. Docker:
| redundant.
|
| The deployment pipeline used hypershell to drop the packages
| and kick the processes over.
|
| There were hundreds of services and dozens of clusters of them,
| but every single one was a service because it needed a
| different SKU (read: instance type), or needed to be in Java or
| C++, or some engineering reason. If it didn't have a real
| reason, it goes in the monolith.
|
| This was dramatically less painful than any of the two dozen
| server type shops I've consulted for using kube and shit. It's
| not that I can't use Kubernetes, I know the k9s shortcuts
| blindfolded. But it's no fun. And pros built these deployments
| and did it well, serious Kubernetes people can do everything
| right and it's _complicated_.
|
| After 4 years of hundreds of elite SWEs and PEs (SRE) building
| a Borg-alike, we'd hit _parity_ with the bash and ZK stuff. And
| it ultimately got to be a clear win.
|
| But we had an _engineering_ reason to use containers: we were
| on bare metal, containers can make a lot of sense on bare
| metal.
|
| In a hyperscaler that has a zillion SKUs on-demand?
| Kubernetes/Docker/OCI/runc/blah is the friggin Bezos tax.
| You're already virtualized!
|
| Some of the new stuff is hot shit, I'm glad I don't ssh into
| prod boxes anymore, let alone run a command on 10k at the same
| time. I'm glad there are good UIs for fleet management in the
| browser and TUI/CLI, and stuff like TailScale where mortals can
| do some network stuff without a guaranteed zero day. I'm glad
| there are layers on top of lock servers for service discovery
| now. There's a lot to keep from the last ten years.
|
| But this yo dawg I heard you like virtual containers in your
| virtual machines so you can virtualize while you virtualize
| shit is overdue for its CORBA/XML/microservice/many-many-many
| repos moment.
|
| You want reproducibility. Statically link. Save Docker for a
| CI/CD SaaS or something.
|
| You want pros handing the datacenter because pets are for
| petting: pay the EC2 markup.
|
| You can't take risks with customer data: RDS is a very sane
| place to splurge.
|
| Half this stuff is awesome, let's keep it. The other half is
| job security and AWS profits.
| geraldhh wrote:
| > We had FB up to 6 figures in servers and a billion MAUs
| (conservatively) before even tinkering with containers.
|
| that would have been around the time when containers entered
| the public/developer consciousness, no?
| annoyingnoob wrote:
| No, not at all. Maybe baffled by the use of expensive cloud
| services instead of running on your own bare metal where the
| cost is in datacenter space and bandwidth. The loss of control
| coupled with the cost is baffling.
| ehPReth wrote:
| Okta... after everything that's happened recently with them?
| deskamess wrote:
| Yeah... this stood out! Do you have any good alternatives? I
| wish CloudFlare would do it (IDP).
| __turbobrew__ wrote:
| It is a shame karpenter is AWS only. I was thinking about how our
| k8s autoscaler could be better and landed on the same kind of
| design as karpenter where you work from unschedulable pods
| backwards. Right now we have an autoscaler which looks at
| resource utilization of a node pool but that doesn't take into
| account things like topology spread constraints and resource
| fragmentation.
| acedTrex wrote:
| https://github.com/Azure/karpenter-provider-azure there is this
| in the works for karpenter on aks
| redrove wrote:
| It's actually released in preview, they called it Node Auto
| Provisioning. Doesn't work with Azure Linux unfortunately.
| sreeramvenkat wrote:
| Ironic that the article begins with an image of server chassis
| with wires running around while the description is entirely about
| cloud infra.
| ChuckMcM wrote:
| This is fabulous. I keep lists like this in my notebook(s). The
| critical thing here is that you shouldn't dwell on your "wrong"
| choices, instead document the choice, what you thought you were
| getting, what you got, and what information would have been
| helpful to know at the time of decision (or which information you
| should have given more weight at the time of the decision.) If
| you do this, you will consistently get better and better.
|
| And by far "automate all the things" is probably my number one
| suggestion for DevOps folks. Something that saves you 10 minutes
| a day pays for itself in a month when you have a couple of hours
| available to diagnose and fix a bug that just showed up. (5 days
| a week X 4 weeks X 10 minutes = 200 minutes) The exponential
| effect of not having to do something is much larger than most
| people internalize (they will say, "This just takes me a couple
| of minutes to do." when in fact it takes 20 to 30 minutes to do
| and they have to do it repeatedly.)
| janfoeh wrote:
| Almost every time I read someone's insights who works in an
| environment with IaaS buy-in, my takeaway is the same: oh boy,
| what an alphabet soup.
|
| The initial promise of "we'll take care of this for you, no in-
| house knowledge needed" has not materialized. For any non-trivial
| use case, all you do is replace transferrable, tailored knowledge
| with vendor-specific voodoo.
|
| People who are serious about selling software-based services
| should do their own infrastructure.
| cyounkins wrote:
| I've climbed the mountain of learning the basics of kubernetes /
| EKS, and I'm thinking we're going to switch to ECS. Kubernetes is
| way too complicated for our needs. It wants to be in control and
| is hard to direct with eg CloudFormation. Load balancers are
| provisioned from the add-on, making it hard to reference them
| outside kubernetes. Logging on EKS Fargate to Cloudwatch appears
| broken, despite following the docs. CPU/Memory metrics don't work
| like they do on EKS EC2, it appears to require ADOT.
|
| I recreated the environment in ECS in 1/10th the time and
| everything just worked.
| jacurtis wrote:
| I've been running ECS for about 5 years now. It has come a long
| way from a "lightweight" orchestration tool into something
| thats actually pretty impressive. The recent new changes to the
| GUI are also helpful for people that don't have a ton of
| experience with orchestration.
|
| We have moved off of it though, you can eventually need more
| features than it provides. Of course that journey always ends
| up in Kubernetes land, so you eventually will find your way
| back there.
|
| Logging to Cloudwatch from kubernetes is good for one thing...
| audit logs. Cloudwatch in general is a shit product compared to
| even open source alternatives. For logging you really need to
| look at Fluentd or Kibana or DataDog or something along those
| lines. Trying to use Cloudwatch for logs is only going to end
| in sadness and pain.
| busterarm wrote:
| GKE is a much better product to me still than EKS but at
| least in the last two years or so EKS has become a usable
| product. Back in like 2018 though? Hell no, avoid avoid
| avoid.
| amluto wrote:
| > Zero Trust VPN
|
| VPNs can be wonderful, and you can use use Tailscale or AWS VPN
| or OpenVPN or IPSEC and you can authenticate using Okta or GSuite
| or Auth0 or Keycloak or Authelia.
|
| But since when is this Zero Trust? It takes a somewhat unusual
| firewall scheme to make a VPN do anything that I would seriously
| construe as Zero Trust, and getting authz on top of that is a
| real PITA.
| morsecodist wrote:
| > Picking AWS over Google Cloud
|
| I know this is an unpopular opinion but I think google cloud is
| amazing compared to AWS. I use google cloud run and it works like
| a dream. I have never found an easier way to get a docker
| container running in the cloud. The services all have sensible
| names, there are fewer more important services compared to the
| mess of AWS services, and the UI is more intuitive. The only
| downside I have found is the lack of community resulting in fewer
| tutorials, difficulty finding experienced hires, and fewer third
| party tools. I recommend trying it. I'd love to get the user base
| to an even dozen.
|
| The reasoning the author cites is that AWS has more responsive
| customer service and maybe I am missing out but it would never
| even occur to me to speak to someone from a cloud provider. They
| mention having "regular cadence meetings with our AWS account
| manager" and I am not sure what could be discussed. I must be
| doing simper stuff.
| iimblack wrote:
| I don't have as much experience with aws but I do hate gcp. The
| ui is slow and buggy. The way they want things to authenticate
| is half baked and only implemented in some libraries and it
| isn't always clear what library supports it. The gcloud command
| line tool regularly just doesn't work; it just hangs and never
| times out forcing you to kill it manually wondering if it did
| anything and you'll mess something up running it again. The way
| they update client libraries by running code generation means
| there's tons of commits that aren't relevant to the library
| you're actually using. Features are not available across all
| client libraries. Documentation contradicts itself or
| contradicts support recommendations. Core services like
| bigquery lack any emulator or Docker image to facilitate CI or
| testing without having to setup a separate project you have to
| pay for.
| arccy wrote:
| aws is even worse yet somehow people love them, maybe because
| they get to talk to a support "human" to hand-hold them
| through all the badness
| mdaniel wrote:
| Oh, friend, you have not known UI pain until you've used
| portal.azure.com. That piece of junk requires actual page
| reloads to make any changes show up. That Refresh button is
| just like the close-door elevator button: it's there for you
| to blow off steam, but it for damn sure does not DO anything.
| I have boundless screenshots showing when their own UI
| actually pops up a dialog saying "ok, I did what you asked
| but it's not going to show up in the console for 10 minutes
| so check back later". If you forget to always reload the
| page, and accidentally click on something that it says exists
| but doesn't, you get the world's ugliest error message and
| only by squinting at it do you realize it's just the 404 page
| rendered as if the world has fallen over
|
| I suspect the team that manages it was OKR-ed into using AJAX
| but come from a classic ASP background, so don't understand
| what all this "single page app" fad is all about and hope it
| blows over one day
| fshbbdssbbgdd wrote:
| I have had the experience of an AWS account manager helping me
| by getting something fixed (working at a big client). But more
| commonly, I think the account manager's job at AWS or any cloud
| or SAAS is to create a reality distortion field and distract
| you from how much they are charging you.
| tester457 wrote:
| > I think the account manager's job at AWS or any cloud or
| SAAS is to create a reality distortion field and distract you
| from how much they are charging you.
|
| How do they do this jedi mind trick?
| viraptor wrote:
| Maybe your TAM is different, but our regularly do
| presentations about cost breakdown, future planning and
| possible reservations. There's nothing distracting there.
| piotrkaminski wrote:
| Heartily seconded. Also don't forget the docs: Google Cloud
| docs are generally fairly sane and often even useful, whereas
| my stomach churns whenever I have to dive into AWS's labyrinth
| of semi-outdated, nigh-unreadable crap.
| andreif wrote:
| To be fair there are lots of GCP docs, but I cannot say they
| are as good as AWS. Everything is CLI-based, some things are
| broken or hello-world-useless. Takes time to go through
| multiple duplicate articles to find anything decent. I have
| never had this issue with AWS.
|
| GCP SDK docs must be mentioned separately as it's a bizarre
| auto-generated nonsense. Have you seen them? How can you even
| say that GCP docs are good after that?
| arccy wrote:
| very few things are cli only, most have multiple ways to do
| things. and they have separate guide reference sections
| that can easily be found. compared to aws where your best
| bet is to hope google indexed the right page for them.
| andreif wrote:
| > few things are cli only
|
| wdym? As far as I see, it's either CLI or Terraform. GCP
| SDK is complete garbage, at least for Python compared to
| AWS boto3. I have personally made web UI for AWS CLI man
| pages as a fun project and can index everything myself if
| needed. Googling works fine. If you are not happy with it
| then ChatGPT is to the rescue. I honestly do not see any
| problem at all.
| kbar13 wrote:
| AWS enterprise support (basically first line support that you
| paid for) is actually really really good. they will look at
| your metrics/logs and share with you solid insights. anything
| more you can talk to a TAM who can then reach out to relevant
| engineering teams
| halfcat wrote:
| > I have never found an easier way to get a docker container
| running in the cloud
|
| We started using Azure Container Apps (ACA) and it seems simple
| enough.
|
| Create ACA, point to GitHub repo, it runs.
|
| Push an update to GitHub and it redeploys.
| rickette wrote:
| Azure Container Apps (ACA) and AWS AppRunner are also heavily
| "inspired" by Google Cloud Run.
| marcinzm wrote:
| So?
| darknavi wrote:
| > I have never found an easier way to get a docker container
| running in the cloud
|
| I don't have a ton of Azure or cloud experience but I run an
| Unraid server locally which has a decent Docker gui.
|
| Getting a docker container running in Azure is so complicated.
| I gave up after an hour of poking around.
| andreif wrote:
| Azure is a complete disaster, deserves its own garbage-
| category, and gives people PTSD. I don't think AWS/CGP should
| ever be compared to it at all.
| jiggawatts wrote:
| Funnily enough, I have the opposite opinion.
|
| AWS has "fun" features like the ability to just lose track
| of some resource and still be billed for it. It's in
| here... somewhere. Not sure which region or account. I'll
| find it one day.
|
| GCP is made by Google, also known as children that forgot
| to take their ADHD medication. Any minute now they'll just
| casually announce that they're cancelling the cloud because
| they're bored of it.
|
| Azure is the only one I've seen with a sane management
| interface, where you can actually see everything everywhere
| all at once. Search, filter, query-across-resources, etc...
| all work reasonably well.
| andreif wrote:
| I am yet to meet an IRL person who believes Azure has
| "sane management interface". In my experience it was
| horribly inconvenient, filled with weird anti-UX
| solutions that were completely unnecessary. It maybe
| shows you all at once, or at least tries to, but it's
| such a horrible idea for a complex system. Non-
| surprisingly it never worked properly with various
| widgets hanging or erroring-out. It was impossible to see
| wtf is going on, what state it is in, or how to do
| anything about it. Azure will always be an example of a
| web UI done horribly wrong. This does actually not
| surprise me at all since Microsoft products are known for
| this. Every time I need to extend my kids Xbox
| subscriptions I have to pull my hair out to figure out
| how to do it in their web mess.
|
| How can you even compare it to AWS is a mystery to me.
| There are pages showing all your resources, not sure why
| you think it's a problem. Could be a problem from long
| time ago?
| arccy wrote:
| you're lucky if azure works without errors half the
| time...
| maccard wrote:
| Oh I disagree - we migrated from azure to AWS, and running a
| container on Fargate is significantly more work than Azure
| Container Apps [0]. Container Apps was basically "here's a
| container, now go".
|
| [0] https://azure.microsoft.com/en-gb/products/container-apps
| mdaniel wrote:
| Heh, your comment almost echos the positive thing I was
| going to say, as well as highlighting half of why I loathe
| Azure with every fiber of my being
|
| https://learn.microsoft.com/en-us/azure/container-
| instances/... is the one I was going to plug, because
| coming from a kubernetes background it seems to damn near
| be the PodSpec and thus both expresses a lot of my needs
| and also is very familiar https://learn.microsoft.com/en-
| us/azure/templates/microsoft....
|
| Your link does seem to be a lot more "container, plus all
| the surrounding stuff" in line with the "apps" part,
| whereas mine more closely matches my actual experience of
| what you said: container, go
|
| The "what the fucking hell is wrong with you people?" part
| is that their naming is just all over the place, and
| changes constantly, and is almost designed to be misleading
| in any sane conversation. I quite literally couldn't have
| guessed whether Container Apps was a prior name of
| Container Instances, a super set of it, subset, other? And
| one will observe that while I said Container Instances, and
| the URL says Container Instances, the ARM is Container
| Groups. Are they the same? different? old? who fucking
| knows. It's horrific
| maccard wrote:
| Oh yeah. This and resource groups are the only two things
| that azure did well. Everything else is a disaster.
| madduci wrote:
| I share your thoughts. It looks like an entire article
| endorsing AWS honestly
| wodenokoto wrote:
| If you are big enough to have regular meetings with AWS you are
| big enough to have meetings with GCP.
|
| I've had technicians at both GCP and Azure debug code and spend
| hours on developing services.
| marcinzm wrote:
| > I've had technicians at both GCP and Azure debug code and
| spend hours on developing services.
|
| Almost every time Google pulled in a specialist engineer
| working on a service/product we had issues with it was very
| very clear the engineer had no desire to be on that call or
| to help us. In other words they'd get no benefit from helping
| us and it was taking away from things that would help their
| career at Google. Sometimes they didn't even show up to the
| first call and only did to the second after an escalation up
| the management chain.
| rswail wrote:
| We are a reasonably large AWS customer and our account manager
| sends out regular emails with NDA information on what's coming
| up, we have regular meetings with them about things as wide
| ranging as database tuning and code development/deployment
| governance.
|
| They often provide that consulting for free, and we know their
| biases. There's nothing hidden about the fact that they will
| push us to use AWS services.
|
| On the other hand, they will also help us optimize those
| services and save money that is directly measurable.
|
| GCP might have a better API and better "naming" of their
| services, but the breadth of AWS services, the incorporation of
| IAM across their services, governance and automation all makes
| it worth while.
|
| Cloud has come a long way from "it's so easy to spin up a
| VM/container/lambda".
| politelemon wrote:
| > There's nothing hidden about the fact that they will push
| us to use AWS services.
|
| Our account team don't even do that. We use a lot of AWS
| anyway and they know it, so they're happy to help with
| competitor offerings and integrating with our existing stack.
| Their main push on us has been to not waste money.
| bakchodi wrote:
| When I was at AWS, I watched SAs get promoted for saving
| customers money all the time.
|
| AWS wants happy customers to stick around for a long time,
| not one month of goosed income
| deskamess wrote:
| Yep. Pay us less every month and stick around for a long
| time. Getting low prices makes it really difficult to
| move away.
|
| If you still decided to move away, and want to take data
| with you, yeah... there is a cost. Heck there is a cost
| to delete the data you have with them (like S3 content).
|
| Its a good way to do business.
| danpalmer wrote:
| In a previous role I got all of these things from GCP - they
| ran training for us, gave us early access to some alpha/beta
| stage products (under NDA), we got direct onboarding from
| engineers on those, they gave us consulting level support on
| some things and offered much more of it than we took up.
| simonbarker87 wrote:
| Totally agree, GCP is far easier to work with and get things up
| and running for how my brain works compared to AWS. Also, GCP
| name stuff in a way that tells me what it does, AWS name things
| like a teenage boy trying to be cool.
| andreif wrote:
| That's completely opposite to my experience. Do you have any
| examples of AWS naming that you think is "teenage boy trying
| to be cool"? I am genuinely curious.
| alentred wrote:
| BigQuery - Athena
|
| Pub/Sub - Kinesis
|
| Cloud CDN - CloudFront
|
| Cloud Domains - Route 53
|
| ...
| andreif wrote:
| Pub/sub is more like SNS or EventBridge Bus to me
| andreif wrote:
| I thought you meant API and parameters. Blaming them for
| product names is weird to me.
| geraldhh wrote:
| why is that?
| andreif wrote:
| Why it's weird to blame them for product names? Because
| their purpose slightly different. I can see where
| negativity comes from and understand, but product name is
| a lot less important as consistent API experience. AWS is
| the best among big players by far, hats off and well-done
| to their teams and leadership. I hope the others will
| finally learn and follow.
| morsecodist wrote:
| My issue isn't just with the names themselves but they
| are emblematic of AWS's overall mentality. They want to
| have the AWS(TM) solution to X business case while other
| cloud providers feel more like utilities that give you
| building blocks. This obviously works for them and many
| of their customers I just personally don't care for it.
| It is probably to do with the level of complexity I am
| working at (*which is not very complex).
|
| Also, I don't think trying to emulate AWS's support and
| consistent API makes sense as a strategy for other cloud
| providers. They will never beat AWS at their own game, it
| is light years ahead. If cloud providers want to survive
| they need to fill a different niche and try different
| things.
| jgalt212 wrote:
| It's nice when things do what they say on the tin. That
| being said, it's hard to build a "brand" when you start
| out with a generic name.
| andreif wrote:
| How many popular products have you named and launched?
| Naming products is hard to meet both usability and
| marketing objectives. This has never been as big of a
| problem for me, as GCPs APIs for example. Those are the
| true evil. Product names I care little for.
| jgalt212 wrote:
| > How many popular products have you named and launched?
|
| One, and you often times only need one.
| arccy wrote:
| aws api and param names are stupidly long CamelCased and
| not even consistent half the time like a leaky
| abstraction over their underlying implementation
| andreif wrote:
| You remember any example? I don't call API directly and
| usually use CLI/SDK/CDK that work a lot better than
| gcloud. I did see some inconsistencies between services
| (e.g. updating params for SQS and SNS) and that could
| definitely be improved. But honestly, comparing to GCP
| mess, AWS is ten times better.
| simonbarker87 wrote:
| Perfect list, also:
|
| Google Cloud Run - Lambda
|
| Sure I get the reference to the underlying algebraic
| representation of coding but come on, Lambda tells us
| nothing of what it does.
|
| Products (not brands, products) should be named in a way
| that means something to the customer afaic.
| Hasu wrote:
| > Perfect list, also:
|
| > Google Cloud Run - Lambda
|
| ECS is the AWS equivalent of Cloud Run. GCP Cloud
| Functions are the equivalent of AWS Lambda.
|
| ECS / Cloud Run = managed container service that
| autoscales
|
| Lambda / Cloud Functions = serverless functions as a
| service
| simonbarker87 wrote:
| Thanks for the clarification hadn't appreciated the
| difference. Also somewhat reiterates my point which is
| nice as well
| andreif wrote:
| Have you named any successful product?
| simonbarker87 wrote:
| Yes, named a product and sold over 100,000 units of them.
| Naming products is hard but not that hard.
| andreif wrote:
| GCP's SDK and documentation is a mess compared to AWS. And
| looking at the source code I don't see how it can get better
| any time soon. AWS seems to have proper design in mind and uses
| less abstractions giving you freedom to build what you need.
| AWS CDK is great for IAC.
|
| The only weird part I experienced with AWS is their SNS API.
| Maybe due to legacy reasons, but what a bizarre mess when you
| try doing it cross-account. This one is odd.
|
| I have been trying GCP for a while and DevX was horrible. The
| only part that more-or-less works is CLI but the naming there
| is inconsistent and not as well-done as in AWS. But it's
| relative and subjective, so I guess someone likes it. I have
| experienced GCP official guides that broken, untested or
| utterly braindead hello-world-useless. And also they are
| numerous and spread so it takes time to find anything decent.
|
| No dark mode is an extra punch. Seriously. Tried to make it
| myself with an extension but their page is Angular hell of
| millions embedded divs. No thank you.
|
| And since you mentioned Cloud Run -- it takes 3 seconds to
| deploy a Lambda version in AWS and a minute or more for GCP
| Could Function.
| ratherbefuddled wrote:
| We're relatively small GCP users (low six figures) and have
| monthly cadence meetings with our Google account manager.
| They're very accommodating, and will help with contacts, events
| and marketing.
| lysecret wrote:
| Also much prefer GCP but gotta say their support is hot
| steaming **. I wasted so much time for absolutely nothing with
| them.
| marcinzm wrote:
| GCP support is atrocious. I've worked at one of their largest
| clients and we literally had to get executives into the loop
| (on both sides) to get things done sometimes. Multiple times
| they broke some functionality we depended on (one time they
| fixed it weeks later except it was still broken) or gave us bad
| advice that cost a lot of money (which they at least refunded
| if we did all the paperwork to document it). It was so bad that
| my team viewed even contacting GCP as an impediment and
| distraction to actually solving a problem they caused.
|
| I also worked at a smaller company using GCP. GCP refused to do
| a small quota increase (which AWS just does via a web form)
| unless I got on a call with my sales representative and
| listened to a 30 minute upsell pitch.
| jq-r wrote:
| > "regular cadence meetings with our AWS account manager" and I
| am not sure what could be discusse.
|
| As being on a number of those calls, its just a bunch of crap
| where they talk like a scripted bot reading from corporate
| buzzword bingo card over a slideshow. Their real intention is
| two fold. To sell you even more AWS complexity/services, and to
| provide "value" to their person of contact (which is person
| working in your company).
|
| We're paying north of 500K per year in AWS support (which is a
| highway robbery), and in return you get a "team" of people
| supposedly dedicated to you, which sounds good in theory but
| you get a labirinth of irresponsiblity, stalling and
| frustration in reality.
|
| So even when you want to reach out to that team you have to
| first to through L1 support which I'm sure will be replaced by
| bots soon (and no value will be lost) which is useful in 1 out
| of 10 cases. Then if you're not satisfied with L1's answer(s),
| then you try to escalate to your "dedicated" support team, then
| they schedule a call in three days time, or if that is around
| Friday, that means Monday etc.
|
| Their goal is to stall so you figure and fix stuff on your own
| so they shield their own better quality teams. No wonder our
| top engineers just left all AWS communication and in cases
| where unavoidable they delegate this to junior people who still
| think they are getting something in return.
| Grimm665 wrote:
| This rings so true from experience it hurts.
| awskinda wrote:
| > We're paying north of 500K per year in AWS support (which
| is a highway robbery), and in return you get a "team" of
| people supposedly dedicated to you, which sounds good in
| theory but you get a labirinth of irresponsiblity, stalling
| and frustration in reality.
|
| I've found a lot of the time the issues we run into are self-
| inflicted. When we call support for these, they have to
| reverse-engineer everything which takes time.
|
| However when we can pinpoint the issue to AWS services, it
| has been really helpful to have them on the horn to confirm &
| help us come up with a fix/workaround. These issues come up
| more rarely, but are _extremely_ frustrating. Support is
| almost mandated in these cases.
|
| It's worth mentioning that we operate at a scale where the
| support cost is a non-issue compared to overall engineering
| costs. There's a balance, and we have an internal structure
| that catches most of the first type of issue nowadays.
| AtlasBarfed wrote:
| This. This is the reality.
|
| I am so tired of the support team having all the real
| metrics, especially in io and throttling, and not surfacing
| it to us somehow.
|
| And cadence is really an opportunity for them to sell to you,
| the parent is completely right.
| jiggawatts wrote:
| Something I've noticed with PaaS services like RDS or Azure SQL
| is that people arguing against it are assuming that the
| alternative is "competence".
|
| Even in a startup, it's difficult to hire an expert in every
| platform that can maintain a robust, secure system. It's
| possible, but not guaranteed, and may require a high pay to
| retain the right staff.
|
| Many government agencies on the other hand are legally banned
| from offering a competitive wage, so they can literally never
| hire anyone that competent.
|
| This cap on skill level means that if they do need reliable
| platforms, the only way they can get one is by paying 10x the
| real market rate for an over-priced cloud service.
|
| These are the "whales" that are keeping the cloud vendors fat and
| happy.
| IamLoading wrote:
| > Go is for services that are non-GPU bound.
|
| What are they using for GPU bound services. Python?
| cissmayazz wrote:
| Python indeed
| itpragmatik wrote:
| Not sure the fascination about Go - one can write fully scalable
| functional readable maintainable upgradable rest api service with
| Java 17 and above.
| MarkMarine wrote:
| I struggle with the type system in both, but today I was going
| through obscure go code and wishing interfaces were explicitly
| implemented. Lack of sum types is making me sad
| nickzelei wrote:
| What are startups using for a logging tool that isn't datadog?
| podoman wrote:
| https://highlight.io
| Too wrote:
| Loki
| ndr wrote:
| https://axiom.co/
| bilalq wrote:
| I love this write-up and the way it's presented. I disagree with
| some of the decisions and recommendations, but it's great to read
| through the reasoning even in those cases.
|
| It'd be amazing if more people published similar articles and
| there was a way to cross-compare them. At the very least, I'm
| inspired to write a similar article.
| roughly wrote:
| The Bazel one made me chuckle - I worked at a company with an scm
| & build setup clearly inspired by Google's setup. As a non-ex-
| Googler, I found it obviously insane, but there was just no way
| to get traction on that argument. I love that the rest of this
| list is pretty cut and dry, but Bazel is the one thing that the
| author can't bring themself to say "don't regret" even though
| they clearly don't regret not using it.
| busterarm wrote:
| I've seen Bazel reduce competent engineers to tears. There was
| a famous blog post a half-decade ago called something like
| "Bazel is the worst build system, except for all the others"
| and this still seems to ring true for me today.
|
| There are some teams I work with that we'll never bother to
| make use Bazel because we know in advance that it would cripple
| them.
| ali_piccioni wrote:
| Having led a successful Bazel migration, I'd still recommend
| many projects to stick to the native or standard supported
| toolchain until there's a good reason to migrate to a build
| system (And I don't consider GitHub actions to be a build
| system).
| dieortin wrote:
| I'm curious, what do you find insane about Bazel? In my
| experience it makes plenty of sense. And after using it for
| some months, I find more insane how build systems like CMake
| depend on you having some stuff preinstalled in your system and
| produce a different result depending on which environment
| they're run on.
| bayareabadboy wrote:
| Interesting enough read. But I'm not sure he's a regretful enough
| boy to write a blog to merit the title.
| fswd wrote:
| stuff like this makes me want to experiment with going back to
| just one huge $100k server and running it all on one box in a
| server rack.
| sseagull wrote:
| I am doing that. I am part of a research group, and don't have
| the $$ or ability to pay so much for all these services.
|
| So we got a $90k server with 184TB of raw storage (SAS SSD), 64
| cores, and 1TB of memory. Put it on a 10GB line at our
| university and it is rock solid. We probably have less downtime
| than Github, even with reboots every few months.
|
| Have some large (multi-TB) databases on it and web APIs for
| accessing the data. Would be hugely expensive in the cloud
| with, especially with egress costs.
|
| You have to be comfortable sys-admining though. Fortunately I
| am.
| hi_hi wrote:
| I was hoping there would be a section for Search Engines. It's
| one of those things you tend to get locked in to, and it's hard
| to clearly know your requirements well enough early on.
|
| Any references to something like this with a Search slant would
| be greatly appreciated.
| brycelarkin wrote:
| Awesome writeup! Just had a couple comments/questions.
|
| > Not adopting an identity platform early on
|
| The reason for not adopting an IDP early is because almost every
| vendor price gouges for SAML SSO integration. Would you say it's
| worth the cost even when you're a 3-5 person startup?
|
| > Datadog
|
| What would you recommend as an alternative? Cloudwatch? I love
| everything about Datadog, except for their pricing....
|
| > Nginx load balancer for EKS ingress
|
| Any reason for doing this instead of an Application Load
| Balancer? Or even HA Proxy?
| kevinslin wrote:
| For datadog, unfortunately there's no obvious altnernative
| despite many companies trying to take marketshare. This is to
| say, datadog both has second to none DX and a wide breadth of
| services.
|
| Grafana Labs comes closest in terms of breadth but their DX is
| abysmal (I say this as a heavy grafana/prometheus user) Same
| comments about new relic though they have better dx than
| grafana. Chronosphere has some nice DX around prometheus based
| metrics but lack the full product suite. I could go on but
| essentially, all vendors either lack breadth, DX, or both.
| hitekker wrote:
| Props to the author for writing up the results from his exercise.
| But I think he should focused on a few controversial ones, and
| not the rotes ones.
|
| Many of the decisions presented are not disagreeable (choosing
| slack) and some lack framing that clarifies the associated loss
| (Not adopting an identity platform early on). I think they're all
| good choices worth mentioned; I would have preferred a deeper
| look into the few that seemed easy and turned out to be hard, or
| the ones that were hard and got even harder.
| 8organicbits wrote:
| > not the rotes ones
|
| It helps to hear the validation, although I think almost every
| decision has a dissenting voice in the HN comments.
| isoprophlex wrote:
| > There are no great FaaS options for running GPU workloads
|
| This hits hard. Someone please take my (client's) money and
| provide sane GPU FaaS. Banana.dev is cool but not really
| enterprise ready. I wish there was a AWS/GCP/Azure analogue that
| the penny pinchers and MBAs in charge of procurement can get
| behind.
| karbon0x wrote:
| I am confused. Doesn't Modal Labs solve this?
| isoprophlex wrote:
| Definitely. But the sad reality is that in some corporate
| environments (incumbent finance, government) if it's not a
| button click in portal.azure.com away, you can spend 6-12
| months in meetings with low energy gloomboys to get your
| access approved.
| karbon0x wrote:
| Ah, I see. Yeah, been victim of that bureaucracy as well.
| Rainymood wrote:
| As a machine learning platform engineer these sound like
| _technology choices_ as opposed to _infrastructure decisions_. I
| would love to read this post but really with the infrastructure
| trade-offs that were made. But thanks for the post.
|
| Side node: There is a small typo repeated twice "Kuberentes"
| davedx wrote:
| Utter insanity. So much cost and complexity, and for what?
| Startups don't think about costs or runway anymore, all they care
| about is "modern infrastructure".
|
| The argument for RDS seems to be "we can't automate backups".
| What on earth?
| isbvhodnvemrwvn wrote:
| Is spending time to make it reliable worth it vs working on
| your actual product? Databases are THE most critical things
| your company has.
| davedx wrote:
| All that infra doesn't integrate itself. Everywhere I've
| worked that had this kind of stack employed at least one if
| not a team of DevOps people to maintain it all, full time,
| the year round. Automating a database backup and testing it
| works takes half a day unless you're doing something weird
| isbvhodnvemrwvn wrote:
| Setting up a multi-az db with automatic failover,
| incremental backups and PiTR, automated runbooks and
| monitoring all that doesn't take half a day, not even with
| RDS.
| davedx wrote:
| No, but again, that sounds like a lot of complexity your
| average startup does not need. Multi-az? Why?
| marcinzm wrote:
| Because their Enterprise client requires it on their due
| diligence paperwork.
| dvfjsdhgfv wrote:
| Which makes little sense anyway as in practice the real
| problems you have are from region/connectivity issues,
| not AZ failures.
| fullstackchris wrote:
| A startup sized company using this many tools? They're for
| sure doing something weird (and that's not a compliment :)
| )
|
| Totally on your side with this one - but alas, people
| associate value with complexity.
| ffsm8 wrote:
| > _Automating a database backup and testing it works takes
| half a day unless you're doing something weird_
|
| True story bro
|
| I'm sure that's possible if you're storing the backup on
| the same server you're restoring on and everything is on
| top of the line nvme storage. Otherwise your backup just
| started to run and will need another few days to finish.
| And that's only if you're running single master.
|
| You're massively underestimating the challenge to get that
| kind of automation done in a stable manner - and the
| maintenance required to keep it working over the years.
| davedx wrote:
| I've implemented such a process for companies multiple
| times, bro. I know what I'm talking about.
| marcinzm wrote:
| And that's the problem. "It's easy for me because I've
| done it a dozen times so it's easy for everyone" is a
| very common fallacy.
| layer8 wrote:
| What happened to having people trained by external
| trainers for what you need? That's much cheaper than
| having everything externally "managed" and still having
| to integrate all of it. The number of services listed in
| TFA is just ridiculous.
| ffsm8 wrote:
| I've done it before,too. For toy project, it's easy as
| you said. It's not once you're at scale. It's hilarious
| that people are down voting my comment. I guess there are
| a lot of juniors suffering from the dunning Kruger
| syndrome around right now
| icedchai wrote:
| I worked at a place with its own colo where they ran
| several multi TB MySQL database servers. We did weekly
| backups and it could take days. Our backups were stored
| on external USB disks. The I/O performance was abysmal.
| Taking a filesystem snapshot and copying it to USB could
| take days. The disks would occasionally lock up and
| someone would have to power cycle them. Total clown show.
|
| I would rather pay for RDS. Databases are the one thing
| you don't want to screw up.
| Draiken wrote:
| I see this argument a lot. Then most startups use that time
| to create rushed half-assed features instead of spending a
| week on their db that'll end up saving hundreds of thousands
| of dollars. Forever.
|
| For me that's short-sighted.
| eptcyka wrote:
| So investing in a critical part of my business is the bad
| thing to do?
| brightball wrote:
| There are other providers with better value for service within
| AWS or GCP, like Crunchy.
| viraptor wrote:
| > The argument for RDS seems to be "we can't automate backups".
| What on earth?
|
| I can automate backups and I'm extremely happy they with some
| extra cost in RDS, I don't have to do that.
|
| Also, at some size automating the database backup becomes non-
| trivial. I mean, I can manage a replica (which needs to be
| updated at specific times after the writer), then regularly
| stop replication for a snapshot, which is then encrypted,
| shipped to storage, then manage the lifecycle of that storage,
| then setup monitoring for all of that, then... Or I can set one
| parameter on the Aurora cluster and have all of that happen
| automatically.
| bowsamic wrote:
| I agree but also I'm not entirely sure how much of this is
| avoidable. Even the most simple web applications are full of
| what feels like needless complexity, but I think actually a lot
| of it is surprisingly essential. That said, there is definitely
| a huge amount of "I'm using this because I'm told that we
| should" over "I'm using this because we actually need it"
| jstummbillig wrote:
| The argument for RDS (and other services along those lines) is
| "we can't do it as good, for less".
|
| And, when factoring in _all_ costs and considering all things
| the service takes care of, it seems like a reasonable
| assumption that in a free market a team that specializes in
| optimizing this entire operation will sell you a db service at
| a better net rate than you would be able to achieve on your
| own.
|
| Which might still turn out to be false, but I don't think it's
| obvious why.
| overstay8930 wrote:
| Everyone who says they can run a database better than Amazon is
| probably lying or Has a story about how they had to miss a
| family event because of an outage.
|
| The point isn't that you can't do it, the point is that it's
| less work for extremely high standards. It is not easy to
| configure multi region failover without an entire network team
| and database team unless you don't give a shit about it
| actually working. Oh yea, and wait until you see how much SOC2
| costs if you roll your own database.
| tofflos wrote:
| > Using cert-manager to manage SSL certificates
|
| > Very intuitive to configure and has worked well with no issues.
| Highly recommend using it to create your Let's Encrypt
| certificates for Kubernetes.
|
| > The only downside is we sometimes have ANCIENT (SaaS problems
| am I right?) tech stack customers that don't trust Let's Encrypt,
| and you need to go get a paid cert for those.
|
| Cert-manager allows you to use any CA you like including paid
| ones without automation.
| kunley wrote:
| A fallacy of a "choice" between GCP and AWS never stops to
| entertain me
| danielovichdk wrote:
| I would have liked some data around why these technologies were
| chosen and preferably based on loads from customers.
|
| Seems like yagni to me but please prove me wrong
| corentin88 wrote:
| Curious about the mention of buying IPs. Anyone else can share
| feedback/thoughts on this?
| cissmayazz wrote:
| This was done for multiple reasons but mainly security and to
| allow customers to whitelist a certain ip range.
| throwawaaarrgh wrote:
| This guy gets it, I agree with it all. The exception being, use
| Fargate without K8s and lean on Terraform and AWS services rather
| than the K8s alternatives. When you have no choice left and you
| have to use K8s, then I would pick it up. No sense going down
| into the mines if you don't have to.
| michidk wrote:
| > Code is of course powerful, but I've found the restrictive
| nature of Terraform's HCL to be a benefit with reduced
| complexity.
|
| No way. We used Terraform before and the code just got
| unreadable. Simple things like looping can get so complex.
| Abstraction via modules is really tedious and decreases
| visibility. CDKTF allowed us to reduce complexity drastically
| while keeping all the abstracted parts really visible. Best
| choice we ever made!
| opentokix wrote:
| After working with infrastructure for 20 years, I fully endorse
| this post.
| kosolam wrote:
| What is the cost? With 1/10th of the sum one capable engineer can
| setup a way better infra on premise. The days of free money is
| over, guys. Wake up!
| politelemon wrote:
| > Ubuntu for dev servers
|
| I didn't understand this section. Ubuntu servers as dev
| environment, what do you mean? As in an environment to deploy
| things onto, or a way for developers to write code like with
| VSCode Remote?
| hahnchen wrote:
| seems like the latter given "Originally I tried making the dev
| servers the same base OS that our Kubernetes nodes ran on,
| thinking this would make the development environment closer to
| prod"
| runiq wrote:
| But I thought the whole point of the container ecosystem was
| to abstract away the OS layer. Given that the kernel is
| backwards compatible to a fault, shouldn't it be enough to
| have a kernel that is as least as recent as the one on your
| k8s platform (provided that you're running with the default
| kernel or something close to it)?
| brainzap wrote:
| My take from this was more: being uniform reduced overhead of
| maintaining.
|
| Being able to write a bash script that runs on ever machine is
| nice.
| politelemon wrote:
| > homebrew for Linux
|
| No, just no. I see this cropping up now and then. Homebrew is
| unsafe for Linux, and is only recommended by Mac users that don't
| want to bother to learn about existing package management.
| erostrate wrote:
| The author leads infrastructure at Cresta. Cresta is a customer
| service automation company. His first point is about how happy he
| is to have picked AWS and their human-based customer service,
| versus Google's robot-based customer service.
|
| I'm not saying there's anything wrong, and I'm oversimplifying a
| bit, but I still find this amusing.
| lysecret wrote:
| Haha very good catch. I prefer GCP but I will admit any day of
| the week that their support is bad. Makes sense that they would
| value good support highly.
| danpalmer wrote:
| We used to use AWS and GCP at my previous company. GCP
| support was fine, and I never saw anything from AWS support
| that GCP didn't also do. I've heard horror stories about
| both, including some security support horror stories from AWS
| that are quite troubling.
| yread wrote:
| > Ubuntu
|
| we have dotnet webapp deployed on Ubuntu and it leaves a lot to
| be desired. The package for .net6 from default repo didn't
| recognise other dotnet components installed, net8 is not even
| coming to 22.04 - you have to install from the ms repo. But that
| is not compatible with the default repo's package for net6 so you
| have to remove that first and faff around with exact versions to
| get it installed side by side...
|
| At least I don't have to deal with rhel Why is renewing a dev
| subscription so clunky?!
| f549abd0 wrote:
| Disagree on the point and reasoning about the single database.
|
| Sounds like they experienced badly managed and badly constrained
| database. The described fks and relations: that's what the key
| constraints and other guard rails and cascades are for - so that
| you are able to manage a schema. That's exactly how you do it:
| add in new tables that reference old data.
|
| I think the regret is actually not managing the database, and not
| so much about having a single database.
|
| "database is used by everyone, it becomes cared for by no one".
| How about "database is used by everyone, it becomes cared for by
| everyone".
| f549abd0 wrote:
| Reading further
|
| > Endorse-ish: Schema migration by Diff
|
| Well that explains it... What a terrible approach to migrations
| for data integrity.
| cloogshicer wrote:
| Genuinely curious (I don't have much experiences with DBs),
| how is schema migration done 'properly' these days?
| jspdown wrote:
| Incremental forward-only migrations (non-state based).
| Then, for the How and When, it mostly depends of your
| constraints and sizes. There's no silver bullet, it's hard,
| it require constant thinking, it's a slow and often multi
| step process.
|
| I never saw a successful fully automated one-way-of-doing
| process.
| from-nibly wrote:
| Are you talking about the mechanics? Like more than just
| run a migration script on boot?
| jamescontrol wrote:
| Can you explain? Having a tool to detect changes and create a
| migration doesn't sound bad? In a nutshell thats how django
| migrations work as well, which works really well.
| Sankozi wrote:
| > How about "database is used by everyone, it becomes cared for
| by everyone".
|
| So every one needs to know every use case of that database?
| Seems very unlikely if there are multiple teams using same DB.
|
| FKs? Unique constraints? Not null colums? If not added at the
| creation of the table they will never be added - the moment DB
| is part of a public API you cannot do a lot of things safely.
|
| The only moment when you want to share DB is when you really
| need to squeeze every last bit of performance - and even then,
| you want to have one owner and severly limited user accounts
| (with white list of accessible views and stored procedures).
| layer8 wrote:
| The database should never ever become part of a public API.
|
| You don't share a DB for performance reasons (rather the
| opposite), you do it to ensure data integrity and
| consistency.
|
| And no, not everyone needs to know every use case. But every
| team needs to have _someone_ who coordinates any overlapping
| schema concerns with the other teams. This needs to be
| managed, but it's also not rocket science.
| Sankozi wrote:
| If database is shared it is a part of an API. If it is
| shared between teams then it is a public API.
|
| If DB is shared then data from different users is
| entered/updated through multiple transactions. So you
| cannot get anything better regarding consistency and
| integrity compared to multiple DBs and distributed TXs.
|
| By introducing schema change coordination you will
| introduce enormous delays to almost any DB change. This is
| more realistic than everyone knowing each use case but less
| practical. Shared DB is an antipattern either way.
| eadmund wrote:
| > Startups don't have the luxury of a DBA ...
|
| I understand, _but_ I think they don't have the luxury of _not_
| having a DBA. Data is important; it's arguably more important
| than code. Someone needs to own thinking about data, whether it
| is stored in a hierarchical, navigation-based database such as a
| filesystem, a key-value stored like S3 (which, sure, can emulate
| a filesystem), or in a relational database. Or, for that matter,
| in vendor systems such as Google Workspace email accounts or
| Office365 OneDrive.
| Draiken wrote:
| Early on, depending on what you're building, you don't need a
| fully fleshed DBA and can get away with at least one person
| that knows DB fundamentals.
|
| But if you only want to hire React developers (or swap for the
| framework of the week) then you'll likely end up with zero
| understanding of the DB. Down the line you have a mess with
| inconsistent or corrupted data that'll come back with a
| vengeance.
|
| It's short-sighted for serious endeavors.
| lysecret wrote:
| Half the stuff is K8s related... Makes me very happy to use Cloud
| Run.
| gokhan wrote:
| "Multiple applications sharing a database" and Kubernetes sound
| really funny together:)
| shp0ngle wrote:
| I should _really_ learn AWS huh
| knowsuchagency wrote:
| Using k8s over ECS and raw-dogging Terraform instead of using the
| CDK? It's no wonder you end up needing to hire entire teams of
| people just to manage infra
| BrickTamblan wrote:
| What's the right way to manage npm installs and deploy it to an
| AWS ec2 instance from github? Kubernetes? GitOps? EKS? I roll my
| own solution now with cron and bash because everything seems so
| bloated.
| ildjarn wrote:
| Reading this I couldn't help but think: yeah all of these points
| make sense in isolation, but if you look at the big picture, this
| is an absurd level of complexity.
|
| Why do we need entire teams making 1000s of micro decisions to
| deploy our app?
|
| I'm hungry for a simpler way, and I doubt I'm alone in this.
| klabb3 wrote:
| You're not alone. There is a constant undercurrent of pushback
| against this craziness. You see it all the time here on hacker
| news and with people I talk to irl.
|
| Does not mean each of these things don't solve problems. The
| issue as always about complexity-utility tradeoff. Some of
| these things have too much complexity for too little utility.
| I'm not qualified to judge here, but if the suspects have
| Turing-complete-yaml-templates on their hands, it probably ties
| them to the crime scene.
| Sammi wrote:
| It smells like ZIRP is not over yet. VCs are still burning
| money in the AWS fire pit.
| kibwen wrote:
| ZIRP was never the root problem.
|
| The problem was: _too much money, too few consequences for
| burning it_.
|
| The existence of the uber-wealthy means that markets can no
| longer function efficiently. _Every_ market remains
| irrational longer than anyone who 's not uber-wealthy can
| remain solvent.
|
| Welcome to the new normal.
| daxfohl wrote:
| Now it's "fix it with AI". (And pay lip service to green
| tech.)
| jgalt212 wrote:
| > We use Okta to manage our VPN access and it's been a great
| experience.
|
| I have no first hand exerience wtih Okta, but everything I read
| about it makes me scared to use it. i.e. stability and security.
| rexreed wrote:
| Sounds like a whole lot of stuff for a startup. Maybe start with
| a simple stack until there's market fit. Even Amazon didn't start
| this way.
| iandanforth wrote:
| For people who enjoyed this post but want to see the other side
| of the spectrum where self hosted is the norm I'll point to the
| now classic series of posts on how Stack Overflow runs its infra:
| https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...
|
| If anyone has newer posts like the above, please reply with links
| as _I_ would love to read them.
| alecthomas wrote:
| https://world.hey.com/dhh/why-we-re-leaving-the-cloud-654b47...
| is another good one. There are a few different posts on it
| scattered around:
|
| https://world.hey.com/dhh/we-stand-to-save-7m-over-five-year...
|
| https://world.hey.com/dhh/our-cloud-exit-has-already-yielded...
|
| Related, looks like X is doing similar:
| https://twitter.com/XEng/status/1717754398410240018
| Shorel wrote:
| I see more 'Endorse' items than 'Regret' items.
|
| Anyway, amazing write up.
|
| Learning about alternatives to Jira is always good.
| maccard wrote:
| I see homebrew in here as a way to distribute <stuff> internally.
|
| We have non-developers (artists, designers) on our team, and
| asking them to manage homebrew is a non-starter. We're also on
| windows.
|
| We current just shove everything (and I mean everything) in
| perforce. Are there any better ways of distributing this for a
| small team?
| pavel_lishin wrote:
| > _Discourage private messages and encourage public channels._
|
| I wish my current company did this. It's infuriating. The other
| day, I asked a question about how to set something up, and a
| manager linked me to a channel where they'd discussed that very
| topic - but it was private, and apparently I don't warrant an
| invite, so instead I have to go bother some other engineers (one
| of whom is on vacation.)
|
| Private channels should be for sensitive topics (legal, finance,
| etc) or for "cozy spaces" - a team should have a private channel
| that feels like their own area, but for things like projects and
| anything that should be searchable, please keep things public.
| foxhop wrote:
| I think kubernetes is a mistake and should have went with AWS ECS
| (using fargate or backed by autoscaling ec2), if single change he
| wouldn't need to even thing about a bunch of other topics on his
| list. Something to think about, AWS Lambda first then fallback to
| AWS ECS for everything else that needs to really be on 100% of
| the time.
| pigcat wrote:
| > My general infrastructure advice is "less is better".
|
| I found this slightly ironic given there are ~50 headers in the
| article :)
|
| I liked the format of the writeup
| thesurlydev wrote:
| I've seen a lot of comments about how bad DataDog is because of
| cost but surprisingly I haven't seen open-source alternatives
| like OpenTelemetry/Prometheus/Grafana/Tempa mentioned.
|
| Is it because most people are willing to pay someone else to
| manage monitoring infrastructure or other reasons?
| kevinslin wrote:
| the way I think of datadog is that datadog it provides a second
| to none DX combined with a wide suite of product offerings that
| is good enough for most companies most of the time. does it
| have opaque pricing that can be 100x more expensive than
| alternatives? absolutely! will people continue to use it? yes!
|
| something to keep in mind is that most companies are not like
| the folks in this thread. they might not have the expertise,
| time or bandwidth to build invest in observability.
|
| the vast majority of companies just want something that
| basically works and doesn't take a lot of training to use. I
| think of Datadog as the Apple of observability vendors - it
| doesn't offer everything and there are real limitations (and
| price tags) for more precise use cases but in the general case,
| it just works (especially if you stay within its ecosystem)
| data_maan wrote:
| Noob here - all these are great... but why can't I just use
| Heroku to radically not have to deal with a large prt if these
| things?
___________________________________________________________________
(page generated 2024-02-10 23:01 UTC)