hngopher.com

       [HN Gopher] Ask HN: How much traffic do you serve and with which...
       ___________________________________________________________________
        
       Ask HN: How much traffic do you serve and with which database
       engine?
        
       It's common to see here that Postgres hosted in RDS can handle 99%
       of workloads up to millions of users. I'm building an IoT app with
       a plan to ingest the IoT traffic into dynamo partitioned on user id
       (I'm quite familiar with the tradeoffs) and everything else be in
       Postgres. A few services but not microservice (basically: core
       service, identity service, IoT data service, notification service).
       Ingesting and monitoring about 1,000,000 IoT devices daily (1
       packer per device per day) and about 1,000,000 users with only
       5,000 active users per day (basically we monitor user IoT devices
       24/7 but only some 5,000 users will have anomalous results and log
       in).  In the database posts & discussions here I sometimes find
       that the opinions are strong but the numbers are missing. Obviously
       applications have wide variation in traffic and query complexity so
       apples to apples comparisons are hard. Still, I would greatly
       benefit from hearing some real world experiences with numbers.
       Rough approximation database questions for current or prior
       applications:  1. How many customers do you have?  2. What's
       expected daily traffic? Peak traffic?  3. What database engine or
       engines do you use?  4. How many rows or how much storage does your
       db have?  5. What else about your application is relevant for
       database load?  6. Microservice, Service, or monolith. Happy with
       it?
        
       Author : ajhool
       Score  : 40 points
       Date   : 2025-03-14 18:41 UTC (4 hours ago)
        
       | jedberg wrote:
       | I think you might be asking the wrong questions. They key
       | questions are Queries per second and the median response size of
       | the query.
       | 
       | For example at reddit (15 years ago) we had 10x more vote traffic
       | than comment traffic, but we only needed two databases to handle
       | votes (technically only one the other was just for redundancy).
       | 
       | But we needed nine comments databases. Mainly because the median
       | query response was so much bigger for comments.
        
         | jascination wrote:
         | Unrelated: I love HN; a random database question and fkn
         | _jedberg_ is one of the first responders
        
         | cogman10 wrote:
         | I'm sure latency matters a lot as well.
         | 
         | The users of our apps are pretty tolerant of 5 to 10 minute
         | request times for some of our pages, which means we've been
         | able to get away with just a few servers for several TBs of
         | data stored and served. (100+mb responses are not unusual for
         | us).
         | 
         | If we had to rethink and redesign the system to cut down those
         | times, we'd need a lot more databases and a much cleverer
         | storage strategy than we currently have.
         | 
         | While I'm sure response time for Reddit is really important, I
         | could imagine that an IOT serving system needs almost nothing
         | in to hit something like a 10 to 20 second response time for 5k
         | devices.
        
       | dharmab wrote:
       | Asking about customers is the wrong question.
       | 
       | 1. It's often information that cannot be casually shared for
       | legal reasons (MNPI)
       | 
       | 2. A single customer might generate many queries. There have been
       | times where a single one of my employer's customers generates
       | more traffic than most companies will ever reach at peak.
        
         | ajhool wrote:
         | Fair. Please interpret as queries rather than customers.
        
       | mattmanser wrote:
       | 5,000 users PER DAY is trivial sauce, you're totally worrying
       | about something ridiculous. Even a crap server with crap code
       | should handle that.
       | 
       | BTW most databases on a decent server could totally handle that 1
       | million IoT updates per day too. 1 packet per day is nothing.
       | Unless they all come at once. That is also a fairly trivial load,
       | if it's spread out. A small VM could handle that.
       | 
       | You are way off on your understanding of what is a heavy load.
       | 
       | You could load test with something like k6 if you want to find
       | out. Try 'emulating' the requests and average users.
       | 
       | I often test with 5,000 requests per second, 5,000 users per day
       | with 20-30 requests each is several orders of magnitude less
       | load.
        
       | cullenking wrote:
       | I'll bite, just so you get a real answer instead of the very
       | correct but annoying "don't worry about it right now" answers
       | everyone else is going to provide!
       | 
       | We have a rails monolith that sends our master database instance
       | between 2,000 and 10,000 queries per second depending on the time
       | of year. We have a seasonal bike business with more traffic in
       | the summer. 5% of queries are insert/update/delete, the rest
       | read.
       | 
       | mariadb (mysql flavor), all reads and writes sent just to master.
       | Two slaves, one for live failover, the other sitting on a ZFS
       | volume for backup snapshotting sending snapshots off to rsync.net
       | (they are awesome BTW).
       | 
       | We run all our own hardware. The database machines have 512gb of
       | ram and dual EPYC 74F3 24 core processors, backed by a 4 drive
       | raid10 nvme linux software raid volume on top of micron 9300
       | drives. These machines also house a legacy mongodb cluster
       | (actually a really really nice and easy to maintain key/value
       | store, which is how we use it) on a separate raid volume, an
       | elastic search cluster, and a redis cluster. The redis cluster
       | often is doing 10,000 commands a second on a 20gb db, and the
       | elastic search cluster is a 3tb full text search + geo search
       | database that does about 150 queries a second.
       | 
       | In other words, mysql isn't single tenant here, though it is
       | single tenant on the drives that back our mysql database.
       | 
       | We don't have any caching as it pertains to database queries. yes
       | we shove some expensive to compute data in redis and use that as
       | a cache, but it wouldn't be hitting our database on a cache miss,
       | it would instead recalculate it on the fly from GPS data. I would
       | expect to 3-5x our current traffic before considering caching
       | more seriously, but I'll probably once again just upgrade
       | machines instead. I've been saying this for 15 years....
       | 
       | At the end of 2024 I went on a really fun quest to cut our DB
       | size from 1.4tb down to about 500gb, along with a bunch of query
       | performance improvements (remove unnecessary writes with small
       | refactors, better indexes, dropping unneeded indices, changing
       | from strings to enums in places, etc). I spent about 1 week of
       | very enjoyable and fast paced work to accomplish this while
       | everyone was out christmas break (my day job is now mostly
       | management), and prob would need another 2 weeks to go after the
       | other 30% performance improvements I have in mind.
       | 
       | All this is to serve a daily average of 200-300 http requests per
       | second to our backend, with a mix of website visitors and users
       | of our mobile apps. I've seen 1000rps steady-state peak peak last
       | year and wasn't worried about anything. I wouldn't be surprised
       | if we could get up to 5,000rps to our API with this current setup
       | and a little tuning.
       | 
       | The biggest table by storage and by row count has 300 million
       | rows and I think 150gb including indexes, though I've had a few
       | tables eclipse a billion rows before rearchitecting things.
       | Basically, if you use DB for analytics things get silly, but you
       | can go a long ways before thinking "maybe this should go in its
       | own datastore like clickhouse".
       | 
       | Also, it's not just queries per second, but also row operations
       | per second. mysql is really really fast. We had some hidden
       | performance issues that allowed me to go from 10,000,000 row ops
       | per second down to 200,000 row ops per second right now. This
       | didn't really change any noticable query performance, mysql was
       | cool for some things just doing a ton of full table scans all
       | over the place....
        
         | ajhool wrote:
         | wonderful, thank you. Some translations to AWS RDS...
         | 
         | "512gb of ram and dual EPYC 74F3 24 core processors, backed by
         | a 4 drive raid10 nvme linux software raid volume on top of
         | micron 9300 drives"
         | 
         | roughly translates to about an db.r8g.16xlarge (64 vCPUs, 512gb
         | ram) $4,949 / month on-demand for compute
         | 
         | I'm not familiar enough with hardware to determine IOPS for the
         | raid config but I believe it is greater than the maximum for
         | io2 block express storage on aws (256k IOPS):
         | 
         | $0.10 per provisioned IOPS-month = 256000 _$.10 = $25,600 /
         | month IOPS -- which feels high so I might be way off on the
         | raid setup's IOPS
         | 
         | $0.125 per GB-month storage = 500gb _ $0.125 = $62.50
         | 
         | That's about $31,930 / month without any reserved discounts for
         | an estimated capacity of 5,000 rps, sound about right? Would
         | you say your total hardware cost is less than one or two months
         | of comparable compute on AWS if the above is true?
        
       | Delomomonl wrote:
       | This sounds just wrong.
       | 
       | Why would you use micro service? You are not having teams.
       | 
       | Btw. Your data structure/ bytes per row is missing
        
       | tmountain wrote:
       | You're asking the wrong questions. Query complexity and patterns
       | matter. Nobody can answer this for you. You have to do the
       | analysis based on your workload.
        
       ___________________________________________________________________
       (page generated 2025-03-14 23:01 UTC)