[HN Gopher] The Winamp Skin Museum is powered by a SQLite3 datab...
       ___________________________________________________________________
        
       The Winamp Skin Museum is powered by a SQLite3 database with 1.2GB
       of metadata
        
       Author : tosh
       Score  : 141 points
       Date   : 2022-06-11 12:06 UTC (10 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | keewee7 wrote:
       | I remember reading about a company that upgraded their production
       | database from a JSON file to SQLite.
       | 
       | We really need this counter culture in software development that
       | emphasis simplicity over the "Start with Kafka-on-k8s" madness.
        
         | bazmattaz wrote:
         | I thought that SQLlite databases are not suitable for large
         | multi user websites. Something got to do with only being able
         | to handles one transaction at a time. Isn't that right?
        
           | ww520 wrote:
           | With the amount of memory these days, the whole SQLite db
           | would be cached in memory. Read heavy application is not a
           | problem at all.
        
           | chasil wrote:
           | SQLite only allows one writer at a time. Attempts at multiple
           | DML logins will not block until successful, but simply fail.
           | 
           | The allowed writer can also impact readers, depending on the
           | WAL mode setting.
           | 
           | For DSS uses, where a data store is published once a day,
           | SQLite is wonderful. For OLTP, look elsewhere.
        
           | teirce wrote:
           | I think this is kind of their point. If you're starting out,
           | you may not need to meet the same considerations as a "large
           | multi user website". So you don't get stuck trying to do that
           | - you start with what you need (appropriately) and adjust
           | down the line.
           | 
           | It is hard to do right.
        
           | dundarious wrote:
           | The Single Writer Principle is probably the most important
           | principle for concurrent systems design. Often it's best to
           | design wholly around it.
           | 
           | In this case, I doubt there are any writers at all.
        
           | hedora wrote:
           | Define large. I'd guess sqlite can do 10-100K TPS, no
           | problem.
           | 
           | You'll want a read cache to avoid repeatedly rendering the
           | same HTML, etc in the frontend, where most CPU is spent.
           | That'll accidentally reduce database load.
           | 
           | Assuming 10% writes (which is really, really write heavy),
           | that'll get you to 1M page views per second. After that,
           | you'll need to rearchitect.
           | 
           | (MySQL and Postgres are probably better choices, but I'd
           | evaluate all three if I was setting this up for real.)
        
           | themerone wrote:
           | SQLlite can do handle concurrent reads, writes are a problem.
           | 
           | A site like this which is mostly a read-only archive is a
           | perfect use case for SQLite.
        
           | colechristensen wrote:
           | Sure, but you can have raised millions with large numbers of
           | customers before this is an issue. Having multiple customers
           | using your platform at the same time to the extent that their
           | transactions actually end up having to wait a meaningful
           | amount of time requires a lot of usage that you often don't
           | reach for a long long time.
        
         | vsroy wrote:
         | Context: I see a bunch of people here recommending SQLite. I
         | have a suggestion, try out LMDB. It's kind of like noSQL
         | SQLite. It's a simple in-process K/V store with a few features
         | like compound keys that allow you to model relational data well
         | enough.
         | 
         | I recently used lmdb for webhighlighter.com (specifically the
         | wrapper: https://www.npmjs.com/package/node-lmdb), and it was a
         | fantastic decision.
         | 
         | A lot of people here say "use SQLite for small projects". But
         | even using SQLite can be _significant_ over-engineering.
         | Running migrations? Writing SQL? That 's too much effort for
         | me.
         | 
         | For example: In my application, people can leave comments on a
         | (what is effectively) a post. A SQL-native solution might have
         | a table for comments, with foreign keys to post IDs. That's 10x
         | more engineering then I want to do for an MVP. I just store all
         | the comments as an array on a post. This means reads read all
         | the comments and writes require reading all the comments,
         | appending, and then re-writing. That's totally fine, and will
         | probably scale me to 100x my current traffic.
         | 
         | LMDB-JS is great. It allows you to serialize arbitrary JS
         | objects to LMDB using the message pack encoding system. This
         | makes for some super concise code.
         | 
         | Here's my entire data layer: - Interface:
         | https://github.com/vedantroy/grape-juice/blob/main/site/app/...
         | - Implementation: https://github.com/vedantroy/grape-
         | juice/blob/main/site/app/...
         | 
         | TL;DR -- I won't use SQLite, for, I don't know, my first 10K
         | users?
        
           | pfraze wrote:
           | Flashback to when leveldb was hot in the node ecosystem
           | 
           | KV stores are fun but you end up writing a lot of queries and
           | indexes manually
        
         | inportb wrote:
         | Tailscale https://tailscale.com/blog/database-for-2022/
        
           | Xeoncross wrote:
           | But tailscale is a distributed, multi-region SaaS provider.
           | It's easy for them to use SQLite.
           | 
           | What about a real-world workload? For example, I have 10
           | users a day on my new next.js app so I clearly need a RDS
           | cluster for burst traffic.
        
             | bigdubs wrote:
             | Tailscale has some server components (account management
             | etc.) that are powered with SQLite.
             | 
             | Can read more here: https://tailscale.com/blog/database-
             | for-2022/
        
       | sedatk wrote:
       | I love the webamp website. But, in the world of multigig RAM and
       | multigbps I/O speeds, 1.2GB isn't _that_ much. Not sure if that
       | 's something novel for SQLite3 though.
        
         | walrus01 wrote:
         | unlike a webserver setup maybe 20 years ago now it's trivially
         | easy to hold even a 30GB database entirely in RAM and never
         | touch disk, for a vast performance increase.
         | 
         | I think I saw a used x86-64 server with 512GB of RAM for sale
         | for $1700 recently.
        
           | metadat wrote:
           | > I think I saw a used x86-64 server with 512GB of RAM for
           | sale for $1700 recently.
           | 
           | Where? Sounds like a terrific bargain, my old supermicro only
           | has 384GB.
        
             | walrus01 wrote:
             | ebay, was a dell r630 with two somewhat older 14-core
             | xeons, 28 cores
        
             | tiernano wrote:
             | You could pick up an R620 with 768GB Ram and 2 8 core Xeons
             | for around 1400GBP (https://www.bargainhardware.co.uk/dell-
             | poweredge-r620-10-sff...) without disks... DDR4 is a bit
             | more expensive... but DDR3 is still fast for a lot of
             | jobs...
        
           | midislack wrote:
           | Since the 1970's Unix has held frequently used pages in
           | cache. Still does!
        
             | [deleted]
        
             | walrus01 wrote:
             | not the same thing as tuning the db settings to
             | specifically cache the whole thing in RAM and maybe even
             | making its disk storage live on a RAM drive that's created
             | at boot.
        
               | midislack wrote:
               | Time to revive the sticky bit!
        
             | hedora wrote:
             | If only this were true for S3 and friends.
             | 
             | (It is true for clients of scale out NFS servers, for what
             | it's worth.)
        
         | apetresc wrote:
         | Not novel at all. SQLite3 can handle 100GB+ files on fast
         | storage without breaking a sweat.
        
         | pornel wrote:
         | Yeah, I have a project with about this much data, and literally
         | read the whole file into memory, and rewrite the whole file on
         | disk every time I need to modify it. It's hilarious that it
         | works.
        
           | iforgotpassword wrote:
           | At that point, why not just mmap the whole file?
        
             | elteto wrote:
             | Probably because file truncation/expansion with mmap is
             | tricky.
        
           | solardev wrote:
           | Wait, is that just how SQlite works? It can't update an entry
           | on disk, it just has to rewrite the whole file...?
        
             | arein3 wrote:
             | It writes in a file, but only the part needed, not the
             | whole file (like any other DB with persistence)
        
               | adzm wrote:
               | It can also use a WAL log file and periodically
               | checkpoint to the actual db
        
             | tinco wrote:
             | No, they're saying they've implemented something that's
             | even more simple than an embedded database and it still
             | works fine for them.
             | 
             | If you can stand several 100s of milliseconds of latency on
             | every write then writing a 1.2gb flat file is not a
             | problem.
        
       | [deleted]
        
       | tuxie_ wrote:
       | I don't have a Twitter account, so I can only read so much until
       | Twitter hides the content behind a login/signup screen. Maybe
       | it's just me but I find it ironic that a writeup about preserving
       | content is given away to a corporation that blocks it in this
       | manner.
        
         | captbaritone wrote:
         | Tweet author here (not op). I'll repost here:
         | 
         | The Winamp Skin Museum is powered by a sqlite3 database
         | containing 1.2gb of metadata about 86,000 Winamp skins.
         | 
         | It's all exposed in this explorable GraphQL endpoint
         | 
         | https://api.webamp.org/graphql
         | 
         | A bit about the data...
         | 
         | It includes:
         | 
         | * Original filenames and md5 hashes of each skins
         | 
         | * Names/metadata of all files compressed WITHIN the skins (file
         | size, date, filename)
         | 
         | * Text content of all text files found within the skins
         | 
         | * URL/likes/retweets if the skin was share by @winampskins (or
         | on Instagram)
         | 
         | * Full metadata/info about each skin's @internetarchive page
         | 
         | * Info about manual reviews (good to tweet? NSFW?)
         | 
         | * URLs to download skin files or screenshots
         | 
         | Kind [of] fun data to comb though (if you're like me).
         | 
         | If anyone is interested in getting the raw DB to play with, or
         | has ideas for extra stuff to expose in the graph, get in touch.
        
           | abofh wrote:
           | I want to know more about the bowling pin man - I mean,
           | there's probably not much more to be said about him, but he
           | raises so many questions!
        
           | [deleted]
        
           | tuxie_ wrote:
           | Thanks! That's really nice of you :)
        
         | exikyut wrote:
         | For reference / FWIW:
         | https://nitter.net/captbaritone/status/1535471373191028737
        
         | not1ofU wrote:
         | The zapper function in ublock origin can be used to get around
         | this limitation. Some people on HN alos talk about nitter, I
         | havent used it, but maybe that would work too.
        
         | [deleted]
        
         | pvg wrote:
         | _Please don 't complain about tangential annoyances--things
         | like article or website formats, name collisions, or back-
         | button breakage. They're too common to be interesting._
         | 
         | https://news.ycombinator.com/newsguidelines.html
        
           | tuxie_ wrote:
           | Is it tangential to mention that the content is not
           | accessible?
        
             | pvg wrote:
             | Yep, lots of moderator comments about it, including
             | specifically twitter. There are workarounds that get posted
             | in thread, use those or post one yourself.
        
         | justsomehnguy wrote:
         | https://nitter.net/captbaritone/status/1535471373191028737
         | 
         | Using a medium with 140^W 280 symbols limit to convey things
         | larger than that is...
        
         | tiffanyh wrote:
         | Tip: replace "Twitter.com" in the url with "nitter.net".
         | 
         | It's what Twitter web should be.
        
       | johtso wrote:
       | Seems like a great use case for static sqlite hosting
       | https://github.com/phiresky/sql.js-httpvfs
        
       | captn3m0 wrote:
       | I have a lot of smaller datasets I am hoping to expose similarly
       | once CloudFlare D1 becomes available.
       | 
       | Fly.io does a decent job, but hoping it gets cheaper/simpler with
       | D1.
        
         | simonw wrote:
         | Fly.io will give you an always-on 256MB of RAM instance for
         | $1.94/month - and they offer 2,340 hours per month in their
         | free tier, which is enough to run three of those for the whole
         | month without spending any money at all.
        
       | Vladimof wrote:
       | I should go back to Winamp if its still the same (best music
       | player I ever used)... I used to to make it small and always on
       | top and drag it on my current window's title bar next to the
       | minimize button...
        
       | DustinBrett wrote:
       | Webamp is quite amazing. I use it on my desktop environment
       | website. I recently added playlist support (m3u/pls/asx) and
       | integrated lazy(er) Butterchurn so it can run as the wallpaper.
       | Winamp/Milkdrop forever!
       | 
       | https://dustinbrett.com/
        
         | nighthawk454 wrote:
         | this is insanely cool! it's hard to imagine wasm stuff has come
         | this far
        
           | captbaritone wrote:
           | Speaking specifically about Webamp, it is a manual
           | reimplementation in JavaScript. But we did use Wasm as a tool
           | for the music visualizer:
           | https://jordaneldredge.com/blog/speeding-up-winamps-music-
           | vi...
        
             | nighthawk454 wrote:
             | that's awesome, I'll check it out. hopefully the
             | WebAudioAPI has gotten better to work with over the years
        
         | sergiomattei wrote:
         | Wow, your website is insane. Great work.
        
       | duxup wrote:
       | Any reason / use for all that metadata?
        
       | TheChaplain wrote:
       | Cool, so much good memories.
       | 
       | There's a bug in webamp though, choosing "Options -> Double Size"
       | only enlarges the player and equalizer. The playlist remains the
       | same size.
        
         | ChickeNES wrote:
         | I don't believe that's a bug, that's how Winamp behaved
        
       | midislack wrote:
       | Be so much better if it was just normal files on disk.
        
       | walrus01 wrote:
       | how many llamas and whips can you fit in 1.2GB?
        
         | 8n4vidtmkvmk wrote:
         | all of them apparently
        
       ___________________________________________________________________
       (page generated 2022-06-11 23:00 UTC)