[HN Gopher] Show HN: I wrote a RDBMS (SQLite clone) from scratch...
       ___________________________________________________________________
        
       Show HN: I wrote a RDBMS (SQLite clone) from scratch in pure Python
        
       I wrote a relational database management system (RDBMS) (sqlite
       clone) from scratch in pure Python.
        
       Author : spanspan
       Score  : 91 points
       Date   : 2023-08-13 20:32 UTC (2 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | _tom_ wrote:
       | How much of the SQLite test suite will pass?
        
       | keithalewis wrote:
       | In what way is this a "SQLite clone"?
        
         | tredre3 wrote:
         | In the way that it's an embedded RDBMS.
        
         | spanspan wrote:
         | Single file, embedded database with similar logical
         | organization
        
       | samsquire wrote:
       | Thank you for sharing
       | 
       | My perspective is that writing this kind of system in a language
       | such as Python is actually a great thing because for myself
       | Python is more widely readable and approachable compared to C++
       | or C which is what databases are often programmed in. If someone
       | wants to be serious they can port it to a low level language . As
       | it stands it's educational and useful for studying.
       | 
       | I wrote a distributed pseudo multi model SQL/graph
       | Cypher/Document and dyanmodb style database in Python
       | https://GitHub.com/samsquire/hash-db for the same goal of
       | learning how database engines could work in a distributed way.
        
         | spanspan wrote:
         | Cool stuff. I had similar intuitions- Python allow me to focus
         | on the high-level concepts. Albeit, there were times where I
         | wished I had gone with a statically-typed + compiled language.
        
         | p4bl0 wrote:
         | I totally agree with that. I loved the ugit [1] build Git from
         | scratch in Python series for that.
         | 
         | [1] https://www.leshenko.net/p/ugit/
        
         | devman0 wrote:
         | This is why you see a community of pure Java RDBMSs
         | (Hypersonic, H2, Derby, etc), if you don't need the big iron
         | scale it's easier to ship/use the database or even embed it in
         | memory if needed.
        
           | fuzztester wrote:
           | Yes. It's an interesting area, irrespective of implementation
           | language.
           | 
           | I had tried out one or two Java-based RDBMSes back in the
           | day, via programs written in Java, for fun.
           | 
           | I think one was HSQLDB.
           | 
           | https://en.m.wikipedia.org/wiki/HSQLDB
           | 
           | There was also another interesting one called PointBase,
           | which was developed by Bruce Scott, an Oracle founder, and
           | others.
           | 
           | https://en.m.wikipedia.org/wiki/PointBase
        
       | xjsjxjsjsh wrote:
       | The documentation is excellent.
        
       | kindawinda wrote:
       | Okay this seems like the perfect project to test out the various
       | AI repo update solutions. Anyone want to do this for fun, you can
       | have your own agenda, I'm just bored!
       | 
       | https://github.com/paul-gauthier/aider https://www.mentat.codes/
       | https://www.gitwit.dev/ https://www.second.dev/
        
         | kindawinda wrote:
         | Why would you downvote this? Ive already turned it into a
         | viable sqlite competitor. Shame on you.
        
       | simonw wrote:
       | Thanks to this post I learned about Lark, which looks like a
       | really nice parser library for Python.
       | 
       | The JSON tutorial on their site is excellent - shows how to build
       | a basic parser for JSON, then goes into some great detail about
       | how to improve its performance: https://lark-
       | parser.readthedocs.io/en/latest/json_tutorial.h...
       | 
       | Here's the grammar used for the RDBMS project:
       | https://github.com/spandanb/learndb-py/blob/master/learndb/l...
        
         | OJFord wrote:
         | DSL in a string? Is that 'really nice'? I haven't used or
         | needed this in Python that I can think of, but surely we can do
         | better than that?
         | 
         | Even a dict with expected keys and construction via the bitwise
         | or operator (which would roughly match the form of a lot of the
         | grammar) would be better wouldn't it? Imports could be imports,
         | just mixed in somehow.
         | 
         | This is just first thoughts at a glance, maybe I'm missing
         | something.
        
       | KeplerBoy wrote:
       | Kudos, i bet that was a fun and rewarding experience.
       | 
       | I know this was never meant to be fast, but could you produce
       | some benchmarks for shits and giggles?
        
         | spanspan wrote:
         | It would be a fun exercise to implement something like TPC-C
         | for learndb and see how this done.
        
         | bryancoxwell wrote:
         | Off topic but do you know of any good resources/talks/blog
         | posts that cover how to create useful benchmarks?
        
           | mburns wrote:
           | Not OP, but Brendan Gregg has written a lot about performance
           | and benchmarking.
           | 
           | * https://www.brendangregg.com/activebenchmarking.html
           | 
           | * https://www.brendangregg.com/blog/2018-06-30/benchmarking-
           | ch...
        
             | bryancoxwell wrote:
             | Great, thank you!
        
       | sgarland wrote:
       | This is fantastic, and seems like a great way for someone (me) to
       | learn DS&A better. I can explain how a B+tree works, but if you
       | asked me to code it, I'd freeze up.
       | 
       | I love databases and Python, so this was really interesting to
       | walk through. Thanks for the post.
        
         | spanspan wrote:
         | Most definitely. The b-tree implementation was the first
         | motivation for starting the project. Especially, all the
         | details around node rebalancing and splitting. And the fact
         | that it was an on-disk structure, added another wrinkle to the
         | thinking about the impl
        
       ___________________________________________________________________
       (page generated 2023-08-13 23:00 UTC)