hngopher.com

       [HN Gopher] Ask HN: Codebases with great, easy to read code?
       ___________________________________________________________________
        
       Ask HN: Codebases with great, easy to read code?
        
       A colleague told me the best way to level up coding skills is to
       read excellent code.  Do you have favorite repos that highlight
       this?  I have an irrational fear of unknown codebases since it
       feels most of the code is either boilerplate or tied to some
       framework.  Do you have tips and tricks you use to read codebases?
        
       Author : impjohn
       Score  : 167 points
       Date   : 2022-03-21 10:57 UTC (12 hours ago)
        
       | tristor wrote:
       | Honestly, the SQLite codebase is a fantastic read.
        
       | heavyset_go wrote:
       | I was always impressed with Near's emulators, RIP.
        
       | xixixao wrote:
       | Codemirror 6
        
       | srvmshr wrote:
       | Depending on your interest, I could vouch for OpenBSD having a
       | very clean readable codebase. Often it has some of the best
       | practices coded in with useful commentary.
        
       | blackbeard334 wrote:
       | https://9p.io/sources/plan9/sys/src/
        
       | makk wrote:
       | It's been years since I've looked but I remember being impressed
       | by the NGINX codebase. https://github.com/nginx/nginx
        
       | happy-dude wrote:
       | Stockfish is well written, commented, and documented C++ code:
       | 
       | https://github.com/official-stockfish/Stockfish
        
       | eatonphil wrote:
       | I've been thinking a lot recently how to get devs to read more
       | code and it's a very interesting reason you give that you don't
       | want to wade through boilerplate. I never thought of that before.
       | 
       | I don't know boilerplate-heavy systems like Rails or Django too
       | well. But I just wouldn't suggest starting with reading web app
       | code (though maybe I've ignored reading too much web app code
       | over time).
       | 
       | The easiest code to start thinking about is libraries and things
       | you use today already like the nginx code base or the CPython
       | code base or your logging library or your web server library
       | code.
       | 
       | In these cases maybe you download the repo, build it, see how you
       | could make a small tweak and run it. And soon you're looking
       | through its code to understand how it works.
       | 
       | Another maybe easier technique to start reading more is when you
       | are programming and have an error in a 3rd party library, use
       | grep to find that error in 3rd library code and just start poking
       | around when you do. Maybe add some print statements to it so you
       | can see more of what goes wrong. Try to solve the problem just
       | looking at the code and modifying it instead of using google.
       | 
       | If you ever get into it I'd love to hear from you. Email is on my
       | site and Discord is in my HN profile.
        
         | Beltiras wrote:
         | Django and DRF are really well written systems with good
         | documentation.
        
           | fernandotakai wrote:
           | absolutely +1 for django's source. and after they shipped
           | their black integration, everything is properly formatted,
           | which makes reading the source even easier.
        
           | bityard wrote:
           | I must be in the minority, I have tried to "learn Django" on
           | more than one occasion and gave up every time after
           | struggling with how hilariously over-engineered it is.
           | 
           | My best guess is that Django must be great for writing big
           | important relational-database-backed apps, or rolling your
           | own CMS, or something else people get a paid a lot of money
           | to do. But personally, for my small projects I get more
           | mileage out of starting with a micro-framework and just
           | choosing and bolting on the bits I need.
        
       | tanaygahlot wrote:
       | The book programmers brain contains a lot of tips on improving
       | code reading skills - https://www.manning.com/books/the-
       | programmers-brain
        
       | NWoodsman wrote:
       | This guy made a HN mobile reader and put all the code on Github
       | for his NDC Oslo presentation, it was good and shows off very
       | readable asynchronous code in C#:
       | 
       | https://github.com/brminnick/AsyncAwaitBestPractices
        
       | shrikant wrote:
       | I see that you're primarily looking into Python work, so I'd
       | recommend `smart_open` as a nice, compact way to get started.
       | 
       | https://github.com/RaRe-Technologies/smart_open
        
       | jedisct1 wrote:
       | Anything written in Zig or Go.
       | 
       | Both languages are extremely readable, even when looking at
       | unfamiliar code.
       | 
       | The Zig standard library is small, yet covers a lot of common
       | tools and structures. Every file contains implementations of one
       | particular thing, so you can casually browse random files and
       | understand what's going on without having to understand the
       | entire context.
        
         | Nullabillity wrote:
         | Couldn't disagree more in the Go case. Folder-level namespaces
         | (rather than file-level) makes Go exceptionally annoying to
         | navigate, and interfaces heavily obscure the.. interface
         | between abstractions and implementations.
        
           | [deleted]
        
           | KptMarchewa wrote:
           | Java basically does the same thing and I've never heard
           | anyone complaining about that.
           | 
           | Python does the file level thing and it's source of constant
           | annoyances with cyclical imports. Ugh, wasted so much hours
           | on fixing it.
        
             | Nullabillity wrote:
             | > Java basically does the same thing and I've never heard
             | anyone complaining about that.
             | 
             | I don't love dealing with Java, but it has none of these
             | particular issues. It's easy to find the definition of a
             | class, because it enforces that the file path must match
             | the fully qualified class name for all public classes (and
             | this tends to be followed for privates as well in
             | practice). Finding the implementation isn't quite trivial,
             | but if you know a class in the tree then you can easily
             | find the whole parent tree by following the "extends Foo"
             | clauses, or find subclasses by grepping for it instead.
        
           | rubiquity wrote:
           | I'll +1 Zig code, but I came here to say this about Go as
           | well. Once Go codebases creep past medium sized (25-50k LOC
           | perhaps) into large I find them to be inscrutable. I've
           | noticed this even in projects that people in the community
           | point to as the gold standard such as the various HashiCorp
           | tools.
           | 
           | I don't write much Go anymore but my hunch is that it's a
           | combination of the package layout and auto method delegation
           | for embedded structs. Even Java does a much better job of
           | helping the developer at obviating the interfaces between
           | different subsystems.
        
           | psanford wrote:
           | It seems like gopls solves your problems. It will do the
           | correct thing for navigating to definitions and listing
           | implementations of interfaces.
        
             | Nullabillity wrote:
             | In my experience gopls hasn't been useful for much more
             | than crashing.
             | 
             | But that aside, Go's incessant insistence on structural
             | typing makes it impossible for any tool to generate a list
             | that is both complete and free from false positives.
        
         | oxplot wrote:
         | Absolutely this. Especially Go's standard library is a pleasure
         | to read. Lots of idioms and good practices to be learnt.
        
         | opportune wrote:
         | Go basically codifies a lot of best practices for writing C++
         | at Google, but I wouldn't say it always teaches good code. Good
         | Go would definitely help you learn how to write good systems
         | code in any systems programming language though, and to be
         | explicit/clear instead of terse.
         | 
         | But you can easily code yourself into a corner with Go too. If
         | someone doesn't know how to use concurrency well they can do
         | bad things like overcreating goroutines or making a mess with
         | channels. And some of the common patterns (particularly
         | excessively overriding things) can be considered an anti
         | pattern in terms of understanding (IMO)
        
         | pxeger1 wrote:
         | This is a massive generalisation. It is very easy to write bad
         | code in any language.
         | 
         | Well, bad on a large scale. Go in particular has some nice
         | tools to ensure code at a small scale is always good (enforcing
         | syntax style), but no language can stop you from having a bad
         | project architecture.
        
       | ramesh31 wrote:
       | Doom 3 is a perennial favorite for "most beautiful C++ codebase"
       | lists [0]
       | 
       | [0] https://github.com/id-Software/DOOM-3-BFG
        
         | dafelst wrote:
         | I was going to comment the same thing, all the Carmack era id
         | software open source code (Quake, Doom) is very nicely
         | structured and quite easy to grok.
        
       | e9 wrote:
       | I had to modify FFmpeg for a job and I found it surprisingly
       | accessible and easy to read/modify:
       | https://github.com/FFmpeg/FFmpeg
        
       | iamricks wrote:
       | How do you guys approach the "start" of reading a code base, i
       | never know where to start looking, specifically if its a language
       | i am not too familiar with i have no idea where to start and
       | sometimes i have no idea where the program execution starts
        
         | yakubin wrote:
         | I've watched an interview on MSDN with one of the developers of
         | .NET (I think she was responsible for the GC), who also used to
         | work on Windows, making those famous workarounds making games
         | work on newer Windows releases, even when they relied on old
         | kernel bugs. I think she said that the best way to get familiar
         | with a complex new codebase is to step through it with a
         | debugger, going through several scenarios. I think it's a great
         | idea. I only wish I had a working visual debugger in my day to
         | day work.
         | 
         | EDIT: Found the interview: <https://docs.microsoft.com/en-
         | us/shows/Careers-Behind-the-Co...>
        
         | Jach wrote:
         | Besides the good methods others have posted, and a really nice
         | method (if you have it) of having someone else familiar with
         | the code give you a tour, you can also do pretty well just by
         | brute forcing it.
         | 
         | Get a list of all the files, sorted however (`find -name *.foo`
         | works) and start going through them top to bottom, or bottom to
         | top if that's a more clear convention of the language. Maybe
         | shuffle order a bit if you discover unit tests (nearby or
         | asking a tool to cross-reference a call) to read the code and
         | the test around the same time, but resist the urge to jump
         | around too much or too deeply. Jot down short notes about what
         | seems to be the main purpose(s) of the file, and move on. Keep
         | going, keep track of what you've seen, your first goal is to do
         | a complete survey of all the files and not get too distracted
         | by fully understanding new syntax (Java annotations and Python
         | decorators can both be understood as high level declarative
         | tags even though under the hood they're quite different) or
         | endless note revisions from new insights as you progress and
         | start seeing connections or just finally understanding
         | terminology ("wtf is a 'hero'?").
         | 
         | You'd be surprised how fast you can do a single (high level,
         | shallow, skimming in places) pass even for larger code bases,
         | by the end of it you'll also have found the/an entry point, and
         | are in a better place for followup study or producing materials
         | that can help the next person (like an architecture diagram
         | that lists the files involved in each element, at least at that
         | moment, or just some important cross references you've noted
         | that a tool isn't necessarily going to make clear). And for
         | easy code, a single pass may be all you ever need, even if you
         | read it in a strange order. A completed puzzle is perfectly
         | clear regardless of the order you put the pieces down.
        
         | miguendes wrote:
         | Mitchell Hashimoto has published a blog post describing how he
         | approaches complex codebases. That might give you an idea where
         | to start.
         | 
         | https://mitchellh.com/writing/contributing-to-complex-projec...
        
           | yodsanklai wrote:
           | Good advice! I especially like the first advice
           | 
           | > The first step to understanding the internals of any
           | project is to become a user of the project.
           | 
           | It's normally easier to figure out complex behaviour from the
           | spec/doc/interaction than from the code.
        
           | labrador wrote:
           | Great guide! I would add fixing bugs. I often learn most
           | about a code base by fixing my bugs. A good debugger can be a
           | blessing. Profiling is part of debugging to me. Questions can
           | come up about why something is taking a long time that lead
           | to more debugging and thinking about what is going on.
        
             | trynewideas wrote:
             | The best runs I've had working on others' codebases is to
             | jump into documenting it. Many projects love having someone
             | read, ask questions about, and document code, even (or
             | especially) from a naive standpoint since that's who'll
             | benefit the most from it, and in the process you learn how
             | the code's structured, track where references lead to, and
             | more often than not kick over some bugs worth fixing in the
             | process.
        
         | maerF0x0 wrote:
         | Short answer -
         | 
         | git clone <repo> ;
         | 
         | open project in editor/IDE
         | 
         | Read the readme.md to get an idea at the author's opinion
         | 
         | Start at `func main(){}` and find what I find.
         | 
         | Longer answer can be taught by taking the patterns out of
         | 
         | https://www.goodreads.com/book/show/567610.How_to_Read_a_Boo...
        
         | gushogg-blake wrote:
         | I came up with the idea of ENTRYPOINT comments to solve this
         | problem: https://gist.github.com/gushogg-
         | blake/247b1bf2ed46b035d1c8a2...
        
       | everyone wrote:
       | Box2D https://github.com/erincatto/box2d I went over every file
       | of this writing a Unity plugin for it in work once. I was really
       | impressed, learned a lot.
        
       | khalladay wrote:
       | I think my favourite open source project to poke around in
       | recently is [Reshade](https://github.com/crosire/reshade). The
       | code is pretty readable and is doing a lot of interesting stuff.
       | Every time I've taken a look at it I've learned something new.
       | Definitely super light on boilerplate, given that it's solving a
       | bit of a unique problem.
       | 
       | In terms of tips and tricks, I often start looking at new code by
       | trying to write out in plain english prose, a bit of a story of
       | how the code works. Almost like I'm writing a blog post
       | explaining how things work to someone else. Often this process
       | uncovers rabbit holes that I need to go down to understand
       | isolated bits of logic before I can return to building this big
       | picture view, which is sort of the point.
        
       | 65 wrote:
       | Wordpress is pretty great.
       | 
       | https://github.com/WordPress/WordPress
        
       | amichal wrote:
       | https://github.com/seattlerb/minitest really removed the FUD for
       | me when i started learning Ruby and Rails. Its full of
       | metaprogramming and fancy tricks but is also quite small,
       | practical and informal in its style.
       | 
       | e.g. "assert_equal" is really just "expected == actual" at it's
       | core but it uses both both a block param (a kind of closure) for
       | composing a default message and calls "diff" which is a dumb
       | wrapper around the system "diff" utility (horrors!). There is
       | even some evolved nastiness in there for an API change that uses
       | the existing assert/refute logic to raise an informative message.
       | this is handled with a simple if and not some sort of complex
       | hard-to-follow factory pattern or dependency injection misuse.
       | 
       | https://github.com/seattlerb/minitest/blob/master/lib/minite...
        
         | cfcosta wrote:
         | That's an oldie, but the experience was the same for me. I was
         | reading Metaprogramming Ruby at the time, and going over the
         | implementation made everything much clearer to understand.
        
       | munk-a wrote:
       | I have a very different suggestion. This codebase (RPI
       | Engine[1])is what I initially cut my teeth on and I learned a lot
       | about good program design just by viewing what works and what
       | didn't work. Reading and understanding code that's stood the test
       | of time can also be quite valuable because you can see which
       | patterns can survive lots of people touching it and which
       | patterns start to fall apart when the original designer isn't
       | available to onboard new people - MUDs develop through time with
       | a few concurrent developers at most, and generally have stretches
       | where there are no active developers, or the people executing
       | code changes are learning it as they go.
       | 
       | I'd suggest this codebase as an excellent lesson in how bloat and
       | complexity enter into the picture over time - I wish the actual
       | commit history was available, but unfortunately the open source
       | release was just a snapshot in time.
       | 
       | 1. https://github.com/webbj74/RPI-Engine
        
       | fieryskiff11 wrote:
        
       | cryptonector wrote:
       | > Do you have tips and tricks you use to read codebases?
       | 
       | #1: If the codebase is huge, you can't read all of it. So you'd
       | best know how to _navigate_ it.
       | 
       | #2: You need an IDE or cscope-like too to navigate a codebase.
       | The codebase is like a web of, say, wikipedia articles, and
       | you're going to have to browse it a lot like how you'd browse
       | wikipedia. Symbols are links!
       | 
       | #3: It helps to understand the big picture. What does this
       | codebase implement? Where are the "entry points" -- where to
       | start reading? What's the architecture? (E.g., Java is a byte-
       | compiled language with a bytecode interpreter known as a JVM.)
       | What's the design look like?
       | 
       | #4: If it's just for fun, well, just browse till you find
       | something interesting, then read it carefully, and go spelunking
       | like it's a wikipedia article.
       | 
       | #5: If you're reading it to debug something, you need to first
       | find the relevant entry points.
       | 
       | #6: If you're reading it to add features, you really need to read
       | the developer docs (if they exist), the internals docs (if they
       | exist), and figure out a lot of things like APIs exported,
       | internal utilities libraries, portability layers, external
       | dependencies, protocols, etc. This will take time, and that's ok.
       | Start with small features, and work your way. You'll build a
       | deeper understanding as you go.
       | 
       | #7: You don't have to understand all that much about the codebase
       | in question, and it might not be possible to if we're talking
       | about a codebase that's in the hundreds of millions of lines of
       | code. You'll have to specialize as you dive deep, and generalize
       | as you wade "near the top".
       | 
       | #8: It can take time to pick up these skills to the point where
       | you can do this quickly. And even then, it can take time to
       | understand a large codebase well enough. There's just a ton of
       | detail that you have to digest into a mental picture that's
       | sufficiently high-level that you can use it productively. So be
       | patient, and keep on going. Just because it's a lot to learn, you
       | shouldn't be discouraged.
       | 
       | To really deal with huge codebases, you have to be a bit like a
       | generalist who can specialize as needed.
       | 
       | For example, if you're reading the OpenJDK, you'll want to
       | understand what Java is, what the JVM is, and so on, though you
       | won't have to understand all of that if you just want to read the
       | OpenJDK implementation of, say, TLS, but you will have to be able
       | to navigate outside that particular bit of the OpenJDK sometimes,
       | but if you tease out code threads far enough, you probably will
       | learn a thing or three about seemingly unrelated things like the
       | GC.
       | 
       | Get comfortable doing these things, and you'll be able to deal
       | with codebases in the millions of lines of code.
        
       | sparker72678 wrote:
       | Sidekiq: https://github.com/mperham/sidekiq
        
       | traviscj wrote:
       | I've learned a TON from the
       | [okhttp3](https://square.github.io/okhttp/) codebase, highly
       | recommend studying it.
        
       | malkosta wrote:
       | I really like DWM: https://git.suckless.org/dwm/
       | 
       | If you have a Linux machine, you can compile and install manually
       | by just following the instructions on the README.
       | 
       | Then you can customize the window manager by copying and pasting
       | the patches into your version and recompiling. That forces you to
       | learn how to build and extend your own window manager in pure C.
       | And it isn't hard at all, even to a beginner.
       | 
       | That inspired the creation of many tiling window managers,
       | because people understood the code and decided to build their
       | own, like i3 or xmonad.
       | 
       | The project also features other easy to read C apps, like ST
       | terminal and the surf web browser.
        
       | dzuc wrote:
       | A bit old now of course but both Underscore [1] and Backbone [2]
       | have annotated sources and are a pleasure to read.
       | 
       | 1. https://underscorejs.org/docs/underscore-esm.html
       | 
       | 2. https://backbonejs.org/docs/backbone.html
        
       | thom wrote:
       | SerenityOS, especially the userland, has always seemed very
       | elegant to me:
       | 
       | https://github.com/SerenityOS/serenity
        
         | typon wrote:
         | one of the best C++ codebases in existence.
        
       | renewiltord wrote:
       | I think `xsv` is easy to read. I have a fork of it for personal
       | use and it was easy to add features to it even though I'm not a
       | rust daily user.
        
       | gorjusborg wrote:
       | I have found using github's language search to be helpful for
       | this sort of thing.
       | 
       | If you are using ruby, for instance, just search for
       | https://github.com/search?q=language%3Aruby and look for popular
       | codebases. You can decide which are beautiful for yourself.
        
       | bluedino wrote:
       | DEU (Doom editing utilities)
       | 
       | https://www.doomworld.com/idgames/utils/level_edit/deu/deu52...
        
       | pogopaule wrote:
       | You might want to join the https://codereading.club/
        
       | SteveMoody73 wrote:
       | wordI think it can be hard to recommend a particular codebase,
       | well written code can be good to read but if you want to become
       | better at a language or problem domain then sometimes reading
       | badly written code may be a better way to learn.
       | 
       | Working through some badly written code that actually performs
       | well can be a real eye opener. I mainly work in C and reading
       | some legacy code (sometimes even my own) can be a challenge to
       | work out exactly what's going on.
       | 
       | If you want to learn how an algorithm works, then a good clean
       | codebase with lots of comments is a good way to go. If you want
       | to learn the details of a particular language, then just read a
       | lot of code in that language whether it's good or bad.
        
       | oso2k wrote:
       | Almost anything from suckless.org.
       | 
       | Here's a windows manager (dwm) and it's docs and build system in
       | 13 files and just around 3000 lines of code.
       | 
       | https://git.suckless.org/dwm/files.html
       | 
       | And sbase, a sort of "busybox-like" set of common *NIX base utils
       | written to be small and portable. Some of the commands are just a
       | few dozen lines.
       | 
       | https://git.suckless.org/sbase/files.html
        
       | sharikous wrote:
       | I hate to be the "it's complicated" guy but "excellent" is too
       | broad.
       | 
       | I see every day code that is elegant but has bugs, ugly code that
       | is foolproof, optimized code that performs abysmally because of
       | some architecture change that happened in between, and a lot of
       | abominations that make the code bad for guy A and good for guy B
       | (e.g. a neat typechecked, object-oriented, very elegant, Pythonic
       | numerical code that is 100 times more confusing for your research
       | level numerical analyst than an uglier but functional Matlab
       | script).
       | 
       | What I agree on is "the best way to improve X in my code" is
       | "read code that has quality X".
       | 
       | Given the broadness of your question I suspect you are still
       | finding your way around programming in general. If that's the
       | case my method is to be driven by curiosity.
       | 
       | - Why does macOS behave this way? Let's look up xnu's code - I
       | wonder about list implementation... Let's look at cPython code
       | for appending items to a list
       | 
       | And so on... There is a lot of open code for stuff we are using
       | everyday. It is interesting to get into it.
        
       | ramboldio wrote:
       | GRBL the CNC firware for Arduninos:
       | 
       | https://github.com/grbl/grbl/
       | 
       | It feels like it has more comments than code. The comments are
       | written in a very nice, understandable language that even
       | activley teaches about concepts that are only adjacent to the
       | code at hand.
       | 
       | E.g. https://github.com/grbl/grbl/blob/master/grbl/stepper.c#L142
       | or https://github.com/grbl/grbl/blob/master/grbl/stepper.c#L233
        
         | bcrl wrote:
         | That's over-commented code written by a junior developer from
         | my quick look. The first random thing I looked at in
         | grbl/eeprom.c:
         | 
         | ... char old_value; // Old EEPROM value. char diff_mask; //
         | Difference mask, i.e. old value XOR new value.
         | 
         | cli(); // Ensure atomic operation for the write operation. ...
         | 
         | You can remove the need for the first comment by calling the
         | variable old_eeprom_value. Boom, simple and obvious. Commenting
         | cli() is similarly ridiculous: call the function
         | disable_interrupts() and it's completely obvious what it's
         | doing. Later on:
         | 
         | sei(); // Restore interrupt flag state.
         | 
         | This is incorrect. It's enabling interrupts, not restoring
         | them. If the intent was actually to restore the interrupt
         | disable flag to its original state then this function is buggy
         | and will unintentionally enable them. It would be far better to
         | document the expected sematics in the documentation for the
         | function above, but instead of documenting the expected
         | semantics of the eeprom_put_char() function, you have to read
         | the code to figure out what the semantics are. What would be
         | better is to have a comment in the function description saying
         | "this function can only be called with interrupts enabled" or
         | "this function is atomic and can be safely called from an
         | interrupt handler or with interrupts enabled". Then it's
         | obvious when reading the code which semantics are guaranteed /
         | expected.
         | 
         | So, sure, overly commented code makes it easy to figure things
         | out, but this is a sign of a junior developer that is focused
         | too much on the code and not enough on the overall system. This
         | isn't something I'd like to see a developer pointed at that is
         | looking to learn good habits. Good habits are telling other
         | developer what they can expect from a function. Bad habits are
         | making them read the code to figure that out.
        
         | jovial_cavalier wrote:
         | It's funny this got mentioned, because I recently got a 3D
         | printer that runs Marlin, which embeds GRBL. So, I decided I
         | would take a look at it. I thought a lot of places were really
         | garbled. Especially motion_control.c, which has a _ton_ of
         | #ifdef logic
        
       | simonw wrote:
       | Something I find really helpful is to start with a question that
       | I want to answer.
       | 
       | Often this will be along the lines of "How does it do X?" - where
       | X is something I either didn't know was possible or that I
       | suspect to be really difficult.
       | 
       | Then I can dive in to the codebase (usually starting with GitHub
       | code search) and try to figure out how they do it.
       | 
       | This helps me skip straight past the boilerplate and means I
       | often get to a satisfying conclusion - where I've learned
       | something new - in a very small amount of time.
       | 
       | And along the way I pick up knowledge about how their code is
       | organized and often a few other tricks too.
        
         | simonw wrote:
         | One recent example: I wanted to know if the SQLite package in
         | Python took any steps to avoid calling "interrupt" on a closed
         | connection, which the SQLite C documentation warns against.
         | 
         | A couple of searches against https://github.com/python/cpython
         | lead me to this code here:
         | https://github.com/python/cpython/blob/4674fd4e938eb4a29ccd5...
        
           | gerhardhaering wrote:
           | It's nice that code that I wrote more than a decade ago is
           | mentioned here.
        
             | edelans wrote:
             | And it's even nicer that you get the chance to see it !
        
             | simonw wrote:
             | Thank you very much for building this, I benefit from it
             | every day!
             | 
             | Since you're here... there was actually a question raised
             | on the SQLite forum about that code and whether it is
             | genuinely safe against a specific race condition... and I
             | don't have nearly enough Python C knowledge to know the
             | answer!
             | 
             | Does this look like it could be a problem to you?
             | https://sqlite.org/forum/forumpost/f37ae374cc
        
       | spullara wrote:
       | I really enjoyed working with the Redis codebase. Great, easy to
       | understand C code.
        
       | aitoehigie wrote:
       | This is a very interesting question.
       | 
       | Are you interested in any particular languages?
       | 
       | For Python, take a look at: https://github.com/psf/requests
        
         | impjohn wrote:
         | I initially had Python in the title but I removed it to give
         | way to a broader discussion. Definitely checking this one out
        
           | andthat wrote:
           | Kenneth Reitz actually wrote a book called The Hitchhiker's
           | Guide to Python! which includes a chapter on Reading Great
           | Code.
           | 
           | https://docs.python-guide.org/writing/reading/
        
             | impjohn wrote:
             | That looks exactly what I was looking for, thanks for the
             | resource
        
       | bitigchi wrote:
       | Haiku: https://git.haiku-os.org/haiku/tree/
        
       | foobarbed wrote:
       | I've always enjoyed lichess's chess API:
       | https://github.com/lichess-org/scalachess/tree/master/src/ma...
       | 
       | It's funny because I remember comparing it to mine that I had
       | tried to write during college, and appreciating how much better
       | it is.
       | 
       | Pay attention to how there's a bunch of different types of chess
       | in there too, and how that's factored.
        
       | winrid wrote:
       | You don't get good at a language by just listening to it all the
       | time. You get good by _engaging_. Same goes for programming.
       | 
       | Also, a lot of "clean code" stuff can be confusing dogma.
       | 
       | You should try building things you find interesting, and try to
       | build them in a way that "feels correct", and try to emphasize -
       | what if someone else was reading this? What if someone else dived
       | into this codebase to add this feature? Could they?
        
       | exyi wrote:
       | For anyone looking for a (nontrivial) C# project, I can only
       | recommend going through ILSpy decompiler.
       | https://github.com/icsharpcode/ilspy
        
       | afry1 wrote:
       | I find Ramda very easy to read! It's a functional Javascript
       | library based on currying and composition.
       | https://github.com/ramda/ramda/
       | 
       | I find a lot of code fairly alienating to read. Lots of codebases
       | require you to get into the "mindset" of the person who wrote the
       | code: their idioms, assumptions, patterns they lean on, etc. So
       | unless you've got the time to get deep into it, the insights you
       | can draw from reading it are minimal.
       | 
       | Ramda, by comparison, is just a library of utility functions, and
       | all of those utilities perform very simple operations: merging,
       | plucking, appending, equality checking, etc.
       | 
       | There's a lot of intention in the Ramda API as well. All
       | functions are "data last," meaning that the actual piece of data
       | you're operating on is the final argument to every function. This
       | enables you to write Ramda code that is very structurally
       | consistent: function parameters first, data last, every time.
       | 
       | It gives me a sense of empowerment, reading the code. It's like
       | "This doesn't have to be rocket science. If you just start from
       | these basic operations, and write those basic operations with a
       | simple but strict ideology of 'data last' every time, and stick
       | them together like lego blocks using compose, then you can
       | achieve some very cool stuff with very little code."
        
       | todotask wrote:
       | My tricks in Go projects could use sqlc to transpile from SQL is
       | a great time saving and minimise error prone, glad to avoid ORM
       | as long as possible and minimal framework. It gets my job done
       | and spent more time on business logic.
       | 
       | Adding on Tailwind, nothing lock you in.
        
       | lazyweb wrote:
       | Pihole [1] is mostly written in bash, which reads rather well, as
       | far as I am concerned.
       | 
       | [1] https://github.com/pi-hole/pi-hole
        
       | cperciva wrote:
       | In past threads, people have mentioned enjoying my Tarsnap
       | (https://github.com/Tarsnap/tarsnap) code. I personally think
       | that the spiped (https://github.com/Tarsnap/spiped) code is even
       | better.
        
       | tbrock wrote:
       | Redis. Read the redis source code if you want to see nice C.
       | 
       | The reason it always impresses me is that C can look like
       | gobledygook, but yet this codebase is clean and understandable.
        
         | thorin wrote:
         | I was impressed with the Redis codebase too. I think it
         | benefits from being relatively new in C terms so it doesn't
         | have too much baggage (2009, is really new in terms of C
         | projects!). It must also take a lot of discipline on behalf of
         | the maintainer.
         | 
         | I seem to remember Postgres and Sqlite were relatively
         | accessible to a low intermediate C programmer. When I've had to
         | look at Android code (more C++ admittedly) I've started to get
         | lost very quickly.
        
           | jjice wrote:
           | Postgres's Yacc definition helped me a ton when I was using
           | Yacc. The documentation out there for Yacc/bison isn't great,
           | but Postgres served as a decent set of examples.
        
           | dkarl wrote:
           | I second Postgres. Not only is the source code a pleasure to
           | read, there's also an unusual amount of well-presented
           | material about its internals available online.
        
             | atonse wrote:
             | this probably correlates to why there is such a rich
             | ecosystem of pg extensions and forks.
        
               | spacemanmatt wrote:
               | while true, the architecture rather than code hygiene
               | deserves credit for facilitating the ecosystem of
               | extensions and forks. in my husky opinion.
        
               | datavirtue wrote:
               | Also explains how Pivotal was able to refactor it into a
               | MPP (Greenplum).
        
         | bear8642 wrote:
         | The Plan 9 operating system is good C codebase to explore too
        
         | zvr wrote:
         | If you're looking for code in C, the implementation of Tcl is a
         | wonderful code base. You can even focus on specific parts
         | instead of the complete scripting language: how to create a
         | hash table, for example.
        
       | numtel wrote:
       | Postgres
        
         | mattashii wrote:
         | Most sections of the codebase that are actively developed are
         | very readable, but I still got quite lost in the core parts of
         | xact/multixact recently. I feel that is more of an exception,
         | though.
        
       | anonymoushn wrote:
       | The zig stdlib has been good reading so far. You also basically
       | have to read it if you want to use it.
        
       | HeckFeck wrote:
       | I've had a look at NetBSD's codebase before. It was fairly easy
       | to follow.
       | 
       | I've also heard good things said for OpenBSD's readability.
        
       | qiskit wrote:
       | > A colleague told me the best way to level up coding skills is
       | to read excellent code.
       | 
       | The best way to level up is to code. Reading code can be a
       | complementary activity that can bring insights but it's not a way
       | to level up. Active > passive.
       | 
       | > Do you have favorite repos that highlight this?
       | 
       | For what language? Desktop, mobile? Systems programming or web
       | development? Linux/BSD/etc all have source code available. I
       | believe microsoft has open sourced the .Net Framework or parts of
       | it.
       | 
       | It's like you are learning a foreign language and want us to
       | recommend good books? Can't really help you if you don't tell us
       | the foreign language and your goals for the language ( casual
       | conversation, business, translation, etc ).
        
         | devnonymous wrote:
         | > Reading code can be a complementary activity that can bring
         | insights but it's not a way to level up.
         | 
         | Counter-point: As a professional developer one might spend far
         | more time reading code than writing code. In my experience, all
         | the good developers I've worked with have the ability to skim
         | through large code bases and quickly zone into the parts that
         | interest them. It is a very deliberate skill to cultivate.
         | 
         | I once put down my thoughts on this :
         | http://lonetwin.net/20090829/hacks-you-can-live-without/on-r...
        
         | bityard wrote:
         | >> A colleague told me the best way to level up coding skills
         | is to read excellent code.
         | 
         | > The best way to level up is to code.
         | 
         | I think it's much more subtle than either of these.
         | 
         | First of all, "excellent code" is an extremely subjective
         | thing. I once worked with this one developer. He could cook up
         | solutions to complex problems very quickly. But he didn't
         | comment or docstring any of his code, he favored writing his
         | own libraries and frameworks rather than pull in dependencies,
         | and every single thing he wrote was grossly over-engineered
         | once you managed to figure out what it was doing.
         | 
         | Which is a long way of saying, he was a brilliant programmer
         | who wrote very shitty code. And unfortunately, there are a
         | large number of open source projects and maintainers like this,
         | so picking some at random to study may not get you very far.
        
         | makk wrote:
         | > The best way to level up is to code.
         | 
         | Up to a point, yes. But beyond that point, in my experience, a
         | deliberate study of software architecture is required to move
         | forward. That and mentorship/code reviews by people who have a
         | deeper appreciation of software architecture.
         | 
         | You start by wanting to learn how to code, then you write a lot
         | of code, then you progress by learning how to write less code
         | and less complex code.
        
         | nouveaux wrote:
         | "The best way to level up is to code. Reading code can be a
         | complementary activity that can bring insights but it's not a
         | way to level up. Active > passive."
         | 
         | Both writing and reading code is important. It's just that most
         | people, in my experience, do not actively search out code to
         | read and spend more time writing code.
        
         | bckr wrote:
         | The post originally asked about Python programming. OP made it
         | more general on purpose.
         | 
         | I'll (tongue-in-cheekly) prompt you with the following:
         | 
         | Language: Whatever qiskit is most familiar with OR has a
         | favorite recommendation for (based on qiskit's interests).
         | Domain: Whatever qiskit is most familiar with OR has a favorite
         | recommendation for (based on qiskit's interests).
        
         | datavirtue wrote:
         | Unfortunately, a lot of developers run into clever, over-
         | abstracted code they can't understand (repeatedly) and then
         | eventually grasp it and think it's OK or even preferred to
         | write clever code like that themselves. It's like a virus.
        
         | klyrs wrote:
         | "Practice makes perfect" is a common refrain, and there is some
         | truth to it... but more accurately, _perfect practice makes
         | perfect._ If you practice bad form, you will execute bad form.
         | Merely writing code is not necessarily good practice. Writing
         | _good_ code is good practice. Ergo, alternating between reading
         | good code and writing code is an effective means of leveling
         | up.
        
       | elcapitan wrote:
       | I remember back in the day reading parts of the Python standard
       | library. I don't know if that's generally good advice or still
       | viable, but that's what I did, and I found it helpful. It was
       | directly available, and usually connected to things I used with
       | Python.
       | 
       | One upside of this might also be that it's not as you said
       | boilerplate, because it's very foundational and not heavily using
       | other stuff. It also is well documented, so you'll find good
       | explanations why things are the way they are.
        
       | jorangreef wrote:
       | Are we allowed to share repos we've written? :)
       | 
       | If so, then here's distributed consensus in Zig:
       | 
       | https://github.com/coilhq/tigerbeetle/blob/main/src/vsr/repl...
       | 
       | Something that differentiates this from many consensus
       | implementations is that there's no boilerplate
       | networking/multithreading code leaking through, it's all message
       | passing, so that it can be deterministically fuzz tested.
       | 
       | I learned so much, and had so much fun writing this, that I also
       | hope it's an enjoyable read--or please let me know what can be
       | improved!
        
       | twothumbsup wrote:
       | I've found the Chef project (https://github.com/chef/chef) to be
       | high quality and easily readable but I've been working with Chef
       | for like 8 years at this point which might be influencing how I
       | view it.
       | 
       | Hashicorp projects also seem very well done too especially given
       | how extensible they are.
        
       | johntdaly wrote:
       | To be honest, I don't know any code bases I would call "great" or
       | "easy to read" but I can tell you what I do when I need to work
       | in codebases I don't know.
       | 
       | I've got two main strategies:
       | 
       | 1) I look at the part of the app I want to modify when I use the
       | app and search for that part in the code. Once I've found that
       | code I roughly try to find out how that code works by adding
       | exploratory code (you can also use a debugger). Once I "think" I
       | know what is going on I try to modify the code. This is where you
       | usually find some exceptions or misunderstandings on you part if
       | you haven't touched the code before. If you are lucky and work in
       | a team somebody can tell you in a code review that you didn't
       | understand. If you are alone you will have to see things blow up,
       | debug and fix the problem.
       | 
       | 2) You can try to figure out from the main entry point how the
       | app works. This works better for some apps than for others. If
       | you have an event based app this is most likely just a supplement
       | to method 1, if you have a cli app or some type of data munching
       | app this can replace method 1.
       | 
       | 3) You can try looking at early versions of a code base in GIT to
       | get an understanding of its architecture before the app became
       | "more complex".
       | 
       | You will always be a bit overwhelmed by any code base and many
       | code bases are just to large for a single person so get
       | comfortable working on "parts" of an app first rather than
       | working on or understanding "the whole thing". Also, code reading
       | is not like reading books, code is way way denser than any book
       | you can read (and that includes Heidegger) so you will not just
       | "read" it, you will need to work with it. Zed Shaw's "Learn X the
       | Hard Way" series relies on you working with the code to
       | understand it. The same holds true for code you "read", you will
       | at least need to try to "run" the code in your mind if you can't
       | run it for real.
       | 
       | You might also want to get over your thing about frameworks. QT,
       | GTK, Ruby on Rails, React, ncurses, frameworks and libs are in
       | just about any app and many apps that get larger might extract
       | significant parts of their functionality into libs or frameworks.
       | A lot of boilerplate is usually a good indication that an app
       | could benefit from a framework. I never understood the "I want to
       | be free from the constraints of frameworks" people. Their code
       | bases usually have the start of multiple architectures and a lot
       | of boiler plate code. I think they always search for some
       | "perfect" solution and just can't find it. The truth is, libs and
       | frameworks are great, they give you an easy in on a new app and
       | they give you documentation that probably wouldn't exist on fully
       | home grown code. In other words, they mace "reading" code easier.
        
       | ppg677 wrote:
       | LevelDB
        
         | danielmarkbruce wrote:
         | +1000
         | 
         | https://github.com/google/leveldb
         | 
         | Jeff Dean and Sanjay Ghemawat are amazing engineers and this
         | code is (/was?) nice.
        
       | aidos wrote:
       | Look through the stack you're familiar with. For me that means
       | nginx, uwsgi, flask, sqlalchemy, alembic - but I'll look at
       | anything I have a question about.
       | 
       | My trick is to dig in when something doesn't work the way I
       | expect. Or someone says "I don't think there's a way to do X with
       | blah". My immediate reaction is to clone the code and take a
       | look. I have a "tools" folder on my local machine that contains
       | many of the tools / libraries is use.
       | 
       | Orientation is easier than you expect. The easiest scenarios are
       | around "why did I get that error" situations. Grep for the error
       | and away you go. But having a question to answer will definitely
       | give you a direction to investigate.
        
         | gmassman wrote:
         | I agree about having an error to investigate or a question to
         | answer. Some of the python libraries with great code I'd
         | recommend reading are Flask and Werkzueg. Both have very clean
         | interfaces and excellent documentation. Sqlalchemy is somewhere
         | in the middle. As an ORM it has a high minimum level of
         | complexity. But the codebase is still reasonably well
         | organized. Looking for an answer to something like "how does it
         | emit a JOIN" may be a lot harder to answer than you expect
         | though.
         | 
         | For packages to avoid, stay away from Celery. It's just...
         | icky.
        
           | aidos wrote:
           | Yup, sqla is super meta and, thus, complicated (just due to
           | the problem space). Better is alembic / alembicutils where
           | you can dig into the autogenerate system they're layering
           | over sqla.
           | 
           | Weirdly enough, I enjoy digging through C and Java libs more,
           | mostly because they're more unfamiliar to me. I'd spend more
           | time in Postgres / fontforge / mupdf / pdfbox / nginx / uwsgi
           | on the whole.
        
       | sgc wrote:
       | Along with the other recommendations, I was introduced to _The
       | Architecture of Open Source Applications_ from an HN post some
       | time back, and have found it quite interesting. You can use it
       | together with a more detailed walk through the respective
       | projects ' source code, to get a great idea of what some big
       | names are doing.
       | 
       | http://aosabook.org/en/index.html
        
       | markstos wrote:
       | For TypeScript, Ghost: https://github.com/TryGhost/Ghost
        
         | davidjfelix wrote:
         | Maybe I'm missing something but this repo looks like it's
         | exclusively Javascript with no typescript.
        
       | asimpletune wrote:
       | Im surprised no one has said reading tests as a good starting
       | point. Any way, besides main, tests are usually good too.
        
       | sirodoht wrote:
       | Every time that I can't figure out how to do something with
       | Django, I just read the code [1] and then everything is easy and
       | clear.
       | 
       | [1]: https://github.com/django/django
        
       | hoten wrote:
       | cs.chromium.org is an example of how tooling can drastically help
       | with readability. It's incredibly easy to navigate the codebase.
        
       | samoit wrote:
       | Anything writen in List /scheme
        
         | bear8642 wrote:
         | Any explicit examples?
         | 
         | Starting to explore scheme more and would be interested in some
         | good pointers
        
       | bikingbismuth wrote:
       | When people ask this question about Python codebases, I always
       | recommend the Shodan Python client -
       | https://github.com/achillean/shodan-python
       | 
       | It is easy to read and has taught me some neat Python-isms.
        
       | artpar wrote:
       | https://medium.com/@012parth/what-source-code-is-worth-study...
        
       | morelandjs wrote:
       | Prefect workflow orchestrator:
       | https://github.com/PrefectHQ/prefect
        
       | maxehmookau wrote:
       | GitLab is an excellent example of a large, complex Rails
       | codebase: https://gitlab.com/gitlab-org/gitlab/
        
       | jmkni wrote:
       | Noda time is very clean/well written IMO ->
       | https://github.com/nodatime/nodatime
        
       ___________________________________________________________________
       (page generated 2022-03-21 23:02 UTC)