[HN Gopher] What we learn from twitch source code leak
       ___________________________________________________________________
        
       What we learn from twitch source code leak
        
       Author : everlastingbits
       Score  : 32 points
       Date   : 2021-10-10 19:47 UTC (3 hours ago)
        
 (HTM) web link (everlastingbits.com)
 (TXT) w3m dump (everlastingbits.com)
        
       | thdc wrote:
       | At least for me, I never expected production code to be this bad
       | until I spent a few years working professionally.
       | 
       | One company I worked at had poor code quality, but I chalked it
       | up to poor engineering culture - a lot of outsourced crap code
       | and no time dedicated to go back to clean it up.
       | 
       | Another company also had a code base with terrible quality but I
       | figured it was because it was a startup and they were rushing to
       | get features out.
       | 
       | None of the companies I've worked for are big ones, and I've
       | always wondered if these levels of quality were the norm, and
       | leaks like this really give me insight to what would otherwise be
       | private code.
       | 
       | Of course, Twitch was also a startup so maybe this was a hack
       | that made it into production as the author notes, although I
       | don't imagine a better solution would've taken much more effort
       | to implement.
       | 
       | Well you could also argue that I'm the common denominator here
       | and the code isn't poor per se, it's just that my expectations
       | were too high when coming into the professional scene - which is
       | what I'm leaning towards currently - I don't really know what to
       | think.
        
         | everlastingbits wrote:
         | I'm also super interested about how code looks like in elite
         | organization! In my previous work place I felt exactly the same
         | as you.. Lucky for me there are many seniors with years of
         | experience under their belts where I'm currently working.
        
           | mftb wrote:
           | I think finding mentors is a great idea, but I think one
           | thing you, and the parent comment are missing is, an "elite
           | organization" in what arena? Of what sort? For a truly
           | different software development process, with totally
           | different requirements, checkout stuff like this -
           | https://www.fastcompany.com/28121/they-write-right-stuff.
           | There are other similar write-ups from different kinds of
           | organizations out there as well.
        
         | jandrewrogers wrote:
         | Beautifully crafted production code exists but the necessary
         | conditions are almost never consistently available -- it
         | requires a lot of time and energy to fight code entropy. Even
         | most people that write software don't care that much; like the
         | business, anything that meets the acceptance criteria is
         | usually considered "good enough". And most professional,
         | production code is the accumulated cruft of that reality.
         | 
         | In the rare cases where you do see elegant, clean, high-quality
         | production code, it is usually because 1) only a very small
         | number of people are writing that code, 2) there is a strong
         | cultural norm of rigorous code hygiene among those people, 3)
         | they do not have prohibitive time pressure to ship, and 4) the
         | requirement to maintain backward compatibility across versions
         | of the code is weak. In practice, these conditions are very
         | rare. I've mostly seen it in what is essentially hobby code,
         | where the craft was a large part of the objective (and hobby
         | code that becomes production code tends to quickly take on the
         | characteristics of other production code).
         | 
         | There simply aren't enough people that care about code quality
         | for its own sake, and the economics of maintaining very high-
         | quality code rarely makes sense in practice.
        
       | thinkingkong wrote:
       | What we can really learn (or remind ourselves) is that code
       | "quality" is rarely representative of company value.
        
         | Sebb767 wrote:
         | People pay you for your product, not what's running under the
         | hood. As long as you can sustain feature development and/or
         | enough pull, code quality is absolutely irrelevant.
         | 
         | Also, any large codebase will develop some wharts due to
         | circumstances you can't see when simply reading the code.
        
           | bserge wrote:
           | Huh, maybe the thinking that everything has to be perfect is
           | rooted in products where people _do_ pay for what 's under
           | the hood? I.e. literally cars.
        
             | TeMPOraL wrote:
             | Cars are just a ruthlessly optimized collection of parts
             | that, individually, few people understand. Not unlike
             | software. Unless you're a specialist, you wouldn't know
             | "bad insides" of a car from "good insides" if you took a
             | look under your car's hood.
             | 
             | The way I see it, programmer perfectionism comes from
             | learning - the projects you do for yourself, early on, are
             | small enough that you can hit perfect trade-offs on your
             | limited needs, and you aren't under time pressure. You
             | quickly learn you _can_ achieve near-perfection - a lesson
             | which stops applying when working with any nontrivial
             | codebase at work.
        
             | burnt_toast wrote:
             | There's plenty of cars out there that mechanics think are
             | designed like junk yet drivers (users) still buy / use
             | them.
             | 
             | (Not sure where I was going with this)
        
               | bserge wrote:
               | Even then, buyers care about performance.
        
         | ferdowsi wrote:
         | "Customers aren't paying you for beautiful code" was how this
         | was phrased to me to a manager.
         | 
         | If the Twitch codebase hadn't leaked, how many people would
         | have known that their naughty-word-blocker was implemented in a
         | less than perfect fashion?
        
         | dijit wrote:
         | I hate that you're being downvoted for this point.
         | 
         | This snippet is not the biggest problem with the twitch leak, I
         | spooled through the terraform and a lot of it is awful. But it
         | obviously works for them.
         | 
         | I had a overwhelming realisation that code quality really
         | doesn't matter, I spend a lot of time making my code clean and
         | easy to understand- and that might improve _my_ ergonomics. But
         | realistically my company is not bigger than twitch and if they
         | can operate without major problems with this code then
         | obviously it doesn't matter.
        
       | i_like_apis wrote:
       | Sometimes what works is the right choice. Duct tape doesn't look
       | nice but it gets the job done and fast.
        
       | planb wrote:
       | Instead of arguing if this implementation is good or bad, can we
       | please talk about the elephant in the room? What exactly is this
       | snippet supposed to do? Filter out ,,bad" usernames by exact
       | match? How did they even come up with this list? There's infinite
       | possibilities to spell profanity or insulting phrases, so aren't
       | they fighting windmills here?
        
       | [deleted]
        
       | platz wrote:
       | > Is that all there is to it? A txt to database solution?
       | 
       | > This whole check should not even be in SQL.
       | 
       | > Similarly, for our original problem we should have a function
       | or class which purpose is to check whether a word matches against
       | many regex phrases.
       | 
       | Interesting conclusion. A Very confusingly-worded article.
        
         | everlastingbits wrote:
         | Thanks for pointing that out! English not my first language.. I
         | changed all of the above into something better, I hope. :D
        
       | politelemon wrote:
       | > It's stored as a list in a txt file.
       | 
       | It is not. You've misunderstood that code snippet.
        
       | 29athrowaway wrote:
       | What you should learn about that leak is that:
       | 
       | - Bad code lives in the dark. Code that is visible by many others
       | never looks like this.
       | 
       | - Shame is good to some extent. It keeps people accountable.
       | 
       | - When closing tickets is more important than actually making
       | real improvements, code looks like this.
        
       | trjordan wrote:
       | A database is the wrong solution to this problem. The problem is
       | spam detection and blocking.
       | 
       | This solution is simple and good. A developer on the receiving
       | end of a "add an item to this list" bug report has a clear, well-
       | trodden way to do it: update the text file. They have a modern
       | editor that handles big files just fine. They frequently deploy
       | new code to production, including testing it.
       | 
       | Moving to a DB means that it needs an interface in production,
       | because presumably Twitch doesn't let their devs run random
       | commands on production boxes. It means ACLs and a sync between
       | dev and prod. It means another moving piece when Twitch spins up
       | a disaster recovery test.
       | 
       | The full-powered, industry-standard solution is an AI-based spam
       | detector with some sort of lag on blocking (e.g. shadow banning).
       | This requires inputs to train on, such as reports from Twitch
       | moderates, which might need oversight. It requires ML engineers
       | and quality heuristics. This stuff is getting cheaper, but it's a
       | hell of a lot more expensive than an ugly text file.
        
       | outime wrote:
       | This snippet has been constantly criticized and somehow it makes
       | me think that the people who loudly complain about it have never
       | worked in any kind of big company - which is fine but it shows a
       | lack of empathy, experience or both. You can go to any big
       | company around the world and you'll find duct tape because you
       | know what, if it works, it isn't terrible to use/update and
       | doesn't leak millions in lost revenue then there's no valid
       | reason to spend dev time on that.
       | 
       | This most likely was a temporary solution that grew over time and
       | was left like it, they just kept adding new combinations. It
       | works. If you want to add a new combination you just add it and
       | run a migration (or similar, I've not worked on Twitch but I
       | assume it's not the most difficult thing in the world if it's
       | still like this).
       | 
       | I understand that this can feel like an itch that you really need
       | to scratch (I also feel it) but if you literally see the source
       | codes of any big company you'll find stuff that anyone can
       | criticize from their chair with little effort. Let's now go build
       | a tremendously successful platform like Twitch without cutting a
       | single corner ever and see how that goes.
        
         | PragmaticPulp wrote:
         | > because you know what, if it works, it isn't terrible to
         | use/update and doesn't leak millions in lost revenue then
         | there's no valid reason to spend dev time on that.
         | 
         | Really bad code grinds progress to a halt when developers have
         | to spend all of their time fixing tech debt or working around
         | fragile codebases to get anything done. However, I don't see
         | anything in this code that would match that description. In
         | fact, anyone can take one look at this and know exactly what it
         | does and exactly how to modify it.
         | 
         | Counterintuitively, perfectly good code also grinds progress to
         | a halt if developers become too focused on doing things the
         | "right" way instead of shipping reasonable code that works.
         | Developers who get lost pursuing a platonic ideal of the
         | perfect code will be perpetually disappointed inside of real-
         | world constraints. Perfect is the enemy of good.
         | 
         | It's definitely not fair to criticize a single snippet
         | extracted from an entire company's code base. It's turning into
         | a cheap way to dunk on a company from the sidelines while
         | ignoring that the company has done a good job of scaling a
         | video delivery platform and community to a massive number of
         | users.
        
           | UncleMeat wrote:
           | This really depends. There is tech debt and there is tech
           | debt. Sometimes you've got bad code that is still well
           | isolated so the pain and suffering it causes is understood
           | and contained. Sometimes you've got bad code that is poorly
           | isolated and it causes unexpected and strange problems all
           | the time.
           | 
           | This code doesn't feel like the latter.
        
         | AmericanChopper wrote:
         | I thought the funny bit was that we've all done something like
         | that before...
        
         | jalino23 wrote:
         | this is true, the same people who complained about this are
         | prob the one with least experience
        
         | vinay427 wrote:
         | To be fair, at least as of this writing, TFA does end with this
         | which seems to align with your view:
         | 
         | > The programmers who made that probably know all of that, or
         | at least that their implementation isn't ideal They made that
         | not because they are stupid or unprofessional, rather because
         | they are employees working under pressure coming from their
         | managers, bosses, and their deadlines. What was suggested as a
         | "temporary" solution, turned into the permanent.
        
       | IshKebab wrote:
       | Uhm wasn't this code part of a script to gather training data for
       | AI? If so it would be run offline and not as part of the actual
       | site. I think it's perfectly reasonable for a one-off script to
       | be a bit hacky.
        
       | rad_gruchalski wrote:
       | NOTHING. There, I answered it for you.
       | 
       | Wow, people have problems. Does it work? It does. How often is
       | this executed? On sign up and maybe nick change. Is it
       | encapsulated? Yes, it is. Move along, nothing to see here.
        
       ___________________________________________________________________
       (page generated 2021-10-10 23:01 UTC)