[HN Gopher] Addition with flamethrowers - why game devs don't un...
___________________________________________________________________
Addition with flamethrowers - why game devs don't unit test
Author : jgalecki
Score : 37 points
Date : 2024-05-24 11:25 UTC (1 days ago)
(HTM) web link (www.pixelatedplaygrounds.com)
(TXT) w3m dump (www.pixelatedplaygrounds.com)
| shaftway wrote:
| I don't buy this argument. Most game developers I know have said
| that unit tests are a waste of time so they never use them, but
| they're struggling with making changes to utility code and making
| sure that it doesn't do the wrong thing. Y'know, what unit tests
| are for.
|
| I think the key here is that the perceived cost / benefit ratio
| is too high. It's the perception that drives their behavior
| though. I'm in a company now that has zero unit tests, because
| they just don't see the value in it (and in their case they may
| be right for a whole slew of reasons).
|
| Also, remember that games are not very long-lived pieces of
| software. You build it, release it, maybe patch it, and move on.
| If the game moves to version 2 then you're probably going to re-
| write most of the game from scratch. When you support software
| for a decade then the code is what's valuable, and unit tests
| keep institutional knowledge about the code. But with disposable
| software like games, the mechanics of the game and IP are what's
| valuable.
|
| Why would you write a unit test for something you know you're
| going to throw away in 6 months?
| Atotalnoob wrote:
| I am curious as to why your current company does not have unit
| tests. Do you mind sharing?
| shaftway wrote:
| We produce a library that gets included in software made by
| our clients, and we have several thousand clients. The uptake
| on new releases is low (most of the clients believe in "if it
| ain't broke, don't fix it"). So every release has the
| potential to live in the wild and need support for a long
| time.
|
| We're also in an industry with a ton of competitors.
|
| On top of that, the company was founded by some very junior
| engineers. for most of them this was their first or second
| job out of college. Literally every anti-pattern is in our
| codebase, and a lot of them are considered best practices by
| them. Unit tests were perceived as a cost with little
| benefit, so none were written. New engineers were almost
| always new grads to save on money.
|
| These facts combined make for an interesting environment.
|
| For starters, leadership is afraid to ship new code, or even
| refactor existing code. Partially because nobody knows how it
| works, partially because they don't have unit tests to verify
| that things are going well. All new code has to be gated by
| feature flags (there's an experiment right now to switch from
| try-finally to try-with-resources). If there isn't a business
| reason to add code, it gets rejected (I had a rejected PR
| that removed a "synchronized" block from around "return
| boolValue;"). And it's hard to say they're wrong. If we push
| out a bad release, there's a very real chance that our
| customers will pack up and migrate to one of our competitors.
| Why risk it?
|
| And the team's experience level plays a role too. With so
| many junior engineers and so much coding skill in-breeding,
| "best practices" have become pretty painful. Code is written
| without an eye towards future maintainability, and the
| classes are a gas factory mixed with a god object. It's not
| uncommon to trace a series of calls through a dozen classes,
| looping back to classes that you've already looked at. And
| trying to isolate chunks of the code is difficult. I recently
| tried to isolate 6 classes and I ended up with an interface
| that used 67 methods from the god object, ranging from
| logging, to thread management, to http calls, to state
| manipulation.
|
| And because nobody else on the team has significant
| experience elsewhere, nobody else really sees the value of
| unit tests. They've all been brought up in this environment
| where unit test are not mentioned, and so it has ingrained
| this idea that they're useless.
|
| So the question is how do you fix this and move forward?
|
| Ideally we'd start by refactoring a couple of these classes
| so that they could be isolated and tested. While management
| doesn't see significant value in unit tests, they're not
| strictly against them, but they are against refactoring code.
| So we can't really add unit tests on the risky code. The only
| places that you can really add them without pushback would be
| in the simplest utility classes, which would benefit from
| them the least, and in doing so prove to management that unit
| tests aren't really valuable. And I mean the SIMPLEST utility
| classes. Most of our utility classes require the god object
| so that we can log and get feature flags.
|
| I say we take off and nuke the entire site from orbit (start
| over from scratch with stronger principles). It's the only
| way to be sure. But there's no way I'm convincing management
| to let the entire dev team have the year they'd need to do
| that with feature parity, and leadership would only see it as
| a massive number of bugs to fix.
|
| In the meantime developer velocity is slowing, but management
| seems to see that as a good thing. Slower development
| translates into more stable code in their minds. And the
| company makes enough that it pays well and can't figure out
| what to do with the excess money. So nobody really sees a
| problem. Our recruiters actually make this a selling point,
| making fun of other companies that say their code is "well
| organized".
| Atotalnoob wrote:
| Thank you for the write up.
|
| That seems like a bad scenario with bad technical
| management. I am wondering if you have considered not
| trying to implement unit tests and think about end to end
| tests. This might be easier for antitesting people to buy
| into because it's directly ensuring your end users get the
| desired outcomes.
|
| It doesn't matter what bad terrible practices you have
| inside your library if the output is correct...
|
| If you input 1+1, and it outputs 5, it will be obvious how
| this can be an issue.
|
| What this will enable you to do is get some quick wins and
| make refactoring safer.
|
| If management still says no, I see 3 major choices.
|
| 1. Quit
|
| 2. Write your tests and keep them to yourself
|
| 3. Mind control
| other_herbert wrote:
| As a corollary to 2, management tends to love graphs...
| whatever your using to build should have a plugin that
| could show unit test success counts and generate even a
| simple line graph... that alone might be enough incentive
| to add more testing
| Atotalnoob wrote:
| I wouldn't use the term "unit test" if they are negative
| on the concept.
|
| Edit; in fact, don't say test at all. Talk about
| verification of the output
| shaftway wrote:
| We do have an integration test that runs just before
| releases. I've never seen it fail, even when something
| was obviously broken, so I question the utility of it.
| There's a specific person in charge of maintaining it.
|
| I've opted for option 4: continue to write code the way
| they want it written and keep cashing my paychecks. In
| the meantime there are tons of other improvements that
| I'm working on, some of which have a more direct impact
| on business revenue (which has a direct impact on my
| personal revenue).
| phito wrote:
| How do you stay sane working with clowns?
| andybak wrote:
| Many of the things he just described are a rational
| response to historical circumstance. It's fine to say
| "we're in a bad place" but that's not the same as saying
| "we're currently making bad decisions".
| HideousKojima wrote:
| I get paid good money to work with clowns
| feoren wrote:
| > Also, remember that games are not very long-lived pieces of
| software. You build it, release it, maybe patch it, and move
| on.
|
| This was true a couple decades ago. Nowadays many games are
| cash cows for decades. Path of Exile was released in 2013,
| Minecraft in 2011, and World of Warcraft in 2004, and all of
| those continue to receive regular updates (and have over the
| course of their lives) and still make plenty of money today.
| Dwarf Fortress has been in continual development since 2002!
| (Although probably not your ideal cash-flow model.)
|
| Or you have the EA Sports model where you use the same "engine"
| and just re-skin some things and re-release the same game over
| and over. There has been a new "Football Manager" game every
| year since 2005 -- do you really think they throw out all their
| code and start over every year?
| shaftway wrote:
| I maintain that the majority of games are still disposable,
| despite the occasional subscription model or long-lived hit
| that pops up. Remember that most games aren't made by AAA
| studios.
|
| Wasn't Minecraft completely rewritten from scratch in Java
| after a few years?
|
| And the EA one, like you said, it's just model updates. Very
| few gameplay mechanics get more than a simple tweak. Just
| recompile with the new models. You don't need unit tests if
| the code never changes.
| mark38848 wrote:
| I think Minecraft was originally written in Java and
| rewritten in a good programming language (i.e. not Java).
| grumpyprole wrote:
| Whether or not one thinks C++ is a "good" language, I
| always thought that (original) Minecraft busted the myth
| that blockbuster games had to be written in C++.
| CuriousSkeptic wrote:
| Being written in Java was probably instrumental in
| enabling the huge modding community around Minecraft.
| Which in turn was probably in large part responsible for
| its success.
| cobalt wrote:
| the original minecraft is in java, it's probably gone
| through a lot of code transformation. The version you're
| thinking of is the microsoft version, rewritten in c++
| oefrha wrote:
| You can add rigor to your decade-plus cash cow later, once
| it's clear that you've hit the jackpot.
| 2muchcoffeeman wrote:
| I still play games that came out a couple decades ago...
| tmtvl wrote:
| Let me guess... Super Metroid? Chrono Trigger? Final
| Fantasy VI? Ultima Underworld? Symphony of the Night?
|
| There were a few decent games released in the '80s and
| '90s.
| abdullahkhalids wrote:
| Valve became serious about software quality in Dota 2 around
| 2017 - about 7 years after launch. Before that game updates
| were accompanied with lots of bugs that would take weeks to
| fix. These days, there are still tons of bugs, but much better
| than before. They just released one of the biggest updates in
| the game's history this week, and there are hardly any bugs
| being reported.
|
| I am pretty sure there is some sort of automated testing
| happening that is catching these bugs before release.
| zubspace wrote:
| Reminds me of an article about the testing infrastructure of
| League and Legends [1] back in 2016. 5500 tests per build in
| 1 to 2 hours.
|
| Games are extremely hard to test. For me it falls into the
| same category like GUI testing frameworks which imho are
| extremely annoying and brittle. Except that games are
| comparable to a user interface consisting of many buttons
| which you can short and long press and drag around while at
| the same time other bots are pressing the same buttons,
| sharing the same state influenced by a physics engine.
|
| How do you test such a ball of mud which also constantly
| changes by devs trying to follow the fun? Yes you can
| unittest individual, reusable parts. But integration tests,
| which require large, time sensitive modules, all strapped
| together and running at the same time? It's mindboggling
| hard.
|
| Moreover if you're in a conceptual phase of development and
| prototyping and idea, tests make no sense. The requirements
| change all the time and complex tests hold you back. But the
| funny thing is, that game development stays in that phase
| most of the time. And when the game is done, you start a new
| one with a completely different set of requirements.
|
| There are exceptions, like League of Legends. The game left
| the conceptual phase many years ago and its rules are set in
| stone. And a game which runs successfully for that long is
| super rare.
|
| [1] https://technology.riotgames.com/news/automated-testing-
| leag...
| abdullahkhalids wrote:
| I doubt Dota 2 devs are writing code like this to test. The
| game is far too complicated, even more so than league, and
| changes a lot over the years, for this to be viable.
|
| Dota 2 and openai had a collaboration in 2018ish, and
| during this time the Dota 2 bots system was reworked
| completely. They already can generate videos of every spell
| in action [1], and I would assume this is done by asking AI
| bots to demonstrate the spell. My guess is that before
| pushing out an update, a human looks at these videos and
| other more complex interaction videos for every major
| change, along with relevant numbers (damage, healing,
| movement speed), and see if everything makes sense.
|
| I think this, because a lot of times recently, changes in
| one hero often cause an un-updated hero to break, because
| they had some backend similarity. And the patch is released
| with the bug.
|
| Then again, there is no public info, so all the above are
| wild speculations.
|
| [1] example https://www.dota2.com/hero/treantprotector
| duskwuff wrote:
| > They already can generate videos of every spell in
| action [1]
|
| I'm fairly certain those videos are all handmade. (Yes,
| all 500+ of them.) Notice that the videos for each hero
| are recorded in different locations on the map, and the
| "victim" hero isn't always the same.
| rhdunn wrote:
| I recall some Minecraft tests being saved worlds with
| redstone logic that will light a beacon green if it is
| working or red if not. That's usefull for games like that.
|
| For games like Starcraft 2 with replay functionality, you
| could probably record/use several matches and test that the
| behaviour matches the recorded behaviour. If you can make
| your game have a replay feature you can make use of this,
| even if you don't ship that replay code.
|
| For things like CYOA type games or decision trees, you
| could have a logging mechanism that prints out the choices,
| player stats, hidden stats, etc. and then have a way to run
| through the decisions, then check the actual log output
| against the expected output. -- I've done something similar
| when writing parsers by printing out the parse tree (for
| AST parser APIs) or the parse events (for reader/SAX parser
| APIs).
|
| I'm sure there are other techniques for testing other parts
| of the system. For example, you could test the rendering by
| saving the render to an image and comparing it against an
| expected image. IIRC, Firefox does something similar for
| some systems like the SVG renderer and the HTML paint code.
|
| Various of these features (replay, screenshots) are useful
| to have in the main game.
| zubspace wrote:
| You're right about parts, which are mostly state
| machines. The have a defined input and output. Tests are
| straightforward to implement and adjust.
|
| But recording and replaying matches? Taking screenshots
| and comparing the output? Just think about it: If you
| have recorded a match and change the hitpoints of a
| single creature, the test could possibly fail. And then?
| Re-record the match?
|
| The same applies to screenshots: What happens if models,
| sprites or colors change?
|
| In my experience, tests like this are annoying, because:
|
| 1) They take a long time to create and adjust/recreate.
|
| 2) They fail for minor reasons.
|
| 3) It takes time to understand, what such tests even
| measure, if someone else made them.
|
| 4) You need a large, self made framework to support such
| tests.
|
| 5) It takes a long time to run them, because they are
| time dependent.
|
| 6) They hinder you to make large changes.
|
| 7) It's cheaper to make some low wage game testers play
| your game. Or better, make the game early access and let
| 1000s of players test your game for free, while even
| making money out of them
| vlovich123 wrote:
| Yes, when you are trying to intentionally change the
| output, you simply regenerate the gold file to be used as
| reference (and yes, it should be easy). It's brittle for
| sure but it does catch unintentional changes and should
| be used where relevant (if sparingly). There are
| definitely existing frameworks that do this (eg Jest
| calls this snapshot testing and has tooling to make it
| easy).
|
| I'm sorry your experiences with this kind of stuff have
| been bad. I've generally had good experiences in the
| machine learning space where we used it judiciously where
| appropriate but didn't overdo it.
|
| I don't see how it can ever hinder you though - you can
| always choose to go "I don't care that the output has
| changed dramaticallly - it's the new ground truth" as
| long as you communicate that's what happening in your
| commit. What it doesn't let you do is that the output is
| different every time you run it but that's generally a
| positive (randomness should be intentionally injected
| deterministically).
| treflop wrote:
| I've seen people slog through untested code where they fear to
| make a change but I've also seen people slog through code with
| too much test coverage where the tests go through constant
| churn.
|
| I don't understand why people don't just add one test even if
| the codebase otherwise has zero tests if they're so scared of
| one area and I don't get why people keep adding excessive
| coverage if it's wasting their time.
|
| It's like people pick a stance and then stick with it forever
| when I couldn't care less how I've been doing something for 10
| years if today you showed me a better way.
| smrq wrote:
| This is the way. My work codebase has probably 5% unit test
| coverage -- it's frontend and a lot of it isn't sensible to
| unit test -- but I'm quite happy to have the tests we do. If
| it's nontrivial logic, just test it. If it isn't (it's
| trivial, it's aesthetic, whatever your reason)... just don't.
| wesselbindt wrote:
| >too much test coverage where the tests go through constant
| churn
|
| This doesn't sound so much as too much coverage but rather
| like having your automated tests be coupled to implementation
| details. This has a multitude of possible causes, for example
| too the tests being too granular (prefer testing at the
| boundary of your system). I've worked in codebases where
| test-implementation detail coupling was taken seriously, and
| in those I've rarely had to write a commit message like "fix
| tests", and all that without losing coverage.
| kbolino wrote:
| It feels like there are two levels of test writing
| proficiency. The first is writing the tests that have high
| benefit and low cost: e.g. pure functions with
| comprehensive tabular tests, simple method chains that have
| well defined sequential behavior and few dependencies, high
| value regression tests against detailed bug reports, etc.
| IMO it's harder to argue against writing these tests than
| to argue for writing them.
|
| Then there's the second level of proficiency, related to
| what you're discussing with "test-implementation detail
| coupling". This is the domain of high test coverage,
| repeatable end-to-end tests, automated QA, etc. I've always
| struggled with this next level and I've yet to work in any
| environment where it was done effectively (if at all). It's
| also harder to argue for this kind of testing because the
| tests often end up brittle and false negatives drown out
| the benefits.
|
| Moreover, most of the discourse centers around the first
| level of proficiency only and it's much harder to find
| digestible advice for achieving the second.
| bluefirebrand wrote:
| > This doesn't sound so much as too much coverage but
| rather like having your automated tests be coupled to
| implementation details
|
| Depending on how high coverage you are aiming for, I find
| it hard to imagine a way to achieve it without inevitably
| tying the tests to implementation details
| armchairhacker wrote:
| Even if the tests aren't coupled to implementation details,
| in most projects the specification itself goes through many
| changes. Furthermore, as the implementation is being
| changed, it stops depending on some lower-level helper code
| and requires new code with a different purpose; the tests
| in the old code turn out to be largely (albeit not
| entirely) a waste of effort.
|
| Changing specifications and code which turns out to be
| unnecessary aren't ideal. but I believe they're inevitable
| to some extent (unless the project is a narrow re-
| implementation of something that already exists). There are
| questions like "how will people use this product?" and
| "what will they like/dislike about it?" that are crucial to
| the specification yet can't be answered or even predicted
| very well until there's already a MVP. And you can't know
| exactly what helper classes and functions you will use to
| implement something until you have the working
| implementation.
|
| Of course, that doesn't mean all tests are wasted effort;
| development will be slower if the developers have to spend
| more time debugging, due to not knowing where bugs
| originate from, due to not having tests. There's a middle
| ground, where you have tests to catch probable and/or
| tricky bugs, and tests for code unlikely to be made
| redundant, but don't spend too long on unnecessary tests
| for unnecessary code.
| jrockway wrote:
| Testing is a continuum. I don't write a test for every change.
| Sometimes I spend a week writing tests for a simple change.
|
| I will say that I've never said "I wish I didn't write a test
| for that". I have also never said, "your PR is fine, but please
| delete that test, it's useless".
|
| I throw away a lot of code. I still test stuff I expect to
| throw away. That's because it probably needs to run once before
| I throw it away, and I can't start throwing it away until it
| works :/
|
| What it comes down to is what else you have to spend your time
| on. Sometimes you need to experiment with a feature; get it out
| to customers, and if it's buggy and rough around the edges,
| it's OK, because you were just trying out the idea. But
| sometimes that's not what you want; whatever time you spend on
| support back and forth finding a bug would have been better
| spent not doing that. The customer needed something rock solid,
| not an experiment. Test that so they don't have to.
|
| There are no rules. "Write a test for every change" is just as
| invalid and unworkable as "Never write any tests". It's a
| spectrum, and each change is going to land somewhere different.
| If you're unsure, ask a coworker. I have been testing stuff for
| 20+ years, and I usually guess OK (that is when I take a
| shortcut and don't test as much as I should, it's rarely the
| thing that caused the production outage), but a guess is just
| that, a guess. Solicit opinions.
| withinboredom wrote:
| Also, non-testable code is often faster (as in cpu time).
| whatasaas wrote:
| Seems like an excuse that might be fine for small indie teams for
| a while. The blog certainly blurs the lines between unit and
| functional tests. In the end, even modest code coverage can pay
| off. Tests help with code review, understanding the codebase, and
| can provide an easy map for debugging. But if you're in an
| environment where everyone is constantly demanding changes and
| only testing the happy path, then good luck.
| fwlr wrote:
| Arguments against testing tend to fall prey to the von Neumann
| Objection: they insist there is something tests can't catch, and
| then they tell you precisely what it is that tests can't catch...
| so you can always imagine writing tests for that specific thing.
|
| E.g. this article uses an example of removing the number 5,
| causing the developer to have to implement a base-9 numbering
| system. Unit tests that confirm this custom base number system is
| working as expected would be extremely reassuring to have.
| Alternatively, you could keep the base-10 system everyone is
| familiar with, and just have logic to eliminate or transform any
| 5s. This would normally be far too risky, but high coverage
| testing could provide strong enough assurance to trust that your
| "patched base-10" isn't letting any 5s through.
|
| The same is true for the other examples - unit testing feels like
| the first thing I'd reach for when told about flaming numbers.
| magoghm wrote:
| Tests can't catch race conditions in multithreaded code. Now
| that I told you what the tests can't catch, can you imagine
| writing tests for that specific thing?
| a_t48 wrote:
| I've written tests around multithreaded code, but they
| typically catch them in a statistical manner - either running
| a bit of code many times over to try and catch an edge
| condition, or by overloading the system to persuade rarer
| orderings to occur.
|
| There's also
| https://clang.llvm.org/docs/ThreadSafetyAnalysis.html which
| can statically catch some threading issues, though I've not
| used it much myself.
| jonex wrote:
| tsan will catch a bunch of potential race conditions for you,
| under the condition that you run it somehow. How to make sure
| it's run? Well, add a test for the relevant code and add it
| to your tsan run in your CI and you'll certainly catch a
| bunch of race conditions over time.
|
| This has saved me a bunch of times when I've be doing work in
| code with proneness to those kind of issues. Sometimes it
| will just lead to a flaky test, but the investigation of the
| flake will usually find the root cause in the end.
| mistercow wrote:
| I've written tests to do exactly that, by adding carefully
| placed locks that allow the test to control the pace at which
| each thread advances. It's not _fun_ but you can do it.
| magoghm wrote:
| Doesn't inserting locks affect the memory hierarchy
| consistency mechanisms and therefore interfere with
| possible race conditions?
| mistercow wrote:
| That's not a situation I've encountered but "race
| condition" is an extremely broad category.
| distortionfield wrote:
| > Tests can't catch race conditions in multithreaded code.
|
| Citation needed.
|
| > can you imagine
|
| Yes I can, because several languages have tooling built
| specifically for finding those race conditions.
|
| If you built it, you can test it. If you can't test it, you
| don't understand what you built.
| jayd16 wrote:
| The lesson is more about the degree of churn and how game rules
| are not hard rules. A valid base 9 number system is NOT a
| design goal and doing that work can be a waste.
|
| It's like testing that the website landing page is blue. Sure
| you can but breaking that rule is certainly valid and you'll
| end up ripping out a lot of tests that way.
|
| Now, instead of calcifying the designer's whims, testing should
| be focused around things that actually need to make sense, ie
| abstract systems, data structures etc etc.
| fwlr wrote:
| Tests that "calcify the designer's whims" - great way to put
| it - can be quite useful if your job description happens to
| be "carrying out the whims of the designer" (and for many of
| us, it is!)
|
| With high coverage and dry-ish tests, changing the tests
| _first_ and seeing which files start failing can function as
| a substitute for find+replace - by altering the tests to
| reflect the whims, it'll tell you all the places you need to
| change your code to express said whims.
| HideousKojima wrote:
| Nah, my objection to unit testing is that too often it devolves
| into what I call "Testing that the code does what the code
| does." If you find yourself often writing code that also
| requires updating or rewriting unit tests, your tests are
| mostly worthless. Unit tests are best for when you have a
| predefined spec, or you have encountered a specific bug
| previously and make a test to ensure it doesn't reoccur, or you
| want to make sure certain weird edge cases are handled
| correctly. But the obsession with things like 100% unit test
| coverage is a counterproductive waste of time.
| fwlr wrote:
| I partially agree - I would say more specifically "those
| situations are the easiest to write good tests for", ie
| having a predefined spec will strongly guide you towards
| writing good and useful tests.
|
| "Testing that the code does what it does" is of course a
| terrible waste of both the time spent writing those tests,
| and of future time spent writing code under those tests. With
| skill and practice at writing tests, you make that mistake
| less often. Perhaps there's a bit of a self-fulfilling
| prophecy for game developers: due to industry convention,
| they're unfamiliar with writing tests, they try writing
| tests, they end up with a superfluous-yet-restrictive test
| suite, thus proving the wisdom of the industry convention
| against testing.
| kelseydh wrote:
| Most video game bugs are subtle and not things that are easy to
| catch with unit testing because they are dynamic systems with
| many interacting parts. The interaction is where the bugs come
| from.
|
| QA processes do a good job catching the rest.
| lionkor wrote:
| I would bet that a lot of those bugs come from utility code
| that is testable
| Aerroon wrote:
| Perhaps in development, but the stuff that tends to make it
| into the release of games seems to be gameplay related. Npc
| behaviors not lining up, the developer literally not
| implementing certain stats in the game (looking at you,
| Diablo 4), graphical bugs caused by something not loading or
| loading too slowly, performance issues from something loading
| 1000 copies of itself etc.
| riffraff wrote:
| I am not convinced of the argument that games change a lot.
|
| I do buy the argument that the trade off between effort and value
| is different, but that's because it's harder to unit test user
| interactions than it is to unit test a physics engine.
|
| It's more or less the reason in the early life of the web few did
| end to end testing involving browsers, or unit tested iOS apps in
| the first releases of the iPhone.
| mistercow wrote:
| I've found that the one thing you can always count on engineers
| to do is to dismiss sensible tools from adjacent domains using
| flimsy, post hoc justifications.
|
| _All_ product development involves poorly defined boundaries
| where the product meets the user, where requirements shift
| frequently, and where the burdens of test maintenance have to be
| weighed against the benefits.
|
| You don't throw out all of unit testing because it doesn't work
| well for a subset of your code. You throw out all of unit testing
| because writing tests is annoying, none of your coworkers have
| set it up, and the rest of your industry doesn't do it, so you
| feel justified in not doing it either.
| stouset wrote:
| Right. And _because_ the rest of the industry isn't doing it,
| there's no institutional knowledge of how to do it well. So
| someone tries it, they do a crap job of it out of
| understandable ignorance, and rather than taking forward any
| lessons learned the effort is discarded as a waste of time.
| jayd16 wrote:
| So wait, who's being dismissive of who's practices here?
| mistercow wrote:
| I didn't say engineers are dismissive of other engineers'
| practices. The general pattern is "that makes sense for your
| field, but we can't use it because..." followed by silly
| reasons.
|
| I was guilty of this myself back when I was an indie dev. It
| took me an embarrassingly long time, for example, to admit
| that git wasn't just something teams needed to coordinate,
| and that I should be using it as the sole developer of a
| project.
| eviks wrote:
| Interesting, was the impetus for change some big issue
| you've run into? Or just a gradual accumulation of
| knowledge about other people experiences made you
| reconsider? Or something else?
| mistercow wrote:
| It was a long time ago, but it was probably having to
| switch from major feature work to emergency bug fixes
| that finally became painful enough for me to acknowledge
| that manual backups weren't going to cut it.
| jrockway wrote:
| Bugs are kind of the fun part of games. If every subroutine
| worked perfectly, you wouldn't have the chaos of real life. Some
| of players favorite mechanics are just bugs. (Overwatch example:
| Mercy's super jump, now a legitimate predictable mechanic that
| everyone can do, not just people that read the forums and watch
| YouTube videos about bugs. It started out as a bug, and it was so
| cool and skill-ceiling increasing that now it's just part of the
| game.)
|
| Having said that, sometimes you need unit tests. Overwatch had
| this bug where there is an ultimate ability called "amplification
| matrix" that is a window that you shoot through and the bullets
| do twice as much damage. One patch, that stopped working. This
| kind of issue is pretty easy to miss in play testing; if you're
| hitting headshots, then the bullets are doing the 2x damage they
| would if they were body shots that got properly amplified. If is
| very hard to tell damage numbers while play testing (as evidenced
| by how many patches are "we made character X do 1 more damage per
| bullet", and it smooths things out over the scale of millions of
| matches, but isn't really that noticeable to players unless
| breakpoints change). So for this reason, write an integration
| test where you set up this window thingie, put an enemy behind
| it, fire a bullet at a known point, and require.Equals(damage,
| 200). Ya just do it, so you don't ship the bug, make real people
| lose real MMR, and then have to "git stash" that cool thing
| you're working on today, check out the release branch, and
| uncomment the code that makes the amp matrix actually work. Games
| are just software engineering. Fun software engineering. But it's
| the same shit that your business logic brothers and sisters are
| working on.
|
| (Overwatch also had a really neat bug, that the community
| believes was due to a x == 0 check instead of an x < 0 check. If
| you pressed the right buttons while using Bastion's ultimate, you
| had infinite ammo. Normally it fires 3 air strikes, but if you
| got that counter to decrement twice and skip the == 0 check, then
| you got to do it an infinite number of times. (Well, actually
| 2^32 or 2^64 times. Eventually you'd integer overflow and have
| the chance to hit 0 again. Anyway, this was absolutely hilarious
| whenever it happened in game. The entire map would turn into a
| bunch of targets for artillery shells, the noise to alert you of
| incoming missiles would play 100 times more than normal, and it
| was total chaos as everyone on your team died. And not even that
| gamebreaking; both teams have the option to run the same
| characters, so you could just do it back to your opponent. Very
| fun, but they fixed the bug quickly.
|
| Follow up follow up: all of these silly bugs are in ultimates,
| which come up the least often of all abilities in the games.
| That's what happens with playtesting. You don't get test coverage
| where you need it. A test you write covers the stuff you're most
| scared about. A careful engineer that likes testing would have
| never shipped these.)
| AndyPa32 wrote:
| > Games are just software engineering. Fun software
| engineering.
|
| I do question the "fun" part. Midnight crunches, unpaid
| overtime and - as far as I have read - some of the worst
| working conditions in all of software engineering. I pass.
| jrockway wrote:
| That is probably true, but you know you suffer for your art
| and all that. People don't really like software, but they
| love games. We know that games are just software, but it's so
| fun, that people forget that. It's pretty cool. Though to me,
| I kind of like getting 8 hours of sleep a night and playing
| other people's games. While getting paid more :/
| rkachowski wrote:
| It's an interesting idea, but here you have the game designer
| taking the place of the product manager stereotype - coming up
| with bizarre unfeasible ideas and the programmer is to make it
| happen.
|
| In any games company I've worked for the designer is responsible
| for mapping and balancing the rules and mechanics of the game,
| they would provide a specification of what "red vs blue numbers"
| would look like and a balanced idea of how to remove the number 5
| from the game (balancing and changing the rules like this being
| entirely within the domain of game design). incidentally any game
| company I've worked at has had an extensive set of test suites.
| epgui wrote:
| What nonsense that is... The idea that intentional mathematical
| design / correct, well-specified behaviour doesn't apply to games
| is absurd.
| SillyUsername wrote:
| Ok I know which AAA game studio this might be because I
| interviewed with them and had to sign an NDA.
|
| In their case their flagship game is full of bugs, and they had
| to ship their product asap pre-aquistion when they were a
| startup.
|
| Because of the mentality of the managers, and weak minded devs,
| they don't write unit tests, and instead spend the vast majority
| of their days fighting bugs, so much so they have to hire
| dedicated staff for their (single game) backlog as they were
| struggling to keep up "with its success".
|
| This is BS of course, I saw their backlog and it was a shit show,
| with Devs expected to work overtime free of charge to get actual
| features out (funny how this works isn't it, never affects the
| business execs' time/life who make the demands of no tests).
|
| I was asked what I would bring to the company to help them
| support their now AAA game, and I stated up front "more unit
| tests" and indirectly criticised their lack of them. I got a call
| later that day that (the manager thought) "I would not be a good
| fit".
|
| I got a lead job elsewhere that has the company's highest
| performing team, literally because of testing practices being
| well balanced between time and effectiveness (i.e. don't bother
| with low value tests, add tests if you find a bug etc, if an
| integration test takes too long leave it and use unit tests).
|
| I think back to that interview every time I interview at games
| studios now, and wonder if I shouldn't push unit tests if they're
| missing. I'd still do it. The managers at that job were assholes
| to their developers, and I now recognise the trait in a company.
| Joel_Mckay wrote:
| 1. Most game engines have the horrible compatibility layers
| abstracted away, and already fully tested under previous mass
| deployments
|
| 2. Anything primarily visual, audio, and control input based is
| extremely hard to reliably automate testing. Thus, if the
| clipping glitches are improbable and hardly noticeable... no one
| cares.
|
| Some people did get around the finer game-play issues by simply
| allowing their AI character to cheat. Mortal Kombat II was famous
| for the impossible moves and combos the AI would inflict on
| players... yet the release was still super popular, as people
| just assumed they needed more practice with the game.
|
| Have fun out there, =)
| throwaway115 wrote:
| Having written a 3d game engine from scratch, I had automated
| tests, but they were more comparable to "golden" tests, which are
| popular in the UI test world. Basically, my renderer needed to
| produce a pixel-perfect frame. If a pixel didn't match, an image
| diff was produced. This saved my butt numerous times when I broke
| subtle parts of the renderer.
| vmaurin wrote:
| > why game devs don't unit test
|
| Sources ?
| PlunderBunny wrote:
| I wrote a game using a 'bottom up' design (i.e. IO and business
| logic first), and I wrote unit tests for the business logic as I
| went. With no UI, I effectively tested and stepped through my
| code with unit tests. I had the luxury of working by myself at my
| own pace.
|
| I have a reasonably clean separation between the UI and the rest
| of the code, but I don't have any unit tests for the UI (I think
| - correct me if I'm wrong here - that would require integration
| tests rather than unit tests?) What I'm trying to say is that, if
| you don't do it this way around, and/or you have multiple
| programmers writing the game at once, and/or you _really_
| optimise for performance, I can imagine that would make it much
| harder to write unit tests.
| smokel wrote:
| "It's _never_ good to be dogmatic. "
|
| In some situations unit tests can be very effective and useful,
| such as in testing complex algorithms, or in code bases where
| some serious refactoring is required, and where one don't want to
| break existing behavior. In backend development, where user
| facing output is limited, there is typically no other practical
| way to check that things are working properly.
|
| However, in games, and typical front-end development, especially
| in its early stages, it can be beneficial to be as flexible as
| possible. And however way you put it, unit tests simply make your
| code more rigid.
|
| In the latter situation, some people prefer guard rails and find
| that they are more flexible with unit tests in place. Others
| prefer not to care about unit tests and attain higher
| productivity without them.
|
| Only when an application grows to a certain size where a
| developer does not naturally inspect typical behavior all day,
| and if quality is important, it starts to make sense to put in
| automated testing, because it is simply more cost effective.
|
| Similar reasoning goes for dynamic vs static typing.
|
| It seems that some people think that everyone should _always_ use
| the same approach for any kind of software development, because
| it worked for them at some point in time. Over time I have grown
| a preference to avoid working with such people.
| follower wrote:
| For an alternative perspective on testing & game development,
| here's a video I've seen from few years ago:
|
| * "Automated Testing of Gameplay Features in 'Sea of Thieves'":
| https://www.youtube.com/embed/X673tOi8pU8?si=uj_lcMEC9nvMpa6...
|
| ~via https://www.gdcvault.com/play/1026366/Automated-Testing-
| of-G... :
|
| "Automated testing of gameplay features has traditionally not
| been embraced by the industry, due to the perceived time required
| and difficulty in creating reliable tests. Sea of Thieves however
| was created with automated testing for gameplay from the start.
| This session explains why automated testing was the right choice
| for Sea of Thieves and how it could benefit your game. It shows
| the framework that was built by Rare to let team members create
| automated tests quickly and easily, the different test types we
| built, and what level of test coverage was found to be
| appropriate. The session also contains best practices for making
| tests work reliably and efficiently, using clear worked through
| examples."
|
| Looks like there's also related talks in later years (which may
| or may not be currently available as free-to-view--I've not
| watched these ones):
|
| * "Lessons Learned in Adapting the 'Sea of Thieves' Automated
| Testing Methodology to 'Minecraft'":
| https://www.gdcvault.com/play/1027345/Lessons-Learned-in-Ada...
|
| * "Automated Testing of Shader Code" (GDC 2024):
| https://schedule.gdconf.com/session/automated-testing-of-sha...
| frou_dh wrote:
| This makes me think of the claim you sometimes see that memory-
| safety is not that relevant for game development because in many
| cases games aren't security-sensitive software. But even putting
| security vulnerabilities aside completely, plain old memory
| corruption can be a major drag when it rears its head (and can
| even kill projects if the game can't be wrangled into being
| crash-free by the deadline). This particularly applies to games
| with huge codebases and numbers of programmers.
| larsrc wrote:
| Disclaimer: never was a game dev
|
| You're conflating unit tests and functional/integration tests
| there. A unit test should test that a single
| function/method/class does what it's expected to do. The game
| design changes should change how they are put together, but not
| often what they do. If your setThingOnFire() method suddenly also
| flips things upside down, you're going to have a bad day. Instead
| your callers should add calls to flipThingUpsideDown().
| bentt wrote:
| I don't see why design needs to have so much impact on whether
| there are Unit Tests. Most unit tests should be much lower level
| than anything which would be balanced or user tested out of
| relevance. You want stuff like "can the player character jump and
| clear these sets of obstacles which we are building all of our
| levels with?", and then a nice script that takes prerecorded
| input and sees if the outcome is determinant. So, this way, if
| someone inadvertantly changes gravity, or friction, or whatever
| in the basic systems that determine locomotion, you'll catch it
| early.
|
| Now, do most game makers take the time to do this? No, because
| they will likely have a lot else to do and make an excuse not to
| do it. However, for the most vital tech foundations, it is a good
| idea.
|
| What gamedev does tend to do more often is smoke testing. Just
| load up each level and see if it runs out of memory or not.
| Automated testing on build to see if something broke. It's less
| granular than unit testing, but when you're building over and
| over in the heat of a closeout on a project, this type of thing
| can tease out a breaking bug early as well.
|
| Overall, I like the title of the OP article, but not much that's
| said within.
___________________________________________________________________
(page generated 2024-05-25 23:02 UTC)