[HN Gopher] Object-Oriented Entity-Component-System Design
       ___________________________________________________________________
        
       Object-Oriented Entity-Component-System Design
        
       Author : agluszak
       Score  : 105 points
       Date   : 2021-08-16 15:09 UTC (7 hours ago)
        
 (HTM) web link (voxely.net)
 (TXT) w3m dump (voxely.net)
        
       | royjacobs wrote:
       | Hmm, I don't get it. Isn't one of the major reasons to use ECS to
       | exploit mechanical sympathy, i.e. you make sure that cache lines
       | are always full, you can use SIMD for calculations, etc.
       | 
       | All of this is pretty much out of the window for this design, so
       | what is the benefit of the ECS here?
        
         | ScoobleDoodle wrote:
         | I think the component instances per type are in the cache
         | coherent memory location. So all physics components are cache
         | coherent with each other. And the rendering instances with each
         | other.
         | 
         | At the individual object level the different components are not
         | cache coherent. So the rendering and physics instances of one
         | object are not in any memory coherent location.
         | 
         | Because the physics will do it's SIMD to resolve the mutual
         | state. And then rendering will do the SIMD for their aggregate.
        
           | Narishma wrote:
           | What do you mean by cache coherent?
        
           | royjacobs wrote:
           | But in the system proposed in the article there are virtual
           | function calls everywhere, which will surely trash any cache?
           | Not to mention these function calls cannot even be inlined.
        
       | kvark wrote:
       | What I found in practice is that many people start using the ECS
       | for speed, and then draw themselves into a corner of the design
       | space. Now they have to weight every step on "how would it work
       | in ECS?", and dedicate effort to fight the ECS paradigm.
       | 
       | For example, a mesh may contain multiple materials. Is each
       | material chunk a separate entity? Or maybe each bone in a
       | skeleton is a separate entity with its own "parent" and
       | "transform" plus other component.
       | 
       | One of the different approaches is component graph systems [1].
       | It lacks the ability to mix and match components, but provides a
       | more natural (and simpler) model to program for.
       | 
       | [1] https://github.com/kvark/froggy/wiki/Component-Graph-System
        
         | [deleted]
        
         | meheleventyone wrote:
         | Interesting, this is the same approach as Godot (a tree of
         | nodes) by the looks of things?
        
         | serverholic wrote:
         | I saw this project awhile ago and think its really interesting.
         | I hope people explore this more in the future.
        
       | nodivbyzero wrote:
       | Check this out. https://github.com/skypjack/entt
       | 
       | EnTT is a header-only, tiny and easy to use library for game
       | programming and much more written in modern C++. Among others,
       | it's used in Minecraft by Mojang, the ArcGIS Runtime SDKs by Esri
       | and the amazing Ragdoll.
        
         | elteto wrote:
         | Tiny? In what world is this tiny?
         | 
         | entt.hpp:
         | 
         | #include "config/version.h" #include "core/algorithm.hpp"
         | #include "core/any.hpp" #include "core/attribute.h" #include
         | "core/family.hpp" #include "core/hashed_string.hpp" #include
         | "core/ident.hpp" #include "core/monostate.hpp" #include
         | "core/type_info.hpp" #include "core/type_traits.hpp" #include
         | "core/utility.hpp" #include "entity/component.hpp" #include
         | "entity/entity.hpp" #include "entity/group.hpp" #include
         | "entity/handle.hpp" #include "entity/helper.hpp" #include
         | "entity/observer.hpp" #include "entity/organizer.hpp" #include
         | "entity/poly_storage.hpp" #include "entity/registry.hpp"
         | #include "entity/runtime_view.hpp" #include
         | "entity/snapshot.hpp" #include "entity/sparse_set.hpp" #include
         | "entity/storage.hpp" #include "entity/utility.hpp" #include
         | "entity/view.hpp" #include "locator/locator.hpp" #include
         | "meta/adl_pointer.hpp" #include "meta/container.hpp" #include
         | "meta/ctx.hpp" #include "meta/factory.hpp" #include
         | "meta/meta.hpp" #include "meta/node.hpp" #include
         | "meta/pointer.hpp" #include "meta/policy.hpp" #include
         | "meta/range.hpp" #include "meta/resolve.hpp" #include
         | "meta/template.hpp" #include "meta/type_traits.hpp" #include
         | "meta/utility.hpp" #include "platform/android-ndk-r17.hpp"
         | #include "poly/poly.hpp" #include "process/process.hpp"
         | #include "process/scheduler.hpp" #include "resource/cache.hpp"
         | #include "resource/handle.hpp" #include "resource/loader.hpp"
         | #include "signal/delegate.hpp" #include "signal/dispatcher.hpp"
         | #include "signal/emitter.hpp" #include "signal/sigh.hpp"
        
           | linkdd wrote:
           | Since it's a header-only library, with a lot of templates,
           | only the used code will be compiled.
           | 
           | So yes, this is tiny. Tinier than Unity, CryEngine, Unreal
           | Engine, or other huge frameworks of that kind.
        
             | meheleventyone wrote:
             | A comparison to those engines is a bit much isn't it? It's
             | not like it provides comparable features.
        
       | gh123man wrote:
       | Possibly related - this is the same person who is working on a
       | very impressive voxel game/engine. I didn't see an explicit
       | mention of it in the blog, but the Youtube videos speak for
       | themselves:
       | 
       | https://www.youtube.com/channel/UCM2RhfMLoLqG24e_DYgTQeA
        
       | nikki93 wrote:
       | The performance / cache-orientation point gets talked about a lot
       | re: ECS, but IME there are also (and maybe moreso) other
       | important benefits:
       | 
       | - Combined with an inspector UI, you can explore gameplay by
       | adding and removing components from entities and arrive at design
       | emergently. One way to look at this is also that you write
       | components and systems to handle the main gameplay path you start
       | out thinking about, but your queries encode many other codepaths
       | than just that (a combinatorial explosion of component membership
       | in entity is possible). This lets you get a kind of "knob crawl"
       | that you see in eg. sound design when tweaking parameters live
       | with synths too. It lets artists and designers using the editor
       | explore many gameplay possibilities.
       | 
       | - The way I see the component data is it's kind of an interface /
       | source of truth, but some subsystems may end up storing transient
       | data elsewhere at runtime (eg. a spatial octree or contact graph
       | for physics). However as components are added or removed or
       | component properties updated, the caches should be updated
       | accordingly. You get a single focal point for scene state. Once
       | some state is expressed as a component you get undo and redo,
       | saving to scene files, saving an invidual (or group of) entity as
       | a blueprint to reuse, ...
       | 
       | The cache thing feels like a minor point to me, inside a larger
       | category of allowing you to massage your data based on access
       | patterns by decoupling the logic acting on it. With performance
       | being one of the goals of said massaging along with many others.
       | 
       | I also find myself not really focusing on the "system" aspect as
       | much as the entity / component; esp. re: embedding constructs for
       | that into a library. I've found you can get far just having one
       | large "updateGame()" function that does queries and then performs
       | the relevant mutations in the query bodies, and you can then
       | separate code into more functions (usually just simple free
       | functions without parameters) from there that become your
       | systems. There's a bit of a rabbit hole designing reusable
       | scheduling and event systems and whatnot but I feel like just
       | simple calls to top level / free functions like this on a per-
       | game basis seems a lot clearer and ultimately more flexible (it's
       | just regular procedural code and you're in control of what
       | happens when). I like seeing the backtrace / callstack and being
       | the owner of things and then being explicit all the way up vs.
       | entering from some emergently scheduled event dispatch system.
        
       | echohack5 wrote:
       | Personal anecdote: Habitat was developed as a sort of application
       | deployment / configuration management tool in Rust, and the
       | architecture there is roughly equivalent to an ECS. I found it a
       | joy to work with and work on. Not sure if it's fundamentally a
       | better software pattern, but it at least meshes with my brain
       | better than how most OO-style software is laid out.
       | 
       | https://github.com/habitat-sh/habitat
        
       | adamnemecek wrote:
       | I was really into ECS a while back and tried to implement a GUI
       | framework in it. It was not a good match since your single
       | widgets contain wildly different types of data.
        
       | reidjs wrote:
       | I'm curious if anyone has used ECS in a business application? Our
       | application has a lot of business rules between our different
       | entities. Currently we use something like MVC and the rules
       | mainly live in the models themselves. This can be a messy when
       | describing rules between objects.
        
         | NovaX wrote:
         | entity = instance id
         | 
         | component = metadata + data
         | 
         | system = metadata processors
         | 
         | This way you can decouple business rules from the model by
         | using metadata to instruct how the entity should be processed.
         | At work we use json schema as our type system to describe our
         | entities, where every instance includes the schemas that it
         | implements. The metadata allows us to render an entity, process
         | it via server-side triggers, store and search, use generic crud
         | routes, etc. In a language like Java, this is the same as using
         | reflection to inspect interfaces and annotations for processing
         | an object.
        
           | reidjs wrote:
           | Do you have any resources (books, websites, etc) that discuss
           | this pattern in depth?
        
             | NovaX wrote:
             | Unfortunately not. I simply realized later that my
             | application design was conceptually similar to ECS. I could
             | walk you through it over a Zoom, though.
        
         | codr7 wrote:
         | This is pretty much a solved problem in languages with multiple
         | dispatch (Common Lisp, Julia, Raku etc), the behavior lives in
         | the methods which don't belong to a specific class.
         | 
         | You can sort of, kind of get there with free functions and
         | overloading as long as you're not doing anything fancy with the
         | methods.
        
       | jeremycw wrote:
       | I think the focus on ECS when talking about data-oriented design
       | largely misses the point of what data-oriented design is all
       | about. Focusing on a _code_ design pattern is the antithesis of
       | _data_-oriented design. Data-oriented is about breaking away from
       | the obsession with taxonomy, abstraction and world-modeling and
       | moving towards the understanding that all software problems are
       | data transformation problems.
       | 
       | It's that all games essentially (and most software in general)
       | boil down to: transform(input, current_state) -> output,
       | new_state
       | 
       | Then, for some finite set of platforms and hardware there will be
       | an optimal transform to accomplish this and it is our job as
       | engineers to make "the code" approach this optimal transform.
        
         | WorldMaker wrote:
         | This fits my gut feeling every time I see an ECS system that
         | videogame design has gotten stuck in a local maxima
         | abstraction/pattern. Often what they really want is a Monadic
         | abstraction of a data/state transformation process, but they
         | are often stuck in languages ("for performance reasons") that
         | make it hard to impossible to get good Monadic abstractions. So
         | instead they use the hammers that to make nails of the
         | abstractions that they can get. ECS feels to me like a strange
         | attempt to build Self-like OO dynamic prototypes in a class-
         | based OO language, and that's almost exactly what you would
         | expect for an industry only just now taking baby steps outside
         | of C/C++.
         | 
         | C# has some good tools to head towards that direction
         | (async/await is a powerful Monadic transformer, for instance;
         | it's not a generic enough transformer on its own of course, but
         | an interesting start), but as this article points out most
         | videogames work in C# today still has to keep the back foot in
         | C/C++ land at all times and C/C++ mentalities are still going
         | to clip the wings of abstraction work.
         | 
         | (ETA: Local maxima are still useful of course! Just that I'd
         | like to point out that they can also be a trap.)
        
           | learc83 wrote:
           | >"for performance reasons"
           | 
           | The quotes imply that this is a bad reason, but in soft
           | realtime systems you often want complete control of memory
           | allocation.
           | 
           | Even in the case of something like Unity--in order to give
           | developers the performance they want--they've designed subset
           | of C# they call high performance C# where memory is manually
           | allocated.
           | 
           | In most cases if you're using an ECS, it's because you care
           | so much about performance that you want to organize most of
           | your data around cache locality. If you don't care about
           | performance, something like the classic Unity Game Object
           | component architecture is a lot easier to work with.
        
             | unknownOrigin wrote:
             | Yea, you're right, I think the previous poster seriouspy
             | underestimates videogames as performance critical (and
             | performance consistent!) apps. In the modern days of
             | desktop Java and C# (and even more in web dev) the vast
             | majority of coders just don't come across the need to "do
             | everything you need to do" in 33ms _or less_ ,
             | _consistently_.
        
             | WorldMaker wrote:
             | I'm not implying it is a bad reason with the quotes, I'm
             | trying to imply that it is a misguided reason (even if it
             | has good intentions).
             | 
             | The "rule" that C/C++ is always "more performant" is just
             | _wrong_. It 's a bit of a sunk cost fallacy that because
             | the games industry has a lot of (constantly reinvented)
             | experience in performance optimizing C/C++ that they can't
             | get the same or better benefits if they used better
             | languages and higher abstractions. (It's the exact same
             | sunk cost fallacy that a previous games industry generation
             | said C/C++ would never beat hand-tuned Assembly and it
             | wasn't worth trying.)
             | 
             | In Enterprise day jobs I've seen a ton of "high
             | performance" C# with regular garbage collection.
             | Performance optimizing C# and garbage collection is a
             | different art than performance optimizing manually
             | allocated memory code, but it is an art/science that
             | exists. I've even seen some very high performance games
             | written entirely in C# and not "high performance C#" but
             | the real thing with honest garbage collection.
             | 
             | (It's a different art to performance optimize C# code but
             | it isn't even that different, at a high level a lot of the
             | techniques are very similar like knowing when to use shared
             | pools or deciding when you can entirely stack allocate a
             | structure instead of pushing it elsewhere in memory, etc.)
             | 
             | The implication in the discussion above is that a possible
             | huge sweet spot for a lot of game development would
             | actually be a language a lot more like Haskell, if not just
             | Haskell. A lot of the "ECS" abstraction boils away into the
             | ether if you have proper Monads and a nice do-notation for
             | working with them. You'd get something of the best of both
             | worlds that you could write what looks like the usual
             | imperative code games have "always" been written in, but
             | with the additional power of a higher abstraction and more
             | complex combinators than what are often written by hand
             | (many, many times over) in ECS systems.
             | 
             | So far I've not seen any production videogame even flirt
             | with a language like Haskell. It clearly doesn't look
             | anything like C/C++ so there's no imagination for how
             | performant it might actually be to write a game in it
             | (outside of hobbyist toys). But there are High Frequency
             | Trading companies out there using Haskell in production. It
             | can clearly hit some strong performant numbers. The art to
             | doing so is even more different from C/C++ than C#'s is,
             | but it exists and there are experts out there doing it.
             | 
             | Performance _is_ a good reason to do things, but I think
             | the videogames industry tends to especially lean on
             | "performance" as a crutch to avoid learning new things. I
             | think as an industry there's a lot of reason to avoid
             | engaging more experts and expertise in programming
             | languages and their performance optimization methodologies
             | when it is far easier to train "passionate" teens extremely
             | over-simplified (and generally wrong) maxims like "C++ will
             | always be more performant than C#" than to keep up with the
             | _actual_ state of the art. I think the games industry is
             | happiest, for a number of reasons, not exploring better
             | options outside of local maxima and  "performance" is an
             | easily available excuse.
        
         | typon wrote:
         | It's really annoying how many people misunderstand the term
         | 'data oriented design'. Usually to mean something like 'not
         | object oriented programming'. If your data was inherently
         | hierarchical and talking about animals that meow or moo, go
         | ahead and implement the textbook OO modeling.
         | 
         | This Mike Acton post describes it accurately:
         | http://www.macton.ninja/home/onwhydodisntamodellingapproacha...
        
         | bob1029 wrote:
         | > Data-oriented is about breaking away from the obsession with
         | taxonomy, abstraction and world-modeling
         | 
         | Something about this does not sit well with me.
         | 
         | Data is absolutely worthless if it generated on top of a
         | garbage schema. Having poor modeling is catastrophic to any
         | complex software project, and will be the root of all evil
         | downstream.
         | 
         | In my view, the _principal_ reason people hate SQL is because
         | no one took the time to  "build the world" and consult with the
         | business experts to verify if their model was well-aligned with
         | reality (i.e. the schema is a dumpster fire). As a consequence,
         | recursive queries and other abominations are required to obtain
         | meaningful business insights. If you took the time to listen to
         | the business explain the complex journey that - for instance -
         | user email addresses went down, you may have decided to model
         | them in their own table rather than as a dumb string fact on
         | the Customers table with zero historization potential.
         | 
         | Imagine if you could go back in time and undo all those little
         | fuck ups in your schemas. With the power of experience and
         | planning ahead, you can do the 2nd best thing.
        
           | typon wrote:
           | Schema design is THE problem data oriented programming is
           | focused on. It's saying, let's design our data structures in
           | memory and on disk such that they exist to solve the problem
           | at hand. I think youre talking about the same thing
        
             | Keyframe wrote:
             | Or to circle.back again to Fred Brooks, time and time
             | again:
             | 
             | "Show me your flowcharts and conceal your tables, and I
             | shall continue to be mystified. Show me your tables, and I
             | won't usually need your flowcharts; they'll be obvious."
             | 
             | - Fred Brooks
        
           | jeremycw wrote:
           | You're right, when I mentioned "taxonomy, abstraction and
           | world-modeling" I meant as it pertains to code organization
           | in the tradition OOP/OOD sense where it's generally about
           | naming classes, creating inheritance hierarchies, etc. Data-
           | oriented design is _absolutely_ concerned with the data
           | schema. I would, however, disagree that the focus should be
           | on "building the world" with your schema. To me this means
           | creating the schema based off of some gut/fuzzy feeling you
           | get when the names of things all end up being real world
           | nouns. To me creating a good schema is less about world
           | building than it is about having the exact data that you
           | need, well normalized and in a format that works well with
           | the algorithm you want to apply to it.
        
         | BobbyJo wrote:
         | > Focusing on a _code_ design pattern is the antithesis of
         | _data_-oriented design
         | 
         | Doesn't the former enable the latter? Ideally, language (both
         | human and machine) would have the semantics needed to represent
         | all transforms, but that's not the case. Code you rely on,
         | since none of it is written in isolation, needs to enable you
         | to implement data-oriented design should you so choose.
         | 
         | Also, I don't think pointing out that 'all games are
         | essentially...' is particularly useful. It's true, no question,
         | but that doesn't mean it's the most useful mental model for
         | people to use when developing software. Our job as engineers is
         | to make software that functions according to some set of
         | desires, and those desires may directly conflict with
         | approaching an optimal transform.
        
           | jeremycw wrote:
           | > Doesn't the former enable the latter?
           | 
           | Not necessarily. ECS is a local maxima when developing a
           | general purpose game engine. Since it's general purpose it
           | can do nothing more than provide a lowest common denominator
           | interface that can be used to make any game. If you are
           | building a game from scratch why would you limit yourself to
           | a lowest common denominator interface when there's no need?
           | Just write the exact concrete code that needs to be there to
           | solve the problem.
           | 
           | > Our job as engineers is to make software that functions
           | according to some set of desires, and those desires may
           | directly conflict with approaching an optimal transform.
           | 
           | All runtime desires of the software must be encoded in the
           | transform. So no software functionality should get in the way
           | of approaching the optimal transform. What does get in the
           | way of approaching the optimal transform is code
           | organization, architecture and abstraction that is non-
           | essential to performing the transform.
        
             | BobbyJo wrote:
             | > If you are building a game from scratch why would you
             | limit yourself to a lowest common denominator interface
             | when there's no need?
             | 
             | There is a need: the limits of the human mind. Nobody can
             | model an entire (worthwhile) game in their head, so unless
             | you plan on recursively rewriting the entire program as
             | each new new oversight pops up, you aren't going to get
             | anywhere near optimal anyway.
        
             | debaserab2 wrote:
             | > If you are building a game from scratch why would you
             | limit yourself to a lowest common denominator interface
             | when there's no need? Just write the exact concrete code
             | that needs to be there to solve the problem.
             | 
             | Coming from the realm of someone who has mostly swam in the
             | OO pool their career, I struggle understanding how a
             | concrete implementation of something like a video game
             | wouldn't spiral out of control quickly without significant
             | organization and some amount of abstraction overhead. That
             | said, I have found ECS type systems be so general purpose
             | that you end up doing a lot of things to please the ECS
             | design itself than you do focusing on the implementation.
             | 
             | Do you have any examples of games and/or code that are
             | written in more of a data oriented way? I'd really love to
             | learn more about this approach.
        
               | throwaway17_17 wrote:
               | The archetypal 'talk' on Data Oriented Design (in the way
               | GP is talking about it) is Mike Acton's 2014 CPPCon
               | keynote.
               | 
               | [1] https://www.youtube.com/watch?v=rX0ItVEVjHc
        
               | debaserab2 wrote:
               | Thanks!
        
               | jeremycw wrote:
               | While stylistically I don't necessarily agree with him
               | all the time, Casey Muratori's Handmade Hero
               | (https://handmadehero.org/) is probably the most complete
               | resource in terms of videos and access to source code as
               | far as an 'example' goes.
        
               | debaserab2 wrote:
               | Thank you!
        
             | BrS96bVxXBLzf5B wrote:
             | > Just write the exact concrete code that needs to be there
             | to solve the problem.
             | 
             | Good luck with that when the exact code to solve the
             | problem is not the exact code the next week, because the
             | problem has changed or evolved.
             | 
             | Not to suggest an ECS is _the answer_ , but this line of
             | thinking is reductive to the realities of creating a piece
             | of art. It's not a spec you can draw a diagram for and
             | trust will be basically the same. It's a creature you
             | discover, revealing more of itself over time. The
             | popularity of the ECS is because it provides accessible
             | composition. It's not the _only_ way of composing data but
             | being able to say  "AddX", "RemoveX" without the
             | implementation details of what struct holds what data and
             | what groupings might matter is what makes it appealing.
        
       | peterthehacker wrote:
       | That chart with "Weak Foundation" vs "Solid Foundation" reminds
       | me of Martin Fowler's chart in his Refactoring book that compares
       | "good design" vs "no design"[0].
       | 
       | [0] https://martinfowler.com/bliki/DesignStaminaHypothesis.html
        
       | jayd16 wrote:
       | Sure ok. Callbacks violate what makes ECS performant but they
       | like the pattern anyway, so why not.
        
       | de_keyboard wrote:
       | Where ECS gets muddy for me is when you have systems working on
       | entities with multiple types of components at once.
       | 
       | * physics - position component + physics component
       | 
       | * rendering - position component + animation component
       | 
       | * etc..
       | 
       | How do we now store these components? How do we create / access
       | aggregates efficiently?
       | 
       | If we have two arrays then there is lots of hopping around:
       | PositionComponent[]          PhysicsComponent[]
       | 
       | Maybe we need some kind of grouping class?
       | struct EntityData {           PositionComponent position; // Null
       | if not a position entity           PhysicsComponent physics; //
       | Null if not a physics entity         }
       | 
       | Interfaces have their issues too, but at least it's fairly clear
       | what to do:                   class BouncyBall : IHasPosition,
       | IHasPhysics {           IPosition getPosition() {             //
       | ...           }                IPhysics getPhysics() {
       | // ...           }         }
       | 
       | Anyone solved this before?
        
         | dkersten wrote:
         | > Anyone solved this before?
         | 
         | Sure. Take a look at EnTT[1], a popular C++ ECS library. It
         | comes with two main tools to deal with this: Sorting[2] and
         | groups[3]. EnTT gives you a large spectrum of tools with
         | different trade-offs so that you can tune your code based on
         | usage patterns. Obviously different bits of code will have
         | conflicting access patterns, so there's no one-size-fits-all
         | solution, but EnTT lets you optimise the patterns that are most
         | important to you (based on profiling, hopefully).
         | 
         | [1] https://github.com/skypjack/entt
         | 
         | [2] Sort one component to be in the same order as another
         | component, so that they can be efficiently accessed together:
         | https://github.com/skypjack/entt/wiki/Crash-Course:-entity-c...
         | 
         | [3] https://github.com/skypjack/entt/wiki/Crash-
         | Course:-entity-c...
        
         | royjacobs wrote:
         | What's wrong with iterating across multiple components? You
         | seem to imply that this is "hopping around" and therefore bad,
         | but it's perfectly acceptable to do so.
         | 
         | Of course, if your system needs to iterate across 20 components
         | to do its job then maybe you need to check if you've factored
         | your components correctly.
        
           | munificent wrote:
           | _> What 's wrong with iterating across multiple components?_
           | 
           | It's bad for spatial locality. You end up with many more CPU
           | cache misses, which significantly slows down execution. Using
           | the CPU cache effectively is one of the primary reasons to
           | use ECS.
        
             | Twisol wrote:
             | Recognizing your username, I'd probably best not argue, but
             | wouldn't iterating across (say) two component arrays only
             | cost you two pages in the cache at any given time, since
             | you're doing a sequential scan? You should have the same
             | number of cache misses overall, unless you're doing
             | something very complicated for each entity to cause the
             | cache to vacate one of the pages.
             | 
             | Of course, if you access N components you need N pages in
             | the cache concurrently, which is going to fall over for a
             | not-too-large N. But N=2 or N=3 seems unlikely to kill
             | spatial locality.
             | 
             | I can imagine it gets a little more complicated with
             | prefetch, but you're still using the prefetched pages --
             | you just need to prefetch pages for two separate arrays
             | (potentially at different rates based on component size)
             | rather than one. Do these details end up snowballing in a
             | way I'm not seeing, or are there details I'm just missing
             | outright?
        
               | munificent wrote:
               | _> two component arrays only cost you two pages in the
               | cache at any given time_
               | 
               | It sounds like you're thinking of virtual memory (i.e.
               | pages of memory either being in RAM or on disk). But CPU
               | caching is about the much smaller L1, L2, and L3 caches
               | directly on the chip itself.
               | 
               | Let's say you have two kinds of components, A and B. You
               | have those stored in two contiguous arrays:
               | AAAAAAAAAAAAAAAAAAA...
               | BBBBBBBBBBBBBBBBBBB...
               | 
               | Each entity has an A and B and your system needs to
               | access both of those to do its work. The code will look
               | like:                   for each entity:           access
               | some data component A           access some data
               | component B           do some computation
               | 
               | On the first access of A, the A component for that entity
               | and a bunch of subsequent As for other entities get
               | loaded into the cache line. On the next access of B, the
               | B for that entity along with a bunch of subsequent Bs
               | gets loaded into a cache line. If you are lucky the A and
               | B arrays will be at addresses such that the chip is able
               | to put them in _different_ cache lines. At that point, I
               | think you 'll mostly be OK.
               | 
               | But if you're unlucky and they are fighting for the same
               | cache line, then each access can end up evicting the
               | previous one and forcing a main memory look up for every
               | single component.
        
               | quotemstr wrote:
               | You should be able to use some kind of coloring approach
               | to avoid that kind of false sharing, right?
        
               | munificent wrote:
               | I'm not an expert at the details of hardware
               | architecture, but my understanding is that you're
               | basically stuck with whatever associativity and cache
               | policy the chip supports. That, and the addresses that
               | your components happen to be at, will determine whether
               | they end up fighting over cache lines or not.
               | 
               | https://en.wikipedia.org/wiki/Cache_placement_policies
        
               | gugagore wrote:
               | I don't think this is false sharing since the issue can
               | occur without any writes.
        
             | royjacobs wrote:
             | Yes, that was the point I was trying to make (also in my
             | other comment). It's certainly bad, but if you iterate
             | across, say, two components it shouldn't be too bad?
             | 
             | It's also an option to have your component data
             | interleaved, if you know the iteration usage upfront, I
             | suppose.
        
             | jayd16 wrote:
             | It's bad compared to what? If you have one system that
             | needs to iterate through A and B and another that need to
             | iterate through B and C, what is a more ideal system?
        
             | nikki93 wrote:
             | It depends on the storage pattern: archetype storages may
             | actually keep those components together / interspersed
             | anyway, or hybrid things like the "groups" in entt. It
             | does, generally speaking, just seem to give you an
             | opportunity to see this issue re: cache misses arise in
             | practice and rearrange your storage accordingly, by
             | decoupling your processing logic (body of a query) from the
             | actual layout. Esp. if the ECS provides a switch like
             | "store A and B interspersed in a single array" that you can
             | enable or disable at any point and profile both ways (part
             | of the data-oriented design idea: orient your data for how
             | it's used in practice).
        
         | resonantjacket5 wrote:
         | Usually you'd have the "physics" system hold onto the array
         | while the EntityData would hold a pointer to the
         | PhysicsComponent for that Entity.
         | 
         | That way each entity can access it's data quickly, and if there
         | is some heavy physics computation it can easily iterate it in a
         | list (aka checking for collisions).
        
         | [deleted]
        
         | throwaway13337 wrote:
         | Position is usually an entity level property.
         | 
         | But, assuming it isn't, you would make that portion of things
         | it's own component/trait of the entity. Components that rely on
         | it could be declared to require it (in Unity, there is a
         | RequireComponent annotation). So you can be sure if that
         | component exists, that its required component also exists on
         | the entity. I think this is a reasonably satisfying solution.
        
           | jcelerier wrote:
           | > Position is usually an entity level property.
           | 
           | that's a very very video game centric point of view. If a
           | pattern only works in a couple fields of application, it's
           | not a very good pattern..
        
             | adamrezich wrote:
             | ...for anything other than the field where the pattern
             | makes sense, of course
        
             | void_mint wrote:
             | This is a dogmatic take. From wikipedia
             | 
             | > Entity-component-system (ECS) is a software architectural
             | pattern that is mostly used in video game development.
        
             | munificent wrote:
             | _> that 's a very very video game centric point of view._
             | 
             | ECS was invented for and is primarily used by videogames.
             | 
             |  _> If a pattern only works in a couple fields of
             | application, it 's not a very good pattern._
             | 
             | I completely and totally disagree. How would you even
             | _define_ a  "field of application" without there being
             | patterns and practices that are unique to it? If every
             | domain uses the same techniques, what is the difference?
             | 
             | Off the top of my head, here are some patterns that I
             | rarely see outside of their primary domain:
             | 
             | Programming languages and compilers:
             | 
             | * Recursive descent
             | 
             | * Top-down operator parsing
             | 
             | * The visitor pattern
             | 
             | * Mark-sweep garbage collection and the tri-color
             | abstraction
             | 
             | Game and simulation programming:
             | 
             | * ECS
             | 
             | * Per-frame arena allocators
             | 
             | * Game loops
             | 
             | * Double buffering
             | 
             | * Spatial partitioning
        
               | jcelerier wrote:
               | Except the visitor, which can be useful in pretty much
               | any system with polymorphism, and maybe "game loops" I
               | definitely would not call any of those patterns ; methods
               | and techniques maybe. These words don't mean the same
               | thing, a pattern is much more general than a technique !
        
           | learc83 wrote:
           | ECS in this case isn't referring to the general method of
           | using entities that have much of their logic defined in
           | components. As you'd have in classic unity with
           | RequireComponent.
           | 
           | ECS is a specific software architecture where (among other
           | things) there is no entity level property because an entity
           | is just an identifier--all properties are stored in
           | logicless, data only components. Unity DOTS has an
           | implementation of this.
        
         | michannne wrote:
         | Typically, component is just some POCO class and systems
         | iterate over combinations of these components.
         | 
         | > if we have two arrays then there is lots of hopping around
         | 
         | Why? You can just create a custom allocation scheme assigning
         | one giant chunk of memory to all components and give each
         | system a custom iterator that iterates accordingly for the
         | alignment of components it cares about
        
         | meheleventyone wrote:
         | This is why the ECS pattern isn't actually as performant as
         | people make out. At least by default. I wrote this explanation
         | of the archetype approach to making an ECS fast a while ago:
         | 
         | This is partly why a lot of ECS demos have a lot of homogeneous
         | elements (they share all components in common). For example
         | particle systems have long been written in a data oriented
         | manner when running on the CPU. So if you implement it in the
         | ECS style you can just run through the arrays in order and its
         | all good. Or Unity's city sim example. But games tend to have
         | much more heterogeneous entities (they share less or few
         | components in common).
         | 
         | The most obvious example I can think of to dispel the myth of
         | ECS's inherent DoDness is an ECS wherein each component storage
         | is a linked list with each element individually allocated. Even
         | iterating through the homogeneous entity example is likely to
         | be extremely slow in comparison to flat arrays. So there is
         | nothing about the pattern that demands it be implemented in a
         | data-oriented manner.
         | 
         | But back to a more heterogeneous example. I'm going to try to
         | explain it generally because I think a worked version would be
         | enormous and maybe cloud things more? Typically component
         | storage is indexed by the entity ID. You want to look up the
         | component in the storage associated with a particular ID. If
         | all your storages are flat arrays where the entity ID is just
         | an index into the array the more heterogeneous your entities
         | the more gaps you will have to iterate over and correspondingly
         | more memory your game will take up. This isn't great for cache
         | locality or memory usage and we have to iterate over every
         | entity for all systems to find the valid ones.
         | 
         | So the next step uses a dense array and a secondary backing
         | array that is indexed by the entity id. So we can keep our
         | components packed nicely but still look them up easily.
         | 
         | Instead of iterating over all the entities for every system we
         | can find the shortest component storage for the set of
         | components the system uses and iterate directly over that and
         | lookup the other components in their storages by the current
         | entity ID. Now we iterate over potentially many fewer entities
         | but essentially do a random lookup into the other component
         | storages for each one. So we're introducing cache misses for
         | the benefit of less things to iterate over.
         | 
         | So what we want is the benefits of blazing through arrays
         | without the downsides of them being pretty sparse and ideally
         | minimizing cache misses. Which is why the concept of an
         | Archetype was invented. If we keep our components in flat
         | arrays but crucially change our storage so we're not keeping
         | flat arrays of every component but keeping separate component
         | storages for each archetype of entity we have right now.
         | 
         | Going from:
         | 
         | AAAAAAAAAA
         | 
         | BBBBBBBBBB
         | 
         | CCCCCCCCCC
         | 
         | To:
         | 
         | (ABC) A B C
         | 
         | (AB) AAA BBB
         | 
         | (AC) AAAAA CCCCC
         | 
         | (C) CCCCC
         | 
         | If we have a system that just iterates C's it can find all the
         | archetype storages and iterate straight through the C array for
         | them one by one. So ideally we only pay a cache miss when we
         | change archetype, have good cache locality and are iterating
         | the minimum set. Similarly a system that uses components A and
         | C will only iterate the archetype storage of ABC and AC and
         | blaze straight through the A and C arrays of each. Same deal.
         | 
         | This comes at a cost of making adding and removing components
         | from an entity more expensive.
         | 
         | We're also ignoring interacting with other components or the
         | world and how that might work. For example we might want to do
         | damage to another entity entirely. Or we might want to look up
         | the properties of the piece of ground we're stood on. So there
         | is a whole other layer of places we can ruin all this good work
         | by wanting to access stuff pretty randomly. Relationships in
         | games tend to be spatial and stuff tends to move around so it's
         | hard to see a general case solution to the problem.
         | 
         | Then there is other axis to think on like ease of creating the
         | game, how flexible it is to change the game, iteration speed,
         | designer friendliness and so on. Rarely IME has the gameplay
         | code itself been the bottleneck outside of stupid mistakes.
         | 
         | In games this level of optimization is really great when you do
         | have a big mostly homogenous set of things. Then it's well
         | worth the time to structure your data for efficient memory
         | access. City Sims, games like Factorio and so on are classic
         | examples."
        
           | hypertele-Xii wrote:
           | Can modern cache hierarchies not maintain two parallel linear
           | array iterations on the same core?
           | 
           | Or is CPU cache really so slow it can literally only look at
           | one stride of memory at a time?
           | 
           | I'm skeptical this kind of optimization is necessary.
        
             | meheleventyone wrote:
             | I think you skimmed part of the original post because I
             | mention using sparse arrays (linear arrays with 'empty'
             | slots) and the benefits/trade offs.
             | 
             | This archetype based approach is used in quite a few big
             | ECS projects. Unity's ECS and Bevy amongst them.
             | 
             | As with anything performance related though, particularly
             | when considering the underlying principles of data oriented
             | design you should be analysing the performance of your
             | approach on the target hardware.
        
           | de_keyboard wrote:
           | Great explanation, thank you.
           | 
           | > This comes at a cost of making adding and removing
           | components from an entity more expensive.
           | 
           | I think you could write a design-time tool that takes a
           | simple description file (with hints) and outputs code that
           | stores your entities and components efficiently.
           | 
           | Description file:                   {           "archetypes":
           | [             {               "components": [ "A", "B", "C" ]
           | },             {               "components": [ "B" ]
           | },             {               "components": [ "B", "C" ]
           | }           ]         }
           | 
           | Output:                   class ArchetypeABC {           A a;
           | B b;           C c;         }              class ArchetypeB {
           | B b;         }              class ArchetypeBC {           B
           | b;           C c;         }              class EntityStore {
           | ArchetypeABC[] entitiesABC;           ArchetypeB[] entitiesB;
           | ArchetypeBC[] entitiesB;         }
        
             | meheleventyone wrote:
             | This static analysis might not be sufficient as most of
             | these designs allow runtime manipulation of the components
             | on an Entity. Usually the implementation of the component
             | storage does much the same at runtime though so archetype
             | storages are created and removed as needed. The static
             | analysis could be used to pre-warm that if it was
             | particularly slow.
        
             | codetrotter wrote:
             | From the way that the parent commenter illustrated the
             | memory layout I thought they were talking about SoA style
             | but your comment is using AoS style.
             | 
             | So I would expect the code corresponding to their comment
             | to look like this instead of what you wrote:
             | class ArchetypeABC {           A[] as;           B[] bs;
             | C[] cs;         }              class ArchetypeB {
             | B[] bs;         }              class ArchetypeBC {
             | B[] bs;           C[] cs;         }
             | 
             | But maybe I misunderstood?
        
               | meheleventyone wrote:
               | You're correct.
        
         | sparkie wrote:
         | There's a fairly recent work called SHAPES[1] which attempts to
         | address this kind of customized memory layout without having to
         | give up the OOP abstraction. You can try out different memory
         | layouts without having to modify the types themselves.
         | 
         | [1]:https://www.doc.ic.ac.uk/%7Escd/ShapesOnwards.pdf; A more
         | recent revision of the work here:
         | https://www.researchgate.net/publication/341693673_Reshape_y...
        
           | quotemstr wrote:
           | Flexibility with object layout is one of the big potential
           | unexploited advantages of managed code systems. Automatically
           | "column-izing" large collections of objects ought to be in
           | the wheelhouse of sufficiently clever JVM and CLR
           | implementations, but this is a very under-explored line of
           | research.
        
         | jayd16 wrote:
         | Unity seems to store things by unique combination of
         | components. In your case they would have an arrays for entities
         | with a position component and a physics component, and then an
         | array of entities with a position component and an animation
         | component, and possibly an array for entities with components
         | that have all three.
         | 
         | Unity then schedules work for your system by passing all the
         | relevant arrays.
         | 
         | Described in detail here:
         | https://docs.unity3d.com/Packages/com.unity.entities@0.17/ma...
        
         | danbolt wrote:
         | EnTT's registry might be an interesting read if you haven't
         | read it before. [1] The specs crate also provides a variety of
         | storage implementations for varying types of components. [2]
         | 
         | I didn't work on the game, but I spoke with some of the
         | developers of _Homeworld: Deserts of Kharak_. Since there was a
         | straightforward quantity of entities and components (a bunch of
         | vehicles in a closed desert space), the space for all data was
         | preallocated at initialization time. I can 't speak further on
         | the specifics though.
         | 
         | [1]
         | https://github.com/skypjack/entt/blob/master/docs/md/entity....
         | 
         | [2] https://docs.rs/specs/0.17.0/specs/struct.VecStorage.html
        
         | BulgarianIdiot wrote:
         | This is not specific to ECS, but comes down to the "single
         | controller" (or single writer, single owner etc. many names)
         | problem.
         | 
         | Ideally you want to have one modifier/controller, but you can
         | have as many readers as you want.
         | 
         | When you can't have a single controller, you have several
         | options:
         | 
         | 1. Pass ownership. Animated components control position only by
         | animation. Physics components control position only by physics.
         | You can pass this control in time from physics to animation and
         | back.
         | 
         | 2. Express one through the other. In this case, express
         | animation as acting on physics constraints, and let the physics
         | engine compute the final position. This way animation becomes
         | just another "physical force" in your game. It can be hard to
         | do sophisticated animation this way though.
         | 
         | 3. Have physics-specific position and animation-specific
         | position and have the final position be computed as a formula
         | of both. Maybe you sum them. So either one that moves from a
         | base offset, impacts the position. This depends on what the
         | position is of.
        
         | beiller wrote:
         | I solved it by duplicating the data. Because a physics object
         | when created needs a starting position. And sometimes you need
         | to 'reset' the position and just having one position variable
         | won't allow that. The rendering logic checks if it has a
         | physics state and if not use the other position etc.
        
         | throw149102 wrote:
         | What you're looking for is "Arrays of Structs of Arrays".
         | 
         | See: https://en.wikipedia.org/wiki/AoS_and_SoA
         | 
         | Jonathan Blow has a good talk about it here:
         | https://www.youtube.com/watch?v=YGTZr6bmNmk
        
         | viktorcode wrote:
         | I didn't find a perfect solution, but it goes like this: it
         | doesn't matter how many components an entity has, as long as
         | all components are stored in corresponding arrays. So, you have
         | positions array, velocity array, etc. Those arrays contain the
         | data ideal for consumption by corresponding systems.
         | 
         | The problem here lies in linking those separate components
         | (i.e. indices in arrays) to entities.
        
       ___________________________________________________________________
       (page generated 2021-08-16 23:00 UTC)