[HN Gopher] Genesis - a generative physics engine for general-pu...
       ___________________________________________________________________
        
       Genesis - a generative physics engine for general-purpose robotics
        
       Author : tomp
       Score  : 161 points
       Date   : 2024-12-19 00:54 UTC (22 hours ago)
        
 (HTM) web link (genesis-world.readthedocs.io)
 (TXT) w3m dump (genesis-world.readthedocs.io)
        
       | tomp wrote:
       | Twitter announcement:
       | https://x.com/zhou_xian_/status/1869511650782658846
       | 
       | GitHub: https://github.com/Genesis-Embodied-AI/Genesis
       | 
       | academic project page: https://genesis-embodied-ai.github.io
        
         | gnabgib wrote:
         | HN: https://news.ycombinator.com/item?id=42456802
        
       | ubj wrote:
       | What method is Genesis using for JIT compilation? What subset of
       | Python syntax / operations will be supported?
       | 
       | The automatic differentiation seems to be intended for
       | compatibility with Pytorch. Will Genesis be able to interface
       | with JAX as well?
       | 
       | The project looks interesting, but the website is somewhat light
       | on details. In any case, all the best to the developers! It's
       | great to hear about various efforts in the space of
       | differentiable simulators.
        
         | sroussey wrote:
         | I believe they use Taichi.
        
         | dragonwriter wrote:
         | > What method is Genesis using for JIT compilation?
         | 
         | Taichi and Numba are both in the pyproject.toml
        
       | sakras wrote:
       | Maybe I missed it, but are there any performance numbers? It
       | being 100% implemented in Python makes me very suspicious that
       | this won't scale to any kind of large robot.
        
         | v9v wrote:
         | There is enough space on large robots to add in beefier compute
         | if needed (at the expense of power consumption). Python is run
         | all the time on robots. Compute usually becomes more of a
         | problem as the robot gets smaller, but it should still be
         | possible to run the intensive parts of a program on the cloud
         | and stream the results back.
        
         | mccoyb wrote:
         | Python is used here to wrap around some sort of kernel compiler
         | (taichi). Not out of the realm of possibility that kernels
         | which are compiled out of Python source code could be placed on
         | device with some sort of minimal runtime (although taichi
         | executes on CPU via LLVM, so maybe not so minimal)
        
         | dragonwriter wrote:
         | It's implemented in Python, but it is using existing Python
         | libraries which themselves are implemented in C, etc.
         | 
         | Notably it uses both Taichi and Numba, which compile code
         | expressed in (distinct restricted subsets of) Python (much
         | broader in Numba's case) to native CPU/GPU code including
         | parallelization.
        
       | andrewsiah wrote:
       | Any roboticists here? Is this impressive/what is the impact of
       | this?
        
       | etwigg wrote:
       | In the sizzle reel, the early waterdrop demos are beautiful but
       | seem staged, the later robotics demos look more plausible and
       | very impressive. But referring to all these "4D dynamical worlds"
       | sounds overhyped / scammy - everyone else calls 3D space
       | simulated through time a 3D world.
       | 
       | > Genesis's physics engine is developed in pure Python, while
       | being 10-80x faster than existing GPU-accelerated stacks like
       | Isaac Gym and MJX. ... Nvidia brought GPU acceleration to robotic
       | simulation, speeding up simulation speed by more than one order
       | of magnitude compared to CPU-based simulation. ... Genesis pushes
       | up this speed by another order of magnitude.
       | 
       | I can believe that setting up some kind of compute pipeline in a
       | high level language such as Python could be fast, but the
       | marketing materials aren't explaining any of the "how", if it's
       | real it must be GPU-accelerated, but they almost imply that it
       | isn't. Looks neat, hope it works great!
        
         | erwincoumans wrote:
         | It is a nice physics engine, it uses Taichi
         | (https://github.com/taichi-dev/taichi) to compile Python code
         | to CUDA/GPU (similar to what Warp Sim does,
         | https://github.com/NVIDIA/warp)
        
         | KaiserPro wrote:
         | > "4D dynamical worlds"
         | 
         | Its a feature of that field of science. I'm currently working
         | in a lab that is doing bunch of things that in papers are
         | described $adjective-AI. In practice its just a slightly hyped,
         | but vaguely agreed upon by consensus in weird science paper
         | english term, or set of terms. (in the same way that guassian
         | splats and totally just point clouds with efficient alpha
         | blending[only slightly more complex, please don't just take my
         | word for it])
         | 
         | You probably understand what this term is meant to describe,
         | but to spell it out gives a bit of insight into _why_ its got
         | such a shite name.
         | 
         | o "4d": because its doing things over time. Normally thats a
         | static scene with a camera flying through it (3D). when you
         | have stuff other than the camera moving, you get an extra
         | dimension, hence 4D.
         | 
         | o "dynamical" (god I hate this) dynamic means that objects in
         | the video are moving around. So you can just used the multiple
         | camera locations to build up a single view of an object or
         | room, you need to account for movement of things in the scene.
         | 
         | o "worlds" to highlight that its not just one room being re-
         | used over and over, its a generator (well its not, but thats
         | for another post) of diverse scenes that can represent many
         | locations around the world.
        
           | ggerules wrote:
           | They could be implying a little bit of computer graphics in
           | the mix. Rotation, shear, and transformation matrices have a
           | dimension of 4.
        
         | rirarobo wrote:
         | > But referring to all these "4D dynamical worlds" sounds
         | overhyped / scammy - everyone else calls 3D space simulated
         | through time a 3D world.
         | 
         | In the research community, "4D" is a commonly used term to
         | differentiate from work on static 3D objects and environments,
         | especially in recent years since the advent of NeRF.
         | 
         | The term "dynamic" has long been used similarly, but sometimes
         | connotes a narrower scope. For example, reconstruction of cloth
         | dynamics from an RGBD sensor, human body motion from a multi-
         | view camera rig, or a scene from video, but assuming that the
         | scene can be decomposed into rigid objects with their
         | individual dynamics and an otherwise static environment. An
         | even narrower related term in this space would be
         | "articulated", such as reconstruction of humans, animals, or
         | objects with moving parts. However, the representations used in
         | prior works typically did not generalize outside their target
         | domains.
         | 
         | So, "4D" has become more common recently to reflect the
         | development of more general representations that can be used to
         | model dynamic objects and environments.
         | 
         | If you'd like to find related work, I'd recommend searching in
         | conjunction with a conference name to start, e.g. "4D CVPR" or
         | "4D NeurIPS", and then digging into webpages of specific
         | researchers or lab groups. Here are a couple interesting
         | related works I found:
         | 
         | https://shape-of-motion.github.io/ https://generative-
         | dynamics.github.io/ https://stereo4d.github.io/ https://make-a-
         | video3d.github.io/
         | 
         | All that considered, "4D dynamical worlds" does feel like
         | buzzword salad, even if the intended audience is the research
         | community, for two main reasons. First, it's as if some authors
         | with a background in physics simulation wanted to reference
         | "dynamical systems", but none of the prior work in 4D
         | reconstruction/generation uses "dynamical", they use "dynamic".
         | Second, as described above, the whole point of "4D" is that
         | it's more general than "dynamic", using both is redundant. So,
         | "4D worlds" would be more appropriate IMO.
        
       | fudged71 wrote:
       | What does it mean that gs.generate() is missing in the project?
        
         | AuryGlenz wrote:
         | "Currently, we are open-sourcing the underlying physics engine
         | and the simulation platform. Access to the generative framework
         | will be rolled out gradually in the near future."
        
       | extr wrote:
       | I saw this on twitter and actually came on HN to see if there was
       | a thread with more details. The demo on twitter was frankly
       | unbelievable. Show me a water droplet falling...okay...now add a
       | live force diagram that is perfectly rendered by just asking for
       | it? What? Doesn't seem possible/real. And yet it seems reputable,
       | the docs/tech look legit, they just "aren't released the
       | generative part yet".
       | 
       | What is going on here? Is the demo just some researchers getting
       | carried away and overpromising, hiding some major behind the
       | scenes work to make that video?
        
         | vagabund wrote:
         | My understanding is they built a performant suite of simulation
         | tools from the ground up, and then they expose those tools via
         | API to an "agent" that can compose them to accomplish the
         | user's ask. It's probably less general than the prompt
         | interface implies, but still seems incredibly useful.
        
         | upcoming-sesame wrote:
         | The values on the forces diagram can't be real
        
       | baq wrote:
       | I was mildly impressed with the water demo, but that robot thing
       | is kinda crazy, really. Finally looks like a framework for AI
       | which can do my laundry.
        
       | forrestthewoods wrote:
       | The GitHub claims:
       | 
       | > Genesis delivers an unprecedented simulation speed -- over 43
       | million FPS when simulating a Franka robotic arm with a single
       | RTX 4090 (430,000 times faster than real-time).
       | 
       | That math works out to... 23.26 nanoseconds per frame. Uhh... no
       | they don't simulate a robot arm in 23 nanoseconds? That's
       | literally twice as fast as a single cache miss?
       | 
       | They may have an interesting platform. I'm not sure. But some of
       | their claims scream exaggeration which makes me not trust other
       | claims.
        
         | reitzensteinm wrote:
         | It's possible they're executing many simulations in parallel,
         | and counting that. 16k robot arms executing at 3k FPS each is
         | much more reasonable on a 4090. If you're effectively fuzzing
         | for edge cases, this would have value.
        
           | forrestthewoods wrote:
           | Yeah it's gotta be something like that. The whole claim comes
           | across as rather dishonest. If you're simulating 16,000 arms
           | at 3000 fps each then say that. Thats great. Be clear and
           | concise with your claims.
        
             | reitzensteinm wrote:
             | Agreed.
        
           | cyber_kinetist wrote:
           | The reason why they are using the FPS (frames-per-second)
           | term in a different way, is that this robotics simulator is
           | primarily going to be used for reinforcement learning, where
           | you run thousands of agents in parallel. In that context, the
           | total "batched" throughput of how many frames you can
           | generate per second is crucial for training your policy
           | network quickly - than the actual latency between frames
           | (which is more important for real-time tasks like gaming)
        
         | GrantMoyer wrote:
         | The fine text at the bottom of speed comparison video on the
         | project homepage says "With `hibernation = True`". Based on a
         | search through the code, the hibernation setting appears to
         | skip simulating components which reach steady state.
        
       | ilaksh wrote:
       | I suspect that the actual generation and simulation/rendering
       | takes several minutes for each step.
        
         | psb217 wrote:
         | The simulation/rendering is actually pretty fast since it's all
         | done by heavily optimized gpu-based physics and graphics
         | engines. The "generative" part is that they have some LLM stuff
         | that's finetuned for generating configurations/parameters for
         | the physics engine conditioned on some text. Ie, the physics
         | and graphics are classical clockworky simulations, with a
         | generative frontend to make it easier (but less precise) to get
         | a world up and running. The open source release currently
         | provides the clockworky simulator stuff, with the generative
         | frontend to be released some time in the future.
        
       | a_t48 wrote:
       | This looks neat. Single step available - as far as I can tell
       | though, no LIDAR, no wheels? Very arm/vision focused. There's
       | nothing wrong with that, but robotics encompasses a huge space to
       | simulate, which is why I haven't yet done my own simulator. Would
       | love a generic simulation engine to plug my framework into, but
       | this is missing a few things I need.
        
       | ChrisArchitect wrote:
       | [dupe]
       | 
       | Earlier project page:
       | https://news.ycombinator.com/item?id=42456802
        
       | dr_kretyn wrote:
       | 100% python and fast? Either it isn't 100% python, or it isn't
       | fast.
        
         | zamadatix wrote:
         | Depends where your boundary for "100% Anything" is I suppose.
         | It seems to use GPU accelerated kernels written in Python via
         | the Taichi library for most of the physics calculations. At
         | some point, sure, the OS+GPU driver+GPU firmware you need to
         | run the GPU accelerated kernel are not written in Python (and
         | if you run it on CPU instead it will be slow, but more because
         | you're using the CPU than you're not using C or something).
         | There is a bit of numpy too, which eventually boils down to
         | some non-Python stuff (as any Python code eventually will). I'm
         | not sure that's a useful distinction or that the choice of
         | language in defining the kernels makes a meaningful difference
         | on the overall performance in this case.
        
           | dr_kretyn wrote:
           | The doc emphasizes "100% Python" and that backend is natively
           | in Python. I'm reading this as "you don't need anything else
           | than Python interpreter." Given a large number of packages
           | aren't in Python under the hood, that's a big, unnecessary
           | hyperbole. It's Ok to acknowledge that there's a heavily
           | reliance on non-python code, e.g. Taichi or Numpy.
           | 
           | I also think that the distinction isn't particularly useful.
           | Just pedantic claims will get pedantic feedback.
        
             | dragonwriter wrote:
             | It's particularly useful if it is an open source project
             | and you want to communicate to people who might want to
             | hack on it (either in a fork or the main project) what
             | languages they will need to work directly with to do so.
             | 
             | It's not important to end users, but they aren't the only
             | audience.
        
         | dragonwriter wrote:
         | The Genesis code itself is 100% python. The underlying Python
         | libraries it uses are not (just as, or that matter, the Python
         | _standard_ library isn't, but this is, in particular, using
         | Numba - which compiles fairly normal Python to CPU and
         | optionally GPU-native code - and Taichi, which compiles very
         | specially-crafted Python to kernels for GPU.)
        
       | aeon-vadu wrote:
       | So we can run AI agents with RL in molecular level simulations
       | for replacing product designing,machanical engineering,
       | electrical engineering, aerospace engineerig and everything else
       | right!!? If we can combine protein folding too then we could
       | possibly solve any disease and poverty with fully automation
        
       ___________________________________________________________________
       (page generated 2024-12-19 23:01 UTC)