[HN Gopher] Genesis - a generative physics engine for general-pu...
___________________________________________________________________
Genesis - a generative physics engine for general-purpose robotics
Author : tomp
Score : 161 points
Date : 2024-12-19 00:54 UTC (22 hours ago)
(HTM) web link (genesis-world.readthedocs.io)
(TXT) w3m dump (genesis-world.readthedocs.io)
| tomp wrote:
| Twitter announcement:
| https://x.com/zhou_xian_/status/1869511650782658846
|
| GitHub: https://github.com/Genesis-Embodied-AI/Genesis
|
| academic project page: https://genesis-embodied-ai.github.io
| gnabgib wrote:
| HN: https://news.ycombinator.com/item?id=42456802
| ubj wrote:
| What method is Genesis using for JIT compilation? What subset of
| Python syntax / operations will be supported?
|
| The automatic differentiation seems to be intended for
| compatibility with Pytorch. Will Genesis be able to interface
| with JAX as well?
|
| The project looks interesting, but the website is somewhat light
| on details. In any case, all the best to the developers! It's
| great to hear about various efforts in the space of
| differentiable simulators.
| sroussey wrote:
| I believe they use Taichi.
| dragonwriter wrote:
| > What method is Genesis using for JIT compilation?
|
| Taichi and Numba are both in the pyproject.toml
| sakras wrote:
| Maybe I missed it, but are there any performance numbers? It
| being 100% implemented in Python makes me very suspicious that
| this won't scale to any kind of large robot.
| v9v wrote:
| There is enough space on large robots to add in beefier compute
| if needed (at the expense of power consumption). Python is run
| all the time on robots. Compute usually becomes more of a
| problem as the robot gets smaller, but it should still be
| possible to run the intensive parts of a program on the cloud
| and stream the results back.
| mccoyb wrote:
| Python is used here to wrap around some sort of kernel compiler
| (taichi). Not out of the realm of possibility that kernels
| which are compiled out of Python source code could be placed on
| device with some sort of minimal runtime (although taichi
| executes on CPU via LLVM, so maybe not so minimal)
| dragonwriter wrote:
| It's implemented in Python, but it is using existing Python
| libraries which themselves are implemented in C, etc.
|
| Notably it uses both Taichi and Numba, which compile code
| expressed in (distinct restricted subsets of) Python (much
| broader in Numba's case) to native CPU/GPU code including
| parallelization.
| andrewsiah wrote:
| Any roboticists here? Is this impressive/what is the impact of
| this?
| etwigg wrote:
| In the sizzle reel, the early waterdrop demos are beautiful but
| seem staged, the later robotics demos look more plausible and
| very impressive. But referring to all these "4D dynamical worlds"
| sounds overhyped / scammy - everyone else calls 3D space
| simulated through time a 3D world.
|
| > Genesis's physics engine is developed in pure Python, while
| being 10-80x faster than existing GPU-accelerated stacks like
| Isaac Gym and MJX. ... Nvidia brought GPU acceleration to robotic
| simulation, speeding up simulation speed by more than one order
| of magnitude compared to CPU-based simulation. ... Genesis pushes
| up this speed by another order of magnitude.
|
| I can believe that setting up some kind of compute pipeline in a
| high level language such as Python could be fast, but the
| marketing materials aren't explaining any of the "how", if it's
| real it must be GPU-accelerated, but they almost imply that it
| isn't. Looks neat, hope it works great!
| erwincoumans wrote:
| It is a nice physics engine, it uses Taichi
| (https://github.com/taichi-dev/taichi) to compile Python code
| to CUDA/GPU (similar to what Warp Sim does,
| https://github.com/NVIDIA/warp)
| KaiserPro wrote:
| > "4D dynamical worlds"
|
| Its a feature of that field of science. I'm currently working
| in a lab that is doing bunch of things that in papers are
| described $adjective-AI. In practice its just a slightly hyped,
| but vaguely agreed upon by consensus in weird science paper
| english term, or set of terms. (in the same way that guassian
| splats and totally just point clouds with efficient alpha
| blending[only slightly more complex, please don't just take my
| word for it])
|
| You probably understand what this term is meant to describe,
| but to spell it out gives a bit of insight into _why_ its got
| such a shite name.
|
| o "4d": because its doing things over time. Normally thats a
| static scene with a camera flying through it (3D). when you
| have stuff other than the camera moving, you get an extra
| dimension, hence 4D.
|
| o "dynamical" (god I hate this) dynamic means that objects in
| the video are moving around. So you can just used the multiple
| camera locations to build up a single view of an object or
| room, you need to account for movement of things in the scene.
|
| o "worlds" to highlight that its not just one room being re-
| used over and over, its a generator (well its not, but thats
| for another post) of diverse scenes that can represent many
| locations around the world.
| ggerules wrote:
| They could be implying a little bit of computer graphics in
| the mix. Rotation, shear, and transformation matrices have a
| dimension of 4.
| rirarobo wrote:
| > But referring to all these "4D dynamical worlds" sounds
| overhyped / scammy - everyone else calls 3D space simulated
| through time a 3D world.
|
| In the research community, "4D" is a commonly used term to
| differentiate from work on static 3D objects and environments,
| especially in recent years since the advent of NeRF.
|
| The term "dynamic" has long been used similarly, but sometimes
| connotes a narrower scope. For example, reconstruction of cloth
| dynamics from an RGBD sensor, human body motion from a multi-
| view camera rig, or a scene from video, but assuming that the
| scene can be decomposed into rigid objects with their
| individual dynamics and an otherwise static environment. An
| even narrower related term in this space would be
| "articulated", such as reconstruction of humans, animals, or
| objects with moving parts. However, the representations used in
| prior works typically did not generalize outside their target
| domains.
|
| So, "4D" has become more common recently to reflect the
| development of more general representations that can be used to
| model dynamic objects and environments.
|
| If you'd like to find related work, I'd recommend searching in
| conjunction with a conference name to start, e.g. "4D CVPR" or
| "4D NeurIPS", and then digging into webpages of specific
| researchers or lab groups. Here are a couple interesting
| related works I found:
|
| https://shape-of-motion.github.io/ https://generative-
| dynamics.github.io/ https://stereo4d.github.io/ https://make-a-
| video3d.github.io/
|
| All that considered, "4D dynamical worlds" does feel like
| buzzword salad, even if the intended audience is the research
| community, for two main reasons. First, it's as if some authors
| with a background in physics simulation wanted to reference
| "dynamical systems", but none of the prior work in 4D
| reconstruction/generation uses "dynamical", they use "dynamic".
| Second, as described above, the whole point of "4D" is that
| it's more general than "dynamic", using both is redundant. So,
| "4D worlds" would be more appropriate IMO.
| fudged71 wrote:
| What does it mean that gs.generate() is missing in the project?
| AuryGlenz wrote:
| "Currently, we are open-sourcing the underlying physics engine
| and the simulation platform. Access to the generative framework
| will be rolled out gradually in the near future."
| extr wrote:
| I saw this on twitter and actually came on HN to see if there was
| a thread with more details. The demo on twitter was frankly
| unbelievable. Show me a water droplet falling...okay...now add a
| live force diagram that is perfectly rendered by just asking for
| it? What? Doesn't seem possible/real. And yet it seems reputable,
| the docs/tech look legit, they just "aren't released the
| generative part yet".
|
| What is going on here? Is the demo just some researchers getting
| carried away and overpromising, hiding some major behind the
| scenes work to make that video?
| vagabund wrote:
| My understanding is they built a performant suite of simulation
| tools from the ground up, and then they expose those tools via
| API to an "agent" that can compose them to accomplish the
| user's ask. It's probably less general than the prompt
| interface implies, but still seems incredibly useful.
| upcoming-sesame wrote:
| The values on the forces diagram can't be real
| baq wrote:
| I was mildly impressed with the water demo, but that robot thing
| is kinda crazy, really. Finally looks like a framework for AI
| which can do my laundry.
| forrestthewoods wrote:
| The GitHub claims:
|
| > Genesis delivers an unprecedented simulation speed -- over 43
| million FPS when simulating a Franka robotic arm with a single
| RTX 4090 (430,000 times faster than real-time).
|
| That math works out to... 23.26 nanoseconds per frame. Uhh... no
| they don't simulate a robot arm in 23 nanoseconds? That's
| literally twice as fast as a single cache miss?
|
| They may have an interesting platform. I'm not sure. But some of
| their claims scream exaggeration which makes me not trust other
| claims.
| reitzensteinm wrote:
| It's possible they're executing many simulations in parallel,
| and counting that. 16k robot arms executing at 3k FPS each is
| much more reasonable on a 4090. If you're effectively fuzzing
| for edge cases, this would have value.
| forrestthewoods wrote:
| Yeah it's gotta be something like that. The whole claim comes
| across as rather dishonest. If you're simulating 16,000 arms
| at 3000 fps each then say that. Thats great. Be clear and
| concise with your claims.
| reitzensteinm wrote:
| Agreed.
| cyber_kinetist wrote:
| The reason why they are using the FPS (frames-per-second)
| term in a different way, is that this robotics simulator is
| primarily going to be used for reinforcement learning, where
| you run thousands of agents in parallel. In that context, the
| total "batched" throughput of how many frames you can
| generate per second is crucial for training your policy
| network quickly - than the actual latency between frames
| (which is more important for real-time tasks like gaming)
| GrantMoyer wrote:
| The fine text at the bottom of speed comparison video on the
| project homepage says "With `hibernation = True`". Based on a
| search through the code, the hibernation setting appears to
| skip simulating components which reach steady state.
| ilaksh wrote:
| I suspect that the actual generation and simulation/rendering
| takes several minutes for each step.
| psb217 wrote:
| The simulation/rendering is actually pretty fast since it's all
| done by heavily optimized gpu-based physics and graphics
| engines. The "generative" part is that they have some LLM stuff
| that's finetuned for generating configurations/parameters for
| the physics engine conditioned on some text. Ie, the physics
| and graphics are classical clockworky simulations, with a
| generative frontend to make it easier (but less precise) to get
| a world up and running. The open source release currently
| provides the clockworky simulator stuff, with the generative
| frontend to be released some time in the future.
| a_t48 wrote:
| This looks neat. Single step available - as far as I can tell
| though, no LIDAR, no wheels? Very arm/vision focused. There's
| nothing wrong with that, but robotics encompasses a huge space to
| simulate, which is why I haven't yet done my own simulator. Would
| love a generic simulation engine to plug my framework into, but
| this is missing a few things I need.
| ChrisArchitect wrote:
| [dupe]
|
| Earlier project page:
| https://news.ycombinator.com/item?id=42456802
| dr_kretyn wrote:
| 100% python and fast? Either it isn't 100% python, or it isn't
| fast.
| zamadatix wrote:
| Depends where your boundary for "100% Anything" is I suppose.
| It seems to use GPU accelerated kernels written in Python via
| the Taichi library for most of the physics calculations. At
| some point, sure, the OS+GPU driver+GPU firmware you need to
| run the GPU accelerated kernel are not written in Python (and
| if you run it on CPU instead it will be slow, but more because
| you're using the CPU than you're not using C or something).
| There is a bit of numpy too, which eventually boils down to
| some non-Python stuff (as any Python code eventually will). I'm
| not sure that's a useful distinction or that the choice of
| language in defining the kernels makes a meaningful difference
| on the overall performance in this case.
| dr_kretyn wrote:
| The doc emphasizes "100% Python" and that backend is natively
| in Python. I'm reading this as "you don't need anything else
| than Python interpreter." Given a large number of packages
| aren't in Python under the hood, that's a big, unnecessary
| hyperbole. It's Ok to acknowledge that there's a heavily
| reliance on non-python code, e.g. Taichi or Numpy.
|
| I also think that the distinction isn't particularly useful.
| Just pedantic claims will get pedantic feedback.
| dragonwriter wrote:
| It's particularly useful if it is an open source project
| and you want to communicate to people who might want to
| hack on it (either in a fork or the main project) what
| languages they will need to work directly with to do so.
|
| It's not important to end users, but they aren't the only
| audience.
| dragonwriter wrote:
| The Genesis code itself is 100% python. The underlying Python
| libraries it uses are not (just as, or that matter, the Python
| _standard_ library isn't, but this is, in particular, using
| Numba - which compiles fairly normal Python to CPU and
| optionally GPU-native code - and Taichi, which compiles very
| specially-crafted Python to kernels for GPU.)
| aeon-vadu wrote:
| So we can run AI agents with RL in molecular level simulations
| for replacing product designing,machanical engineering,
| electrical engineering, aerospace engineerig and everything else
| right!!? If we can combine protein folding too then we could
| possibly solve any disease and poverty with fully automation
___________________________________________________________________
(page generated 2024-12-19 23:01 UTC)