[HN Gopher] Rendering protein structures inside cells at the ato...
___________________________________________________________________
Rendering protein structures inside cells at the atomic level with
Unreal Engine
Author : Michelangelo11
Score : 146 points
Date : 2024-02-29 14:38 UTC (8 hours ago)
(HTM) web link (www.biorxiv.org)
(TXT) w3m dump (www.biorxiv.org)
| canadiantim wrote:
| Wow, Unreal!
| eurekin wrote:
| One of videos:
|
| https://twitter.com/manorlaboratory/status/17630941356392943...
|
| Anybody came across the code?
| gilleain wrote:
| There's a link in the paper to a turorial:
|
| https://blake.bcm.edu/emanwiki/EMAN2/unreal_render
|
| which uses ChimeraX:
|
| https://www.cgl.ucsf.edu/chimerax/
| dekhn wrote:
| so it's using ChimeraX to turn a PDB file (protein or DNA
| structure) into an isosurface and then triangulating the
| surface into a mesh which is then rendered in unreal.
| gpnt wrote:
| videos:
| https://www.biorxiv.org/content/10.1101/2023.12.08.570879v1....
| COGlory wrote:
| https://github.com/cryoem/eman2/blob/master/programs/e2spt_m...
| gilleain wrote:
| Reminds me of the drawings of David Goodsell:
|
| https://ccsb.scripps.edu/goodsell/
|
| which are similarly about the 'packed' nature of cells.
| timtimmy wrote:
| I just released a biology education app very much like the
| preprint for the Vision Pro launch (and soon for iPad/iPhone).
| I worked with David Goodsell's group to integrate their whole-
| cell bacteria model and David wrote the content. It looks like
| this: https://twitter.com/timd_ca/status/1753250624677007492
| Our first bit of content is a tour through a 300 million atom
| bacteria cell for Apple Vision Pro (>60 fps, stereoscopic,
| atomic resolution).
|
| We developed the tech for iPhone, iPad and AVP mobile GPUs (UE5
| doesn't support this on the devices we're targeting). iPad:
| https://twitter.com/timd_ca/status/1592948101144547328
|
| The linked preprint is beautiful, and I love the pipeline. I
| wonder if it's possible to export to other tools like Blender?
| The linked preprint is part of a pretty cool field of research
| into mesoscale modeling and visualization. For me these are a
| few of the standout papers, projects and works in the area (and
| there are many more):
|
| - le Muzic et al. "Multi-Scale Rendering of Large Biomolecular
| Datasets" 2015 [1]
|
| - - Ivan Viola's group helped pioneer large scale molecular
| visualization. This reference should be in the preprint, IMO.
|
| - Maritan et al. "3D Whole Cell Model of a Mycoplasma
| Bacterium" [2]
|
| - - This is out of David Goodsell's lab and the model I'm
| using.
|
| - Stevens et al. "Molecular dynamics simulation of an entire
| cell" [3]
|
| - Brady Johnston's Molecular Nodes addon for Blender [4]
|
| - YASARA PetWorld [5]
|
| [1] :
| https://www.cg.tuwien.ac.at/research/publications/2015/cellV...
|
| [2] : https://ccsb.scripps.edu/gallery/mycoplasma_model/
|
| [3] :
| https://twitter.com/JanAdStevens/status/1615693906137473030 and
| https://www.frontiersin.org/articles/10.3389/fchem.2023.1106...
|
| [4] : https://bradyajohnston.github.io/MolecularNodes/
|
| [5] : http://download.yasara.org/petworld/index.html
| COGlory wrote:
| Muyuan Chen is one of (maybe the?) primary developer of the sub-
| tomogram averaging portion of the EMAN2 software package (linked
| below in another comment). Typically what you do is take a 3D
| tomogram (think like a scan) using a microscope, but it's
| extremely noisy. Then you go through and extract all the
| particles that are identical, but in different orientations, in
| the tomogram. So if the same protein is there multiple times, you
| can align them to each other, and average them together to
| increase the signal. Then you clone back in the higher signal
| averaged volume at the position and orientation that you found
| them in originally.
|
| The one-line command to go from EMAN2 coordinates to Unreal
| Engine 5 is kind of crazy.
|
| As usual on these (rare) threads, I'm happy to answer any
| questions about structural biology or cryo-EM.
| frodo8sam wrote:
| Sounds pretty cool, how does EMAN2 deal with dynamic
| structures? I assume you'll get garbage if sufficiently
| different conformations get averaged together. Is there some
| kind of clustering to find similar conformations as is
| sometimes done in cryo-EM?
| COGlory wrote:
| Yes, at multiple levels. You can do a heterogeneous
| refinement that takes the structures and solves for _n_
| number of averages, trying to use something like PCA to
| maximize the distances between the averages. Particles get
| sorted into the average they contribute most constructively
| to. That works well for compositional heterogeneity or large
| conformational differences.
|
| For minor differences, there's something called the guassian
| mixture model (lots of software packages have similar, but
| GMM is EMAN2's version).
|
| https://blake.bcm.edu/emanwiki/EMAN2/e2gmm
|
| What you can get out of the other end is a volume series,
| that shows reconstructed 3D volumes along various axes of
| conformational variability. This quickly turns into a multi-
| dimensional problem, but it has been very successful in, for
| instance, seeing all the different states of an active
| ribosome.
| dalke wrote:
| Do you know what the author means by "Current visualization
| software, such as UCSF ChimeraX6 , can only render one or a few
| protein structures at the atomic level."
|
| I haven't used VMD for about 30 years, but even in the 1990s I
| was using it to visualize the full poliovirus structure (4
| proteins in 2PLV * 60 copies, as I recall).
|
| It took about 6-10 seconds per update on our SGI Onyx, but
| again, that was 25 years ago.
| COGlory wrote:
| I can only guess, but I believe that ChimeraX's rendering
| pipeline is single threaded (just an empirical guess based on
| my CPU usage when using it). Additionally, loading that many
| atom positions requires a huge amount of memory (I routinely
| use > 32 GB memory just loading a few proteins) and things
| start to slow down quite a bit.
|
| Loading a 60-fold icosahedral virus has used > 100 GB memory
| on my workstation, and resulted in a 0fps experience. It
| might render OK from the command line, but now imagine a few
| dozen of those, plus a cell, plus all the proteins in the
| cell...
| dalke wrote:
| Odd. I can't see why. I think we had 128 MB on that IRIX
| box, and I know I loaded a 1 million atom structure with
| copies of 2PLV (full capsid plus a bit more to get to a
| million.)
|
| Each atom record has ~60 bytes (x, y, z, occupancy, bond
| list, resid, segid, atom name, plus higher-level structure
| information about secondary structure, connected fragments,
| etc.) We had our own display list, so another (x, y, z, r,
| color-index) per atom, giving 20 more bytes. We probably
| used a GL/OpenGL display list for the sphere, and immediate
| mode to render that display list for each point, so all-in-
| all about 100 bytes per atom, which just barely fits in 128
| MB.
|
| That was also all single-threaded, with a ~0.1 Hz frame
| rate. Again, in the 1990s.
|
| I wanted to see what more recent projects have done. Google
| Scholar found "cellVIEW: a Tool for Illustrative and Multi-
| Scale Rendering of Large Biomolecular Datasets" (2017) at
| https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5747374/ which
| says
|
| > The most widely known visualization softwares are: VMD
| [HDS96], Chimera [PGH _04], Pymol [DeL02], PMV [S_ 99],
| ePMV [JAG _11]. These tools, however, are not designed to
| render a large number of atoms at interactive frame-rates
| and with full-atomic details (Van der Walls or CPK
| spherical representation). Megamol [GKM_ 15] is a state-of-
| the-art prototyping and visualization framework designed
| for particle-based data and which currently outperforms any
| other molecular visualisation software or generic
| visualization frameworks such VTK/Paraview [SLM04]. The
| system is able to render up to 100 million atoms at 10 fps
| on commodity hardware, which represents, in terms of size,
| a large virus or a small bacterium.
|
| Following that is a section on Related Work:
|
| > With their new improvement they managed to obtain 3.6 fps
| in full HD resolution for 25 billion atoms on a NVidia GTX
| 580, while Lindow et al. managed to get around 3 fps for 10
| billions atoms in HD resolution on a NVIDIA GTX 285. Le
| Muzic et al. [LMPSV14], introduced another technique for
| fast rendering of large particle-based datasets using the
| GPU rasterization pipeline instead. They were able to
| render up to 30 billions of atoms at 10 fps in full HD
| resolution on a NVidia GTX Titan
|
| Checking up on VMD, in "Atomic detail visualization of
| photosynthetic membranes with GPU-accelerated ray tracing"
| from 2016:
|
| > VMD has achieved direct-to-HMD rendering rates limited by
| the HMD display hardware (75 frames per second on Oculus
| Rift DK2) for moderate complexity scenes containing on the
| order of one million atoms, with direct lighting and a
| small number of ambient occlusion lighting samples.
|
| Those citations are 6-7 years ago, which make me scratch my
| head wondering why ChimeraX can't handle a picornavirus.
| dekhn wrote:
| The author of EMAN2 is incorrect, I don't know why they
| claimed that. ChimeraX is probably like Chimera and
| targets a 30fps but can drop below that significantly
| based on dataset size and rendering quality. It should be
| using OpenGL with display lists (or some more modern
| variant on that). The main loop is likely in Python, but
| if you're just moving a molecule around, the rendering
| should touch very little python. On a modern machine with
| an nvidia gaming card it should be fine.
|
| For example, in this case I loaded 2PLV with "open 2PLV"
| and on the right side, there's an option to select one of
| the mmcif assemblies, with select 1 being "complete
| icosahedral assembly, 60 copies of chains 1-4". With the
| default ribbon rendering, rotating is completely smooth;
| with all atoms displayed (wireframe or sphere), it's
| still smooth. Computing a surface for the entire capsid
| takes well under a second(!) and still renders smoothly.
| Rotating shows my GPU (nvidia RTX 3080 Ti) at about 50%
| utilization, and if I exit Chimera, my GPU's releases
| ~200MB of memory.
|
| Chimera was never intended to do high quality rendering
| of cellular environments with many hundreds of proteins.
| It was intended for a combination of nice rendering and
| scripting directly in python. VMD definitely handled some
| extremely large scenarios faster. A dedicated small C++
| using modern OpenGL would be able to do far, far more
| than Chimera when it comes to simple rendering without
| any scripting control.
| dalke wrote:
| Thanks! That's much more like what I expected.
| COGlory wrote:
| Opening the bio-assembly for 3J31 ate about 8 GB of my
| VRAM, and 32 of my system RAM, in ChimeraX. Which is
| actually less than I remember a few years ago. I wonder
| if the render pipeline has changed a bit. That said, it's
| still very significant if you have 10-20 viruses attached
| to a cell, for instance.
|
| EDIT - that's also for atoms. Going to 3D maps is
| significantly more computationally intensive. A typical,
| sub-tomogram average, or annotation will be in MRC file
| format, which is horrendously slow with a box size > 1024
| pixels or so.
| timtimmy wrote:
| "Current visualization software, such as UCSF ChimeraX6, can
| only render one or a few protein structures at the atomic
| level."
|
| Lots of current visualization software is focused on
| visualizing a single protein structure (for example,
| ChimeraX). New visualizing and modeling systems are being
| developed to go up in scale to cellular scenes and even whole
| cells. For example, systems like le Muzic et al.'s CellView
| (2015) [1] are capable of rendering atomic resolution whole
| cell datasets like this in realtime:
| https://ccsb.scripps.edu/gallery/mycoplasma_model/
|
| [1] : https://www.cg.tuwien.ac.at/research/publications/2015/
| cellV...
| Vt71fcAqt7 wrote:
| Slightly off topic but what do you think of the potential
| growth for GWAS? How much will it improve in the next 10 years?
| 2x? 10x?
| COGlory wrote:
| Not an expert in that field. I can speculate wildly that I'm
| not super optimistic. I can say I've seen the narrative
| around the field start to give way slightly - shifting from
| "genome is all you need" to "genome is not enough".
|
| The problem is that many of the interesting or urgent
| pathologies have no obvious (or weak) associations. Or maybe
| the noise level is too high. So there's got to be a piece of
| the puzzle we're missing, or something is getting lost in the
| noise. Whether a neural network can pull something out of the
| noise remains to be seen, but if it can't, I'm not super
| optimistic about our chances. Overall I'd say trying to
| tackle the problem at the outcome level is probably more
| promising right now. Even if we can find good associations,
| we're still lacking therapies.
| abraxas wrote:
| this reminds me of the video from years ago that blew my mind:
|
| https://vimeo.com/76306502
| GeoAtreides wrote:
| requires login : (
|
| but it might be this video:
| https://www.youtube.com/watch?v=wJyUtbn0O5Y
| abraxas wrote:
| This is an abberviated version of the same yes, I'm surprised
| though that the vimeo link isn't working without login. I
| don't have an account with them and was able to play it
| without issues.
| littlestymaar wrote:
| Same for me, it says something about the video lacking an
| age classification.
| yardshop wrote:
| It didn't require one for me. There was a Google login box
| but I closed it and the video played fine. Maybe a regional
| thing?
| throwaway8877 wrote:
| I have seen this video. The complexity of life is mind blowing.
| Even more so the fact that we know anything about it.
| mchinen wrote:
| This is fascinating to watch.
|
| I understand that they get the proteins from PDB/ChimeraX, but
| how much manual process is involved to map and place the
| individual proteins? The paper says it gets the protein locations
| from CryoET tomograms, but I'd be surprised if these let you
| automatically identify which proteins are where, and exactly how
| to place them, and even less so how the ligomers bind together to
| form larger structures - for example, in the video the membrane
| surfaces are very smooth, and look almost textbook picture
| perfect, which suggests they come from a hardcoded model or are
| smoothed in some way. One part of the paper mentions subatomic
| averaging from the tomogram, but another mentions:
|
| > From a tomogram (Figure3a), we select the particles and
| determine the orientation of the crown of the spike, as well as
| the stalk that connects the spike to the membrane
|
| Is this a manual process, where the researcher is using his
| mental model of how the proteins are fit and placing/rotating
| individual proteins? Or do the tools they developed let this be
| automated. Both are impressive! If the former, I'm blown away by
| the effort it must take to make these kinds of videos.
| COGlory wrote:
| It's a guided, but automated process. EMAN2 (the software
| used/partially written by the author), for instance, has a
| convolutional neural network particle picker. So you can pick a
| few particles by hand, pick some noise, pick some bad
| particles, train a neural network, and then inference that
| network to pick the remaining particles throughout the
| tomogram.
|
| There are a variety of other methods too. There is simple 6
| dimensional real-space cross correlation. You can place points
| by hand, or according to a model. For instance, if you are
| trying to identify viral spike proteins, and the virus is
| spherical, and the spike proteins are always on the surface of
| the sphere, you have a great starting point. So you can say
| "place points at _n_ interval along the surface of this sphere
| " and then oversample the spherical surface. You then take a
| reference volume (can be generated a number of ways), and check
| each point you placed to see how well it matches the reference
| volume. You can allow for rotations and translations of the
| reference volume at each point, and if you find points that are
| too close together, you can merge them automatically.
|
| If you have a high contrast, relatively static protein (such as
| a ribosome), you can do 2D template matching in the tomogram,
| where you use a central slice (or maybe a collapsed projection)
| to do cross correlation in 3 dimensions instead of six (X
| translation, Y, translation, Z rotation). Or you can beef that
| up with more neural network/YOLO type stuff.
|
| EDIT: To expand on this, continuous density like membranes can
| be roughly modeled just with techniques like thresholding,
| watershedding, etc. There are some neural nets such as Membrain
| [0] and tomoseg [1] (also by the author of this paper), but
| membranes certainly are trickier. I typically segment membranes
| by hand (and do so rarely).
|
| [0]
| https://www.sciencedirect.com/science/article/pii/S016926072...
|
| [1] https://blake.bcm.edu/emanwiki/EMAN2/Programs/tomoseg
| benjismith wrote:
| This reminds me of the amazing molecular animations of Drew
| Berry, which he showed in this TED talk:
|
| https://youtu.be/WFCvkkDSfIU?si=JNe06VS8TjIrHpqh
|
| Which was 12 years ago! After watching that video, I had a much
| greater appreciation for how our bodies are made up of trillions
| of tiny protein machines. Fascinating stuff!!
| shagie wrote:
| https://xvivo.com/examples/the-inner-life-of-the-cell/
|
| I didn't realize there was also a Powering the Cell:
| Mitochondria video.
|
| The classic one narrated - https://youtu.be/QplXd76lAYQ
|
| One of my pandemic YouTube binges was watching Ron Vale videos
| about kinesin and dynein. https://youtu.be/9RUHJhskW00
| https://youtu.be/lVwKiWSu8XE https://youtu.be/FRtqfpO8THU
|
| And searching for Ron Vale will bring a number of other videos
| about molecular machines.
| benjismith wrote:
| Nice! Thank you for the links!
| dekhn wrote:
| My friend in grad school was in Ron's group. He built a
| microscope that visualized individual kinesin molecules and
| measured their speed using fluorescent labelling. The whole
| thing was held together with a bunch of scripts written in
| LabView. Ron had oodles of money and was able to support
| long-term software development of open source software like
| MicroManager, which gives a common interface to a wide range
| of microscopy software.
|
| The systems he studies are literally little motors that can
| attach to biological surfaces and drive around in specific
| directions, pick up payloads, and then drive to other places.
| They work in very different way from how humans engineer tiny
| motors and understanding/engineering their behavior was a
| major focus in the early 2000s.
| shagie wrote:
| > My friend in grad school was in Ron's group. He built a
| microscope that visualized individual kinesin molecules and
| measured their speed using fluorescent labelling.
|
| In vitro motility of yeast cytoplasmic dynein -
| https://youtu.be/lVwKiWSu8XE?si=Su29neym0wg9DalR&t=627
| dekhn wrote:
| Yeah, exactly that, but with kinesin instead of dynein
| (everybody started with myosin, but loss interest, and
| moved to kinesin and then dynein) and about 10 years
| earlier.
|
| Those little blobs moving along the filaments are ~10-100
| nanometers, you wouldn't normally be able to see them,
| but they managed to tether fluorescent (glowing)
| molecules to them and those act like point sources of
| light, which allows for precise localization because the
| PSF of a point is approximately gaussian and finding the
| centroid of a gaussian is trivial.
| adkaplan wrote:
| Appreciate the share. I've seen a compilation of these clips
| out of context and loved them. Never figured out where they
| came from. They really are amazing in striking the balance
| between organic and mechanistic. The Kinesin in particular are
| cute.
|
| https://www.youtube.com/watch?v=wJyUtbn0O5Y&t=2s
|
| This video shows it quite well too. EDIT: looks like someone
| else shared the same.
| vdm wrote:
| Videos:
| https://www.biorxiv.org/content/10.1101/2023.12.08.570879v1....
| leptons wrote:
| Nice videos but I'm always reminded when watching this type of
| molecular biology video that it's missing all the water
| molecules. These proteins and things aren't floating around in
| empty space.
| amelius wrote:
| What is the meaning of color in these visualizations? Does every
| functional unit have its own color?
| leoncvlt wrote:
| If you get a kick out of 3D renderings of cells and molecules,
| you're gonna have a field day with the work done at
| https://random42.com/. PSA: I started working there as a 3D
| artist but now lead the interactive department. You'd be
| surprised at how much a good art direction really makes a
| difference in scientific visualization. Real-time graphics
| advanced considerably in the last couple years but it's always a
| challenge to transport that nice, smooth pre-rendered look over
| to mobile devices and the web at 60 frames per second (90 on
| virtual reality headsets, to boot...)
| zmmmmm wrote:
| Would love a way to see these in 3D in VR / MR.
| COGlory wrote:
| ChimeraX has a VR functionality. Certain modeling programs
| still support Nvidia's Stereo3D (Coot, PyMOL, Chimera,
| ChimeraX, and more) which I still use for modeling.
|
| That relies on X11 unfortunately, so I'm looking for a new way
| to do 3D viewing.
| protoman3000 wrote:
| I would like to ask a question and add before that that I have no
| intent to judge, discredit or diminish the value of this. It's
| merely that I really don't understand and would like to gain
| insight.
|
| The question is: How is this a scientific contribution?
|
| Or, to ask it differently: What makes this a scientific
| contribution?
| bglazer wrote:
| It's more of an engineering accomplishment. I could see this
| being useful for exploratory data analysis of large protein
| (mesoscale) complexes. A surprising amount of science starts
| with a grad student staring at a really complex plot for a
| really long time, then suddenly going "oh shit thats weird".
| That kind of realization is terribly difficult if your
| visualization tools are fighting you the whole time.
___________________________________________________________________
(page generated 2024-02-29 23:01 UTC)