[HN Gopher] Rendering protein structures inside cells at the ato...
       ___________________________________________________________________
        
       Rendering protein structures inside cells at the atomic level with
       Unreal Engine
        
       Author : Michelangelo11
       Score  : 146 points
       Date   : 2024-02-29 14:38 UTC (8 hours ago)
        
 (HTM) web link (www.biorxiv.org)
 (TXT) w3m dump (www.biorxiv.org)
        
       | canadiantim wrote:
       | Wow, Unreal!
        
       | eurekin wrote:
       | One of videos:
       | 
       | https://twitter.com/manorlaboratory/status/17630941356392943...
       | 
       | Anybody came across the code?
        
         | gilleain wrote:
         | There's a link in the paper to a turorial:
         | 
         | https://blake.bcm.edu/emanwiki/EMAN2/unreal_render
         | 
         | which uses ChimeraX:
         | 
         | https://www.cgl.ucsf.edu/chimerax/
        
           | dekhn wrote:
           | so it's using ChimeraX to turn a PDB file (protein or DNA
           | structure) into an isosurface and then triangulating the
           | surface into a mesh which is then rendered in unreal.
        
         | gpnt wrote:
         | videos:
         | https://www.biorxiv.org/content/10.1101/2023.12.08.570879v1....
        
         | COGlory wrote:
         | https://github.com/cryoem/eman2/blob/master/programs/e2spt_m...
        
       | gilleain wrote:
       | Reminds me of the drawings of David Goodsell:
       | 
       | https://ccsb.scripps.edu/goodsell/
       | 
       | which are similarly about the 'packed' nature of cells.
        
         | timtimmy wrote:
         | I just released a biology education app very much like the
         | preprint for the Vision Pro launch (and soon for iPad/iPhone).
         | I worked with David Goodsell's group to integrate their whole-
         | cell bacteria model and David wrote the content. It looks like
         | this: https://twitter.com/timd_ca/status/1753250624677007492
         | Our first bit of content is a tour through a 300 million atom
         | bacteria cell for Apple Vision Pro (>60 fps, stereoscopic,
         | atomic resolution).
         | 
         | We developed the tech for iPhone, iPad and AVP mobile GPUs (UE5
         | doesn't support this on the devices we're targeting). iPad:
         | https://twitter.com/timd_ca/status/1592948101144547328
         | 
         | The linked preprint is beautiful, and I love the pipeline. I
         | wonder if it's possible to export to other tools like Blender?
         | The linked preprint is part of a pretty cool field of research
         | into mesoscale modeling and visualization. For me these are a
         | few of the standout papers, projects and works in the area (and
         | there are many more):
         | 
         | - le Muzic et al. "Multi-Scale Rendering of Large Biomolecular
         | Datasets" 2015 [1]
         | 
         | - - Ivan Viola's group helped pioneer large scale molecular
         | visualization. This reference should be in the preprint, IMO.
         | 
         | - Maritan et al. "3D Whole Cell Model of a Mycoplasma
         | Bacterium" [2]
         | 
         | - - This is out of David Goodsell's lab and the model I'm
         | using.
         | 
         | - Stevens et al. "Molecular dynamics simulation of an entire
         | cell" [3]
         | 
         | - Brady Johnston's Molecular Nodes addon for Blender [4]
         | 
         | - YASARA PetWorld [5]
         | 
         | [1] :
         | https://www.cg.tuwien.ac.at/research/publications/2015/cellV...
         | 
         | [2] : https://ccsb.scripps.edu/gallery/mycoplasma_model/
         | 
         | [3] :
         | https://twitter.com/JanAdStevens/status/1615693906137473030 and
         | https://www.frontiersin.org/articles/10.3389/fchem.2023.1106...
         | 
         | [4] : https://bradyajohnston.github.io/MolecularNodes/
         | 
         | [5] : http://download.yasara.org/petworld/index.html
        
       | COGlory wrote:
       | Muyuan Chen is one of (maybe the?) primary developer of the sub-
       | tomogram averaging portion of the EMAN2 software package (linked
       | below in another comment). Typically what you do is take a 3D
       | tomogram (think like a scan) using a microscope, but it's
       | extremely noisy. Then you go through and extract all the
       | particles that are identical, but in different orientations, in
       | the tomogram. So if the same protein is there multiple times, you
       | can align them to each other, and average them together to
       | increase the signal. Then you clone back in the higher signal
       | averaged volume at the position and orientation that you found
       | them in originally.
       | 
       | The one-line command to go from EMAN2 coordinates to Unreal
       | Engine 5 is kind of crazy.
       | 
       | As usual on these (rare) threads, I'm happy to answer any
       | questions about structural biology or cryo-EM.
        
         | frodo8sam wrote:
         | Sounds pretty cool, how does EMAN2 deal with dynamic
         | structures? I assume you'll get garbage if sufficiently
         | different conformations get averaged together. Is there some
         | kind of clustering to find similar conformations as is
         | sometimes done in cryo-EM?
        
           | COGlory wrote:
           | Yes, at multiple levels. You can do a heterogeneous
           | refinement that takes the structures and solves for _n_
           | number of averages, trying to use something like PCA to
           | maximize the distances between the averages. Particles get
           | sorted into the average they contribute most constructively
           | to. That works well for compositional heterogeneity or large
           | conformational differences.
           | 
           | For minor differences, there's something called the guassian
           | mixture model (lots of software packages have similar, but
           | GMM is EMAN2's version).
           | 
           | https://blake.bcm.edu/emanwiki/EMAN2/e2gmm
           | 
           | What you can get out of the other end is a volume series,
           | that shows reconstructed 3D volumes along various axes of
           | conformational variability. This quickly turns into a multi-
           | dimensional problem, but it has been very successful in, for
           | instance, seeing all the different states of an active
           | ribosome.
        
         | dalke wrote:
         | Do you know what the author means by "Current visualization
         | software, such as UCSF ChimeraX6 , can only render one or a few
         | protein structures at the atomic level."
         | 
         | I haven't used VMD for about 30 years, but even in the 1990s I
         | was using it to visualize the full poliovirus structure (4
         | proteins in 2PLV * 60 copies, as I recall).
         | 
         | It took about 6-10 seconds per update on our SGI Onyx, but
         | again, that was 25 years ago.
        
           | COGlory wrote:
           | I can only guess, but I believe that ChimeraX's rendering
           | pipeline is single threaded (just an empirical guess based on
           | my CPU usage when using it). Additionally, loading that many
           | atom positions requires a huge amount of memory (I routinely
           | use > 32 GB memory just loading a few proteins) and things
           | start to slow down quite a bit.
           | 
           | Loading a 60-fold icosahedral virus has used > 100 GB memory
           | on my workstation, and resulted in a 0fps experience. It
           | might render OK from the command line, but now imagine a few
           | dozen of those, plus a cell, plus all the proteins in the
           | cell...
        
             | dalke wrote:
             | Odd. I can't see why. I think we had 128 MB on that IRIX
             | box, and I know I loaded a 1 million atom structure with
             | copies of 2PLV (full capsid plus a bit more to get to a
             | million.)
             | 
             | Each atom record has ~60 bytes (x, y, z, occupancy, bond
             | list, resid, segid, atom name, plus higher-level structure
             | information about secondary structure, connected fragments,
             | etc.) We had our own display list, so another (x, y, z, r,
             | color-index) per atom, giving 20 more bytes. We probably
             | used a GL/OpenGL display list for the sphere, and immediate
             | mode to render that display list for each point, so all-in-
             | all about 100 bytes per atom, which just barely fits in 128
             | MB.
             | 
             | That was also all single-threaded, with a ~0.1 Hz frame
             | rate. Again, in the 1990s.
             | 
             | I wanted to see what more recent projects have done. Google
             | Scholar found "cellVIEW: a Tool for Illustrative and Multi-
             | Scale Rendering of Large Biomolecular Datasets" (2017) at
             | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5747374/ which
             | says
             | 
             | > The most widely known visualization softwares are: VMD
             | [HDS96], Chimera [PGH _04], Pymol [DeL02], PMV [S_ 99],
             | ePMV [JAG _11]. These tools, however, are not designed to
             | render a large number of atoms at interactive frame-rates
             | and with full-atomic details (Van der Walls or CPK
             | spherical representation). Megamol [GKM_ 15] is a state-of-
             | the-art prototyping and visualization framework designed
             | for particle-based data and which currently outperforms any
             | other molecular visualisation software or generic
             | visualization frameworks such VTK/Paraview [SLM04]. The
             | system is able to render up to 100 million atoms at 10 fps
             | on commodity hardware, which represents, in terms of size,
             | a large virus or a small bacterium.
             | 
             | Following that is a section on Related Work:
             | 
             | > With their new improvement they managed to obtain 3.6 fps
             | in full HD resolution for 25 billion atoms on a NVidia GTX
             | 580, while Lindow et al. managed to get around 3 fps for 10
             | billions atoms in HD resolution on a NVIDIA GTX 285. Le
             | Muzic et al. [LMPSV14], introduced another technique for
             | fast rendering of large particle-based datasets using the
             | GPU rasterization pipeline instead. They were able to
             | render up to 30 billions of atoms at 10 fps in full HD
             | resolution on a NVidia GTX Titan
             | 
             | Checking up on VMD, in "Atomic detail visualization of
             | photosynthetic membranes with GPU-accelerated ray tracing"
             | from 2016:
             | 
             | > VMD has achieved direct-to-HMD rendering rates limited by
             | the HMD display hardware (75 frames per second on Oculus
             | Rift DK2) for moderate complexity scenes containing on the
             | order of one million atoms, with direct lighting and a
             | small number of ambient occlusion lighting samples.
             | 
             | Those citations are 6-7 years ago, which make me scratch my
             | head wondering why ChimeraX can't handle a picornavirus.
        
               | dekhn wrote:
               | The author of EMAN2 is incorrect, I don't know why they
               | claimed that. ChimeraX is probably like Chimera and
               | targets a 30fps but can drop below that significantly
               | based on dataset size and rendering quality. It should be
               | using OpenGL with display lists (or some more modern
               | variant on that). The main loop is likely in Python, but
               | if you're just moving a molecule around, the rendering
               | should touch very little python. On a modern machine with
               | an nvidia gaming card it should be fine.
               | 
               | For example, in this case I loaded 2PLV with "open 2PLV"
               | and on the right side, there's an option to select one of
               | the mmcif assemblies, with select 1 being "complete
               | icosahedral assembly, 60 copies of chains 1-4". With the
               | default ribbon rendering, rotating is completely smooth;
               | with all atoms displayed (wireframe or sphere), it's
               | still smooth. Computing a surface for the entire capsid
               | takes well under a second(!) and still renders smoothly.
               | Rotating shows my GPU (nvidia RTX 3080 Ti) at about 50%
               | utilization, and if I exit Chimera, my GPU's releases
               | ~200MB of memory.
               | 
               | Chimera was never intended to do high quality rendering
               | of cellular environments with many hundreds of proteins.
               | It was intended for a combination of nice rendering and
               | scripting directly in python. VMD definitely handled some
               | extremely large scenarios faster. A dedicated small C++
               | using modern OpenGL would be able to do far, far more
               | than Chimera when it comes to simple rendering without
               | any scripting control.
        
               | dalke wrote:
               | Thanks! That's much more like what I expected.
        
               | COGlory wrote:
               | Opening the bio-assembly for 3J31 ate about 8 GB of my
               | VRAM, and 32 of my system RAM, in ChimeraX. Which is
               | actually less than I remember a few years ago. I wonder
               | if the render pipeline has changed a bit. That said, it's
               | still very significant if you have 10-20 viruses attached
               | to a cell, for instance.
               | 
               | EDIT - that's also for atoms. Going to 3D maps is
               | significantly more computationally intensive. A typical,
               | sub-tomogram average, or annotation will be in MRC file
               | format, which is horrendously slow with a box size > 1024
               | pixels or so.
        
           | timtimmy wrote:
           | "Current visualization software, such as UCSF ChimeraX6, can
           | only render one or a few protein structures at the atomic
           | level."
           | 
           | Lots of current visualization software is focused on
           | visualizing a single protein structure (for example,
           | ChimeraX). New visualizing and modeling systems are being
           | developed to go up in scale to cellular scenes and even whole
           | cells. For example, systems like le Muzic et al.'s CellView
           | (2015) [1] are capable of rendering atomic resolution whole
           | cell datasets like this in realtime:
           | https://ccsb.scripps.edu/gallery/mycoplasma_model/
           | 
           | [1] : https://www.cg.tuwien.ac.at/research/publications/2015/
           | cellV...
        
         | Vt71fcAqt7 wrote:
         | Slightly off topic but what do you think of the potential
         | growth for GWAS? How much will it improve in the next 10 years?
         | 2x? 10x?
        
           | COGlory wrote:
           | Not an expert in that field. I can speculate wildly that I'm
           | not super optimistic. I can say I've seen the narrative
           | around the field start to give way slightly - shifting from
           | "genome is all you need" to "genome is not enough".
           | 
           | The problem is that many of the interesting or urgent
           | pathologies have no obvious (or weak) associations. Or maybe
           | the noise level is too high. So there's got to be a piece of
           | the puzzle we're missing, or something is getting lost in the
           | noise. Whether a neural network can pull something out of the
           | noise remains to be seen, but if it can't, I'm not super
           | optimistic about our chances. Overall I'd say trying to
           | tackle the problem at the outcome level is probably more
           | promising right now. Even if we can find good associations,
           | we're still lacking therapies.
        
       | abraxas wrote:
       | this reminds me of the video from years ago that blew my mind:
       | 
       | https://vimeo.com/76306502
        
         | GeoAtreides wrote:
         | requires login : (
         | 
         | but it might be this video:
         | https://www.youtube.com/watch?v=wJyUtbn0O5Y
        
           | abraxas wrote:
           | This is an abberviated version of the same yes, I'm surprised
           | though that the vimeo link isn't working without login. I
           | don't have an account with them and was able to play it
           | without issues.
        
             | littlestymaar wrote:
             | Same for me, it says something about the video lacking an
             | age classification.
        
           | yardshop wrote:
           | It didn't require one for me. There was a Google login box
           | but I closed it and the video played fine. Maybe a regional
           | thing?
        
         | throwaway8877 wrote:
         | I have seen this video. The complexity of life is mind blowing.
         | Even more so the fact that we know anything about it.
        
       | mchinen wrote:
       | This is fascinating to watch.
       | 
       | I understand that they get the proteins from PDB/ChimeraX, but
       | how much manual process is involved to map and place the
       | individual proteins? The paper says it gets the protein locations
       | from CryoET tomograms, but I'd be surprised if these let you
       | automatically identify which proteins are where, and exactly how
       | to place them, and even less so how the ligomers bind together to
       | form larger structures - for example, in the video the membrane
       | surfaces are very smooth, and look almost textbook picture
       | perfect, which suggests they come from a hardcoded model or are
       | smoothed in some way. One part of the paper mentions subatomic
       | averaging from the tomogram, but another mentions:
       | 
       | > From a tomogram (Figure3a), we select the particles and
       | determine the orientation of the crown of the spike, as well as
       | the stalk that connects the spike to the membrane
       | 
       | Is this a manual process, where the researcher is using his
       | mental model of how the proteins are fit and placing/rotating
       | individual proteins? Or do the tools they developed let this be
       | automated. Both are impressive! If the former, I'm blown away by
       | the effort it must take to make these kinds of videos.
        
         | COGlory wrote:
         | It's a guided, but automated process. EMAN2 (the software
         | used/partially written by the author), for instance, has a
         | convolutional neural network particle picker. So you can pick a
         | few particles by hand, pick some noise, pick some bad
         | particles, train a neural network, and then inference that
         | network to pick the remaining particles throughout the
         | tomogram.
         | 
         | There are a variety of other methods too. There is simple 6
         | dimensional real-space cross correlation. You can place points
         | by hand, or according to a model. For instance, if you are
         | trying to identify viral spike proteins, and the virus is
         | spherical, and the spike proteins are always on the surface of
         | the sphere, you have a great starting point. So you can say
         | "place points at _n_ interval along the surface of this sphere
         | " and then oversample the spherical surface. You then take a
         | reference volume (can be generated a number of ways), and check
         | each point you placed to see how well it matches the reference
         | volume. You can allow for rotations and translations of the
         | reference volume at each point, and if you find points that are
         | too close together, you can merge them automatically.
         | 
         | If you have a high contrast, relatively static protein (such as
         | a ribosome), you can do 2D template matching in the tomogram,
         | where you use a central slice (or maybe a collapsed projection)
         | to do cross correlation in 3 dimensions instead of six (X
         | translation, Y, translation, Z rotation). Or you can beef that
         | up with more neural network/YOLO type stuff.
         | 
         | EDIT: To expand on this, continuous density like membranes can
         | be roughly modeled just with techniques like thresholding,
         | watershedding, etc. There are some neural nets such as Membrain
         | [0] and tomoseg [1] (also by the author of this paper), but
         | membranes certainly are trickier. I typically segment membranes
         | by hand (and do so rarely).
         | 
         | [0]
         | https://www.sciencedirect.com/science/article/pii/S016926072...
         | 
         | [1] https://blake.bcm.edu/emanwiki/EMAN2/Programs/tomoseg
        
       | benjismith wrote:
       | This reminds me of the amazing molecular animations of Drew
       | Berry, which he showed in this TED talk:
       | 
       | https://youtu.be/WFCvkkDSfIU?si=JNe06VS8TjIrHpqh
       | 
       | Which was 12 years ago! After watching that video, I had a much
       | greater appreciation for how our bodies are made up of trillions
       | of tiny protein machines. Fascinating stuff!!
        
         | shagie wrote:
         | https://xvivo.com/examples/the-inner-life-of-the-cell/
         | 
         | I didn't realize there was also a Powering the Cell:
         | Mitochondria video.
         | 
         | The classic one narrated - https://youtu.be/QplXd76lAYQ
         | 
         | One of my pandemic YouTube binges was watching Ron Vale videos
         | about kinesin and dynein. https://youtu.be/9RUHJhskW00
         | https://youtu.be/lVwKiWSu8XE https://youtu.be/FRtqfpO8THU
         | 
         | And searching for Ron Vale will bring a number of other videos
         | about molecular machines.
        
           | benjismith wrote:
           | Nice! Thank you for the links!
        
           | dekhn wrote:
           | My friend in grad school was in Ron's group. He built a
           | microscope that visualized individual kinesin molecules and
           | measured their speed using fluorescent labelling. The whole
           | thing was held together with a bunch of scripts written in
           | LabView. Ron had oodles of money and was able to support
           | long-term software development of open source software like
           | MicroManager, which gives a common interface to a wide range
           | of microscopy software.
           | 
           | The systems he studies are literally little motors that can
           | attach to biological surfaces and drive around in specific
           | directions, pick up payloads, and then drive to other places.
           | They work in very different way from how humans engineer tiny
           | motors and understanding/engineering their behavior was a
           | major focus in the early 2000s.
        
             | shagie wrote:
             | > My friend in grad school was in Ron's group. He built a
             | microscope that visualized individual kinesin molecules and
             | measured their speed using fluorescent labelling.
             | 
             | In vitro motility of yeast cytoplasmic dynein -
             | https://youtu.be/lVwKiWSu8XE?si=Su29neym0wg9DalR&t=627
        
               | dekhn wrote:
               | Yeah, exactly that, but with kinesin instead of dynein
               | (everybody started with myosin, but loss interest, and
               | moved to kinesin and then dynein) and about 10 years
               | earlier.
               | 
               | Those little blobs moving along the filaments are ~10-100
               | nanometers, you wouldn't normally be able to see them,
               | but they managed to tether fluorescent (glowing)
               | molecules to them and those act like point sources of
               | light, which allows for precise localization because the
               | PSF of a point is approximately gaussian and finding the
               | centroid of a gaussian is trivial.
        
         | adkaplan wrote:
         | Appreciate the share. I've seen a compilation of these clips
         | out of context and loved them. Never figured out where they
         | came from. They really are amazing in striking the balance
         | between organic and mechanistic. The Kinesin in particular are
         | cute.
         | 
         | https://www.youtube.com/watch?v=wJyUtbn0O5Y&t=2s
         | 
         | This video shows it quite well too. EDIT: looks like someone
         | else shared the same.
        
       | vdm wrote:
       | Videos:
       | https://www.biorxiv.org/content/10.1101/2023.12.08.570879v1....
        
         | leptons wrote:
         | Nice videos but I'm always reminded when watching this type of
         | molecular biology video that it's missing all the water
         | molecules. These proteins and things aren't floating around in
         | empty space.
        
       | amelius wrote:
       | What is the meaning of color in these visualizations? Does every
       | functional unit have its own color?
        
       | leoncvlt wrote:
       | If you get a kick out of 3D renderings of cells and molecules,
       | you're gonna have a field day with the work done at
       | https://random42.com/. PSA: I started working there as a 3D
       | artist but now lead the interactive department. You'd be
       | surprised at how much a good art direction really makes a
       | difference in scientific visualization. Real-time graphics
       | advanced considerably in the last couple years but it's always a
       | challenge to transport that nice, smooth pre-rendered look over
       | to mobile devices and the web at 60 frames per second (90 on
       | virtual reality headsets, to boot...)
        
       | zmmmmm wrote:
       | Would love a way to see these in 3D in VR / MR.
        
         | COGlory wrote:
         | ChimeraX has a VR functionality. Certain modeling programs
         | still support Nvidia's Stereo3D (Coot, PyMOL, Chimera,
         | ChimeraX, and more) which I still use for modeling.
         | 
         | That relies on X11 unfortunately, so I'm looking for a new way
         | to do 3D viewing.
        
       | protoman3000 wrote:
       | I would like to ask a question and add before that that I have no
       | intent to judge, discredit or diminish the value of this. It's
       | merely that I really don't understand and would like to gain
       | insight.
       | 
       | The question is: How is this a scientific contribution?
       | 
       | Or, to ask it differently: What makes this a scientific
       | contribution?
        
         | bglazer wrote:
         | It's more of an engineering accomplishment. I could see this
         | being useful for exploratory data analysis of large protein
         | (mesoscale) complexes. A surprising amount of science starts
         | with a grad student staring at a really complex plot for a
         | really long time, then suddenly going "oh shit thats weird".
         | That kind of realization is terribly difficult if your
         | visualization tools are fighting you the whole time.
        
       ___________________________________________________________________
       (page generated 2024-02-29 23:01 UTC)