Viewing the Electron Density from different Unit Cells

Case number:845833-2006218
Topic:Game: Display
Opened by:jeff101
Status:Open
Type:Suggestion
Opened on:Thursday, November 15, 2018 - 15:06
Last modified:Monday, November 19, 2018 - 03:40

Before I get into my suggestions, I have some questions about Electron Density Puzzles.
When you Tab on a residue and the Segment Information box appears, what does it
mean if the Density subscore for that residue is negative? Is a negative Density
subscore undesirable? I have asked similar questions before at:
https://fold.it/portal/node/2006117#comment-37713
but would appreciate an official answer from someone at Foldit Central.

(Thu, 11/15/2018 - 15:06  |  8 comments)


jeff101's picture
User offline. Last seen 16 hours 8 min ago. Offline
Joined: 04/20/2012
Groups: Go Science

In Electron Density Puzzles like 1598, one can turn on the
Electron Density Cloud to view it. Sometimes this cloud is
near the protein and sometimes it is not. When the visible
cloud is far from the protein, one has to remember that
there are many invisible copies of the cloud extending in
all directions. These invisible clouds let segments of the
protein have nonzero Density subscores even when the
protein does not appear to overlap with the visible cloud.

I think this lattice of clouds can be imagined to contain
many identically-shaped boxes (also called Unit Cells),
and each Unit Cell contains one cloud. This lattice of
clouds / Unit Cells is periodic in all directions. If one
moved through space in a fixed direction, one would
cross from one Unit Cell to another at regular intervals.

I think it would help if one could display all the clouds
that overlap with the protein. The cloud with the most
overlap could be colored like normal, but the other
clouds could be colored slightly differently, a bit like
copies of the main monomer in a Symmetry Puzzle.

I think it would also help if one could use certain keyboard
buttons (perhaps the arrows and A Z or PageUp PageDown)
to translate the protein from one Unit Cell to another. The
arrows could translate the protein left right up or down on the
screen, and A Z or PageUp PageDown could translate the
protein into or out of the screen. Since each Unit Cell is
identical, the protein's score should not change for these
translations.

With these tools, if the protein overlaps with more than one
Unit Cell's Electron Density Cloud, one could more easily
identify which segments of the protein overlap with which
parts of the cloud, and this might give insight into how
an extended protein should fold in order to make all of its
contacts with just one Unit Cell's ED Cloud. For example, if
one saw a very large Density subscore for Trp53 when Trp53
was nowhere near the usual visible ED Cloud, one could easily
translate the protein so that Trp53 moved into the Unit Cell
containing the visible cloud. Then one could see which part
of the cloud had such a large density and therefore such a
good overlap with Trp53. Displaying all the clouds in contact
with the protein would also help identify which part of the
cloud overlapped so well with Trp53.

S0ckrates's picture
User offline. Last seen 3 hours 25 min ago. Offline
Joined: 05/19/2017
Groups: None

I understand where you're coming from, and how an extended chain that's longer than one unit cell might stretch some residues into a positive density bonus, but I don't think this unit cell visibility is actually useful. For one, it's pretty serendipitous to get that positive density subscore in the first place, and from what I can actually tell from experience, it doesn't matter if the residue in question is in the *correct* part of the cloud (since the whole point is that we don't necessarily know how it fits into the cloud) but rather the Density subscore is focused on whether or not a given residue is overlapped by *any* part of the cloud. [Dev verification needed.]

In other words, I expect that if we were to shift that hypothetical lucky Trp53 into the main unit cell that's visible, we would find that it doesn't correspond well to a good cloud shape for it, but rather it just happens to be buried in a portion of the cloud that's large enough to house it, but not the correct shape or place for it. I think it's a much more effective strategy to match landmark residues within the visible cloud, and align the sequences as such. Qualitative over quantitative

I'll try and do some testing tonight if no devs get back to us on this to determine whether or not the Density subscore relies on any given position in the cloud or if there's an implicit bonus for a "correct" cloud position that bolsters the bonus.

bertro's picture
User offline. Last seen 2 hours 32 min ago. Offline
Joined: 05/02/2011
Groups: Beta Folders

Please keep in mind that even if a particular sidechain has a high density score, it does not mean that sidechain belongs there. And the "surrounding" before-and-after sidechains may be the wrong ones for that area of the density cloud. Also you have to consider that the chain may be sideways to the cloud even though that particular sidechain density is high. In other words, high Density Score is not equal to good position in cloud.

Only by visualizing the density cloud AND the sidechains that you can decide if they belong together.

I know it is difficult for many people to see the two together. For my part, I manage by usually reducing the Threshold to a point where I can see most of the shapes of the sidechains which is the point where the dust particles (small part of the cloud around the main cloud) mostly disappear. That is with the Threshold slider being over the word "density" in the line below the slider (in 1598 case, for me the slider is over "ensit"). Viewing the chain in "All Loops" also helps me. Playing with "Enable Backface Culling" may also help in Wireframe mode.

Then I look for one Trp (or other big sidechain if no Trp are present) at least in "Solid" view of the Electron Density panel. Like you said Trp53 is a good candidate for 1598 and is easily located, I think. "Solid" view is OK to locate Trps because they are usually on the surface of the protein. I will then mark the cloud at that place with a dot (tab on the cloud).

Now I go to "Wireframe" view of the Electron Density panel.

I also locate all the Trps in the chain (there are 4 Trps in 1598). The easiest way for me is to cut on each side of the Trps plus/minus 2 segments (not too many as it get cumbersome to handle them, not so few that you cannot decide if they belong there) and I try them by moving these sections in the previously found cloud position(s). Now I use shift-Q to focus on the Trp and I try to mostly adjust it to the Trp cloud I found. By still using the move tool, I try to adjust/rotate the segments before and after the Trp to the cloud I see before and after where the Trp is in. It means I bring each sections with the move tool inside the could with the Trp in position. By looking at the cloud I can tell if they kind of belong there. If they don't, I try another section until I find the right one. That is your start point where you can thread from both ends the rest of the chain to the cloud.

Remember that extended chain can be misleading. The spacing of the segments in the extended chain is usually too big (in the protein segments are usually closer together) and also that the sidechains may be on the wrong side in the protein/cloud (the extended chain has them all alternating, in the cloud they may not). I usually can fit at most 2-3 sidechains in the cloud before I need to cut them up to turn or fit while following the cloud. Most of the time I need to cut in between each sidechain to make them fit (specially where the loops are). It is a relatively long process (a few hours) so I do it at a few different times. All sidechains have a shape that extrudes out of the cloud except Glycine which only look like a tube/body of the cloud.

All this to say, the one cloud is good enough for me, just bring a bit of chain in it and the cloud will appear to mean something out of the fog. Without a bit of chain the cloud it is more difficult to see the outline of the sidechains in 3D. If you cannot see anything at first, bring/move any small part of the chain (I usually use one of the Trp sections) with you as you explore the cloud even if it is not yet in the right position. It may help you see better what the cloud is telling you.

Sorry for the long post, hope it helps a bit.

bertro

S0ckrates's picture
User offline. Last seen 3 hours 25 min ago. Offline
Joined: 05/19/2017
Groups: None

Seconding this. I use a similar technique that I commented on in my recent post-commentary YouTube video. Instead of just tryptophans, I also incorporate phenylalanines, prolines, tyrosines, lysines, and arginines as "landmark" residues in my alignment process. I prefer to take the time to work my way in order along the backbone placing residues piece by piece and using the landmark residues to check my progress/alignment. I don't start completely from scratch and pick a random blob in the cloud though; Rainbow coloring + Trace Tube viewing + PSIPRED predictions (or starting structure hints, a la 1588) help me determine a rough draft before starting landmark alignment.

jeff101's picture
User offline. Last seen 16 hours 8 min ago. Offline
Joined: 04/20/2012
Groups: Go Science

Below is an example to illustrate how shifting the
protein from unit cell to unit cell might work:

Say your screen shows a 100-residue protein on the left
and its electron density cloud on the right. One could
imagine 4 unit cells A-D in a row from left to right
with the protein straddling cells A & B and the visible
ED cloud in cell C. Cells A B & D each contain their own
invisible copy of the ED cloud. Say the protein's segments
1-60 are in cell A while its segments 61-100 are in cell B.
Say segments 20-30 have high density subscores because they
overlap well with cell A's invisible ED cloud. Say segments
70-90 have high density subscores because they overlap well
with cell B's invisible ED cloud. What should you do?

Ideally you want all 100 segments of the protein to end
in one unit cell, preferably cell C with its visible ED
cloud. You would also like to have as many segments as
possible overlapping well with cell C's visible ED cloud
and giving high density subscores. It would be great if
you could keep the high density subscores from both segment
groups 20-30 & 70-90, but you might have to sacrifice one
group to keep the other. What should you do?

If you had a set of buttons (like A Z PageUp PageDown & the
arrows) that would shift the protein from one unit cell to
another (without changing the protein's overall score), you
could easily shift the protein to the right so that it would
straddle cells B & C or cells C & D. Also, if you wanted to
go back & forth between these positions, you could easily do
so with the appropriate buttons.

When straddling cells B & C, segments 1-60 would be in cell B
while segments 61-100 would be in cell C with its visible ED
cloud. In this case, it would be easy to see which of segments
61-100 overlapped best with the ED cloud & what particular
features led to the high density subscores for segments 70-90.

When straddling cells C & D, segments 1-60 would be in cell C
with its visible ED cloud while segments 61-100 would be in
cell D. In this case, it would be easy to see which of segments
1-60 overlapped best with the ED cloud & what particular
features led to the high density subscores for segments 20-30.

Armed with this information, you could decide which parts of
the protein to leave unchanged in cell C and which parts to
cut & move to let this happen.

One could imagine more complicated cases if the protein
straddled more than 2 unit cells. For example, if the protein
was spherical and centered on the corner of a cubic unit cell,
8 adjacent unit cells would each contain 1/8 of the protein's
volume. Being able to shift the protein from one unit cell to
another would help sort out which parts of the protein should
be left alone and which should be cut & moved to get the best
possible overlap with the visible ED cloud.

S0ckrates's picture
User offline. Last seen 3 hours 25 min ago. Offline
Joined: 05/19/2017
Groups: None

It is as I suspected. From what I've tested, the subscore is merely determined by cloud overlap, with no bias towards whether or not the given residue is in the proper place in the backbone sequence. I tried placing PHE10 in 1598 in a piece of the cloud that looked to be a suitable recipient for TRP (which, volumetrically, is bigger than PHE) followed by positioning it in a spot where the cloud is suitable for PHE (which would be a perfect fit in terms of volume). I was able to have the cloud completely envelop the PHE in both situations, and the density bonus remained roughly in the 42-point range.

Further testing by translating the residue around and rotating it revealed that the density subscore reaches the negative range if it is close to a part of the cloud, but none of it is enveloped in the cloud. This only goes up to a certain range; if it goes past, the density bonus is ignored completely.

Jeff, I understand where you're coming from in terms of what you're trying to do with the unit-cell shifting, I really do, but if we were able to do so, you'd find more often than not 1 of 2 situations:
1. The density subscore for any given residue is scoring too low to actually warrant investigating using the method.
2. The residue in question, when shifted into the visible cloud unit cell, would not fit the cloud AT ALL upon visual inspection and is merely getting a density bonus by happenstance (e.g. overlapping is disjointed, ambiguity in sidechain, etc.)

If you had the ability to shift unit-cells, very rarely would a residue be in the invisible cloud for the bonus, in the right orientation and completely enveloped upon visual inspection, and in the right place in the cloud's backbone to reveal useful information.

Take-home message is that you're better off just visually identifying where residues need to go in the visible cloud > score refining for stability once residues have been placed > hand-fixing script errors bringing sidechains/backbone outside of the cloud to maximize density bonus. There is basically no need to worry about the density bonus in these unit-cell clouds.

If you're still not convinced, let me consider your most complicated example, a hypothetical folded globular protein straddling 8 unit-cells in the cubic corners of each.

"One could imagine more complicated cases if the protein
straddled more than 2 unit cells. For example, if the protein
was spherical and centered on the corner of a cubic unit cell,
8 adjacent unit cells would each contain 1/8 of the protein's
volume. Being able to shift the protein from one unit cell to
another would help sort out which parts of the protein should
be left alone and which should be cut & moved to get the best
possible overlap with the visible ED cloud."

In this example, if I'm understanding correctly (and at this point I'm fairly confident I am), you'd find that the residues getting a density bonus in those unit-cell corners belong to the opposite corner of the cloud in each unit-cell, and the position that it's in provides practically no clues as to the correct orientation. If you shifted the unit-cell position as you have described, you'd be left with the same useless information in all 8 corners; overlap granting a density bonus, but visual inspection revealing that it's just touching the cloud but not really correctly placed in the cloud.

jeff101's picture
User offline. Last seen 16 hours 8 min ago. Offline
Joined: 04/20/2012
Groups: Go Science

Letting certain buttons shift the protein from one unit cell
to another is kind of like the "Center Protein on Density"
button, but the "Center Protein on Density" button often
changes the overall score of the protein (generally making
the score worse), while shifting the protein to a different
unit cell should not change the overall score.

frood66's picture
User offline. Last seen 28 min 43 sec ago. Offline
Joined: 09/20/2011
Groups: Marvin's bunch

It has always been the case that the folding area/volume is tiled as far as the cloud is concerned - so far as I know.

Looking at it from a different perspective - perhaps these tiles slow the game.

I C no point in having them - we cannot see them since they are not rendered. And therefore affect solutions placed outside
the rendered cloud.

Perhaps removing them would help?

It would certainly help those that like to work on a solution outside of the rendered cloud.

Sitemap

Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, RosettaCommons