"Align Protein to Density" button is confusingly named

Started by alwen

alwen Lv 1

The "Align Protein to Density" button in the Electron Density menu actually flips the protein over inside the density.

The text on the button is confusing.

Especially since the Alignment Tool aligns the protein to the model, and the word "align" does not seem to be used in that sense at all on this button.

Susume Lv 1

I had my protein arranged outside of ED; hit Align button, and it flipped my protein so it was 180 degrees off from where I had it. It also translated it into the ED, which I expect, but I had to rotate every piece 180 degrees and move it to the other side of the cloud to correct it.

Rav3n_pl Lv 1

Yes, I found that button is ALWAYS make a flipover. Also score ALWAYS drops like hell.
If this is to help us find better place - it is a miss.

jflat06 Staff Lv 1

I agree that it is confusing. The tool is designed to actually align, however, this is computationally very hard to do. I am considering removing it or renaming it to something like "center protein on density".

Susume Lv 1

Given that it seems to flip the protein in space, I'm wondering if there is an extra or missing minus sign in the spatial calculations somewhere. If so, and that were fixed, the tool might be fine.

brow42 Lv 1

It did work on the first two or so ED puzzles, when I was already close.

PCA/SVD gives an ambiguous result, it gives the optimal axes, which are preserved on reflection. I bet susume's right. It's not a missing sign, it's a (50-50? maybe, maybe not. Have to do some monte carlo :-) chance of having a negative determinant. Easily fixed in 3-D. Maybe they just forgot that last step, and it sometimes works and sometimes doesn't for that reason?

jflat06 Staff Lv 1

I can check the math again, but I think it has more to do with the nature of the things we are trying to align. Using PCA to align two point clouds has issues when the clouds are roughly rotationally symmetric around one of the basis axes. There aren't enough defining features for it to correctly determine its rotation about this axis.

Bletchley Park Lv 1

Could it be that we align the bare loop in reverse to the ED and that the button tells us we should have started at the other end in our alignments ? Because that is a situation I am wondering about now. I have it aligned pretty well, yet the score is half of the top scorers and my model gets flipped when I press that button.

Is there a way to help us start at the right end of that loop and save days of work ?

jeff101 Lv 1

I wonder if the "Align Protein to Density" or "Center Protein on Density"
button would work better if it operated as follows:

(1) Find the average of the protein's alpha-carbon xyz coordinates,
call this the center of the protein, and subtract the center's xyz coordinates
from all of the protein's xyz coordinates so that the center lies at the origin.

(2) Determine a set of xyz coordinates that tell how far the center of the protein
is translated from a different origin fixed within the electron density cloud.

(3) Determine a set of Euler angles that tell how much the protein is rotated
about its center in an xyz coordinate system fixed with respect to the electron
density cloud.

(4) Use the above angles and coordinates to position the protein in the electron
density cloud and note the score.

(5) Use an optimization algorithm like the Nelder-Mead Simplex Direct Search Method
(implemented in Matlab as the fminsearch command) to vary the 3 xyz coordinates in (2)
and the 3 Euler angles in (3) (6 variables in all) to find the best score it can.

While fminsearch is not guaranteed to give the global maximum score, I have seen
fminsearch give good solutions when optimizing 6 variables, so I think it would work here,
at least as well as the algorithm Foldit has been using so far. Exploiting the periodicity
of both the Euler angles and the electron density cloud, as well as varying the size of
the initial simplex, can let multiple uses of fminsearch give increasingly better scores.

For more information, please see
http://www.mathworks.com/help/matlab/ref/fminsearch.html?requestedDomain=www.mathworks.com#moreabout

jeff101 Lv 1

It might also be good to break the optimization into 2 layers,
like nesting fminsearch for the xyz coordinates inside fminsearch
for the Euler angles. The inner layer varies the xyz translation
coordinates and is done more often. The outer layer varies the
Euler angles and is done less often. Varying the translations more
often makes sense because the translations are just addition
operations (a faster calculation) while the rotations involve
multiplying 3x3 rotation matrices by many sets of xyz coordinates
(a slower calculation).

I would not recommend graphing the intermediate results. Just show
the protein's position for the best score at the end of the
optimization.