Are foldit wiggles still out of synch with Rosetta?
Case number: | 699969-996873 |
Topic: | General |
Opened by: | alcor29 |
Status: | Open |
Type: | Bug |
Opened on: | Saturday, February 8, 2014 - 04:53 |
Last modified: | Monday, February 10, 2014 - 22:59 |
CASP Puzzle 843, rbtta2, the first alignment which comes up, is first presented as already idealized but low scoring, meaning using "idealize" tool on every segment produces no result. Then when I immediately use a high power "wiggle sidechains" (E on old IF) or "wiggle" (W) it goes out of idealization. Does this mean that these tools as currently modified are still incapable on matching the Rosetta core?
Yes, that's what I was trying to grasp. And yes it is confusing. So why not have the generating protocol attempt bond and angle optimization?
I'm not fully certain (I'm not in the structure prediction group, and don't know exactly how that CASP ROLL prediction was generated), but my guess would be that the Robetta structure prediction server isn't sampling close enough to the native state for it to be worth it. Just as a Foldit player wouldn't bother with high-powered wiggling in the early or middle stages of hand-folding something, when the structure is still far from optimal, so also there's no reason for the automated algorithms to be doing computationally expensive fine-scaled optimization when they're still making larger-scale, coarser mistakes.
Interesting to learn that "ideal" means only the dihedrals, and not "fully optimized".
Players have noted in the past that when we are given structures that are too optimized, they are very stiff and hard to improve upon. This used to happen regularly with Robetta predictions. So it's probably a good thing that the starting models are low scoring and not too stiff.
I just want to clarify that the term "ideality," when used in Foldit, does include bond lengths and angles.
Also, as Susume has brought up, fully-optimized Robetta predictions are often so entrenched with the score function that alternative Foldit models would require tons of optimization just to compete. Of course, all this extra optimization usually doesn't make a huge difference in prediction quality, and we like to encourage Foldit players to explore lots of different structures rather than optimizing just one.
For this reason, we usually "idealize" server models before providing them as starting templates in puzzles (this is also the case in Puzzle 843). This effectively undoes Robetta's optimization.
Yes. So now I have a clearer picture which tells me to use "idealize" a lot because it is working much better and is now a more useful tool; but, not to worry too much about idealize as a goal and just, as before, use the score as the final arbiter for folders. I'll let the Rosetta adapters worry about how far we are deviating with the current software setup and leave it to them to improve it further.
Yes, exactly -- that's a good strategy. Now, it is true that VERY stretched bonds are bad, slight deviations from ideality are perfectly fine. And yes, a lot of what we do to optimize the scorefunction involves figuring out how heavily to weight the penalty for nonideal bond lenghts and bond angles, to ensure that small deviations from ideality like those seen in proteins are allowed, but giant deviations are penalized.
Or, better put: "idealize" (tool) yes. "Ideality" (subscore) not so much.
I've noticed that the idealize score part can be very low, I idealize, wiggle out, and the idealize drops back. Is this ok? My intuition seeing the brown segments says not.
It is usually okay, provided the rest of the scoring is good. The idealize part is intended to ONLY give penalties. If all of the geometry were perfectly ideal (every peptide bond were EXACTLY 1.33 Angstroms, etc.), then the idealize part should give a penalty of zero. When you wiggle on high power, it is expected that you'll move away from ideal bond lengths and bond angles a tiny bit, resulting in a small penalty, but many of the other score terms should improve by more than the penalty. In a real protein structure, bond lengths will fall within a range, meaning that scoring a real structure with the idealize term would produce a small penalty because they aren't all right on the ideal bond length value. Other, favourable interactions make up for this penalty.
Ah, thanks. In this case, could we please have a way to either customize what gets counted in the score coloring, and maybe even which score terms wiggle (NOT scoring) pays attention too and how much? I know it would probably be quite difficult, but it would really help in focusing my design efforts. Also, I did notice that helixes aren't very compact now a days-personally, it's a bit too decompressed.
Rosetta has the same abilities as Foldit to optimize bond lengths and bond angles (which is what the high-powered wiggle does). Indeed, these capabilities were added to Rosetta before they were added to Foldit. However, specific Rosetta protocols might not use this functionality. If a predicted structure has ideal geometry, it probably means that the protocol used to generate it did not attempt bond length and bond angle optimization (instead only optimizing dihedral angles). Does that make sense?
(Real proteins are usually slightly non-ideal, meaning that bonds and bond angles can compress or stretch a tiny bit. A small stretch or compression is a little bit energetically unfavorable, but can sometimes result in other, highly favorable interactions that make up for that. When you turn on the high-powered wiggle, you're moving away from ideality a tiny bit, but finding better packing, so the score improves. "Ideal" doesn't necessarily mean more realistic, though that can be confusing.)