So, the CASP10 results have been up for a while on the CASP webpage as most of the natives have been solved and released.
If you looked at the results you saw that Foldit did quite well in the Refinement category!
The Template-Free category is always a very tough one, and unlike in CASP9, there was no amazing de-novo prediction in CASP10. In CASP9 there was one amazing prediction that was highlighted by the assessors: T0581. It had been generated by the RosettaServer and the best prediction came from the Void Crushers (this was in the NSMB paper). But this year it was a very tough category and no group really "nailed" any template-free target.
The Template-Based category was different, where lots of CASP10 teams were able to do well. These are the targets where there is already a close structure that you can start from, or many different templates. This category was a lot harder for Foldit, because unlike the other CASP10 teams (who get to use many bioinformatic tools) all we gave you was an extended chain and the Alignment Tool. Even with just that, though, many Foldit players were able to do very well. The main issue is that we have trouble selecting these models.
This was also an issue in the Refinement Category, as you can see in the table below:
Each row represents each of the a different CASP10 refinement targets that Foldit was able to participate in (some of the proteins were too big for Foldit).
The second column is the GDT of the starting model given to us by the CASP organizers.
The third column shows you the GDT of the best Foldit prediction in the set of models that was filtered by the WeFold team.
The last column is the GDT of the model that the CASP organizers deemed to be the best prediction from any CASP10 team (not just us).
You clearly generated some amazing predictions (most of them are a lot better than the starting refinement model!) and had we been able to pick them out, they would have beat the predictions selected by the other participants at CASP10. In terms of the refinement category, the last column highlights the winners of each of those targets, but clearly you generated better models than what they were able to pick out and submit! Unfortunately, we are very bad at selecting those solutions.
What is interesting is that you also seemed to get better as CASP went on, but that could be because the first refinement targets were smack in the middle of CASP10 (during the Template-Based and Template-Free puzzles) whereas eventually you were able to focus solely on the refinement targets.
Sadly, we've known for a while that we are very bad at selection, which is why three Foldit Groups asked us (before CASP10 started) to be able to pick out THEIR OWN Foldit group's submissions.
The next table shows you which CASP10 team submitted the best prediction for each of the CASP10 refinement targets that Foldit participated in:
Here, the second column is sorted by the "Improvement in GDT over the starting refinement model."
This table shows: how much did the very best prediction submitted to CASP10 (by any team) actually improve the starting refinement model that the organizers initially gave us.
It is obvious from this table that the FEIG group won the refinement category in CASP10, but you can see that a lot of their "winning predictions" didn't actually improve over the starting model that much.
[For anyone interested: Michael Feig's CASP10 team utilized many independent explicit solvent molecular dynamics simulations, which Foldit doesn't have access to, since Rosetta currently doesn’t include explicit water molecules]
But, if you look at which refinement predictions had the best improvements in CASP10: Foldit is at the top!
Anthropic Dreams, Void Crushers, and one of the WeFold branches were all able to find, select, and submit those amazing predictions! When Foldit wins, it wins big, but when we select poorly (because we were bold in our selections) then it really hurts us.
I think the take-away message is that selection is still the main issue... but that you are much better at it than we are! It is important to note that I'm sure the other CASP groups will argue that they too have this problem, and they generated better models as well that they weren't able to select.
This leads me to Hand-Folding:
The above figure is an RMSD plot for Puzzle 689b: Hand-Folding CASP10 T0711 Repost. It was a 33 residue freestyle puzzle, with templates, and had 3 disulfide bridges (which you totally got perfectly!). The rosetta energy (y-axis) is what you see in the game as it corresponds to the Foldit score (except that it is negative on this plot). On the x-axis is the RMSD representing how far from the native each Foldit solution is (RMSD = 0 is a perfect match, RMSD = 33 is completely wrong).
You can see above that the lowest Rosetta Energy—the top-scoring Foldit solution—the one that is easy to pick out by score, is actually one of the closest to the native.
Now compare that RMSD plot to the one for Puzzle 694: CASP T0711 Disulfide Repost Round 2:
This is the same plot as above, except for Round 2 of this puzzle (when Lua scripts and sharing was allowed) where you can see that if we now selected the best Foldit score (lowest Rosetta energy) it would not be the solution that is closest to the native. What we want is for the lowest points to be as far to the left as possible.
[The reason we reposted these puzzles after CASP10 is that when you formed the disulfide bridges during CASP10 it didn't score as high as if you didn't form them, so even though there were many Foldit predictions that were correct, they were not easy to pick since they were not the top-scoring ones. This has been fixed with the new disulfide bonus that was on these recent puzzles.]
These are obviously preliminary results (as this was only the fifth Hand-Folding puzzle we've posted) so we need to look into this some more, but this might explain why we were having so much trouble picking out your solutions that were closest to the native during CASP10.( Posted by beta_helix 100 3100 | Wed, 04/10/2013 - 21:27 | 0 comments )