A brief message about the Rosetta Energy Function
I just wanted to show an example of the Rosetta energy function that is used in Foldit.
We have evidence that the Rosetta energy function is able to pick out the native conformation when compared to many other models.
In the attached plot, the y-axis is the Rosetta score (negative is better here) and on the x-axis is how far off the model is from the native (so closer to the left is better). The wiggled native proteins are shown in blue on the very bottom left with all the black dots representing Rosetta predictions.
If you were to look at all models with a Rosetta score of exactly -170 (draw a line horizontally across the graph at -170) you can see that these models are very different from one another. If you look at the most successful predictions (shown by the red arrow) you can see that they have the best Rosetta score and are closest to the correct native structure. Based on this, we believe that our energy function is good at picking out folds that are similar to the native.
You'll notice that most of the Rosetta predictions on this graph are far from the native structure.
We are hoping that Foldit players will be able to more efficiently make accurate models of protein structures, because the automated methods are essentially random searches.
This figure is taken from the Science article:
"Toward High-Resolution de Novo Structure Prediction for Small Proteins"
Philip Bradley, Kira M. S. Misura, and David Baker
Science 16 September 2005: 1868-1871.
Exciting preliminary results in Grand Challenge puzzles!
First of all, I would like to thank all of you for participating in this most exciting new venture! I think FoldIt has real promise for making it possible for people all over the world and from all different backgrounds to bring their talents and unique skill sets to bear on solving critical problems in biomedicine and bioengineering. And of course, have fun at the same time!
My research group at the University of Washington has been working on the protein structure prediction problem for quite some time. With the Rosetta program and the computing power of Rosetta@home, we can now predict structures accurately for small proteins. The main obstacle preventing us from being able to predict structures of larger and more complex proteins is that finding the correct structure becomes increasingly difficult. The correct structure almost always has lower energy than incorrect structures, so if Rosetta can locate the native structure it can be identified based on its very low energy. We find that Rosetta often gets close to the native structure, but not close enough for the energy to drop significantly. This drop requires a last few twists and contortions that Rosetta, which searches by making random changes and evaluating the energy, is not smart enough to try.
The Grand Challenge puzzles are testing whether FoldIt players can overcome these last obstacles which Rosetta often fails at. The starting points in these puzzles are generally pretty close to the correct structure, but there are one or a small number of errors which prevent the energy from dropping (in FoldIt, to get a score more like a traditional computer game, we take the negative of the energy, so a higher score means lower energy).
The exciting preliminary result is that many of you are able to recognize the problems in the starting structures--many of which are the structures that Rosetta trajectories got stuck at==
and fix these problems remarkably well. Over the next month we will be releasing more puzzles with different types of errors in the starting Rosetta generated structures, and doing a full analysis of the results on these challenges which we will report to you directly and to the world via a scientific publication. Beginning with the central issue of whether human intelligence and intuition can overcome obstacles hindering computers from solving problems like protein folding, here are many fascinating questions, most of which have never been considered before.
Keep up the great work!
David( Posted by David Baker 73 1762 | Mon, 01/26/2009 - 06:39 | 5 comments )
Sneak Preview: All-hands competition
In the process we of analyzing the folded proteins produced by the gameplay, on several occasions we noticed that there would be a significant benefit in merging the efforts between different groups and soloists. Specifically in some instances, solutions from different groups get different parts of the protein very close to its true shape. In addition, some of you mentioned that you would be interested in being able to look and work on the protein after the puzzle has been completed.
With this in mind we have been workin on the all-hands competition stage. This stage starts upon the completion of the puzzle, when all the best group solutions have been determined. At that point anyone can try to further improve the protein structure, starting from any of the top 5 high-scoring but diversified solutions. So unlike current puzzles you can choose to evolve any of the 5 best candidates, and perhaps combine some of the best elements from any of the starting 5 solutions. We think there is a high likelihood that with the ultimate mind-meld of all game players together we can push the results even further. The all-hands score will be kept in the same way that we keep the solo and evolver scores since we think that this will be yet another different skill that should be honored by itself. We will make sure that all Grand Challenge puzzles go through the all-hands mode.
We will also be introducing the gallery mode where you should be able to view within the game any of the old best solutions.
All these new feature, should be coming to a PC near you next week. Please let us know what you think of this in the comments below.
key differences in the Grand Challenge Puzzles
A few of our players have pointed out the palpable difference in the game play that emerged in the new puzzles.
One of the main reasons for the different feel and different rate of progress for the new puzzles is that
they are often the best solutions found by the automatic methods -- meaning that the starting point is a fairly strong local minimum (slight changes invariably lead to worse scores or little difference).
We are trying to find out whether people know how to get out of these, and find a much better space of
solutions by making one or several key changes and then refining these until a
better solution is found. Often making the big key change may initially make the score a lot worse, even though the score would eventually be a lot higher after some refinement. So in many ways this is harder, and also one of the key reasons why computers cannot really do anything to improve the score.
Also, since the score initially is already very high (the best the computer can get), the final solution is not going to be significantly higher than the starting position. In the Grand Challenge puzzles 100 or even 20 points makes a huge difference, a difference between a really good prediction and the many flawed ones that massive distributed computing efforts generate.
Also, one possible reason why the solutions may seem to require a lot of small tweaks at some point is that you may have reached a global solution (the native protein) to the puzzle and any subsequent tweaks are simply not possible. At least we hope that is one of the reasons ;)
So keep all this in mind as you tackle the Grand Challenges. They are harder, and every 10 point difference can make a huge impact. You're at the bleeding edge of the current state-of-the-art in protein structure prediction.
The New Grand Challenge
The release of "The New Challenge" puzzle signifies the start of a new sequence of puzzles aimed at determining the specific ways in which human puzzle solving can outperform best known automated methods. These puzzles are a result of the analysis of all game play results since we introduced the game in May. The puzzles we will be unrolling from now on are specifically selected to confirm certain hypothesis about the nature of problems where the collective "game brain" will outperform other search methods.
The CASP8 results showed us that protein exploration through foldit puzzles is a promising new paradigm for 3D structural exploration. It did not, however, point out the how and why people are able to get such good solutions. The "grand challenge" puzzles will hopefully provide that information. Should the results prove to be interesting, we will shortly afterward submit a research paper reporting our findings.
So let the Grand Challenge begin! We're counting on you.