Exciting preliminary results in Grand Challenge puzzles!
First of all, I would like to thank all of you for participating in this most exciting new venture! I think FoldIt has real promise for making it possible for people all over the world and from all different backgrounds to bring their talents and unique skill sets to bear on solving critical problems in biomedicine and bioengineering. And of course, have fun at the same time!
My research group at the University of Washington has been working on the protein structure prediction problem for quite some time. With the Rosetta program and the computing power of Rosetta@home, we can now predict structures accurately for small proteins. The main obstacle preventing us from being able to predict structures of larger and more complex proteins is that finding the correct structure becomes increasingly difficult. The correct structure almost always has lower energy than incorrect structures, so if Rosetta can locate the native structure it can be identified based on its very low energy. We find that Rosetta often gets close to the native structure, but not close enough for the energy to drop significantly. This drop requires a last few twists and contortions that Rosetta, which searches by making random changes and evaluating the energy, is not smart enough to try.
The Grand Challenge puzzles are testing whether FoldIt players can overcome these last obstacles which Rosetta often fails at. The starting points in these puzzles are generally pretty close to the correct structure, but there are one or a small number of errors which prevent the energy from dropping (in FoldIt, to get a score more like a traditional computer game, we take the negative of the energy, so a higher score means lower energy).
The exciting preliminary result is that many of you are able to recognize the problems in the starting structures--many of which are the structures that Rosetta trajectories got stuck at==
and fix these problems remarkably well. Over the next month we will be releasing more puzzles with different types of errors in the starting Rosetta generated structures, and doing a full analysis of the results on these challenges which we will report to you directly and to the world via a scientific publication. Beginning with the central issue of whether human intelligence and intuition can overcome obstacles hindering computers from solving problems like protein folding, here are many fascinating questions, most of which have never been considered before.
Keep up the great work!
David( Posted by David Baker 82 2278 | Mon, 01/26/2009 - 06:39 | 5 comments )
Sneak Preview: All-hands competition
In the process we of analyzing the folded proteins produced by the gameplay, on several occasions we noticed that there would be a significant benefit in merging the efforts between different groups and soloists. Specifically in some instances, solutions from different groups get different parts of the protein very close to its true shape. In addition, some of you mentioned that you would be interested in being able to look and work on the protein after the puzzle has been completed.
With this in mind we have been workin on the all-hands competition stage. This stage starts upon the completion of the puzzle, when all the best group solutions have been determined. At that point anyone can try to further improve the protein structure, starting from any of the top 5 high-scoring but diversified solutions. So unlike current puzzles you can choose to evolve any of the 5 best candidates, and perhaps combine some of the best elements from any of the starting 5 solutions. We think there is a high likelihood that with the ultimate mind-meld of all game players together we can push the results even further. The all-hands score will be kept in the same way that we keep the solo and evolver scores since we think that this will be yet another different skill that should be honored by itself. We will make sure that all Grand Challenge puzzles go through the all-hands mode.
We will also be introducing the gallery mode where you should be able to view within the game any of the old best solutions.
All these new feature, should be coming to a PC near you next week. Please let us know what you think of this in the comments below.
key differences in the Grand Challenge Puzzles
A few of our players have pointed out the palpable difference in the game play that emerged in the new puzzles.
One of the main reasons for the different feel and different rate of progress for the new puzzles is that
they are often the best solutions found by the automatic methods -- meaning that the starting point is a fairly strong local minimum (slight changes invariably lead to worse scores or little difference).
We are trying to find out whether people know how to get out of these, and find a much better space of
solutions by making one or several key changes and then refining these until a
better solution is found. Often making the big key change may initially make the score a lot worse, even though the score would eventually be a lot higher after some refinement. So in many ways this is harder, and also one of the key reasons why computers cannot really do anything to improve the score.
Also, since the score initially is already very high (the best the computer can get), the final solution is not going to be significantly higher than the starting position. In the Grand Challenge puzzles 100 or even 20 points makes a huge difference, a difference between a really good prediction and the many flawed ones that massive distributed computing efforts generate.
Also, one possible reason why the solutions may seem to require a lot of small tweaks at some point is that you may have reached a global solution (the native protein) to the puzzle and any subsequent tweaks are simply not possible. At least we hope that is one of the reasons ;)
So keep all this in mind as you tackle the Grand Challenges. They are harder, and every 10 point difference can make a huge impact. You're at the bleeding edge of the current state-of-the-art in protein structure prediction.
The New Grand Challenge
The release of "The New Challenge" puzzle signifies the start of a new sequence of puzzles aimed at determining the specific ways in which human puzzle solving can outperform best known automated methods. These puzzles are a result of the analysis of all game play results since we introduced the game in May. The puzzles we will be unrolling from now on are specifically selected to confirm certain hypothesis about the nature of problems where the collective "game brain" will outperform other search methods.
The CASP8 results showed us that protein exploration through foldit puzzles is a promising new paradigm for 3D structural exploration. It did not, however, point out the how and why people are able to get such good solutions. The "grand challenge" puzzles will hopefully provide that information. Should the results prove to be interesting, we will shortly afterward submit a research paper reporting our findings.
So let the Grand Challenge begin! We're counting on you.
We've got some very exciting news! The table below shows our placement in CASP8 puzzles. Click on the Target column to go to the respective CASP8 page showing results. The rank column shows Foldit solution rank according to the default CASP8 sorting criteria (GDT_TS). Entries column shows the number of submissions from biochemistry labs worldwide. The puzzle column shows the corresponding foldit puzzle.
|Target||Rank||Entries||Puzzle||Group / User|
|TR389||3||71||TS389_5||Another Hour Another Point|
|TR389||7||71||TS389_1||Another Hour Another Point|
|TR461||2||83||TS461_1||Another Hour Another Point|
|TR461||19||83||TS461_5||Another Hour Another Point|
|TR469||2||74||TS469_5||Another Hour Another Point|
|TR469||3||74||TS469_4||Another Hour Another Point|
|TR488||1||77||TS488_1||Another Hour Another Point|
|TR488||3||77||TS488_3||Another Hour Another Point|
|TR488||26||77||TS488_5||Another Hour Another Point|
|TS492||2||527||TS492_3||Another Hour Another Point|
|TS492||96||527||TS492_5||Another Hour Another Point|
Looking at the numbers, it appears that Foldit players did amazingly well! You placed in top 3 in a number of puzzles and it seems that we apparently won one of the puzzles. Congratulations, Foldit players! You're amzing.
Looking beyond numbers, the preliminary conclusion by biochemist is that foldit players are on par, but not better than protein folding experts at trying to solve the same problem with all tools available to them. It also appears that foldit outperformed all fully automated server submissions. Hopefully over time foldit can do even better, but being able to produce solutions of same quality as experts means that the top science research can now also be done outside of labs by game players, significantly speeding up the process of scientific advancement! As developers, we are truly inspired by this.
Of course, we didn't do well on all puzzles, but even this has been very useful to us as we're evolving the game further to significantly improve over these findings.
Some caveats that should be mentioned:
- there is a considerable amount of discussion over what is the best way to score the quality of the solution, and GDT_TS is just one of them that CASP8 sorts their solutions by. The ranking would be slightly different with different metrics, but the overall conclusion should not change.
- it is possible that the specific starting point including all the constraint bands played a significant role in the success of foldit solutions, and we may have gotten a particularly good one from the Baker lab. To understand this better, we'll be introducing new puzzles (more on that tomorrow).
- foldit did not have the sequence allignment tool that is fundamental for protein prediction. Scientists use sequence alighment a lot to find the best score. For next CASP challenge we will make sure that these tools are also part of the game, which should significantly enable the gameplay and the resulting proteins.