Wiggle Power Results

Earlier this year we introduced different Wiggle Powers, and a couple months ago we tried to simplify this with Auto-Wiggle Power.

We recently posted a bunch of De-novo puzzles where the "High Wiggle Power" option was disabled and hopefully the results from those puzzles will explain why we've given High Wiggle Power a time-out during CASP11.

Below are RMSD plots for De-novo Freestyle 36 puzzles 864: Low Power & 868: High Power. The green dots represent your many different Foldit predictions, and for all these RMSD plots, you want to be as close to the left as possible (an RMSD=0.0 would be a perfect match to the native).

You can see that the top-scoring Foldit solution (the lowest Rosetta energy) doesn't change much between puzzles RMSD-wise. So although the high-scoring Foldit solution for puzzle 868: High Power was 9,208 (compared to 9,098 in puzzle 864: Low Power), it is not any closer to the native.

In general (and this was the case for all the Low Power/High Power plot comparisons) although the scores were better in the High Power rounds, the models were not any more accurate. We hypothesized that this could be happening because we allowed you to load in solutions from the Low Power rounds, and therefore the High Power round was mostly "drilling" down the energy landscape of those previous models (since doing that would obviously improve the in-game score!).

This is why for 880: De-novo Freestyle 38: High Power we did not let you load in solutions from the previous Low Power round.
You can see in the plots below that this High Power round had fewer green dots, but unfortunately the results are actually much worse than the Low Power round:

On the left, 876: De-novo Freestyle 38: Low Power has a very nice plot where the top-scoring Foldit model is one of the left-most points. This is not the case on the right, where the top-scoring Foldit model from 880: De-novo Freestyle 38: High Power is much further from the native than in the Low Power round.

The exciting news, however, is that the results from the Predicted Contacts rounds have been very promising!
You can see this below for the De-novo Freestyle 37 puzzles:

On the left, the top-scoring solutions for 867: De-novo Freestyle 37: Low Power are not the left-most points on the plot (they are quite far away from the native topology) but given predicted contacts, your results on the right for 875b: De-novo Freestyle 37: Predicted Contacts look great!

The results were similar for the most recent Predicted Contacts puzzle, where we disabled High Wiggle Power and did not allow loading of solutions from 881: De-novo Freestyle 39: Low Power.

So we are looking forward to the Contact-assisted CASP11 targets, and hopefully this post explains why we'll give High Wiggle Power a rest during CASP11. The CASP season is long and busy enough that we don't want to waste your time gaining Foldit points, but not getting more accurate solutions!

Lastly, Seth and I wanted to thank all of you in the DC area who stopped by when we presented Foldit (and debuted nanocrafter) at the 3rd USA Science & Engineering Festival.
Next time we promise to give everyone a little bit more advanced notice, and we'll make sure to have a camera ready from day 1. At least we managed to snap a photo with Galaxie on the last day:

Thanks for all your hard work, everybody... and keep up the great folding!

( Posted by  beta_helix 78 1433  |  Tue, 05/13/2014 - 16:10  |  9 comments )
5
Joined: 12/27/2010
Groups: None
Where's my spot?

If the goal of foldit is to get closest to native - after 3 years of playing I sometimes forget that - then it would be very helpful if we could see on the above RMSD diagrams where exactly our own personal solutions fall. That way we could decide for ourselves what techniques give the most natural solution. Furthermore, it appears to me as if there is no particular correlation between low energy (high foldit score) and a low RMSD value which kind of defeats the object of the exercise.

Perhaps it would be helpful, in puzzles that are examining a known structure, if the fold closest to natural structure was also recognised in some way as opposed to just the highest scoring fold which might be completely wrong.

Joined: 09/24/2012
Groups: Go Science
And medium power? (CASP)

Now I wonder if I should concentrate on improving my low wp only, never passing to medium wp (or only the latest day to get some points for the game).

I tend to abandon lwp after some times, in order to concentrate on gaining points for the game. May be doing so, I'm doing worse for CASP (and science)?

Joined: 03/18/2014
Groups: Gargleblasters
Hmmm

Can't scoring be corrected to reflect this?

I thought the idea was for score to reflect usefulness. Otherwise what's the point in gamifying it? If I can get more points for a less scientifically accurate solution than something has gone wrong with the gamification process, because score is what I aim for, because I'm a player and want to be awesome. So you need to give us accurate feedback on what is awesome and what is not awesome, and that feedback comes in the form of our scores.

wisky's picture
User offline. Last seen 4 days 12 hours ago. Offline
Joined: 07/13/2011
They have been working at this for YEARS

There was an update to wiggle back in January that was designed to improve upon this aspect. Regularly before that update, many of the highest scoring solutions were NOTHING close to the native structure! I THINK this has been improved with Newchapter (the January wiggle update), though I'm still unsure. I would also like to see how close MY structures are to the native structures... Regardless, people will always fold for a way to improve their score. Like you said, Stack, the idea is for score to reflect usefulness. There is always room for improving this aspect, and the devs have worked hard to improve this!

wisky's picture
User offline. Last seen 4 days 12 hours ago. Offline
Joined: 07/13/2011
:)

All this tells me, is we still have more work to do on the scoring function. What changes would need to be made, I'm not sure. I believe we have been making strides in this area though! 2.0 RMSD is very close, but 4.0 RMSD is still pretty close, and most of the highest scoring structures are getting close in.

Joined: 09/21/2011
Groups: Void Crushers
Rosetta energy score

All this lets me wondering what the Rosetta Energy Score for the natives is on these puzzles. Would that beat the best player solution?

alcor29's picture
User offline. Last seen 9 hours 52 min ago. Offline
Joined: 11/16/2012
Wiggle powers and NC vs. prior

I would like to know a) if you have stats on auto? And b) perhaps more importantly, stats comparing NC vs the prior version of foldit. My non-statistical impression seems to me to be the RMSDs were just as close to the left or even closer with the prior version?

jflat06's picture
User offline. Last seen 8 hours 16 min ago. Offline
Joined: 09/29/2010
Groups: Window Group
.

@Bruno - The main point of medium power wiggle is to allow you to resolve any unidealities that have been created in your pose through the act of folding. It does allow you to 'drill' like high power does, but it actually serves another purpose besides optimization. This is why we introduced Auto wiggle - it only turns on the medium power when it's needed to fix an ideality problem.

@StackOverflow - There are a couple of problems here.

1. Our score function isn't perfect. It turns out it's extremely hard to get scoring right. It's an ongoing goal of the rosetta project to refine the scorefunction to be more accurate.

2. Even assuming a perfect score function, the only guarantee you get is that the native will be the lowest scoring structure. Other structures that are far away can still score very well, but the native will be better. These other structures can be local minima - meaning that any attempt to modify those structures will result in score loss. Which leads me to...

3. The sampling problem. Even if you have a perfect score function, you still have to find the solutions themselves. Just following the score function isn't enough - if this were true, a computer could easily find the native. The hypothesis with Foldit is that players have the intuition of what changes must be made *against* the score temporarily in order to find the native.

@Timo - I haven't actually looked at the native scores for these puzzles in particular, but usually the natives *do* score quite a bit lower.

@alcor29 - a) We can't have stats like these on auto without constraining the puzzles even further (If by auto, you mean players were only allowed to use auto-wiggle, and not medium or low).

b) It's not just about how close solutions are - the shape of the plot matters a lot as well. If the solutions closest to the native all score poorly, then in a blind case, we have no way of differentiating those from the solutions that are further away. The most direct purpose of NC was to ensure that this shape was correct. Players still have to actually find the correct structures themselves in order to sort of "fill in the shape" (This is called sampling).

As for previous plots - the plots actually vary drastically from puzzle to puzzle, depending on how hard the puzzle is. So it is very hard to draw conclusions from several plots. Even if we run the same puzzles, there are many more factors that can come into play that keep us from comparing them in a completely fair manner.

What we do know objectively is that a poor score function is incapable of correctly selecting the native even if you do find it.

spmm's picture
User offline. Last seen 32 weeks 5 days ago. Offline
Joined: 08/05/2010
Groups: Void Crushers
Black belt folder

Would be great to see galaxie doing a black belt fold for us!

Get Started: Download
  Windows    OSX    Linux  
Windows
(7/8/10)
OSX
(10.7 or later)
Linux
(64-bit)

Are you new to Foldit? Click here.

Are you a student? Click here.

Are you an educator? Click here.
Search
Only search fold.it
Recommend Foldit
User login
Soloists
Evolvers
Groups
Topics
Top New Users
Sitemap

Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, RosettaCommons