Mining the solutions received by the server

Case number:671076-2004305
Opened by:jeff101
Opened on:Tuesday, October 10, 2017 - 00:34
Last modified:Thursday, November 2, 2017 - 02:39

I think each Foldit client automatically sends its latest solutions to the Foldit server every few minutes. Does this continue after a puzzle ends? Do the scientists use solutions received after a puzzle ends when they analyze the results for that puzzle?

I've also heard that when the Foldit team analyzes solutions for a puzzle, they only use the best solo and evo solution for each player plus the latest 5 solutions that player has shared with scientists. Is this correct? If a solution is shared with scientists after a puzzle has ended, is the solution used in the analysis for that puzzle?

When I play Foldit, I often use multiple clients for the same puzzle. Often after a puzzle ends, I need to take a break and end up letting the clients keep running whatever recipes they were running. This often generates better solutions than I get credit/rank/points for, but these solutions could be useful to the scientists when they analyze the results for the puzzle. Also, each client is often working on a unique solution, and the ones that score the best might not be the most useful to the scientists. Why throw away all these solutions just because they were made by the same player?

Is it possible to include in the analysis for each puzzle more of the solutions that are automatically sent to the server?


(Tue, 10/10/2017 - 00:34  |  3 comments)

jeff101's picture
User offline. Last seen 20 hours 41 min ago. Offline
Joined: 04/20/2012
Groups: Go Science

Perhaps you could do something like below:

(1) Make a list called allscores of all scores stored on the server.
(2) Sort the allscores list by score.
(3) Make an empty list called keepers of solutions to analyze.
(4) Move the highest-scoring solution from the allscores list to the keepers list.
(5) Take the top-scoring solution left in the allscores list (call it topsolution)
and find its alpha-carbon rmsd from each solution in the keepers list. If any
of these rmsd values are below a certain cutoff, topsolution is too similar to
a solution already chosen for analysis, so don't add topsolution to the keepers
list. If all of the rmsd values are above the cutoff, topsolution is novel, so
do add topsolution to the keepers list.
(6) Remove topsolution from the allscores list.
(7) If any solutions remain in allscores, go to (5) above. Otherwise, continue with (8) below.
(8) The list allscores should be empty now. Analyze the results in keepers.

bkoep's picture
User offline. Last seen 14 hours 6 min ago. Offline
Joined: 11/15/2012
Groups: Foldit Staff

Excellent suggestions!

To answer your questions:
Yes, your Foldit client will continue to upload solutions to our server if you continue playing after a puzzle expires. However, that does not mean we will wait for you to begin our analysis. Generally speaking, our analysis will capture solutions uploaded only for an hour or so after puzzle expiration.

We focus most of our attention on the best Solo and Evolver solutions for all players, as well as the Scientist Shares. We do also dig deeper into the bulk of solutions from a puzzle, depending on the amount of interest in the puzzle results (we'll do this for most Design puzzles and special collaborations, for example; but probably not for Revisited or De-novo Freestyle puzzles). In fact, we typically use exactly the process you suggest to cluster these solutions according to structural alignment.

jeff101's picture
User offline. Last seen 20 hours 41 min ago. Offline
Joined: 04/20/2012
Groups: Go Science

Perhaps a good compromise is to allow each player
to share more solutions (say 10) with scientists.
Another idea would be to also include in your
analysis all (or just the most recent list of)
solutions a player shares with himself or his group.


Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, Boehringer Ingelheim, RosettaCommons