DEVPREV and scoreboards scoring abnormality

Case number:845799-2009750
Topic:Biochem
Opened by:Bletchley Park
Status:Open
Type:Bug
Opened on:Friday, May 29, 2020 - 05:55
Last modified:Monday, June 1, 2020 - 01:33

20200527-ebd9381948-win_x86-devprev

This 'devprev' version allows scores to permeate to the leaderboard despite the fact that scoring is done significantly different with a serious unfair advantage over the 'main' client.

AWP wiggling does not wiggle at AWP, as subsequent HWP wiggling adds nothing,

Further investigation will have to reveal whether this already was present in the previous devprev version and thus may have affected past puzzles.

Solution shared with scientists: 'ridiculous_score'

(Fri, 05/29/2020 - 05:55  |  12 comments)


Joined: 05/19/2009
Groups: Contenders

A further test has confirmed my suspicion that previous puzzles benefit unfairly from the devprev client. A simple wiggle from our top scoring solution (developed and evoed in 'main' client only) gained another near 250 points without effort.

I would like to know from the makers whether the alternate scoring routines were already present in the previous devprev client.

Hanto's picture
User offline. Last seen 2 weeks 6 days ago. Offline
Joined: 05/10/2008
Groups: None

According to the wiggle correction routine at the beginning of OLD and NEW AFK, my wiggle is low by about .11 on both prior and new devprev on multiple computers. Ratio for me is pretty much what it has always been in my case.

bkoep's picture
User offline. Last seen 6 hours 58 min ago. Offline
Joined: 11/15/2012
Groups: Foldit Staff

Thanks for the quick feedback, Bletchley! Can you tell me which puzzle it was that you shared the "ridiculous_score" solution?

Joined: 05/19/2009
Groups: Contenders

Hi bkoep,

It was for 1842, but since then I have run this on historic puzzles from the past month or so and found *all* of them affected when run through the devprev client, they score hundreds of points higher when just wiggling. Note that for these puzzles we used the main client only, not sure if other teams used the, then current, devprev client, so it is important to check the previous devprev for issues as the scoring may possibly have an effect on the science outcomes of already closed puzzles.

georg137's picture
User offline. Last seen 14 hours 53 min ago. Offline
Joined: 08/07/2010
Groups: Contenders

1842 was a ratings and science train-wreck for main clients, but this concerns several puzzles and probably quite a few users going back weeks.

Are module coders and testers the same people, or are these duties assigned to different individuals?
Are coders also account-holder users?

Hanto's picture
User offline. Last seen 2 weeks 6 days ago. Offline
Joined: 05/10/2008
Groups: None

1841b was the best puzzle we've had in a long time but many factors are at play. less then 2 months ago i had a global rank of 37, now look. New members, many quite good are affecting outcomes as they should. Also attempts to play design puzzles starting out with no structures are a waste of time and personal/computer energy. there should be an AUTO method for transferring structures to a blank design puzzle. YA HEAR ME DEVs!!!!!!!

joshmiller's picture
User offline. Last seen 6 hours 35 min ago. Offline
Joined: 09/08/2017
Groups: Foldit Staff

I hear you Hanto! We're working on a new tool that might be helpful. In the meantime, I recommend Loci's wonderful SS Edit recipe: https://fold.it/portal/recipe/103294

Joined: 05/19/2009
Groups: Contenders

The report is for devprev on windows, what are the differences between devprev for linux, windows and Mac ? Is it possible that one of the other platforms has incorrect code as well, potentially even in an earlier version ? Do you do any cross-checking between platform versions regarding real world performance to verify consistency ?

Joined: 09/24/2012
Groups: Go Science

On puzzle 1844 with the 20200527-ebd9381948-win_x86-devprev on Windows.
loading a solution share at low wiggle power (from Main), and wiggle all on "low" wiggle power reacts as a "medium" wiggle power.

Score raised by 800. Trying on Medium or auto didn't change anything afterwards. Starting again from the share with Medium or Auto wp gives the same score.

Conclusion: it's the same wiggle power for Low, Medium and Auto.

Now taking a share from Main and commented as "Medium" and wiggling raises the score by 600 (it's not sure the player ended with a medium wp action).

I wonder if the "default" (low, auto, medium) is not "high" wiggle power actually.

I shared "before" and "after" of both tests to scientist.

In order to verify this hypothese, I loaded a shared "High" solution from revisitting puzzle 1845. Wiggling all rose the score by about 100. The "default" is thus not "high wp, but something else ("super high"). I shares Before and after to scientists for this puzzle too.

joshmiller's picture
User offline. Last seen 6 hours 35 min ago. Offline
Joined: 09/08/2017
Groups: Foldit Staff

Thanks all. We think we've identified the bugs described here, and we're working on putting out a hotfix some time this week. Thanks for your patience.

Joined: 05/19/2009
Groups: Contenders

I notice there currently is no active client in the experimental group, was a 0528 devprev pre-release removed and not replaced with a different client ? Which client should be in experimental currently ?

joshmiller's picture
User offline. Last seen 6 hours 35 min ago. Offline
Joined: 09/08/2017
Groups: Foldit Staff

To my knowledge, that's correct Bletchley. We will be replacing the client later this week. Please use Main in the meantime.

Sitemap

Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, RosettaCommons