Global soloist scores calculated incorrectly.

Case number:671076-992897
Topic:Server
Opened by:BootsMcGraw
Status:Open
Type:Bug
Opened on:Saturday, June 9, 2012 - 00:47
Last modified:Wednesday, November 13, 2019 - 14:27

It seems we have gone well past the four-month cut-off point on calculating global scores. Scores should have been cut off February 8th, with "Beginner Puzzle: T-cell Lymphoma" being the last puzzle scored. That would have given me 3502 points, yet my global score is 4032.

Lest we forget the mass exodus of players after the last time the scores weren't updated in a timely manner, I suggest someone cull the scores, immediately.

And then write a program to do this automatically, since it appears the scores are revised, manually.

(Sat, 06/09/2012 - 00:47  |  41 comments)


Joined: 12/06/2008
Groups: Contenders

I have an alternate suggestion: Scores be culled on the first day of every month, or some other day (e.g. the 15th, the last day), and also post that policy on the "Players" page and everyone's "My Page", so that folderers will see it every time they look at the global scores.

B_2's picture
User offline. Last seen 4 years 32 weeks ago. Offline
Joined: 11/29/2008
Groups: None

Why wouldn't there be a script to update the scores every time a puzzle closes? Seems pretty obvious...

tamirh's picture
User offline. Last seen 6 years 26 weeks ago. Offline
Joined: 05/11/2012

Assigning

Joined: 11/03/2011
Type: Bug » Suggestion

I would suggest exponential smoothening where to get the new global score you add the last puzzle score to
the last global score multiplied by some factor <1 like .98.Like that,you take each and every puzzle into account,
and if you have a stable puzzle score ,100 for example ,your global score would stay at 5000 avoiding the
ups and downs of actualization and giving some 'absolute' measure of quality.Moreover computing is easy
and checkable.

beta_helix's picture
User offline. Last seen 8 hours 51 min ago. Offline
Joined: 05/09/2008
Groups: None

Please read about the new scoring system that will soon replace the old broken one:
http://fold.it/portal/node/992875

But hopefully we can keep the old system from failing miserably before we completely switch over!

Joined: 12/06/2008
Groups: Contenders

The old system wasn't broken. It just lacked the means to automatically keep a rolling four-month scoring window. If the leaderboards were managed regularly, this would be a moot point.

The new "categories" are a great addition, if you want to know what types of puzzles are your strengths. But they are useless to tell me if I am a better player than anyone else. Is a rank of nine in CASP better than a rank of five in Symmetry, since there are so many more CASP puzzles? I don't know. And neither will anyone else.

Please do NOT get rid of the current scoring system. Seriously. You are going to offend untold numbers of players if you do, yours truly, included.

B_2's picture
User offline. Last seen 4 years 32 weeks ago. Offline
Joined: 11/29/2008
Groups: None

I completely agree with Boots.

Don't do away with the current system, just get the rolling 4 month re-totaling automated as each puzzle closes.

I really thought this was already happening, and I'm kind of disappointed that we are still talking about this after a couple of years.

spmm's picture
User offline. Last seen 48 weeks 4 days ago. Offline
Joined: 08/05/2010
Groups: Void Crushers

Isn't the overall category the same as old global solo/evo? My understanding from jflat was that beginner is the <15 <150 puzzles - not puzzles called beginner but that is not clear from the description.

Description:
This the is the overall puzzle category, containing all non-beginner puzzles. Doing well in this category means you're well versed in all forms of folding, even if you're not the best in one particular category.

infjamc's picture
User offline. Last seen 3 years 6 days ago. Offline
Joined: 02/20/2009
Groups: Contenders

Re: spmm

The answer is "almost, but not quite." (The new "overall" category cuts off at exactly 30 puzzles, which is about a 6-week window.)

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group

This is correct. The main difference is the window size.

We were already thinking of decreasing the window size, since it forces new players to play for so long before they're able to be competitive on the leaderboards. When we implemented the new categories boards, we decided on using a number-of-puzzles window instead of time because it works better with the 'skips' we've mentioned (if you had a period of the window where there were a LOT of puzzles posted at once, there would be a disproportionately small number of skips compared to the number of puzzles).

Please remember that the actual numbers on skips/window size (and even who gets gold/silver/bronze) are subject to change - these are just the initial values that we've put up, and they're probably not optimal, so we'll likely make adjustments.

Joined: 11/03/2011

Obviously,the problem will be the same with the categories,with additional problems linked
to the very different numbers in each category.

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group
Topic: Game: Other » Server
Type: Suggestion » Bug

The system is set up to recalculate points automatically every hour. However, there is a bug in the recalculation - one that we are having trouble finding. We thought we had fixed it after the last recalculation error, but it appears not.

We will be keeping the old and new systems side by side for a while - we realize that a lot of players are very fond of the old system, and we've taken that into consideration.

However, the old system does not allow for the puzzle load to scale, and it incentivizes people playing every puzzle. While some people may be willing to try and tackle the massive load, it shouldn't be necessary to do well.

In short, the old system was 'broken' because it didn't allow us to run more puzzles, and prevented us from working on multiple avenues of discovery simultaneously.

As for being able to compare yourself against other folders, I think that comparing a rank 3 in design to a rank 7 in prediction doesn't really make sense - they're different skills.

If your point is that you want to see who's better 'overall', that's exactly what the overall leaderboard is for.

Again, we will be keeping the old scoreboards up until we think the problem is actually solved. Just know that if we are successful in getting puzzles posted in each of the categories in a constant stream, it's going to be harder and harder to keep up with that fact on the old leaderboards.

B_2's picture
User offline. Last seen 4 years 32 weeks ago. Offline
Joined: 11/29/2008
Groups: None

"However, the old system does not allow for the puzzle load to scale, and it incentivizes people playing every puzzle. While some people may be willing to try and tackle the massive load, it shouldn't be necessary to do well.

In short, the old system was 'broken' because it didn't allow us to run more puzzles, and prevented us from working on multiple avenues of discovery simultaneously.

As for being able to compare yourself against other folders, I think that comparing a rank 3 in design to a rank 7 in prediction doesn't really make sense - they're different skills.

If your point is that you want to see who's better 'overall', that's exactly what the overall leaderboard is for.

Again, we will be keeping the old scoreboards up until we think the problem is actually solved. Just know that if we are successful in getting puzzles posted in each of the categories in a constant stream, it's going to be harder and harder to keep up with that fact on the old leaderboards. "

I really don't understand the reasoning here.

How does the old system prevent you from running more and different types of puzzles? All the scoring system has to do is total up the points of all the completed puzzles over the last 120 days - certainly not a difficult task for any computer.

Also, why do the new players need "help" getting up the leaderboards? They should have to work at it just as we all have since the project started. This smacks of some sort of entitlement system, or the kiddy soccer-league thing when everyone has to be a winner.

OLet's just keep the existing 120 day window the way it is, and just fix the auto-scoring. So what if it's 200 puzzles deep instead of 100? If you want to be the best, put the work into it, play all the puzzles. I'm going to be mightily pissed off if any newbie can race to the top of the leaderboard in a couple of weeks because you have made it far too easy.

infjamc's picture
User offline. Last seen 3 years 6 days ago. Offline
Joined: 02/20/2009
Groups: Contenders

I'm totally speculating here, but here's my guess:

It might be possible that the old system can only handle a certain number of puzzles over a 120-day period. If this were true, then the inability to post more puzzles makes sense, because the scores for the extra puzzles beyond N would otherwise not be countable. (How could this be the case? My guess is that the server's disk space being finite might be a cause. There has been cases where users could not upload solutions or screenshots due to the server being full, for example.)

B_2's picture
User offline. Last seen 4 years 32 weeks ago. Offline
Joined: 11/29/2008
Groups: None

I seriously doubt that storing final scores for puzzles could overwhelm their system. If so, they have a lot bigger problems.

A simple table with columns for Puzzle ID|Category|Close Date|Player ID|Solo Score|Evolver Score should hardly take more than a couple of MB for every puzzle we're ever done.

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group

It is not a technical limitation. We can handle far more than a 4 month window.

The issue is social (I thought I made this clear). If we are running 15 puzzles a week, the scoreboards will not be determined by who has the most skill, but rather by who has the most time, and computational power.

Again, you might be willing to put in this time, but we are not interested in forcing people to play 24/7 in order to remain competitive.

"They should have to work at it just as we all have since the project started. This smacks of some sort of entitlement system, or the kiddy soccer-league thing when everyone has to be a winner."

The exact reverse may be said. The fact that you've been around for a while does not entitle you to having a game-supported advantage over new players. If you are truly a better folder, your experience should keep you ahead of these new players.

B_2's picture
User offline. Last seen 4 years 32 weeks ago. Offline
Joined: 11/29/2008
Groups: None

I disagree that this is a social issue, and I don't think it was at all clear. "Can't run more puzzles..." sounded like a technical limitation.

Put the puzzles up, and people will play them, or not. Their choice. I'll still play every single one I can get to, even with one PC, and running at the most 2 clients.

But don't go messing with the leaderboards!! The four month window is fine, just get the scoring script to work every time.

Having to play for 4 months to get to their proper earned place on the leaderboards isn't a bad thing. It weeds out those who aren't in it for the long haul, and the "just lucky" ones who get a couple of good scores. Do people these days need immediate gratification to stay focused for more than a couple of days? If people stick it out for 4 months, they probably will have learned something, and might stay around. Maybe they will even learn to read the feedback and forums before posting endless duplicates.

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group

Again, I understand you fall into the group of people who are willing to play every puzzle. You are looking at it from your perspective - try to think about the system as a whole, and how it will play out in practice.

Under the old system, most people will not be willing to keep up with an increased puzzle load. As they fail to keep up, they would become discouraged, and less likely to stay with the game. You might be okay with this, citing survival-of-the-fittest as the correct governing principal.

But these people are good folders, and are useful to science. The leaderboard should reflect this - NOT just who has the time and energy to tackle every puzzle.

The new system promotes a smaller number of higher quality solutions from each player, resulting in better results from puzzles for us, and also keeps people engaged in the game longer, as they are less likely to burn out. Less people burning out means more people contributing these higher quality solutions. The smaller window size allows new players to be engaged in the game sooner, increasing the likelyhood that they'll stick with it and actually MAKE it to the 4 month mark. The competition is still there, it's just focused more on who's doing well now than who's been around longer.

I'll ask you to step back for a second, and take an objective look at the design decisions between the two systems, and re-evaluate your position. Remember that just because it's the way it's always been doesn't mean it's the way it should be.

B_2's picture
User offline. Last seen 4 years 32 weeks ago. Offline
Joined: 11/29/2008
Groups: None

I think it's pandering to the trophy generation - those who won't participate unless they get immediate rewards.

I suspect you will end up with an endless parade of newbies, and your dedicated base of good experienced players will dry up. Once they (we?) leave, it's exponentially harder to get them back.

You should do everything possible to try to retain the knowledge base, not try to alienate them by screwing with the scoring.

But - perhaps the social experiment is the real science goal here, so this might be just the most recent way the scientists are poking needles at their captive white rats.

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group

This system will not alienate old players. The system remains competitive. The best folders will still occupy the top spots. If someone manages to beat you over a period of 6 weeks, dare I say it, they are a better folder than you.

I have yet to see any constructive criticism of the new system vs old, so at this point I'm sure you're just trying to get a reaction out of me, especially bringing up the topic of "captive white rats". We do not think of our players as lab rats.

The new system has very clear design advantages, as I've explained in previous posts. It is being done with the primary goal of increasing the probability of Foldit contributing to scientific discovery.

I'll ask you to please keep the discussion civil.

B_2's picture
User offline. Last seen 4 years 32 weeks ago. Offline
Joined: 11/29/2008
Groups: None

You want constructive?
1. Get rid of the ugly green bars that were just added.
2. Keep the overall 120 day puzzle window.
3. Make the scoring windows based on time, not number of puzzles

You all let the cat out of the bag last month when you revealed that we were all subjects of a psych or game theory experiment, analyzing the IRC chat, feedback and forum. How are we to know if changes thrust on us out of the blue are for the good of the science, or to satisfy the scientists studying us?

I still cannot understand how your new scoring system will help anything. Perhaps I'm just dense, or you just aren't explaining it well.

You want to run more puzzles, fine. Just start them up. Why do you have to mess with the scoring system to do that? Outside of not being able to calculate a running 120 day point total, there doesn't seem to be anything that needs to be fixed.

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group

Sorry, when I say constructive criticism, I mean that I want to understand the reasoning behind your suggestions, not just what your suggestions are. We are aware that you want a longer window, but what we're interested in is 'why'. We have very clear reasons for making it a shorter window, and I have yet to see any clear reasons not to. I've also explained the reasoning for using a puzzle-based window as opposed to a time-based window.

If you are having trouble understanding my posts, I suggest you re-read them with an open mind, as I think I've done a good job of explaining our reasoning already.

B_2's picture
User offline. Last seen 4 years 32 weeks ago. Offline
Joined: 11/29/2008
Groups: None

As I mentioned before, a longer window encourages and rewards a long term commitment to the project, which a short window does not.

A short window will reward a lucky guess or two, but staying at the top or holding position in a long-term window requires commitment and dedication the should be encouraged.

If you just want slashdot or Nature flash-in-the pan players, I guess your new scheme will achieve that, but at the expense of the real base of support and knowledge.

Obviously you place less value on the committed players than they deserve. Too bad.

infjamc's picture
User offline. Last seen 3 years 6 days ago. Offline
Joined: 02/20/2009
Groups: Contenders

I see several ways to reward a long-term commitment while still using a short window:

1. Have a parallel annual and all-time leader board (which is what I'm seeing on the "Top Soloists for Recent Puzzles" page).

2. Reduce the exponent on the score function from 7 to 5 or lower. As things now stand, one top finish is worth many above-average finishes, which could cause major fluctuations in rank for those with a high variance in performance.

3. Another possibility is to place a multiplication factor to one's score based on the amount of time that has passed since joining Foldit. For example:

+1% after 2 months
+2% after 4 months
+3% after 6 months
+4% after 9 months
+5% after 12 months (+ achievement)
+6% after 18 months
+7% after 24 months (+ achievement)
+8% after 36 months (+ achievement)
+9% after 48 months (+ achievement)
+10% after 60 months (+ achievement)

Joined: 04/19/2009

There is one very good reason for a long time window for overall rankings.

It will take a while before many of us will be able to forget the months when the software was seriously screwed up. By removing the sliders and backtracking, that fixed it to some degree. A few weeks ago, diversity scoring returned, again to some degree - but unless you folks have figured out why and not shared it, it was something that was not planned for, and was not done on purpose.

My point is that the software has grown incredibly complex. Every improvement you make to it has the possibility of introducing a new (or old) bug. We all acknowledge that as "the price we pay" to play the game, and are as patient as we can be when something happens that adversely affects gameplay.

Had you had a 6 week window when the software was bad, many of the experienced players would likely have stopped playing (most of those who do handbuilding, which was extremely affected) as their rankings dropped severely. As it was, some did stop playing - and more were on the verge of quitting.

As it is, the software does keep having bugs (the ongoing crashes that bertro is having, and the fact that even on a different computer for me, mac snow leopard will hard hang during any mutate script). We also are still missing what used to be much of the fun of the game, walking.

I do understand some of the reasoning behind what I think is an attempt to increase the puzzle load without discouraging players. I like the idea of skips. I like having the beginners be able to gauge themselves to other beginners. I like the ability to see how I am doing in the different types of puzzles.

But I agree with Boots and Brick that there is no reason to severely shorten the overall rankings time period, for the reasons they have stated. I see no reason why we can't have a longer window for overall rankings, which could integrate with average number of puzzles in a new typical 4 month window (including skips).

Jflat, you were obviously assigned this job, and you've done a lot of good work. Please realize that this whole scoring thing was raised once in a dev chat with many objections on the part of the players, and what you have presented was not discussed with the players at all. Much of what you have accomplished is being received positively, but this one thing - the drastic change to the overall ranking window, is going to spark a lot of conflict since the players were unable to make their thoughts known in advance.

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group

From talking with Seth, my understanding is that the scoring used to be unlimited - going back to the start of the project. In response to player feedback, the window was shortened.

But the actual length of the window (4 months) was essentially arbitrary. It just 'seemed like a reasonable value'. The same applies to the new 30 puzzle value in the new system. It just 'seemed like a reasonable value'. As I've stated in the blog, these values are subject to change based on our observations and player feedback.

There had been previous player discussion about shortening the window in the thread covering the last time the scoreboards had issues: http://fold.it/portal/node/992082 (Including support from B_2 and yourself).

Again, it could be that the value is off, and it was cut *too* short. As you've mentioned, the issue with cutting the window too short is that it is not as tolerant to short-term issues, such as a set of weird puzzles, or a bug in the game. It also somewhat diminishes the statistical significance of the ranking. But in general, it is desirable to keep the window as short as possible while avoiding these issues.

30 puzzles / 6 weeks is a lower bound on a 'reasonable' value. My plan was to adjust this value until we reach what seems to be the optimal value. In retrospect, it may have been a better idea to start longer, and then incrementally shorten it until the optimal value was found.

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group

Also note that the issue of the beginner puzzles (<15) and (<150) that was brought up in the same thread is also resolved by this system, since they are on a separate leaderboard, and don't count towards overall score.

Player feedback on these values is one of the reasons I had the developer chat (http://fold.it/portal/node/992887). I did not hear strong opposition to the current values during the chat. Was the chat too soon for people to evaluate the new system? I wanted to preempt concerns such as this by having a chat to address problems quickly.

B_2's picture
User offline. Last seen 4 years 32 weeks ago. Offline
Joined: 11/29/2008
Groups: None

If you didn't have the chats in the middle of the US workday, you might get more participation. So far, I have *never* been able to join a dev or science chat.

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group

Unfortunately, we work during US workdays, so this is when we're available. We apologize for the inconvenience.

Joined: 04/19/2009

Can I suggest a few things here?

In terms of another chat - yes, very desirable. And although we know that you prefer to have them during US working hours, we also know that you are working late some evenings :-)

Perhaps you could work a deal with whomever to go into work late some day soon, and do the chat at around 7 pm your time, which would give those on the US west coast a chance to participate after they've come home from work.

From what I've heard, many folders think that much of what you are doing is either enhancing the rankings or neutral (other than the overall window). It's nice to know how a player ranks in certain categories. Removing beginner puzzles from the overall rankings is a strongly positive step.

The only point of contention here would seem to be the severe shortening of the overall ranking window. I suspect that if you started with a 50 puzzle window (in one of the other feedbacks infjamc had counted 53 puzzles in the previous 4 month window), that would likely satisfy most people, coupled with having dropped the <15 and <150 puzzles from inclusion in that scoring/ranking.

If you start inundating us with puzzles, the time period for 50 puzzles will shrink - at that point have another community discussion - it may be desirable, or may need to be tweaked.

If you have reason to have CASP or design or whatever puzzle category to have a shorter window, then simply state it and leave those the way you need them. There is no reason to tie everything together with the same window (in fact if you are not using the <15 & <150 puzzles, then you are already not tying it all together). If the rankings are kept up to date and specified clearly what the windows are, then it's all good.

One last suggestion... many people have commented on how it's really good to have the forever totals for soloists available to see, as you currently have listed in "Top Soloists for Recent Puzzles". Please find a way to keep that list available.

spmm's picture
User offline. Last seen 48 weeks 4 days ago. Offline
Joined: 08/05/2010
Groups: Void Crushers

just a comment on the 'top soloist for recent puzzles' - this is a cumulative score and does not really tell you anything much apart from the fact that the longer you play the more points you will collect, that is fine, anyone who has played consistently for years will be nearer the top.

Madde's list of Master Soloist is a much better indicator of quality as you have to have folded in the top ten, not just been around for a long time. http://de.foldit.wikia.com/wiki/Master_Soloist

Joined: 04/19/2009

Then that would be great to include here on the website as well.

Folders who have been here for years, as well as those who have excelled at some point in their folding career should be recognized in some way, as in having these lists easily available on the main website.

spmm's picture
User offline. Last seen 48 weeks 4 days ago. Offline
Joined: 08/05/2010
Groups: Void Crushers

thanks jflat - just to avoid opening a different post slightly off topic - if we are having lots more puzzles then will there also be a review of the client Puzzle Menu? It may be just me, but I find it difficult to find puzzles now especially in devprev. Beginner/Intermediate/Advanced doesn't really help particularly when I am looking for an expired puzzle number, as the font and each entry are quite large scrolling is not that simple either.

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group

Yes, we are looking into this. Hopefully the new system will address all these issues.

B_2's picture
User offline. Last seen 4 years 32 weeks ago. Offline
Joined: 11/29/2008
Groups: None

Can we go back to the original topic -

Have the 4 month solo, evolver and team leaderboards been recalculated correctly since this feedback was opened?

I think not.

Can you PLEASE at least try to get that to happen reliably? Id doesn't make any difference what the scoring window is if the leaderboards don't ever get re-calced every time a puzzle closes.

jflat06's picture
User offline. Last seen 1 day 9 hours ago. Offline
Joined: 09/29/2010
Groups: Window Group

We've found the bug and will be updating the scoreboards soon.

Joined: 09/24/2012
Groups: Go Science

Hello,

The solist and evolvers scores for categories semm not to be accurate (as far as I understand the latest ranking rules).

LociOiling's picture
User offline. Last seen 2 hours 16 min ago. Offline
Joined: 12/27/2012
Groups: Beta Folders

Yes, the category scores are incorrect since we moved to the new server.

The Overall category page is extremely slow to load, as are the other category pages. The Overall page doesn't list any soloists or evolvers. The list of open puzzles is wildly inaccurate, showing puzzle 665 as open. The closed puzzle list is empty.

I can't do any further research right now, none of the category links seem to be responding.

LociOiling's picture
User offline. Last seen 2 hours 16 min ago. Offline
Joined: 12/27/2012
Groups: Beta Folders

The Symmetry category started responding again. It shows the top soloists and evolvers with 200 points each.

The top 25 soloists all have 200 points, but have different ranks despite that.

The top 14 evolvers have 200 points, and are shown tied at #1 rank.

Not sure what's going on, but I believe the colloquial phrase that includes "beyond all recognition" would describe the situation.

frood66's picture
User offline. Last seen 11 hours 35 min ago. Offline
Joined: 09/20/2011
Groups: Marvin's bunch

I'm only here to say how great it is to see bootsmcgraw listed in feedback - great player - lost for no good reason imho

Hanto's picture
User offline. Last seen 5 days 16 hours ago. Offline
Joined: 05/10/2008
Groups: None

Why was Boots lost to the community and game frood? You know the reason i am ignorant of these things

Sitemap

Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, RosettaCommons