puzzle picture
1835: Coronavirus NSP2 Prediction
Status: Closed


Name: 1835: Coronavirus NSP2 Prediction
Status: Closed
Created: 05/07/2020
Points: 100
Expired: 05/14/2020 - 23:00
Difficulty: Intermediate
Description: Fold this coronavirus protein! This is one portion of a larger protein encoded in the viral genome of SARS-CoV-2. It is encoded in a region of the genome called NSP2, but the protein's structure and function are still unknown. If we knew how this protein folds, we might be able to figure out its exact function. The puzzle's starting structure shows SS predictions from PSIPRED, and hints which parts of the protein might fold into helices or sheets. Refold this protein to find high-scoring solutions, which will tell us how this protein is most likely to fold!

Categories: Overall, Prediction

Top Groups

1Hold My Beer11,575100
3Go Science11,35765
4Team India11,32452
5Marvin's bunch11,30341

Top Evolvers

Top Soloists

Need this puzzle? Log in to download.  


bkoep's picture
User offline. Last seen 6 hours 14 min ago. Offline
Joined: 11/15/2012
Groups: Foldit Staff
PSIPRED Predictions

Conf: 951668778889987044289999999998313168999999998664231538712316
              10        20        30        40        50        60

Conf: 783248999999999986558897799999999999986399999999999999600663
              70        80        90       100       110       120

Conf: 54635564367999998519
             130       140
Joined: 04/20/2012
Groups: Go Science

Should we assume this protein forms disulfide bonds or not?

beta_helix's picture
User offline. Last seen 16 hours 25 min ago. Offline
Joined: 05/09/2008
Groups: None
disulfide bond predictions:

from http://clavius.bc.edu/~clotelab/DiANNA/

Cys position	Distance	Bond	        Score
116 - 128	12	KFISTCACEIV-GQIVTCAKEIK	0.01037

and from http://disulfind.dsi.unifi.it/monitor.php?query=jKDK6k


DB_state                                     0 0         0            
DB_conf                                      3 1         2  
Joined: 04/20/2012
Groups: Go Science
Thanks, but what does the above mean?

In the DiANNA part above, do scores near 0.01 mean
that line's disulfide bond is not very likely?

In the bottom part, what does DB_state = 0 mean?
Also, what does DB_conf = 1 2 or 3 mean?

Joined: 04/20/2012
Groups: Go Science
DiANNA part:

Going to http://clavius.bc.edu/~clotelab/DiANNA/
and clicking on Help! says the following:

"Disulfide connectivity
For each pair of cysteine in the input sequence,
a neural network trained to recognize disulfide bonds
produce a score ranging from 0 to 1 (higher the score,
higher the prediction reliability)."

It also gives an example with 4 cysteines and
6 possible disulfide bonds with scores ranging
from 0.1 to 0.9. It picks 2 disulfide bonds each
with a score of 0.8 as the best combination.

Joined: 04/20/2012
Groups: Go Science
disulfind part:

The link given for the disulfind part above says:

DB_state predicted disulfide bonding state (1=disulfide bonded, 0=not disulfide bonded).
DB_conf confidence of disulfide bonding state prediction (0=low to 9=high).

So I guess it predicts that no disulfide bonds form,
but it is not very confident in this prediction.

Serca's picture
User offline. Last seen 24 weeks 2 days ago. Offline
Joined: 02/03/2020
Groups: Go Science
Any ideas why this 140

Any ideas why this 140 residues section of the NSP2 protein is not buried inside the other 498 residues of NSP2 protein?

We cannot ignore hydrophobicity in Foldit, so best strategy to have top score on this puzzle seems to be skipping any secondary structure prediction and even the real protein SS and build your own.

Susume's picture
User offline. Last seen 10 hours 7 min ago. Offline
Joined: 10/02/2011
based on first round of CASP predictions

My understanding is the CASP organizers have tentatively divided the larger viral proteins into smaller sections (domains) based on the predictions they got back in the first round of competition. Those predictions came from both servers and human teams.

Serca's picture
User offline. Last seen 24 weeks 2 days ago. Offline
Joined: 02/03/2020
Groups: Go Science
Ok, now it became a bit

Ok, now it became a bit clearer why do we have residues 360-499 of the NSP2 protein. That looks like a domain of the highest distance similarity between different CASP models. And that part has the highest helix propensity according to the SS prediction.

But I still cannot see any evidence that this fragment is spatially separated from the rest of the NSP2 protein to search its highest Foldit score. The Hiding score looks important enough to skip any structure trying to hide hydrophobics of this NSP2 fragment inside itself.

And btw, overall NSP2 has 27 cysteines.

bkoep's picture
User offline. Last seen 6 hours 14 min ago. Offline
Joined: 11/15/2012
Groups: Foldit Staff
CASP domain suggestions

That's right, Serca. We are simply going off of suggestions from the CASP organizers about tentative domain assignments of this target.

These suggestions likely come from inter-residue distance prediction models, similar to AlphaFold. As far as I know, nobody has collected any empirical data about this protein's structure. So, this sequence might form a well-folded domain; but it might not. Foldit predictions might help us figure that out!

Joined: 05/26/2008
Groups: Hold My Beer
I ran my own psipred

I ran my own psipred prediction on this. Only place it varies significantly is that it predicted the small helix to just as likely be a sheet, so I stripped color from any segments that had a low confidence and converted the small helix to a sheet, making a small sheet pair with cysteine at the end of each. Though I haven't been able to get a disulfide bond that keeps the score up, it puts all 3 cysteine together in the same small pocket.

User login
Download links:
  Windows    OSX    Linux  
(10.12 or later)

Are you new to Foldit? Click here.

Are you a student? Click here.

Are you an educator? Click here.
Social Media

Only search fold.it
Other Games: Mozak
Recommend Foldit
Top New Users

Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, RosettaCommons