How does Foldit decide which residues are in the core?

Case number:845833-2012050
Topic:Game: Display
Opened by:jeff101
Opened on:Monday, September 13, 2021 - 19:30
Last modified:Friday, September 17, 2021 - 22:48

I have a design that keeps giving 0's for the Core Existence: Monomer
and Core Complex Objectives. If you check the Show box for one of these 
Objectives, each segment appears blue (protein exterior), green (border 
between exterior and core), or orange (core). I think the # of orange 
(core) residues determines the scores for the Core Objectives.

That being said, how does Foldit decide which residues to color blue
vs green vs orange? Does Foldit use the Packing or Hiding subscores for 
each residue? Does Foldit consider the # of voids near each residue? 
Does Foldit consider how many hydrogen bonds each residue has or how 
many BUNS are near the residue? Does Foldit consider the hydrophobic 
vs hydrophilic nature of each residue? For example, are the amino acids 
vwylmpiacfg (orange in the Selection Interface's Mutate Tool) more likely 
to be treated as core residues than the amino acids stnqrhkde (blue in 
the Selection Interface's Mutate Tool) ? The amino acids w & y are odd
in that they are colored orange in the Mutate Tool even though they can
form hydrogen bonds. Are w & y more likely to be in the protein interior
or on the surface of the protein?

If there are web pages or articles that answer these questions, please
post links here. Also, if there are recipes folks have found helpful for
raising the Core Objectives' scores, please let me know.


(Mon, 09/13/2021 - 19:30  |  4 comments)

bkoep's picture
User offline. Last seen 12 hours 25 min ago. Offline
Joined: 11/15/2012
Groups: Foldit Staff

The Core Existence Objective classifies each residue as surface/boundary/core based roughly on the number of residues nearby. See a previous explanation here.

jeff101's picture
User offline. Last seen 2 days 19 hours ago. Offline
Joined: 04/20/2012
Groups: Go Science
The post you cited above says:

The classification of a residue is based on the number of 
neighboring Cα atoms that lie within some distance of its 
sidechain—or rather, within a cone that radiates out along 
the sidechain's Cα-Cβ axis (we usually want to ignore atoms 
"behind" the residue that do not interact with the sidechain). 
The neighbor "counting" is actually continuous (a residue can 
have 3.46 neighbors, for example), but the layer classification 
is applied on a hard threshold of this neighbor count.

Below are some questions about this.

What are the neighbor count thresholds for blue/green/orange
or surface/boundary/core residues? 

Since glycine doesn't have a Cβ, does it always give 0 for 
its neighbor count?

When viewed from the side, a cone looks like a triangle. 
I think for the neighbor counting, this triangle will be 
isosceles with 2 sides the same length L and a third side
of length B. Is the angle b between the 2 equal sides the 
same for all amino acids with Cβ? Using trigonometry, the
triangle should have a height H that obeys:

       H/L=cos(b/2) or H=L*cos(b/2). 

Similarly, the base B should obey:

   B/(2*L)=sin(b/2) or B=2*L*sin(b/2) 

as well as:

   B/(2*H)=tan(b/2) or B=2*H*tan(b/2). 

Do larger amino acids have larger B, H, & L values, making 
their cones larger and giving larger neighbor counts? Do 
you have a table of the angle b and the lengths B, H, & L
for each of the 20 amino acid types?

jeff101's picture
User offline. Last seen 2 days 19 hours ago. Offline
Joined: 04/20/2012
Groups: Go Science

Another question is how partial neighbor counts occur.
Does Foldit find the volume of the intersection of the 
cone with each nearby Cα atom? Does it divide this 
volume by the volume of a Cα atom to get what fraction 
of the Cα atom's volume is inside the cone? What is 
the Cα atom radius it uses to find these volumes?

Thanks again,

jeff101's picture
User offline. Last seen 2 days 19 hours ago. Offline
Joined: 04/20/2012
Groups: Go Science
says that filter.GetData can give some results for the 
"Core Existence" and "Core: Complex" filters. For both 
of these filters, it gives for each segment the results:
  -1 for surface (blue)
   0 for boundary (green) and
   1 for core (orange).

Would it be possible to add for each of these filters the 
number of neighbors each residue has? These numbers are not 
just integers. Instead, they can have values like 3.46. These 
numbers could be used in recipes to tell if a conformation 
change is subtly converting a residue from surface to core 
(from blue to green to orange) or from core to surface 
(from orange to green to blue). For more details, see above.


Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, Boehringer Ingelheim, RosettaCommons