Placeholder image of a protein
Icon representing a puzzle

2109: Symmetric Tetramer Design with AlphaFold Predictions

Closed since 7 months ago

Intermediate Overall Design Symmetry


February 16, 2022
Max points

Design a symmetric protein tetramer, with 4 identical chains of 60 residues each! This puzzle enables AlphaFold predictions for the monomer subunit of your design, so you can upload your solution for AlphaFold using the AlphaFold prediction tool. AlphaFold will predict the structure of your monomer subunit only (i.e. in the unbound state, in the absence of other symmetric copies). If you load this prediction, then Foldit will attempt to align the prediction with your solution. If you continue working off of the AlphaFold prediction, you may need to make adjustments at the interface where the monomer subunit interacts with symmetric copies.

This puzzle includes a Secondary Structure Objective, so no more than 50% of your design can form helices. The H-bond Network Objective encourages players to build buried, satisfied H-bond networks at the interface between symmetric chains. H-bond networks are a great way to introduce polar residues at the interface, but it's important that all of the bondable atoms make hydrogen bonds! We've also adjusted the H-bond Network Objective so that poor-scoring H-bonds may not contribute to networks; poor-scoring H-bonds will be displayed in red. This puzzle uses the Buried Unsats Objective, with a large penalty for buried polar atoms that can't make H-bonds. In this puzzle, there are no limits on the Complex Core, but we've included the Complex Core objective so players can see the core residues that can be incorporated into H-bond Networks.

Top groups

  1. Avatar for Go Science 100 pts. 19,683
  2. Avatar for Anthropic Dreams 2. Anthropic Dreams 68 pts. 19,485
  3. Avatar for Contenders 3. Contenders 44 pts. 19,474
  4. Avatar for Marvin's bunch 4. Marvin's bunch 27 pts. 18,933
  5. Avatar for Gargleblasters 5. Gargleblasters 16 pts. 18,716
  6. Avatar for Beta Folders 6. Beta Folders 9 pts. 18,158
  7. Avatar for Australia 7. Australia 5 pts. 15,892
  8. Avatar for L'Alliance Francophone 8. L'Alliance Francophone 3 pts. 15,650
  9. Avatar for Void Crushers 9. Void Crushers 1 pt. 15,393
  10. Avatar for BIOF215 10. BIOF215 1 pt. 13,197

  1. Avatar for Bruno Kestemont
    1. Bruno Kestemont Lv 1
    100 pts. 19,683
  2. Avatar for spvincent 2. spvincent Lv 1 96 pts. 19,301
  3. Avatar for Galaxie 3. Galaxie Lv 1 91 pts. 19,264
  4. Avatar for alcor29 4. alcor29 Lv 1 86 pts. 19,195
  5. Avatar for ichwilldiesennamen 5. ichwilldiesennamen Lv 1 82 pts. 18,976
  6. Avatar for Punzi Baker 2 6. Punzi Baker 2 Lv 1 77 pts. 18,903
  7. Avatar for gmn 7. gmn Lv 1 73 pts. 18,710
  8. Avatar for georg137 8. georg137 Lv 1 70 pts. 18,687
  9. Avatar for HuubR 9. HuubR Lv 1 66 pts. 18,599
  10. Avatar for BootsMcGraw 10. BootsMcGraw Lv 1 62 pts. 18,453


bkoep Staff Lv 1

Buried Unsats (max +500)
Penalizes polar atoms that cannot make hydrogen bonds, -200 points per atom (not including symmetric copies).

Core Existence: Monomer (max +1600)
Ensures that at least 16 residues are buried in the core of the monomer unit.

Core: Complex (max +0)
Awards no bonuses or penalties. Click Show to see which residues count as "Core" for the H-bond Network objective.

H-bond Network (max +2400)
Rewards networks that comprise at least 2 H-bonds involving core residues.
Between 1 and 12 H-bonds should cross the interface between symmetric units.
Networks must be at least 75% satisfied (i.e. 75% of all bondable atoms in a network must make a H-bond).

Interaction Energy (max +500)
Monitors that all large PHE, TYR, and TRP residues are scoring well.

SS Design (max +500)
Penalizes all CYS residues.

Ideal Loops (max +500)
Penalizes any loop region that does not match one of the Building Blocks in the Blueprint tool. Use "Auto Structures" to see which regions of your protein count as loops.

Secondary Structure (max +500)
No more than 50% of residues may form helices. Extra helices are penalized at 10 points per residue.

Neural Net (max +0)
Achieve an AlphaFold prediction confidence of 80% or more.

Vincera Lv 1

The AF tool is missing on my Menu panel next to the 'Save and Exit' option. I cannot upload my symm tetra predictions on this puzzle and could not upload my endgame solutions on the NNO puzzle 2106.

bkoep Staff Lv 1

In the latest update, the AlphaFold button has been moved from the Main Menu to the Actions Bar (next to the buttons for Blueprint and Rama Map). Sorry for the confusion!

ichwilldiesennamen Lv 1

As of late I only try to focus on good AF-values. I have noticed that some loop-constraints seem to be counteracting the AF-performance. Especially blue+green 2-segment loops in sheets seem to do this. When I load a prediction with them I always see that the predicted loops are much more "twisted". Sometimes so much that the loop gets green+green. If I then do wiggle etc. with the loop-constraints in, then the twist relaxes but AF-sim goes noticeably down. If I instead remove the constraints and then do wiggle, the AF-sim stays quite good in most cases.
Could it be that these constraints are inaccurate/unsutable for good AF? Could these be changed so that they affect the AF less?

BootsMcGraw Lv 1

Finally… we are allowed to incorporate alanines into our structures without penalty. My helices thank you.

bkoep Staff Lv 1

I suspect the underlying issue here is the protein sequence.

When we look at the BG and GG hairpins found in natural protein structures, we see a stark difference in the sequences that fold into these two different shapes. The GG loop is far more common, and usually has a ASN-GLY or ASP-GLY sequence; the BG loop is less common, but it is almost always a PRO-GLY sequence. If we dig a little deeper and check the Rama Map for PRO, we see that PRO in fact never adopts a G (green) backbone, which is prevented by the special ring-shape of the PRO sidechain.

So, it seems that the GG hairpin shape is generally more stable than BG. For most sequences, AlphaFold will probably predict a GG loop instead of a BG. If you really want a BG hairpin, then the PRO is necessary to prevent the loop from "misfolding" into GG.

This goes back to the fundamental problem of protein design. Foldit can only optimize the absolute energy of your current solution, but as protein designers we have to consider the entire energy landscape when we pick a sequence for our protein. If you ask Foldit to mutate the sequence of a BG loop, Foldit will pick ASP-GLY if it has a better energy than PRO-GLY – even though the energy landscape of ASP-GLY may prefer the GG loop.

ichwilldiesennamen Lv 1

Thx bkoep for these great background infos! Because for me as a player with little to no background in biochemistry, I have no clue of what is likely to work or not in reality. I only see the options in the bp-tool and take whatever I think would fit well. Seems like I have to look more at GG-loops. I can fully confirm what you wrote. In BG loops typically no PRO is formed by Foldit itself. But it might be interesting to see what happens when I enforce it. Maybe the twisting by AF-prediction will not be as severe anymore. But I will also look at GG. Ruled that out long time ago because I did not favor the resulting sheet-curvature at that time.
Might it be a good idea to have in the bp-tool a percentage or something given of the likelihood of the occurance in nature like it is in the sidechain-tool? Or is it not possible to tell this in general? Or do you favor the diversity in players trying different loop-types even though the likelihood of occurances in nature is low?

bkoep Staff Lv 1

I like the idea to include information about "prevalence" for the different Building Blocks; perhaps we can add this somewhere in the Building Block panel.

In the meantime, note that the Building Blocks are already sorted in order of prevalence. The Building Blocks at the top of the panel are more common than those at the bottom.

ichwilldiesennamen Lv 1

Good to know that this is already sorted. Is this valid for ALL categories in the bp-tool? So not only sheet-sheet but also sheet-helix for example? Because I typically used elements more "to the bottom" in each category. These always worked very well for me in the past also in terms of AF performance. Would you recommend it to try more to use elements with higher prevalence instead? Can we expect a scientific advantage in this or should I rely only on getting good AF values AND score? Because this is what I get when using these less common elements.
For the sheet-sheet I do get very good results with the GG-loops so I am fine in that category.
Any guidance/tipps on what I should do/change to get scientifically more useful results is very much appreciated.

bkoep Staff Lv 1

Yes, each category is sorted from most to least prevalent.

You should not worry about using Building Blocks at the bottom of the list–use whichever Building Block suits your needs! All of the Foldit Building Blocks are "canonical" loops that show up frequently in natural protein structures. Some show up more frequently than others, but that doesn't necessarily mean they are more stable.

See Lin et al. (2015) for more details.