Experiment results for MERS-CoV binders

We have lab results for Foldit MERS-CoV binder designs! Several months ago, we challenged Foldit players to design proteins that could bind to the spike protein of the MERS coronavirus and block infection (this is similar to our previous challenge to bind the COVID-19 viral spike). After the puzzles ended, we used yeast-display FACS experiments to test the most promising Foldit designs and see if they stick to the MERS spike protein.

Long story short, this experiment did not reveal any successful binders. Read on for details about these designs and the lab experiments we used to test them, including some new ideas we used to boost our chances of success. Evidence suggests that this particular MERS-CoV spike protein is an especially difficult binding target. And this latest experiment highlights the challenge of the protein binder design problem.

Design strategy

Starting in October 2020, we ran 7 rounds of MERS-CoV Binder Design puzzles. Prior to these puzzles, we had challenges to design binders for the SARS-CoV-2 spike and the IL6 receptor. But these MERS-CoV puzzles were some of the first Foldit puzzles to award bonuses for binder metrics, like SASA and Shape Complementarity.

Since then, we’ve replaced SASA and Shape Complementarity with the new Contact Surface Objective, which is faster to run and seems to be a better predictor of binder success. We were only able to run one MERS-CoV puzzle with the Contact Surface Objective, but the results from that puzzle looked especially good. (In fact, we ended up testing more designs from that puzzle than any other puzzle in the series!)

After the round 7 puzzle closed, we ran some additional analysis on all of Foldit players’ solutions to select the most promising designs. Our selection criteria included binder metrics like DDG, Contact Surface, and BUNS. We also ran some other calculations, like secondary structure prediction, that indicate whether a design is likely to fold correctly.

In the end, we selected 59 solutions that we believed could bind to the MERS-CoV spike protein, and queued the designs for testing:

2010629_c0001 OWM3,Enzyme
2010629_c0107 ZeroLeak7
2010629_c0138 Enzyme
2010629_c0143 ZeroLeak7
2010629_c0407 PLAYER_3
2010629_c0998 ichwilldiesennamen
2010629_c1072 alwen
2010629_c1101 Todd6485577
2010667_c0002 ZeroLeak7
2010667_c0004 Galaxie,PLAYER_2
2010667_c0009 grogar7
2010667_c0012 ucad
2010667_c0018 ichwilldiesennamen
2010667_c0021 Enzyme,gm82
2010667_c0023 Bletchley Park,infjamc
2010667_c0165 Todd6485577
2010667_c0428 ZeroLeak7
2010727_c0156 Bruno Kestemont
2010727_c0545 Bruno Kestemont
2010727_c0684 ZeroLeak7,mirp
2010727_c0840 Mike Lewis,Enzyme
2010727_c1402 ZeroLeak7,Dhalion,fisherlr777
2010771_c0001 Enzyme,Skippysk8s
2010771_c0004 nspc
2010771_c0005 Galaxie,grogar7
2010771_c0014 Enzyme
2010771_c0041 PLAYER_4
2010771_c0060 PLAYER_2
2010816_c0010 PLAYER_1
2010816_c0011 ZeroLeak7,mirp

2010816_c0020 ZeroLeak7
2010816_c0028 ichwilldiesennamen
2010816_c0033 LociOiling
2010816_c0034 Bruno Kestemont
2010816_c0054 nspc
2010816_c0087 ucad
2010816_c0103 LociOiling
2010816_c0128 LociOiling
2010816_c0130 LociOiling
2010816_c0154 LociOiling
2010816_c0157 LociOiling
2010816_c0191 LociOiling
2010816_y4289 nspc
2010816_y4442 Formula350
2010882_c0014 nspc
2010882_c0018 ichwilldiesennamen
2010913_c0002 Skippysk8s
2010913_c0017 Skippysk8s
2010913_c0030 Skippysk8s
2010913_c0039 Skippysk8s
2010913_c0063 Skippysk8s
2010913_c0227 Skippysk8s
2010913_c0508 Anfinsen_slept_here
2010913_c0567 Skippysk8s
2010913_c0669 Skippysk8s
2010913_c0789 ZeroLeak7,Dhalion
2010913_c0933 NeLikomSheet
2010913_c0939 ZeroLeak7,Dhalion
2010913_c1161 NeLikomSheet

Boosting Foldit binder designs

If you read our previous blog post about design throughput, you might recall that the success rate for binder design experiments is around 0.1%. We think the main source of failure is from designs not folding up exactly as designed. Even a tiny inaccuracy in folding can be ruinous for binding, for example if it creates a clash or an unsatisfied polar atom.

Researchers are working hard to improve our ability to select good binders before testing, so we can increase this success rate. But right now even the best-looking design only has a 1 in 1000 chance of correctly binding the target. This also means that we’d like to be testing at least 1000 designs in each experiment. So, even though Foldit players created 59 excellent binder designs, it is still unlikely that our experiment will reveal a successful binder from a batch of this size.

In order to boost our design numbers and make the most of Foldit players’ work, we used a new grafting technique to recombine Foldit binder designs with automated design scaffolds.

Using high-throughput folding experiments, scientists at the Institute for Protein Design (IPD) have accumulated a database with millions of automatically-designed proteins that seem to be well-folded in the lab. Even though these proteins don’t do anything (like bind a target), they are good starting points for further design, and serve as useful scaffolds for modification. For many protein design projects at the IPD, scientists prefer to start from one of these well-behaved protein scaffolds rather than try to design a new fold from scratch.

From each of the 59 parent Foldit designs, we extracted the portion that makes the most binding interactions with the target. Then we looked to see if we could computationally graft that portion onto the scaffolds in our database. This method is finicky, and the graft has to match the scaffold backbone very closely for it to have any chance of working. Even though there are millions of proteins in the scaffold database, some Foldit designs cannot be matched to any scaffold.

This technique lets us recycle Foldit-designed interfaces into many unique designs with different folds and different sequences. By recombining the Foldit designs with the scaffolds, we were able to multiply our 59 parent designs into 873 grafted designs.

For good measure, we also redesigned each of the 59 parent designs using the IPD’s latest machine-learning algorithm. Typically, this does not change the parent design drastically, but it still provides a little bit more sequence diversity for the experiment. That brought us to a total of 989 designs to test for binding against the MERS-CoV spike protein.

Experiment results

Below is a preview of the data. You can download the data for all 989 designs here.

pdb_id       	counts1	counts2	counts3	counts4	counts5	counts6	ddg    	contact_surface	BUNS
2010629_c0001	37     	0      	0      	0      	0      	0      	-44.673	401.919     	3
2010629_c0107	4      	0      	0      	0      	0      	0      	-56.559	567.339     	7
2010629_c0138	0      	0      	0      	0      	0      	0      	-41.261	422.217     	6
2010629_c0143	34     	0      	0      	0      	0      	0      	-42.075	430.841     	6
2010629_c0407	2      	0      	0      	0      	0      	0      	-44.809	424.300     	7
2010629_c0998	1      	0      	0      	0      	0      	0      	-39.433	356.197     	4
2010629_c1072	0      	0      	0      	0      	0      	0      	-36.672	440.411     	5
2010629_c1101	0      	0      	0      	0      	0      	0      	-37.413	381.733     	7
2010667_c0002	4      	0      	0      	0      	0      	0      	-43.552	481.502     	3

Sort schedule

  1. Expression
  2. Enrichment at 1000 nM target
  3. Enrichment at 1000 nM target
  4. Binding at 1000 nM target
  5. Binding at 200 nM target
  6. Binding at 40 nM target

For details about the binding experiment and how to interpret these numbers, see previous blog posts here and here. To recap: yeast cells display our designs on their surface, and we do successive rounds of sorting to collect yeast cells that appear to stick to the target. After each round of sorting, we use DNA sequencing to track which designs were collected. A high number indicates that we collected many yeast cells that appear to bind the target with our design.

Generally speaking, if a design successfully binds the target, we expect to see steady high numbers for that design across all six rounds of the experiment. An unsuccessful design will have decreasing numbers and eventually drop out of the sorting rounds.

Unfortunately, none of our 989 designs appeared to bind to the MERS-CoV spike protein.

A difficult target

In parallel with the Foldit designs, IPD scientists also tested about 30,000 designs that were created with an automated design method. From those 30,000 designs, only 11 showed any binding--and only weak binding at that. That translates to a success rate considerably lower than the usual 0.1%, and hints that the MERS-CoV spike protein is an especially difficult target.

The 11 IPD binders all have especially high Contact Surface values (around 500 or greater). This was a bit surprising, since previous data had suggested a Contact Surface value of 400 can be sufficient for good binding. This new data will help us improve our binder design puzzles in Foldit, and in the future we’ll be challenging Foldit players to strive for binder designs with even stronger binder metrics!

( Posted by  bkoep 70 412  |  Sun, 05/09/2021 - 01:06  |  0 comments )
User login
Download links:
  Windows    OSX    Linux  
(10.12 or later)

Are you new to Foldit? Click here.

Are you a student? Click here.

Are you an educator? Click here.
Social Media

Only search fold.it
Other Games: Mozak
Recommend Foldit
Top New Users

Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, Boehringer Ingelheim, RosettaCommons