Experiment results for MERS-CoV binders
We have lab results for Foldit MERS-CoV binder designs! Several months ago, we challenged Foldit players to design proteins that could bind to the spike protein of the MERS coronavirus and block infection (this is similar to our previous challenge to bind the COVID-19 viral spike). After the puzzles ended, we used yeast-display FACS experiments to test the most promising Foldit designs and see if they stick to the MERS spike protein.
Long story short, this experiment did not reveal any successful binders. Read on for details about these designs and the lab experiments we used to test them, including some new ideas we used to boost our chances of success. Evidence suggests that this particular MERS-CoV spike protein is an especially difficult binding target. And this latest experiment highlights the challenge of the protein binder design problem.
Starting in October 2020, we ran 7 rounds of MERS-CoV Binder Design puzzles. Prior to these puzzles, we had challenges to design binders for the SARS-CoV-2 spike and the IL6 receptor. But these MERS-CoV puzzles were some of the first Foldit puzzles to award bonuses for binder metrics, like SASA and Shape Complementarity.
Since then, we’ve replaced SASA and Shape Complementarity with the new Contact Surface Objective, which is faster to run and seems to be a better predictor of binder success. We were only able to run one MERS-CoV puzzle with the Contact Surface Objective, but the results from that puzzle looked especially good. (In fact, we ended up testing more designs from that puzzle than any other puzzle in the series!)
After the round 7 puzzle closed, we ran some additional analysis on all of Foldit players’ solutions to select the most promising designs. Our selection criteria included binder metrics like DDG, Contact Surface, and BUNS. We also ran some other calculations, like secondary structure prediction, that indicate whether a design is likely to fold correctly.
In the end, we selected 59 solutions that we believed could bind to the MERS-CoV spike protein, and queued the designs for testing:
2010667_c0023 Bletchley Park,infjamc
2010727_c0156 Bruno Kestemont
2010727_c0545 Bruno Kestemont
2010727_c0840 Mike Lewis,Enzyme
2010816_c0034 Bruno Kestemont
Boosting Foldit binder designs
If you read our previous blog post about design throughput, you might recall that the success rate for binder design experiments is around 0.1%. We think the main source of failure is from designs not folding up exactly as designed. Even a tiny inaccuracy in folding can be ruinous for binding, for example if it creates a clash or an unsatisfied polar atom.
Researchers are working hard to improve our ability to select good binders before testing, so we can increase this success rate. But right now even the best-looking design only has a 1 in 1000 chance of correctly binding the target. This also means that we’d like to be testing at least 1000 designs in each experiment. So, even though Foldit players created 59 excellent binder designs, it is still unlikely that our experiment will reveal a successful binder from a batch of this size.
In order to boost our design numbers and make the most of Foldit players’ work, we used a new grafting technique to recombine Foldit binder designs with automated design scaffolds.
Using high-throughput folding experiments, scientists at the Institute for Protein Design (IPD) have accumulated a database with millions of automatically-designed proteins that seem to be well-folded in the lab. Even though these proteins don’t do anything (like bind a target), they are good starting points for further design, and serve as useful scaffolds for modification. For many protein design projects at the IPD, scientists prefer to start from one of these well-behaved protein scaffolds rather than try to design a new fold from scratch.
From each of the 59 parent Foldit designs, we extracted the portion that makes the most binding interactions with the target. Then we looked to see if we could computationally graft that portion onto the scaffolds in our database. This method is finicky, and the graft has to match the scaffold backbone very closely for it to have any chance of working. Even though there are millions of proteins in the scaffold database, some Foldit designs cannot be matched to any scaffold.
This technique lets us recycle Foldit-designed interfaces into many unique designs with different folds and different sequences. By recombining the Foldit designs with the scaffolds, we were able to multiply our 59 parent designs into 873 grafted designs.
For good measure, we also redesigned each of the 59 parent designs using the IPD’s latest machine-learning algorithm. Typically, this does not change the parent design drastically, but it still provides a little bit more sequence diversity for the experiment. That brought us to a total of 989 designs to test for binding against the MERS-CoV spike protein.
Below is a preview of the data. You can download the data for all 989 designs here.
pdb_id counts1 counts2 counts3 counts4 counts5 counts6 ddg contact_surface BUNS 2010629_c0001 37 0 0 0 0 0 -44.673 401.919 3 2010629_c0107 4 0 0 0 0 0 -56.559 567.339 7 2010629_c0138 0 0 0 0 0 0 -41.261 422.217 6 2010629_c0143 34 0 0 0 0 0 -42.075 430.841 6 2010629_c0407 2 0 0 0 0 0 -44.809 424.300 7 2010629_c0998 1 0 0 0 0 0 -39.433 356.197 4 2010629_c1072 0 0 0 0 0 0 -36.672 440.411 5 2010629_c1101 0 0 0 0 0 0 -37.413 381.733 7 2010667_c0002 4 0 0 0 0 0 -43.552 481.502 3 ...
- Enrichment at 1000 nM target
- Enrichment at 1000 nM target
- Binding at 1000 nM target
- Binding at 200 nM target
- Binding at 40 nM target
For details about the binding experiment and how to interpret these numbers, see previous blog posts here and here. To recap: yeast cells display our designs on their surface, and we do successive rounds of sorting to collect yeast cells that appear to stick to the target. After each round of sorting, we use DNA sequencing to track which designs were collected. A high number indicates that we collected many yeast cells that appear to bind the target with our design.
Generally speaking, if a design successfully binds the target, we expect to see steady high numbers for that design across all six rounds of the experiment. An unsuccessful design will have decreasing numbers and eventually drop out of the sorting rounds.
Unfortunately, none of our 989 designs appeared to bind to the MERS-CoV spike protein.
A difficult target
In parallel with the Foldit designs, IPD scientists also tested about 30,000 designs that were created with an automated design method. From those 30,000 designs, only 11 showed any binding--and only weak binding at that. That translates to a success rate considerably lower than the usual 0.1%, and hints that the MERS-CoV spike protein is an especially difficult target.
The 11 IPD binders all have especially high Contact Surface values (around 500 or greater). This was a bit surprising, since previous data had suggested a Contact Surface value of 400 can be sufficient for good binding. This new data will help us improve our binder design puzzles in Foldit, and in the future we’ll be challenging Foldit players to strive for binder designs with even stronger binder metrics!( Posted by bkoep 73 681 | Sun, 05/09/2021 - 01:06 | 1 comment )