Protein Design Critique: IL-7R Binder Redesign
You’re doing great so far! I've looked at your solutions from first 4 puzzle rounds, and I think a lot of your designs are going to work! I just wanted to remind everyone that, in addition to the Foldit score you get on each puzzle, in the end you'll also get a binding score based on our testing of these designs in the lab!
Designing a protein to fold precisely is a difficult problem! When we test your protein, we are testing whether the sequence you chose folds into the shape of your solution. In Foldit, you can change your solution into whatever shape you want, but in the lab your sequence might not fold into the shape you wanted. It took scientists decades to figure out the shape a given protein sequence folds into (they call this the Protein Folding Problem). The good news for you though is that the Protein Folding Problem has a really simple answer:
I want to emphasize a few guidelines you can use to ensure your designed fold is the most favorable state:
· Secondary structure - use lots of alpha-helices or beta sheets
· Puzzle score - try to have the best score for your chosen fold
· Short loops - you'll need to use loops, but keep them as short as possible
Next I’ll show some examples and give my thoughts on a few designs from Foldit players. Please note that all of these designs have been chosen because they showcase a single weakness in an otherwise excellent design. We don't mean to disparage anyone's designs—on the contrary, the solutions highlighted in this critique are among our favorites!
A study of two 3-helix bundles
While both of these structures emphasize secondary structure and well-packed cores, design A is more likely to fold because of its shorter loops.
The reason we prefer secondary structure to loops is that loops typically have many alternate conformations (decoys) that score the same or even better than the design model. Shorter loops mean fewer decoys and a better chance of folding as intended. For instance, one can imagine how the loop of design B could misfold so that the third helix is on the wrong side of the bundle.
Bad beta-sheet, better beta-sheet, best beta-sheet
Beta sheets are a tricky secondary structure, because they require distant parts of the protein chain to come together. The point I want to highlight here again is that shorter loops are almost always better. In design C, there are too many loop residues between the helices and sheets. These loop residues are likely to rearrange themselves in real life.
Design D has shorter loops, but I still see a few backbone H-bond pairs that are unsatisfied here. (Also, I'm not so sure about that ARG / GLU zipper there. ARG / GLU like to form helices, so I'd probably go with HIS / THR...)
Design E is an optimized Baker Lab design (not from the IL-7R series), but I wanted to include it to demonstrate my point. Look at how short those loops are! This is a difficult fold to master, but FoldIt players like challenges, right?
4-helix bundles, the good, the bad, and the ugly
When it comes to 4-helical bundles (and really all designed proteins), the name of the game is compact. You want your design to resemble a ball with all portions stabilized by at least 2 other secondary structures. Design H fails just that; it's too long and unsupported. This structure will almost certainly fold into something more compact in real life.
Design G also fails this rule, as it's leaving a large portion of the structure thin and unsupported. Those two helices would have been better on top of the protein like the good example is doing here.
Yes, design F would be better if the helices were longer, but we didn't give players enough residues for that (unfortunately, we're limited to small proteins for our lab experiment). If you run out of residues for good helix packing, you can try beta-sheets. Although, previous experiments have shown that helices are more robust than beta-sheets. So if the choice is between an okay beta-sheet and an okay helix, I'd go for the helix.
Don't try to make additional target contacts
First let me say that these designs are very interesting in that they make additional contacts with the target. Especially in design I, I'm not even sure I could design that with all the tools I have! But, I want to remind everyone that in this design challenge, folding is more important than binding.
You've already been given two helices that are guaranteed to bind the IL-7R. If you can just fold the rest of the protein into a stable fold then you'll have a binder!
Great 3-helix bundle, but that long loop isn't going to fly
Finally, one more design to really hammer home the message of shorter loops. Design K looks great with three well packed helices, but look a little closer and you'll see that a long loop is required to stretch back and meet the third helix. I'll admit, this protein has a chance to work, but with a loop that long, who knows where the final helix will actually fold...
Posted by bcov 78 915 |
Fri, 08/16/2019 - 17:57 |
We have a lot more puzzles planned for this series, and we look forward to seeing more designs from Foldit players! Round 5 just closed, and we'll get started on the analysis of those solutions right away. In the mean time, check out the Round 6 puzzle, which is online now!
Redesigning IL-7R Binders
Hi Foldit players! We need your help redesigning protein binders!
I'm bcov, a graduate student in the Baker Lab. My PhD project is to make proteins that stick to other proteins. In my work, I’m given the model of a natural target protein and my task is to design a new protein that will bind to it. It turns out this problem is really hard because not only do my designed proteins need to bind to the target, but they have to properly fold first! Fortunately, I can use a high-throughput binding experiment that allows me to test 100,000 different proteins at once.
At the moment, I’m interested in studying the folding aspect of this problem. I have a clever experiment planned where I should be able to confirm the atomic accuracy of a designed protein even when it’s mixed with thousands of other proteins. For this experiment, I will need lots of binder designs that have different folds, but that share a common binding interface. I'm planning a series of Foldit puzzles in which players can redesign my binders while preserving the binding interface.
My designed binders target a protein called interleukin 7 receptor (IL-7R), which helps to regulate the human immune system, and is an important target for cancer therapy.
Here are the details of the experiment:
· I have 11 designed proteins that are confirmed to bind the target IL-7R
· I want to leave my designed binding interface the same, but redesign the rest of the protein
· In each puzzle, your task is to design the rest of the protein so that it folds the interface-side in precisely the right conformation
· Your designs will be tested for binding against IL-7R
· You will get a binding score based on how well your design binds in the wet lab
The binding score here is really cool actually. After we run the binding experiments at the end of the puzzle series, you will receive cold-hard data from the biochemistry lab about the binding strength of your design. Well-folded proteins that fold precisely into the puzzle structure will likely score the highest. Details about the binding score will be released later, but in general, there are three categories:
1. Your design did not bind to IL-7R
2. Your design bound to IL-7R but was worse than my design
3. Your design bound to IL-7R and was better than my design
If you end up in category 3, congrats! You beat me :P
Nearly all Foldit player designs will be tested experimentally. This is possible because we can test all the designs at the same time in our high-throughput binding experiment. Designs that look especially good will be tested multiple times with various mutations to increase data consistency.
Due to time constraints, puzzles in this series will be shorter than our normal week-long puzzles, and will only run for 4 days at a time. We'd like to generate as many variants as possible for the original 11 binders. So, don't worry if you miss a puzzle; there will be plenty more to follow up!
Check out the first puzzle of the series, Puzzle 1704: IL-7R Binder Redesign: Round 1, which is out now! Happy folding!( Posted by bcov 78 915 | Thu, 07/25/2019 - 20:37 | 4 comments )
Protein Design Critique: Cubane FeS Binder
A few weeks ago, we challenged Foldit players to design a protein that could bind an iron-sulfur (FeS) cluster, in Puzzle 1688: Cubane FeS Binder Design. A cubane type [4Fe-4S] iron-sulfur cluster is a "cube" made out of alternating iron and sulfur atoms, and is bound by carefully-placed cysteine residues in a protein. Iron-sulfur metallo-proteins are responsible for electron transfer in light-harvesting, cellular respiration, many other processes. We'd like to design an iron-sulfur protein so that we can better understand electron transfer in proteins. By changing the environment around the iron-sulfur cluster, we could tune the electron transfer properties of the protein, which could open the door to metabolic engineering and new chemistry!
We asked Dr. Anindya Roy, the Baker Lab’s expert on redox proteins, to take a look at Foldit players’ designs from Puzzle 1688. Below are some comments from Anindya, which we hope players will take into account for the Round 2 puzzle, which is online now!
We were very excited about the structural diversity of designs by Foldit players, who developed a variety of different protein folds! Many natural redox proteins adopt a ferredoxin fold, with a secondary structure pattern of (β-α-β)2, and we were worried that Foldit players might also favor the same ferredoxin fold. We were happy to see lots of helical bundles and other α/β folds with different secondary structure patterns, because these folds might have properties that are not possible with the ferredoxins typically found in nature. We encourage players to keep exploring helical bundles and other folds!
Room for improvement
In these initial designs, the two main areas for improvement are excessive loops and incomplete burial of the FeS cluster.
The cubane FeS cluster should be buried inside the protein core as much as possible. If the FeS cluster is to be used to catalyze a chemical reaction, then we want the active site to be protected from the water surrounding the protein. The top-scoring design by toshiue and Wilm, shown below, does a good job of burying the FeS cluster. The frozen FeS-binding loop is highlighted in blue and purple, with helices packing nicely against the cluster on three sides, shielding it from water.
If we zoom in on the FeS cluster, we can see some other nice features of this design. We like to see large, aromatic residues packed near the FeS cluster, like the TRP residue at the left of this protein. Players should try to design aromatic PHE, TRP, and TYR residues around the FeS cluster. Also, because the FeS cluster is negatively charged, it can be stabilized with complementary positively-charged residues, like the LYS residue shown beneath the cluster here. Players should also design positively charged LYS and ARG residues near the FeS cluster.
Unfortunately, we’re afraid this design has too many residues in loops, and not enough secondary structure. We can see in the first image that the frozen loop has been extended to make an even longer loop, which is unlikely to fold as intended. In order for these protein designs to fold up with high stability, we want to minimize the amount of loops in the structure. The more residues in helices and sheets, the better!
Below is a design by Galaxie and grogar7 that has a much smaller proportion of loop residues. The FeS-binding loop is flanked closely by long, stable helices on either side, and all of the other helices are connected by minimal loops. This design would have a much better chance of folding up into a stable structure.
However, in this design the FeS cluster is not completely buried, and will be exposed to the water surrounding the protein. This means we have less control over the electron transfer properties of the FeS cluster, which makes it harder to design an enzyme that can catalyze chemical reactions. One way to improve this design would be to extend the helices on either side of the FeS cluster in order to bury it away from the surrounding solvent.
This design also features lots of positively charged LYS and ARG residues at the binding site, which help to stabilize the negatively charged FeS cluster. Keep in mind that these charged residues have polar atoms that like to make hydrogen bonds. On the protein surface, they can make hydrogen bonds with the surrounding water; but if they’re buried away from solvent then they need to make hydrogen bonds within the protein!
Posted by bkoep 78 593 |
Fri, 07/19/2019 - 22:57 |
In summary, we encourage Foldit players to design more helical bundles and other folds, with a focus on minimizing loop residues and burying the FeS cluster in the protein core! Large aromatic residues (PHE, TRP, TYR) and positively charged residues (LYS, ARG) help to stabilize the FeS binding site! Play Puzzle 1701: Cubane FeS Binder Design: Round 2 now!
The Foldit protein design paper
Today, the scientific journal Nature published a paper titled De novo protein design by citizen scientists, all about the work of Foldit players!
The paper is written for an audience of professional scientists, and gets somewhat technical. This blog post is meant to summarize the main points of the paper, so that everyone can appreciate the significance of this achievement. If you have trouble accessing the paper on the Nature website, try this view-only online version or check the Baker Lab website.
What is 'de novo' protein design?
The Latin phrase de novo translates literally to “from the new”—we usually use it to mean “from scratch.” Veteran Foldit players will recognize this phrase from De-novo Freestyle Foldit puzzles, where players fold up a protein from a completely unfolded starting position (i.e. from scratch), rather than from a partially-folded starting position.
In the field of protein design, this phrase has a special meaning. De novo protein designs are created without referencing the sequences of natural proteins.
To illustrate, you could imagine designing a 3-helix bundle protein just by looking at the sequences of natural 3-helix bundles and choosing the most common amino acid at each position. Since we have lots of data about natural protein sequences, and powerful ways to extract patterns from data, this method is relatively easy. But it will only ever let us design proteins that are similar to natural proteins.
On the other hand, de novo protein design is much more difficult. Rather than relying on patterns in massive datasets, de novo design requires an understanding of the physical principles behind protein folding. The advantage is that we can use de novo methods to design brand new proteins that are unlike any proteins found in nature.
Why is protein design hard?
A designed protein must fold entirely on its own, without direction or instruction from any outside source.
The number of possible folds for a protein is huge, and a protein dissolved in solution is generally free to sample any of those possible folds. But if the protein sequence is chosen carefully, then the protein chain will have lower energy in one fold than in any other, and the protein will naturally prefer that lowest-energy fold.
It is difficult to choose the sequence because there are also many possible protein sequences (more than there are atoms in the universe!). And, once we choose a sequence for our target fold, we cannot check all the possible folds to ensure that our target fold has the lowest energy.
For a deeper discussion about the difficulties of protein design, see this previous blog post.
How can computer gamers design proteins?
Figure 1 below shows the Foldit game interface. Foldit players have a number of tools that allow them to change both the fold and the sequence of a virtual protein. The player's score is calculated from the energy of the virtual protein, with a state-of-the-art energy function developed by academic protein scientists. By competing with one another to reach the highest score, Foldit players arrive at virtual proteins with extremely low energies (a high Foldit score corresponds to a low protein energy).
Since energy alone is not enough for protein design, the Foldit team has had to make some adjustments to the Foldit score function. Every step of the way, we’ve relied on the work of Foldit players to expose problems with our score function. Foldit players are excellent at exploring new kinds of protein folds that are unlike anything seen in nature. For this reason, Foldit players are incredibly helpful for identifying unanticipated weaknesses in our energy function, and ultimately can improve our understanding of protein folding.
How do Foldit players actually design proteins?
Figure 2 shows that Foldit players design proteins much differently than automatic protein design algorithms. From start to finish, players will routinely accept huge penalties (high-energy spikes; colored traces in panel 2a), that ultimately pay off with low-energy designs.
Automatic algorithms, on the other hand, can only accept very small penalties, and they do so less frequently (gray traces in panel 2a).
How do virtual Foldit designs behave in real life?
Figure 3 shows data from the lab tests that we perform on protein designs from Foldit players.
The first thing to note, in panel 3a, is that these proteins are extremely diverse and span many different protein folds. Due to the amount of planning and creativity required to conceive a protein fold, a protein engineer will usually focus on a small number of protein folds for a given task. This paper reports a greater number of protein folds than any other protein design paper to date—including a brand new fold that is not observed in any natural proteins!
Panels 3c-f show that these proteins are very well-behaved both on the computer and in the lab. The plots in panel 3c show that Rosetta@home computer simulations predict the designs will fold accurately (details here).
Panels 3d-e show that the proteins don’t aggregate together, and are rigidly structured in solution. And panels 3f-g show that the proteins do not unfold except in extremely harsh conditions (read more here. Most natural proteins unfold with only 3-5 kcal/mol of energy; many of the designed proteins are hyper-stable and require >10 kcal/mol!
How do we know that the proteins fold up as designed?
Since proteins are smaller than the wavelength of visible light, we can’t see them directly under a microscope. However, in some cases we can use very intensive techniques to determine the structure of a protein indirectly (read more here and here). We used these techniques to solve high-resolution structures of 4 proteins designed by Foldit players.
Figure 4 shows the exact placement of atoms in the real-life protein structures, which is nearly identical to the virtual protein design in every case.
So, what does this all mean?
This is a huge accomplishment for Foldit players! De novo protein design is a very new field, and already citizen scientists are making significant contributions—not just by designing new proteins, but also by helping us improve our understanding of protein design. We hope that scientists in other fields will be able to find similar ways to engage public creativity and enthusiasm, to increase our understanding of the world.
Now that Foldit players can accurately design high-quality proteins from scratch, we can start to challenge Foldit players with more applied protein design problems. We’d like Foldit players to help us design new proteins that can assemble into multi-component structures and materials, or that can bind to biological targets as potent medicines, or that can degrade toxic chemicals!
Because Foldit depends on the cooperation and competition of its player community, our scientific ability grows rapidly with the number of Foldit players. We look forward to expanding the Foldit community and recruiting more creative and curious Foldit players!
Help us design a protein for cancer treatment right now, by playing Puzzle 1683: Integrin Antagonist Design!( Posted by bkoep 78 593 | Wed, 06/05/2019 - 18:11 | 6 comments )
New Custom Contests feature
We are excited to announce a new feature in Foldit: Custom Contests! As you may know, contests have been a feature that allows anyone to host their own private Foldit puzzle, chosen from a limited, pre-selected list. Now, you can make your own custom Foldit puzzle of whatever you choose and host it as a contest. We designed the Custom Contest feature especially for educators, who can now tailor their Foldit puzzles to their exact curriculum. There are plenty of other uses as well, including private contests that research groups can use to brainstorm new ideas, or even Foldit parties!
We just published a paper that can be found here that describes the Custom Contests in depth. If you’re interested in making Custom Contests, please email mail.fold.it |at| gmail.com for access.( Posted by beta_helix 78 1421 | Mon, 03/04/2019 - 16:53 | 8 comments )