Two-sided protein interface design
This blog post explains some of the background science behind recent Two-sided Interface Design puzzles, like Puzzle 1963. IPD scientist Ryan Kibler elaborates on the goals of these puzzles, how they might be used, and the special challenges we face when designing proteins that bind each other to form organized protein assemblies.
Protein assemblies in nature
A major theme of recent Foldit puzzles is designing symmetric proteins. Through playing these puzzles, you have no doubt realized that symmetry allows you to build a large protein assembly by designing a single chain to bind with itself. Nature has apparently realized this, too. About 63% of the different kinds of proteins naturally produced by E. coli bacteria exist as symmetric homo-oligomers (“assemblies of the same chain”).1
Hetero-oligomers (“assemblies of different chains”) are much rarer but usually serve important biological functions. A great example of this is the LSm family of proteins. In humans and other eukaryotes, seven different chains assemble into a hetero-heptameric (“seven different parts”) LSm ring. These LSm rings have many functions, but they often involve binding to RNA and acting as hand-holds for other proteins to grab and make modifications to the RNA, or carry it from place to place.
Curiously, LSm proteins are found in all types of organisms (eukaryotes, archaea, and bacteria) but they don’t all form hetero-heptameric rings. In archaea they form homo-heptameric (“seven of the same part”) rings, where multiple copies of the same protein bind each other. Since eukaryotes are thought to have evolved from archaea, this suggests an interesting evolutionary tale where the gene encoding the homo-heptameric LSm protein in archaea was duplicated and diversified to the point where each one of the seven proteins prefers to assemble with different partners rather than with identical copies of itself.
Designing protein assemblies at the IPD
Designing hetero-oligomers with more than a few chains is a difficult task because there is a lot that can go wrong when you design so many interfaces simultaneously. Instead, scientists at the IPD have taken inspiration from this evolutionary story of LSm and are taking a different, incremental approach. Rather than designing all the interfaces at once to make a hetero-oligomer in one step, we have broken the problem down into smaller sub-problems. We start with a single homo-oligomer and redesign both sides of the interface many different times, then make sure the interface forms correctly through experimentation, and finally we recombine the interfaces into a single protein. Using this strategy, we have already successfully transformed a homo-trimer into a hetero-trimer.
Figure 1. A schematic diagram shows how we create hetero-oligomers from a homo-oligomer starting structure. We can break the problem down into smaller pieces, and start by creating lots of different homo-oligomers. After we test the different homo-oligomers in the lab, we can recombine their different interfaces to create hetero-oligomers.
The two keys to this strategy are: (1) keeping the protein backbone fixed, except for the parts that make up the interface, so that the different interfaces can be copied and pasted onto the same starting structure; and (2) making the interfaces as diverse as possible in order to prevent unintended off-target assembly between the wrong interfaces.
Key #1 is easily accomplished by freezing the non-interface region to hold the backbone in place. We have been using the same alpha helical backbone for all interfaces, allowing only small tweaks to the starting structure. But key #2 is harder. So far, we have relied on H-bond networks to prevent unintended off-target assembly. This works because if two wrong chains attempt to bind, their non-matching arrangements of polar residues become energetically unfavorable BUNS and prevent binding. Only when the two correct chains come together, the H-bond network forms and produces a stable interface with zero BUNS.
Assemblies with increased complexity
On our path to increasingly more complicated hetero-oligomers, we are now trying to make a hetero-tetramer (“four different parts”). The more chains there are in an assembly, the harder it is to make them bind in the correct arrangement, because there are exponentially more chances for off-target assemblies. We believe we will need more than H-bond networks to prevent off-target assemblies.
One very good way of further diversifying the interfaces is to make them physically different by changing the protein backbone. Like bkoep mentioned in Lab Report #17, scientists’ designs use mostly alpha helices at the interfaces, because we have a good handle on how to generate alpha helical backbones, and we have a good understanding of how to design sequences for alpha helices. Unfortunately, we’ve observed that interfaces which are made entirely out of alpha helices are prone to off-target assembly with similar-looking alpha helical interfaces. If we want to get a large number of specific interfaces, it will not be enough to simply make small tweaks to our starting alpha helices.
Two-sided interface design puzzles
This is where Foldit comes in. Foldit players are extremely good at making diverse protein backbones, so we’re challenging you to redesign the interfaces of our homo-tetramers, in a series of Two-sided Interface Design puzzles!
The most recent and most promising starting structure is a homo-tetramer called RC4_20, a circular tandem repeat protein (known internally as a “donut”). This protein was originally designed at the IPD by Phil Bradley, PhD,2 and later modified by Alexis Courbet, PhD, to have a larger interface, making its tetrameric form more stable and also easier to see under an electron microscope.
Figure 2. This homo-tetramer "donut" protein was designed by IPD scientists. We can use it as a starting point and create a hetero-tetramer by redesigning the interface between chains. Highlighted in red are the frozen stubs that were provided in the starting structure of Puzzle 1959.
To work within the constraints of key #1 (making sure that different interfaces can be copied and pasted onto the same starting structure), our Foldit puzzles start with a section that is frozen and a section that you can refold. Keep in mind that this is only a small part of the larger assembly (Figure 2). To convert the homo-oligomer RC4_20 into a hetero-oligomer, we simply swap out the original interface with the different interfaces designed in Foldit.
Initial design results
We’ve been happy to see many players attempting to install beta sheets at the interface. This is exactly the sort of diversity we were looking for! The design below, from an anonymous player, has a beta sheet with the outer edge pointing out to surrounding water and the inner edge placed against the interface. The polar atoms on the inner edge can participate in a H-bond network if they are satisfied by polar residues from chain B. Unfortunately, this design didn’t satisfy all of these backbone polar atoms. In future puzzles, we would love to see players use the edge strand of a beta sheet as part of an H-bond network!
Figure 3. A creative design with a beta sheet at the interface. A variety of backbone shapes will help us create well-behaved assemblies that can avoid off-target binding. The highlighted inner edge strand has polar atoms along the backbone that are buried at the interface. Those polar atoms would be more stable as part of an H-bond network.
Another feature we like to see is tight packing between both starter alpha helices on chain A. While it’s hard to tell from the truncated starting structure we gave you, making good contacts with both of these alpha helices is important to maintain structural rigidity. If your refolded interface only contacts one of these alpha helices, that alpha helix could act like a hinge and cause your interface to swing around and not be in the correct position to bind the other half. The solution below from LociOiling has good packing between both alpha helices of the starting stub and an alpha helix backing up the interface. Also, we like the placement of a key tryptophan buried at the interface. It would be really great to see a H-bond network sprout out of that sidechain nitrogen in future designs!
Figure 4. A design with great sidechain packing by LociOiling. On the left, the highlighted alpha helix packs closely against both alpha helices on the chain A starting stub, and also backs up the alpha helices that actually contact chain B. On the right, a buried TRP sidechain is buried at the interface for tight binding, but the polar N atom is unsatisfied and would like to make a H-bond.
In the most recent Two-sided Interface Design puzzles, we’ve asked Foldit players to install H-bond networks across the interface, in addition to refolding the backbone, to make the interfaces as different as possible and avoid off-target assembly. We’ve also asked for one disulfide bond across the interface, and that’s because we know that a disulfide is important for encouraging assembly of this particular RC4_20 donut.
In solutions from Puzzle 1959, the disulfides you’ve designed look really good, but the H-bond networks need a little work. We’ve noticed that the H-bond networks are usually not very big, and often are at the protein surface, near the water that surrounds the protein. Instead, H-bond networks should be buried in the core of the interface where they can be most effective at discouraging off-target assembly. Remember, the polar network residues will form BUNS if they try to bind against a chain with non-matching polar residues. If you want some inspiration for making effective H-bond networks, you can review some of the published H-bond networks designed by IPD scientists.3
Thank you all for your solutions to these puzzles! We’re accumulating a list of designs that we plan to test in the wet lab in the coming months. But there’s still a lot of room, so keep up the good work! Check out the latest Puzzle 1963: Two-sided Interface Design now!
1. Goodsell, D. S. & Olson, A. J. Structural symmetry and protein function. Annu. Rev. Biophys. Biomol. Struct. 29, 105–153 (2000)
2. Correnti, C. E. et al. Engineering and functionalization of large circular tandem repeat protein nanoparticles. Nat. Struct. Mol. Biol. 27, 342–350 (2020)
3. Boyken, S. E. et al. De novo design of protein homo-oligomers with modular hydrogen-bond network-mediated specificity. Science 352, 680–687 (2016)