Foldit design update - Part 2

This is an extension of last week's protein design update, in which we discussed recent improvements in backbone quality and showcased a collection of player designs that were brought into the wet lab. Our analysis is ongoing, and some of those designs may still yield results. But a few exceptional designs are already showing promise, and we thought those results warranted a separate, more focused analysis here.

Folded Proteins

Below are four proteins designed by Foldit players, then expressed and purified in the Baker lab (more here). Experimental data from circular dichroism (CD) spectroscopy suggest that these proteins are stable and well-folded (figures explained in the key below).

Note that our testing is not yet complete—we still do not know whether these proteins are folding into their intended conformation or some other, alternative structure. For that we will need atomic-resolution data from x-ray crystallography or other methods.

Susume (Anthropic Dreams) — Puzzle 1248

Waya, Galaxie, Susume (Anthropic Dreams) — Puzzle 1297

fiendish_ghoulPuzzle 1299

fiendish_ghoulPuzzle 1299

(A) Cartoon diagram of each Foldit player-designed protein. All of these designs feature α-helices packed against a single β-sheet, but no two designs share the same fold.

(B) Rosetta@home folding predictions (described here). Rosetta@home was able to successfully predict the structure of each design based on its amino acid sequence. The "funneled" cloud of red points reaching toward the lower-left corner of each plot indicates that Rosetta is able to reconstruct the intended fold from sequence information alone, and that the intended fold is furthermore predicted to be the most stable.

(C) The circular dichroism (CD) spectrum of purified protein shows that each protein contains significant secondary structure. This characteristic CD signature—with a broad, flat trough between 208 and 222 nm—suggests that both α-helices and β-sheets are present at 25°C (blue trace). We see that most of this structure is retained at 95°C (red trace), and that lost structure can be recovered upon cooling back to 25°C (green trace).

(D) Each protein is fairly thermostable, retaining a strong CD signal at 220 nm as it is heated from 25°C to 95°C.

(E) These proteins are unfolded by titration of concentrated guanidinium chloride (a chaotropic agent). The steep, sigmoidal transition from the folded to the unfolded state suggests that each of these proteins folds via a cooperative, two-state mechanism.

Crystallization Trials

The next step is to try to crystallize these proteins. Under very specific conditions, a concentrated sample of purified protein will self-organize into a highly-ordered crystal lattice. Protein crystals are useful to us because they comprise a large number (think trillions) of identical protein molecules all locked into the same orientation. If we aim a focused beam of x-rays at a protein crystal, electrons in the ordered crystal lattice will diffract the x-rays to produce an ordered diffraction pattern. From this diffraction pattern we can infer the distribution of ordered electrons in the crystal at extremely high resolution, in the form of an electron density map, thus revealing the atomic structure of the crystallized protein.

Unfortunately, protein crystallization is a delicate process, and is very sensitive to subtle change in conditions. Different proteins require wildly different conditions for crystallization, and we have no way to predict which conditions will allow a particular protein to crystallize. Protein concentration, buffer, pH, salts, ligands, precipitants, temperature, and time can all be critical factors for crystal growth. Typically, a crystallographer will set up high-throughput crystal screens, incubating concentrated protein in large arrays with hundreds of different conditions, and monitor them over periods of weeks or months.

Ultimately, protein crystallization is a lottery. Many proteins are never successfully crystallized. But, with a little luck, we'll be able to grow crystals of some of these proteins, collect x-ray diffraction data, and determine their full structure.

( Posted by  bkoep 89 1458  |  Wed, 03/01/2017 - 17:45  |  9 comments )
Joined: 03/30/2013
Groups: Go Science

Thank you for the blog post. I thought you did a very nice job discussing crystallization an diffraction. I like the analogy of crystallization as a lottery. Sad but true.

But isn't that too pessimistic in this case? With well folded small proteins, crystals are a lot more likely. So do show us the pretty pictures of crystals when you get them. I love looking at crystal pictures.

Just me thinking out loud: I would think these proteins would also be amenable to NMR. NMR can be used like CD to assess folded-ness. But these proteins are small enough that NMR could be used to solve the structure too, in case you don't get crystals.

bkoep's picture
User offline. Last seen 55 min 35 sec ago. Offline
Joined: 11/15/2012
Groups: None
Crystallization difficulties

I wish I shared your optimism! Unfortunately, we've seen that a lot of Rosetta designs do not crystallize as readily as you might expect. All attempts to explain this are pretty hand-wavy, but most will point at the protein surface as unfavorable for making crystal contacts.

You can see that these proteins are covered in LYS, ARG, and GLU residues. These are all charged residues, and some people are skeptical about electrostatic forces disrupting crystal formation (though I think it might not be such a problem in a well-salted buffer). Perhaps a bigger concern is the side-chain entropy of these residues. They are all long, floppy side-chains with multiple degrees of freedom; locking them into a single conformation incurs a greater entropic cost, and will disfavor crystal formation.

I do agree with you that these proteins should be good candidates for NMR structure determination! If we don't see any crystals, that will likely be our next step.

Susume's picture
User offline. Last seen 1 hour 20 min ago. Offline
Joined: 10/02/2011
Reference subscore adjustment?

It seems to me GLU shows up all over foldit designs in part because it has such a high reference subscore (15.5). ARG and LYS do not have that bonus (0 and 2.9 reference, respectively), but they still show up all over the surface in part because they are heavily rewarded for bonding with the GLU that are there. Could foldit designs (maybe Rosetta designs in general) show improved crystallizability without too much loss of stability if the reference bonus of GLU were reduced and the floppy sidechains became less common on the surface?

I have previously compared the proportion of GLU sidechains in my foldit designs to those in the successful Rosetta designs I found in the PDB, and they both seem quite high (17.9% of residues in my designs are GLU, 12.9% of residues in Rosetta designs are GLU). It would be interesting to know how the proportions of sidechains in both foldit and Rosetta designs compare to natural small alpha-beta proteins. Presumably there is a reason for that high reference subscore, but maybe it has a downside as well. Maybe it could be lowered in design puzzles where it may interfere with crystal formation and left alone in prediction puzzles where the original reason for it still holds?

bkoep's picture
User offline. Last seen 55 min 35 sec ago. Offline
Joined: 11/15/2012
Groups: None

I agree that tempering the GLU reference energy might allow more residue types at the surface. A Rosetta developer recently wrote a "AA composition" score term that can be introduced ad hoc to impose explicit limits on the number of each residue type allowed (e.g. "allow no more than 5 GLU residues"). We might start to experiment with that score term in future.

Also, we don't want to get ahead of ourselves. While possibly problematic for crystallography, the charge-peppered surfaces are almost certainly contributing to the stability of these proteins in solution. First we need to be certain that Foldit design is robust for stable, soluble proteins; only then will we start worrying about how well Foldit designs crystallize.

As an aside, the reference energies are indeed optimized against a design benchmark, of sorts (described in this paper). Note that reference energies actually play no role in structure prediction problems, where the sequence is fixed.

Joined: 06/06/2013
Groups: Gargleblasters
seat belts or ankle straps

last time the examples were all helices attached to the protein at only one end. this time some have connections at both ends, like the one Fiendish Ghoul did third example. I've done one like that. more stable, but obviously tighter packing so side chains will be different
Can you let us know if you learn which is better?

bkoep's picture
User offline. Last seen 55 min 35 sec ago. Offline
Joined: 11/15/2012
Groups: None
No preference

Both are valid. Some people think that folds with terminal sheets are less stable; on the other hand, there are many natural proteins with ferredoxin-like folds similar to the one you point out.

If we find that either scenario is problematic, we'll let you know. However, we love to see diversity in players' designs, so more than anything I would encourage Foldit players to try anything and everything!

Joined: 09/29/2016
Groups: Gargleblasters

"However, we love to see diversity in players' designs, so more than anything I would encourage Foldit players to try anything and everything!"

If you've never seen any of mine, then I say now's as good of time as ever, as I strive to make mine all as different as possible. lol

Unfortunately the one I was most proud of since it scored well enough, which I coined a "Macro Helix", actually had ended up being a real one and I found it on Wikipedia and is a Beta-Helix. (the triangle style one, not the ladder style)

So I'll be sure to keep doing what I'm doing! :D

bkoep's picture
User offline. Last seen 55 min 35 sec ago. Offline
Joined: 11/15/2012
Groups: None
Yes, please!

And don't forget to share your favorite designs using the Share with Scientist button! We do pay special attention to these, even if they are not high-scoring!

Joined: 09/24/2012
Groups: Go Science
Thanks again bkoep !

Very interesting and well explained.

Get Started: Download
  Windows    OSX    Linux  
(10.7 or later)

Are you new to Foldit? Click here.

Are you a student? Click here.

Are you an educator? Click here.
Only search
Recommend Foldit
User login
Top New Users

Supported by: UW Center for Game Science, UW Department of Computer Science and Engineering, UW Baker Lab, VU Meiler Lab,
DARPA, NSF, NIH, HHMI, Microsoft, Adobe, RosettaCommons