Since our last blog post, we've carried out an x-ray diffraction experiment with one of our protein crystals. We were lucky that the protein crystal yielded high quality diffraction data, and from this data we were able to solve the first-ever crystal structure of a protein designed by Foldit players—a near-exact match to the designed structure! Below we explain a bit more about x-ray diffraction. In a later post, we'll examine the final structure in more detail.
First, the protein crystal is harvested from the drop using a small loop of nylon, about 0.3 mm across. Protein crystals are often very fragile, so looping the crystal requires a steady hand (i.e. optimal coffee dosage). Even in the loop, the crystal is still immersed in an aqueous solution, with the surface tension of the water helping to keep the crystal in the loop. The loop is rapidly submerged in liquid nitrogen, at a temperature of about -200ºC, which quenches most of the thermal motion of molecules in the crystal.
Once frozen, our looped crystal is mounted on a robotic arm that positions the loop in the path of an x-ray beam. During x-ray exposure, the crystal is kept under a steady stream of cold nitrogen gas to limit temperature increases in the crystal. X-rays have a high energy, and a protein crystal can only endure so much exposure to x-rays before it starts to degrade. The protein lattice could disintegrate from the increased thermal motion of individual protein molecules, or else the x-rays could trigger chemical reactions within the protein, distorting its structure.
X-rays are simply a type of electromagnetic radiation with a very short wavelength—in this case about one angstrom. In an x-ray diffraction experiment, it's important that all radiation has exactly the same wavelength and is focused into a very narrow beam. With our crystal mounted in the path of the x-ray beam, an x-ray detector is positioned behind the crystal, and measures incident x-rays after they strike the crystal and are diffracted by electrons of the protein molecules within. Because of the regular arrangement of atoms in the protein crystal, diffracted x-rays undergo constructive interference in particular directions. This occurs when two equivalent "slices" of the crystal are oriented to coincide with the wavelength of the x-rays. Wherever constructive interference occurs, the detector registers an especially intense signal, shown as a dark spot on the image below. Taken together, these spots comprise a diffraction pattern.
Above is an x-ray diffraction pattern from a protein crystal. In the inset at the right, we can see that some spots seem to have duplicates which are slightly offset. This indicates that there are actually two identical crystals in the path of the x-ray, lying in slightly different orientations. Most likely, the crystal cracked in two during freezing. Fortunately, the image-processing software we use is sophisticated enough to correct for this issue.
The spacing and position of spots is governed by the size and shape of the crystal’s unit cell, the repeating unit that makes up the crystal. The intensity of each spot is determined by the distribution of electrons within the unit cell (i.e. the positions of atoms in the protein). Every atom of the unit cell contributes to each spot in the diffraction pattern. If you could change the electron density around just one atom of your crystallized protein, this would alter the intensity of every spot in the diffraction pattern!
Notice that spots farther from the center of the detector tend to be less intense. More distant spots contain higher resolution data about the electron density of the protein. If we adjust the contrast of this image, we can discern spots close to the edge of the detector. This protein diffracts x-rays to a resolution limit of 1.20 Å! In an electron density map derived from these diffraction patterns, we should be able to distinguish the positions of individual atoms.
If the crystal is rotated relative to the x-ray beam, then we would observe another diffraction pattern, as the new orientation produces constructive interference in different directions. We typically measure a new diffraction pattern at rotation intervals of 0.5 degrees, eventually rotating the crystal a total of 180 degrees (sometimes less for highly-symmetric crystals) to collect a complete dataset. This dataset was collected with a state-of-the-art detector that can measure individual photons; collecting a full dataset takes no more than a few minutes. In the early days of protein crystallography, it could take a whole day to collect a complete dataset!
The processing and interpretation of a these x-ray diffraction patterns is a complex, technical procedure, and we won't go into it here. But suffice it to say, this x-ray diffraction data revealed a full, high-resolution crystal structure of this Foldit player-designed protein!
Congratulations to Waya, Galaxie, and Susume who contributed to this solution in Puzzle 1297! All players should check out Puzzle 1384 to explore the refined electron density map from this data, and see if you can fold up the protein sequence into its crystal structure! We'll follow up later with a more detailed comparison of the designed model and the final crystal structure.( Posted by bkoep 78 1232 | Tue, 05/30/2017 - 04:59 | 4 comments )
It's time for an update on Foldit protein design! If you recall, our last update showed that several Foldit player-designed proteins appear folded and stable in solution. However, we'd like to have crystal structures of these proteins to show that they are indeed folding into their intended folds. The first step in getting a crystal structure is getting a protein crystal. Here we take a closer look at the protein crystallization process.
Above is a 96-well crystallization tray. We use a robot to rapidly set up crystallization experiments with 96 different conditions per tray. For this protein we set up four trays, to test a total of 384 crystallization conditions.
Each “well” in the 96-well tray is actually divided into four distinct regions. In the upper right, a square reservoir holds the mother liquor. The mother liquor is typically an aqueous buffer with some salt and a high concentration of precipitant. The reservoir is accompanied by three circular drop wells, each of which contains of drop of our protein sample mixed with the mother liquor. In this tray, the three drop wells are used to test different drop ratios, with protein and mother liquor combined in a ratio of 1:1, 2:1, or 1:2.
Each of the 96 wells is sealed off from the air and from neighboring wells. However, within a well, the three drops share an atmosphere with the reservoir, so that the drops can equilibrate with the reservoir by vapor diffusion. Over time, water evaporates from the drops and condenses in the reservoir. As the drop volume decreases, the protein concentration in the drop gradually increases. Eventually, the protein concentration reaches a critical point and the protein crystallizes.
In the drop above, we see several plate-like crystals radiating outward from a single origin. Most likely a small dust particle at the center served to “seed” the growth of all these crystals.
The crystals are not actually colored, per se, but exhibit birefringence—meaning that they refract light waves differently, depending on the orientation of the light waves with respect to the crystal lattice. When viewed through a microscope equipped with a light-polarizing filter, the birefringent crystals appear colored.
These crystals appear to be thin and plate-like, suggesting this particular crystal lattice extends readily in height and width, but less easily in depth. Sometimes, this is indicative of imperfections in the crystal packing, and can limit the quality of x-ray diffraction. To follow up, we’ll try to optimize the crystallization conditions by setting up a number of similar drops with slight alterations in the composition, in hopes that we get larger, more substantial crystals. However, there's a chance one of these crystals will diffract well enough to yield a crystal structure.
Once we have a nice, high-quality crystal that yields a good x-ray diffraction pattern, we can set about solving the crystal structure. A solved crystal structure will tell us definitively whether the protein folds up as the designer intended!( Posted by bkoep 78 1232 | Sat, 04/15/2017 - 00:09 | 8 comments )
Foldit design update - Part 2
This is an extension of last week's protein design update, in which we discussed recent improvements in backbone quality and showcased a collection of player designs that were brought into the wet lab. Our analysis is ongoing, and some of those designs may still yield results. But a few exceptional designs are already showing promise, and we thought those results warranted a separate, more focused analysis here.
Below are four proteins designed by Foldit players, then expressed and purified in the Baker lab (more here). Experimental data from circular dichroism (CD) spectroscopy suggest that these proteins are stable and well-folded (figures explained in the key below).
Note that our testing is not yet complete—we still do not know whether these proteins are folding into their intended conformation or some other, alternative structure. For that we will need atomic-resolution data from x-ray crystallography or other methods.
Susume (Anthropic Dreams) — Puzzle 1248
Waya, Galaxie, Susume (Anthropic Dreams) — Puzzle 1297
fiendish_ghoul — Puzzle 1299
fiendish_ghoul — Puzzle 1299
(A) Cartoon diagram of each Foldit player-designed protein. All of these designs feature α-helices packed against a single β-sheet, but no two designs share the same fold.
(B) Rosetta@home folding predictions (described here). Rosetta@home was able to successfully predict the structure of each design based on its amino acid sequence. The "funneled" cloud of red points reaching toward the lower-left corner of each plot indicates that Rosetta is able to reconstruct the intended fold from sequence information alone, and that the intended fold is furthermore predicted to be the most stable.
(C) The circular dichroism (CD) spectrum of purified protein shows that each protein contains significant secondary structure. This characteristic CD signature—with a broad, flat trough between 208 and 222 nm—suggests that both α-helices and β-sheets are present at 25°C (blue trace). We see that most of this structure is retained at 95°C (red trace), and that lost structure can be recovered upon cooling back to 25°C (green trace).
(D) Each protein is fairly thermostable, retaining a strong CD signal at 220 nm as it is heated from 25°C to 95°C.
(E) These proteins are unfolded by titration of concentrated guanidinium chloride (a chaotropic agent). The steep, sigmoidal transition from the folded to the unfolded state suggests that each of these proteins folds via a cooperative, two-state mechanism.
The next step is to try to crystallize these proteins. Under very specific conditions, a concentrated sample of purified protein will self-organize into a highly-ordered crystal lattice. Protein crystals are useful to us because they comprise a large number (think trillions) of identical protein molecules all locked into the same orientation. If we aim a focused beam of x-rays at a protein crystal, electrons in the ordered crystal lattice will diffract the x-rays to produce an ordered diffraction pattern. From this diffraction pattern we can infer the distribution of ordered electrons in the crystal at extremely high resolution, in the form of an electron density map, thus revealing the atomic structure of the crystallized protein.
Unfortunately, protein crystallization is a delicate process, and is very sensitive to subtle change in conditions. Different proteins require wildly different conditions for crystallization, and we have no way to predict which conditions will allow a particular protein to crystallize. Protein concentration, buffer, pH, salts, ligands, precipitants, temperature, and time can all be critical factors for crystal growth. Typically, a crystallographer will set up high-throughput crystal screens, incubating concentrated protein in large arrays with hundreds of different conditions, and monitor them over periods of weeks or months.
Ultimately, protein crystallization is a lottery. Many proteins are never successfully crystallized. But, with a little luck, we'll be able to grow crystals of some of these proteins, collect x-ray diffraction data, and determine their full structure.( Posted by bkoep 78 1232 | Wed, 03/01/2017 - 17:45 | 9 comments )
Foldit design update - Part 1
It's been a long time since our last update on Foldit protein design! Here we lay out some recent progress and highlight the latest improvements in proteins designed by Foldit players.
Local Backbone Quality
Unlike designed α-helical bundles, which Foldit players have mastered with relative ease, the design of α/β folds has proven to be more problematic. For some time, we've suspected that the crux of the problem lies in unfavorable local backbone conformations. In particular, we found that the α/β proteins designed by Foldit players seemed to have loops that are never observed in natural proteins.
The Ideal Loop Filter, which was introduced last June, has helped Foldit designs remarkably. And in subsequent updates spanning the last several months we've seen further improvement in the backbone quality of Foldit-designed proteins. The box plot below shows the average local deviation from natural protein backbones in top-scoring Foldit designs. (Imagine breaking up each designed protein into 9-residue fragments, for each fragment searching natural proteins for a fragment with a similar backbone, and then measuring the RMSD to the closest match. If every backbone fragment of a design has a close match in a natural protein, that design should have a low mean RMSD; if there are regions of the design that have an unusual backbone, the design will have a higher mean RMSD.)
You can see that backbone quality in Foldit designs improved significantly after imposing the Ideal Loop Filter; disabling Rebuild; adjusting the IdealizeSS torsions; and introducing the Blueprint Panel. The dotted line marks a reference value from successful Baker lab designs; all designed proteins from Koga et al. fall below that line. In the latest design puzzles with the Blueprint Panel, we see that most high-ranking Foldit designs also fall below that line.
Rosetta@home Folding Funnels
The improvement in Foldit backbones is reflected in other types of analysis. With the improved backbones, Rosetta@home is better able to predict the structure of Foldit designs from their amino acid sequences (explained here).
Below is a set of 14 Foldit player designs that were successfully folded by Rosetta@home—all but one originate from puzzles using the Ideal Loop Filter. The strong "funnel" shape of each plot indicates not only that Rosetta is able to sample the intended fold (note the numerous red points with RMSD < 2 Å), but also that Rosetta predicts the intended structure to be the most stable. Compare these folding funnels to those of earlier α/β designs.
mimi, Mark- (Contenders) — Puzzle 1245
Bletchley Park, Mark- (Contenders) — Puzzle 1248
tokens, Galaxie (Anthropic Dreams) — Puzzle 1251
tokens, Galaxie (Anthropic Dreams) — Puzzle 1257
tokens (Anthropic Dreams) — Puzzle 1257
dcrwheeler — Puzzle 1263
fiendish_ghoul — Puzzle 1285
gitwut(Contenders) — Puzzle 1290
Bletchley Park, Cyberkashi, Mark- (Contenders) — Puzzle 1294
Hollinas, Bruno Kestemont, Scopper (Go Science) — Puzzle 1294
tokens (Anthropic Dreams) — Puzzle 1294
fiendish_ghoul — Puzzle 1297
fiendish_ghoul — Puzzle 1299
retiredmichael (Beta Folders)— Puzzle 1299
Each of the designs above has been reverse-transcribed into synthetic DNA, which is inserted into E. coli and expressed in our lab for further testing (read more about lab testing here). However, in the list above I've omitted four particularly promising designs that are already showing encouraging results. Next week we'll post a follow-up with more information about those designs, alongside some brand new experimental data.
A big thank you is due to all the Foldit players who have been designing proteins every week! We're learning a lot about protein design from your contributions, and credit goes to all participants—not just to those players acknowledged above. We appreciate your patience and persistence as we experiment with new tools and filters. Keep up the great folding!( Posted by bkoep 78 1232 | Tue, 02/28/2017 - 19:34 | 7 comments )
Sci Chat Roundup (Developer Edition)
We had a number of development style questions during our last open call for science chat questions, and as such, weren’t able to prioritize those during our last busy chat. We still wanted to get these answered for everyone, so here they are.
Will a parallel programming language, such as CUDA or OpenCL, ever be used to optimize processing speed in Foldit? Source question
Most of the heavy-weight processing is done inside of Rosetta, so if and when it is added, we can consider it. Aside from the technical problems, this also introduces potential social problems. If the benefits of using these platforms are meaningful, it could make a high end graphics card a requirement for competing. While this is also true for CPUs, CPU performance is not as varied or expensive compared to GPUs. We don't want a situation where your ability to compete is determined by your graphics card.
Will Foldit ever be open source? Currently, only Rosetta is open source, not Foldit. Source question
Unfortunately, this is unlikely. Please note that Rosetta itself is not open source, either.
The descriptions of remix and rebuild in jflat06's blog post raise some questions about how things work. First, if rebuild always works with a fragment length of 3, what does a rebuild of length 2 do? (A certain recipe defaults to starting with length 2.) Source question
In this case, rebuild is not actually inserting a fragment at all. It just places a cut, and then does a loop closure to close it again.
Could someone comment on the internal code:
LocalWiggleSequence in GUI versus structure.LocalWiggleSelected() in LUA:
Are they the same internally?
Or what would be the equivalent LUA function for this GUI one?
Extract from chat on 2017-01-13:
22:09 Wbertro TomTaylor5: converted "Wiggle by sheets" to LUA
22:09 TomTaylor5 Great
22:10 Wbertro but the code does not give the same results as the GUI one
22:10 Wbertro however it does not crash the client at the end
22:11 TomTaylor5 What would you like to hear first? The good news or the bad news?
22:11 Wbertro I think the GUI LocalWiggleSequence code is NOT the same as the LUA LocalWiggle code
22:11 TomTaylor5 That was probably the function I couldn't find.
22:12 Wbertro but the 1322 puzzle I tested them against is so sensitive that two runs don't give the same result twice
22:12 Wbertro I don't think it is a missing function
22:13 Wbertro it is simply DIFFERENT internal code, it seems
22:13 Wbertro I think I will ask on the next chat
They are using the same underlying procedure, but they differ in how they call it. The GUI script weirdly wiggles the residues sequentially, one at a time.
Thanks for the questions, everyone! We hope these additional answers help.( Posted by jflat06 78 1060 | Thu, 02/23/2017 - 19:05 | 7 comments )