Newsletter July 3: Initial Reactions
(This post was originally sent out on July 3 to our mailing list. You can sign up for the mailing list here to receive weekly updates about Foldit, including tips and tricks and see the top-scoring solutions to the week's puzzles. Don't forget to join our Discord as well to stay in the chat even when you're not folding!)
Dev Josh here with your weekly Foldit update.
This week we saw the introduction of the Reaction Design tool. The devs are working hard on polishing it up and making it more usable! As always, thanks for your feedback and bug reports. You can submit more feedback here.
Top Results from Puzzle 1856: Coronavirus Round 12
In this puzzle, I accidentally evo'ed on a broken developer build and got the top score. Whoops, sorry about that!
Here are some of the solutions at the top of the leaderboards. [A note from our scientists: the top of the leaderboards doesn't always mean the most scientifically useful. These highlights are not scientific feedback and are not officially endorsed as scientifically valid designs by the Foldit team.]
Join the mailing list to see what others are folding!
Recipe of the Week
This week's recipe is an oldie but a goodie from drjr. The recipe is called Reset, and it does what it says on the tin: reset to the best score, unfreeze the protein, remove all your bands, and set the CI to 1. A simple recipe, but a handy quality of life tool for when you just need to backtrack a little.
Player of the Week
Quick shoutout to argyrw for always being a friendly voice in chat! Say hi to her in global or veteran chat.
Today’s Master Folding Tips
Beginner: Are you still using Pull to draft your protein in the early game? Try making cutpoints and moving pieces around with the Move tool, it's so much easier! Don't forget to disable cutpoint bands in the Behavior tab, or they'll all come together again when you wiggle.
Intermediate: It can be really tempting mid-game to just switch to running recipes. But give some time to carefully inspect every acceptor and donor (the red and blue dots) to see what hydrogen bonds you can form, and manually mutate as needed. Not only will this lower your BUNS, but it'll help form a strong hbond network. The scientists love this, and your rank will too!
Expert: If you haven't already, read bkoep's blog on binder design metrics. DDG, SASA, and SC are going to become really important soon since we're looking to add objectives for them. So understanding and practicing these principles now can help you get a headstart on the competition! Use the protein design sandbox to try out some ideas.
Have a tip to share or a recipe to recommend? Reply with your suggestions or make a wiki page for your ideas! Reaction Design doesn't have a page yet, so if you understand this tool, help out your community by writing about it! (Since writing this post, LociOiling has graciously created the page for Reaction Design puzzles.)
Until next time, happy folding!( Posted by joshmiller 93 950 | Mon, 07/06/2020 - 18:11 | 3 comments )
Experiment results for IL6R binders
The results from our IL6R binder experiment are back! This experiment tested 100 Foldit designs from the first two rounds of our Coronavirus Anti-inflammatory puzzles, to see if any of them bind to the IL6R target.
In short, we did not see any successful binding from the Foldit designs. This is unfortunate, but we should not be too discouraged! Read on for more details about the experiment, and what these results mean for Foldit (hint: more binder design puzzles!).
This is a long blog post, broken into a few different sections. First, we’ll explain some background about DNA libraries and fluorescence-activated cell sorting techniques that were used for this experiment. Then we’ll go over the experiment results for protein expression and target binding. Finally, we’ll close out with some discussion about these results, and thoughts about what’s next for Foldit.
In order to test lots of proteins at once, we order a custom DNA library. A DNA library is a mixed pool containing thousands of different DNA genes that encode our designed proteins.
In this experiment, the library includes genes for 100 Foldit player designs and thousands of designs from IPD researchers. All of these designs are intended to bind to the IL6R target.
We insert this mixture of genes into a yeast culture so that each yeast cell gets a gene for just one binder design.
We insert our designed gene alongside a companion gene that encodes a yeast membrane protein. When these genes are decoded, our designed protein is linked to the companion membrane protein. The yeast cell exports these to the cell membrane, so that our designed binder is displayed on the outside of the yeast cell, but is still tethered to the companion protein embedded in the membrane.
Although we expect the yeast cell to have lots of binders on the surface, those binders should all be identical since they came from the same gene.
Figure 1. A DNA library is a mixture with DNA genes encoding thousands of protein designs. The genes are inserted into yeast cells so that the yeast cells can decode the genes and express the designed proteins. The yeast cells export the designed proteins to the cell membrane so that they are displayed on the yeast surface.
Now we have a culture with millions and millions of yeast cells, which are displaying our library with thousands of different binder designs. Each yeast cell displays only one of the designs from the library; but there may be many identical yeast cells that each display the same design.
Fluorescence-activated cell sorting (FACS)
Now that our designed protein is displayed on the yeast surface, we tag the protein with a fluorescent molecule that emits green light. The intensity of green fluorescence corresponds to the amount of protein displayed on the yeast surface (higher intensity = more protein).
In a separate tube, our target protein (IL6R) is free-floating in solution, and we tag it with a different fluorescent molecule that emits red light.
Then we mix the free-floating target IL6R with our yeast cells. We expect the target will stick to binders that are displayed on the yeast surface. However, if one of our designed proteins does not bind the target, then no target molecules will stick to that yeast cell.
Now we'd like to measure how much target is stuck to each yeast cell. We use a microfluidics device to pass yeast cells, one at a time, in front of a sensitive photometer, which measures the intensity of green and red fluorescence in two separate measurements.
These two measurements are typically plotted as a scatter plot. Each point represents one yeast cell, where the x-axis is intensity of green fluorescence (the amount of displayed protein), and the y-axis is intensity of red fluorescence (the amount of bound target).
Figure 2. (A) Green-tagged designs are tethered to the yeast surface, while red-tagged target is free-floating. If a design successfully binds the target, then a yeast cell will have high-intensity green and red fluorescence. (B) FACS scatter plot of yeast fluorescence measurements. Each point is a yeast cell, with green fluorescence (expression) on the x-axis, and red fluorescence (binding) on the y-axis. Points in the top right corner represent cells with both red and green fluorescence, indicating good expression and binding. (Note that the colors in the plot represent point density; for example, the patch of red near the center of the plot means there are lots of overlapping points in this region.)
After taking these measurements, the cell sorter can redirect each individual yeast cell to one of two buckets (“select” or “reject”), based on their fluorescence. Normally, we are looking for cells that have strong expression (intense green) and strong binding (intense red). So we want to select the top right quadrant of the scatter plot, and reject everything else.
After sorting, we end up with a “select” bucket of all the yeast cells displaying successful binders (these were cells with intense red and green fluorescence, indicating that they express well and stick to the target).
The last step of this experiment is to figure out which proteins were displayed on those cells. There were thousands of designs in our library; which ones stick to the target?
For this, we use DNA sequencing to read the genes of everything in our “select” bucket. If we read a gene encoding one of our designs, then we know that a yeast cell displaying our design was sorted into the select bucket, and so it must have had strong red and green fluorescence.
The final output of our experiment is a list of genes that were found in the "select" bucket, and the number of times we read each gene. If our bucket contains multiple, identical yeast cells with the same gene, then we expect to see multiple reads of that gene.
Below is a preview of the data from this experiment. You can download the data for all 100 Foldit designs here.
design_id counts1 counts2 counts3 counts4 counts5 counts6 DDG SASA SC BUNS 2009432_c0003 21 0 0 0 0 0 -26.908 946.664 0.600 9 2009432_c0004 57 3 0 0 0 0 -35.443 1198.221 0.669 8 2009432_c0006 29 0 3 0 0 0 -40.365 1386.322 0.647 10 2009432_c0007 17 0 5 0 1 0 -53.948 1635.076 0.679 15 2009432_c0009 67 0 0 0 0 0 -31.730 1032.899 0.665 6 2009432_c0010 94 0 0 0 0 0 -31.894 1267.798 0.672 10 2009432_c0011 57 0 0 0 0 0 -30.796 1122.379 0.553 9 2009432_c0012 111 1 0 0 0 0 -37.067 1340.479 0.641 10 2009432_c0014 5 0 0 0 0 0 -44.323 1378.069 0.554 13 2009432_c0016 16 0 0 0 0 0 -39.257 1460.892 0.649 10 ...
In the table above, you can see that each design has six “counts” columns. These correspond to six different FACS experiments with the IL6R binder library, which we'll describe below:
- Binding at 1000 nM
- Binding at 100 nM
- Binding at 10 nM
- Binding at 1 nM
- Binding at 0.1 nM
Sorting for expression
In experiment #1, we try to measure how well the yeast can express and display our designed proteins. We don’t mix the target IL6R protein with our yeast and we don’t measure red fluorescence for binding. We only select yeast with strong green fluorescence, collecting cells that have lots of designed protein displayed on their surface.
The expression experiment is a helpful control for the later binding experiments, but it can also tell us something about how well our proteins behave. Stable, well folded proteins are easily displayed by the yeast, and these yeast will have strong green fluorescence. In contrast, unstable, poorly folded proteins are less likely to be displayed, and will show weaker fluorescence.
For many of the Foldit designs, the sequencing counts from experiment #1 are a little low. The median expression count for a design in this entire library was about 50, and only a third of the Foldit designs met this threshold. This suggests that some of these protein designs are not folding very well.
This is in line with our expectations. When Foldit players design monomer proteins from scratch, we see about a 50% success rate for good folding in the lab (50% is very good by protein design standards!). Binder design is harder than bare monomer design, because we generally have to sacrifice folding stability to optimize binding. So we should expect that <50% of binder designs will fold properly.
Sorting for binding
After selecting for expression, we can start selecting designs from our library based on binding.
This time we mix our yeast cells with red-tagged target IL6R that is free in solution. In the early experiments we mix with a high concentration of the target (1000 nM).
A binding measurement at high concentrations of target is a lenient test for binding. There are lots of target molecules floating around, so even weak binders are likely to have some target stuck to them.
After letting the yeast cells equilibrate with the target in solution, we pass the yeast through the cell sorter and measure the intensity of both red and green light. If a cell lights up for both expression and binding (in the top right quadrant), then we send it to the select bucket for sequencing.
Figure 3. FACS scatter plots. (A) The fluorescence measurements from expression experiment #1. We see two clusters of cells in the bottom left and bottom right quadrants, representing cells with poor expression and high expression, respectively. We select everything in the bottom right quadrant. Note that this experiment does not include any IL6R target, so there is no red fluorescent signal for binding (there are no cells in the top left or top right quadrants). (B) The fluorescence measurements from binding experiment #2. After incubating the yeast cells with target IL6R, we see that some cells have both green and red fluorescence (the top right quadrant). This indicates both strong expression and also strong binding.
We typically repeat the binding experiment, reducing the concentration of target each time. Binding measurements at low concentrations of target provide a stringent test for binding. At 0.1 nM target concentration, we are likely to see binder and target stuck together only if they bind very tightly.
We see very low sequencing counts for all of the Foldit designs--even at high concentration of target--which indicates zero binders. Some designs show a couple of reads in one or two of the binding experiments, but this is within the range of noise that we would expect for zero binders.
Why didn't the Foldit designs bind to the target?
These results are slightly disappointing, but we should not be too discouraged!
Although none of our Foldit designs bound to the IL6R target, we did see a few binders from the designs by IPD researchers. Below are the counts from the tightest IPD binder:
design_id counts1 counts2 counts3 counts4 counts5 counts6 DDG SASA SC BUNS IPD_design 144 38 69 56 13 52 -39.114 1720.442 0.640 9
Figure 4. An IPD-designed protein binder with exceptional binder metrics, which appears to bind IL6R. The IL6R library included thousands of proteins designed by IPD researchers with highly optimized binder metrics. Only a handful of designs successfully bound to the target.
Why did we see binding from IPD designs but not from Foldit designs? The IPD designs had exceptional binder metrics. Recall from our previous blogpost that certain metrics seem to correlate with good binding (DDG, SASA, BUNS, shape complementarity). If we rank the tested designs using these metrics, we find that this IPD design outranks all but three of our Foldit designs.
In order to design successful protein binders in Foldit, we will need to focus on these binder metrics. If we can make these metrics available in Foldit puzzles, we are confident that Foldit players will be able to optimize them just as well as IPD researchers. To that end, the Foldit team has been working to add new Objectives that can compute all of these metrics in Foldit. We should be able to release the first prototype Objectives in an update very soon!
Another important consideration here is the sheer number of IPD designs tested. The library for this experiment included thousands of IPD designs, and all of them had top-tier binding metrics like the one above. Even with those thousands of designs, we only got a few binder hits out of the library.
Unfortunately, such high failure rates are typical for protein binder experiments. We have to remember that protein design is a difficult challenge with many pitfalls, and our understanding of protein folding and binding is imperfect. To succeed in protein binder design, we will need to generate lots of designs to test.
What's next for Foldit?
The Foldit designs in this experiment came from just the first two rounds of the anti-inflammatory puzzles, back in April. Since then, we’ve seen even more great designs from Foldit players, and we’ll continue to run binder design puzzles as we work to improve the Foldit tools.
Soon Foldit will have prototype Objectives for calculating DDG, SASA, and shape complementarity. Already, it seems that players have been able to use the new BUNS Objective to improve designs in recent weeks.
We’re excited to keep pressing on the problem of protein binder design! We are used to tackling hard problems in Foldit, we tend to learn a lot about proteins in the process. We think that Foldit players have a lot to contribute in this arena, and we’ll be looking to tackle new (and harder) targets in the coming months.
Remember that we also have an experiment under way to test Foldit-designed binders for the coronavirus spike protein, and we should have results from that experiment soon. So stay tuned for more, and happy folding!( Posted by bkoep 93 886 | Tue, 06/30/2020 - 22:34 | 9 comments )
Reaction Design Tool
The new Reaction Design Tool is live, and we have a puzzle headed your way! First a little bit about the tool itself. You all have been amazing in the realm of protein design, and now its time to step into the world of small molecule design. One approach in small molecule design is to modify or individually place each atom. This is a great approach, but it can have some short comings like the creation of chemically infeasible molecules, and the last thing we want is to create a wonderfully scoring small molecule that wouldn’t be possible in the real world, or worse would explode! So, the way around this is to use a reaction-based approach. With this approach you will be given fragments of a small molecule and its up to you to find the best way to combine them. The great thing about these fragments, or reactants as they will be known in the game, is that they are already determined to be synthesizable. Meaning that the small molecules you create can be produced in a lab, and possibly used for therapeutics.
The layout of the tool is in three major parts. First, at the top of the tool is the Reaction Panel. This panel allows you to choose the base of your new small molecule, or ligand as it will be known in game. These reaction options are the center of you new ligand. The reactions are surrounded by black spheres. Let’s call these linking atoms. These linking atoms are simply there to denote where your chosen fragments will connect to your reaction base. Note: linking atoms will mostly appear this way, but not always. The second major part of the tool is the Reactant Panel. The reactant panel is where each of the fragments are stored. In some puzzles you will only have one reactant to choose from, while in others you will be able to combine two or three. These reactants also display linking atoms so this way you can see how the reactant you choose will connect with the reaction base. The last part of the tool is the Accept Button. The accept button allows you to realize your creation in the context of the protein. Once you have selected your reaction base and your reactants, you will notice that not only is your ligand glowing blue, but it is now in the shape of the ligand you are trying to create. Once you are satisfied, click the accept button, and you will have created your new ligand! To get the best results you will need to optimize your newly created ligand, just like you would optimize a rebuilt/remixed loop. Wiggle, shake, and move your ligand to discover if it really is the best design for the protein.
Here are some tips to get you started. Just like the Genie from Aladdin, ligands have “phenomenal cosmic power!” Ok well maybe not, but they are quite powerful when it comes to their inactions with proteins. However, just like the Genie they have itty bitty living spaces. This living space is known as the activity pocket. You will need to design a ligand that best fits in this activity pocket. One way to do this is to look for hydrogen bonding. Hydrogen bonds help the ligand bind to the protein and therefore are immensely important. If after wiggling, the ligand gets pushed out of the pocket, or if it appears to be bending and stretching in odd ways, try lowering the wiggle power. A little nudge can go a long way. Also, the Reaction Design Tool tries its best to fit your newly designed ligand to its starting structure. This means that a different starting structure could produce a better resulting ligand, because it is oriented differently in the activity pocket.
We really hope you all enjoy working with this new tool and are extremely excited to see what you all come up with. Expect more small molecule/ligand design puzzles in the future.
Be sure to check out the new Reaction Design tutorial in the Campaign Menu.
Happy folding everyone!
( Posted by jtscott 93 1642 | Wed, 06/24/2020 - 20:03 | 7 comments )
Anti-inflammatory designs queued for testing!
In our last blog post, we announced a similar experiment for binders to the coronavirus spike protein. (We had some issues getting the necessary materials for that experiment, but we've come up with a workaround and that experiment is back on track!) These latest designs will be tested using the same kind of experiment, using yeast display and flow cytometry techniques, but swapping in the IL6R target instead of spike protein. See our previous YouTube video for more.
IL6R is a protein found on human immune cells, and plays a role in the "cytokine storm" that can cause dangerous inflammation in severe cases of COVID-19. A protein that binds to IL6R might be useful as a drug to temper this inflammation. We'll be testing these 100 Foldit player-designed proteins to see if they bind to the IL6R in a controlled lab setting. It will be several weeks before the experiment results come in, but we'll continue running more anti-inflammatory puzzles to try and develop even better designs.
2009432_c0004 ZeroLeak7, puxatudo, Phyx, PLAYER_21, w1seguy, Bruno Kestemont
2009432_c0009 PLAYER_5, TheGUmmer
2009432_c0012 CharlieFortsConscience, Bletchley Park, georg137, spvincent
2009432_c0016 Bruno Kestemont
2009432_c0030 Steven Pletsch, AntiVaccine
2009432_c0034 Crossed Sticks
2009432_c0096 silent gene
2009432_c0101 PLAYER_4, spdenne
2009432_y6445 silent gene
2009565_c0001 ZeroLeak7, Bruno Kestemont, mirp, RockOn
2009565_c0002 Mike Lewis, Formula350, Skippysk8s, actiasluna, Joanna_H, Jpilkington
2009565_c0025 Scopper, NinjaGreg, RockOn
2009565_c0063 Mike Lewis
2009565_c0067 Bletchley Park
2009565_c0074 actiasluna, Formula350
2009565_c0075 Mike Lewis, Joanna_H, Jpilkington, ManVsYard
2009565_c0115 actiasluna, Formula350
2009565_y1631 silent gene
Coronavirus binder designs queued for testing!
After the first three rounds of our Coronavirus Binder Design challenge, we've selected 99 of the most promising Foldit player solutions for experimental testing!
Once a Foldit puzzle closes, we run some further analysis to figure out which designs are the most likely to fold and bind to the target. You can read more about some of that analysis on our previous blog post. To select promising designs, we consider Foldit score in addition to metrics that correlate with proper folding and others that correlate with binding.
We've combined those metrics to choose 33 designs from each of rounds one, two, and three of the Coronavirus Binder Design challenge. In total, 99 Foldit binder designs will be tested at the UW Institute for Protein Design, with the same experiments that have already begun for computationally-designed binders.
It will be a few more weeks before genes arrive and we can begin experiments on the Foldit designs. In the meantime, we'll continue to work on designing better binders in Foldit, so stay tuned for more puzzles! Be sure to review our tips for designing successful binders and watch coronavirus expert Lexi Walls, Ph.D. discuss early Foldit designs!
Below are the 99 designed proteins that we'll test for binding to the SARS-CoV-2 spike protein (click to view the full-size image). Remember to fill out our username sharing form if you want to see your username in Foldit updates!
2008926_c0069 Galaxie, robgee, alwen
2008926_c0071 silent gene
2008926_c0193 silent gene
2008984_c0002 Caraline_nelson, Phyx, mirp, PLAYER_17, jeff101, silent gene
2008984_c0003 Bletchley Park, spvincent
2008984_c0036 silent gene, edpalas
2008984_c0046 Steven Pletsch, PLAYER_18
2008984_c0058 PLAYER_12, frood66
2008984_c0239 PLAYER_15, frood66
2008984_y9747 silent gene
2008984_y9800 silent gene
2009030_c0002 Bletchley Park, PLAYER_5, georg137, spvincent
2009030_c0005 Bletchley Park
2009030_c0009 Steven Pletsch, PLAYER_18
2009030_c0020 Galaxie, jamiexq
2009030_c0049 actiasluna, Jpilkington, ManVsYard
2009030_c0073 Crossed Sticks
2009030_c0105 Steven Pletsch
2009030_y0378 silent gene
2009030_y3873 PLAYER_10, PLAYER_17, Bruno Kestemont
2009030_y5708 Bruno Kestemont