New Foldit Interface
We are excited to release a new, single interface for Foldit.
One of the issues we’ve had in Foldit development is supporting both of the prior interfaces (Original Interface and Selection Interface). Not only does it take more effort to write to implement features in both interfaces, it’s also something of a user support issue, as it’s difficult for new players to come in learning one interface, and then get help from the documentation and experienced players for the other interface.
People with familiarity with the old Selection Interface will see similarities with the new interface, as we leaned heavily on the behavior of the Selection Interface for the new UI. (Selection interface is the preferred interface of a number of top users and the Foldit team, as well as allowing more flexibility for future improvements.) Most importantly, we kept the “selection” mechanic from the Selection Interface. To work on sub-sections of the protein, first select segments by clicking (and/or shift-clicking, ctrl-clicking [cmd-click for Mac] or double clicking) those sections you want, then select the action to use. Simply click the background to deselect segments and return to global actions.
Menu Panel: The Menu Panel can be accessed by clicking the Foldit icon in the upper left corner. This allows you to access overall game controls like changing puzzles, saving/loading solutions and exiting the program.
Puzzle Title: Click on the Puzzle title to go to the web page for the puzzle. (Try it on intro puzzles!)
Score Display: We’ve simplified the score display to focus more on the thing that counts -- your score.
Objective Panel: Previously, in puzzles with a large number of objectives, having the objective panel directly underneath the score meant that expanding it would block the main display. We’ve moved it off to the side, to lessen the interference when it’s expanded.
Side Buttons: For easy access, a number of panels can be accessed from convenient buttons on the left hand side of the screen: Help, Undo, View Options, Behavior Options and the Cookbook
Action Bar: Similar to the old Selection Interface, most of the actions you’ll use in working with the protein are found in the action bar. Unlike the SI, though, all the actions which are available on the puzzle should be present from the start - they may be disabled until you have the correct selection (check the tooltip on disabled actions), but they’ll be present. If you’re familiar with the Original Interface, the actions from the action popup menu and the right-click pie menu will now be found at the bottom of the screen.
No Modes: Gone are the “Modes” from the Original Interface. Instead, all functionality from the independent modes are available in the action bar. No more switching back-and-forth!
Mouse Control: The control of the view by clicking on the background should be unchanged. Use (left) click (either single or double, potentially while holding shift or ctrl [cmd for Macs]) to select residues. (Left) Click and drag does pull. Right click [control-click for single button Macs] can freeze a segment, double right click will freeze a region, and right click and drag will add a band.
View options: The behavior of the View Options menu has gotten an update. We’ve added support for view presets. This should allow you to develop your own customized combinations of view settings for different purposes, and easily switch between them. It will also allow us to put out suggested per-puzzle view settings, such that design puzzles can get a different default view from electron density puzzles or ligand puzzles.
To use a preset, simply select it from the list. You can also create a new preset from the current settings, edit the current preset, or delete your custom preset. Editing a preset brings up the customization options. These will only change your preset if you actually Save the preset, or return to the preset list (keeping the current view settings, but without changing the preset.)
We’re excited about the new, unified interface. This should allow us to more rapidly implement improvements to the program and the UI. We’re open to suggestions on how to make the new user interface more useful for all Foldit users.
Wondering where things went? We’ve put together a short guide on where various buttons and functionality have moved to.( Posted by rmoretti 69 387 | Tue, 11/16/2021 - 18:17 | 17 comments )
PROTAC small molecule design project
We’re happy to officially announce an ambitious new scientific project making use of the updated Small Molecule Design tool
Controlled protein degradation
Your cells need to constantly recycle old proteins. One way they do this is by tagging unneeded proteins with a specific signal molecule (ubiquitin) which sends them to the proteasome where they are cut up into individual amino acids. The ubiquitin tag itself is sufficient to cause the protein to be degraded, so the cell has a number of complex signalling pathways which normally control which proteins get tagged at what time. (This is the E1/E2/E3 enzyme cascade.)
Proteolysis targeting chimeras (PROTACs) are an exciting new approach to hook into that system, to promote degradation of proteins which wouldn’t otherwise be degraded. They work by having one end which binds to the protein to be destroyed, and another end which binds to an E3 ubiquitin ligase. Simply by binding to both proteins and bringing them together, a PROTAC causes the E3 ubiquitin ligase to tag the nearby target protein with ubiquitin and thus send it for degradation. The beauty of the system is that the portion of the molecule that binds the target and the portion which binds the E3 ligase are completely independent, connected by a generic linker. This makes it much easier to develop the parts separately and then combine them.
Not only are PROTACs an exciting possibility for developing new drugs, they’re an excellent research tool to figure out how proteins work in cells and organisms. Current approaches for gene function research rely heavily on “knockout” studies in model organisms, where the gene is removed entirely. This approach has limitations in that you can’t control when and where the protein is removed (it’s removed from everywhere always). It also requires difficult and costly genetic manipulation of the organism. PROTACs allow you to control the timing of protein removal by when you apply the drug, and by which E3 ligase you target. And you can do this in unmodified “wildtype” cells and organisms.
Currently, we’re somewhat limited by the number of small molecules which can target E3 ubiquitin ligase. Humans have over 500 different E3 ligase genes, each with their own expression profile and localization. We have comparatively fewer small molecules which can bind to the E3 ligases, and the ones we do have aren’t necessarily great for drug purposes. (Thalidomide is a popular choice for research purposes.) This limits the control we have over when and where the PROTACs work.
We hope you can change that! Using the small molecule design tool, you can make small molecules which bind to an E3 ligase, and thus can be a potential base for future PROTACs. The hope is that by developing a library of compounds which bind to different E3 ligases with different specificities, we’ll develop an arsenal of PROTAC-halves which can be used by future researchers. They’ll only have to worry about finding a binder to their protein of interest, and can then take an E3 ligase binder “off the shelf” and “simply” connect the two. With a library of E3 ligase binders, they can use the same protein of interest binder and target different E3 ligase binders, depending on their purposes.
The initial E3 ubiquitin ligase we’re targeting is the von Hippel-Lindau tumor suppressor protein (VHL). VHL has already been shown to be useful as the E3 target of PROTACs, and there’s existing publications showing what makes a good VHL binder. However, the existing binders are not particularly “drug like”, so we think there’s plenty of room for Foldit players to improve the state-of-the-art.
Compounds will be tested - will yours?
We’ve teamed up with a major pharmaceutical company - Boehringer Ingelheim (BI) - for this project. They approached us, wondering how they could support the small molecule design efforts in Foldit. BI has a history of supporting open science, for example creating opnMe to share BI-generated molecules. They are excited about the possibilities citizen science has in supporting open research in small molecule drug design, and think Foldit is a great way to achieve this.
Boehringer Ingelheim has committed to help evaluate and test the molecules which Foldit players have designed. Molecules you create in Foldit will be passed on to the team at BI, who will evaluate them based on the same criteria used for their own internal small molecule development. Compounds which pass the test will then be synthesized and tested for binding by BI. They have also volunteered to try to determine the crystal structure of successful protein-small molecule complexes, so we can better determine how well the Foldit design matches the actual experimental structure.
All participants and game sponsors of current and future small molecule design games commit to complying with the Foldit Terms of Service including those pertaining to intellectual property.
All compounds created as part of the collaboration puzzles will be made publicly available. Experimental results from testing the molecules will also be released publicly.( Posted by rmoretti 69 387 | Wed, 10/20/2021 - 21:11 | 0 comments )
Small Molecule Design Tool
The highly anticipated updates to the Small Molecule Design Tool are here, and we have a series of puzzles on the way! Your continued efforts in protein design have been nothing short of remarkable, and we are excited to be able to further help this amazing community contribute to small molecule design. While protein based drugs are an increasingly important category of therapeutics, the majority of drugs continue to be small-molecule based. With the Small Molecule Design tool you will be able to create, edit, and enhance small molecules.
Let's dive in and see how the Small Molecule Design Tool works. The tool will come up as an option in the Selection Mode action bar when you select a designable small molecule on a puzzle which supports it. With the Small Molecule Design Tool open, you can select particular atoms in the designable ligand to act on. The tool has two distinct panels: Atom Selection and Fragment Selection. You can toggle between them with the central Panel Selection Button.
The Atom Selection Panel has two unique areas: The Atom Selection area at the top and the Bond Selection area in the middle. With Atom Selection, selected atoms can be replaced with an atom of your choosing, assuming the placement is allowed. With Bond Selection, you will be able to select two atoms, and change the bond type between them.
The Fragment Selection Panel works with groups of atoms. The category buttons in the middle organize the fragments into four separate groups: Functional Groups, Unsaturated Cyclic, Saturated Cyclic, and Polycyclic. Each choice will change the contents of the Fragment Selection area above. Once you’ve selected the type of fragment, you can select the location (the atoms) on the small molecule where you want it to go, and then select which fragment you want from the Fragment Selection area. This will pop up a sub-panel which allows you to select where on the fragment you want to place the attachment. Using the `A`, `S` and `TAB` keys while hovering over the selection in the subpanel will allow you to further customize how the fragment is placed.
You may have noticed that both panels share a couple of elements. These are the Panel Selection Button, which allows you to switch between the Atom Selection Panel and the Fragment Selection Panel, and the Cleanup Structure area, with which you can delete atoms, delete bonds, and deselect highlighted atoms.
Be sure to check out the new tutorials to help get you started! We can't wait to see what all you come up with. Happy building!Sciren 53 93 | Tue, 10/12/2021 - 21:45 | 4 comments )
The AlphaFold prediction tool in Foldit
We are announcing a brand new Foldit feature that will enable players to use the revolutionary AlphaFold algorithm from DeepMind!
The AlphaFold feature is currently available for devprev users, and we expect to release it as a main update in the coming days. The AlphaFold feature is now available for all users in select Foldit puzzles.
AlphaFold v2.0 is an algorithm to predict the folded structure of a protein from its sequence, and was developed by the company DeepMind in 2020.
Previously, in the 2018 CASP competition for protein structure prediction, DeepMind had made a splash with their initial version of AlphaFold, outperforming dozens of research groups from around the world. The DeepMind group specializes in a type of algorithm called a neural network, and they showed that this type of algorithm held huge potential for the field of protein structure prediction. We wrote a blog post about the initial AlphaFold algorithm when DeepMind published it in January 2020.
After this initial success, DeepMind completely restructured their algorithm, and at the 2020 CASP competition they amazed the world with an even bigger leap forward. The new AlphaFold v2.0 is able to predict protein structures with astounding accuracy. The 2020 CASP results promised big advances for protein research, and the scientific community has been anxiously waiting for DeepMind to release the details about AlphaFold v2.0.
AlphaFold for protein design
AlphaFold is especially accurate for predicting natural proteins, where it can draw on the rich information in evolutionary patterns. But we’ve also found it to be very good at predicting the structures of designed proteins—even though these proteins have no evolutionary history. In fact, when we check against solved structures of designed proteins, we find that AlphaFold is usually more accurate than the design model itself!
Figure 2. Comparing the accuracy of AlphaFold predicted models and design models for 22 designed proteins with solved structures. The diagonal represents the line of equality. Points above the diagonal are cases where the AlphaFold prediction is more accurate than the design model.
We’ve also found that AlphaFold may be able to help us pick out designs that will fail lab testing. Whenever AlphaFold predicts a structure, the algorithm also produces a confidence value for the prediction. We see that AlphaFold tends to report a higher prediction confidence for successful protein designs.
In 2019, we tested 148 Foldit designs in the lab and found 56 were successful designs—a total success rate of about 38%. If we had rejected designs with AlphaFold confidence under 80%, then we still would have found 50 successful designs, with a success rate of over 60%!
A new Foldit feature
We are excited to announce a new Foldit feature that will let you get AlphaFold predictions for proteins you design in Foldit.
Certain puzzles will display a new DeepMind AlphaFold button in the Main Menu. This button opens up a dialog with a list of your saved solutions on the right-hand side. To request an AlphaFold prediction for a solution, select the solution and click the Upload for AlphaFold button. This will send your solution to the Foldit server and remotely run the AlphaFold algorithm.
A new solution will appear in the left-hand list and show the message “Pending…” while AlphaFold makes its prediction. It will take at least a few minutes to run, and the wait time may be longer depending on how busy the server is.
You will not be able to make a new AlphaFold upload while you have a submission currently pending. You may submit up to 5 concurrent jobs; if you currently have 5 AlphaFold uploads pending, you must wait for one to complete before making another submission. Click the Refresh Solutions button to check if your AlphaFold job is done.
When the AlphaFold algorithm has completed, the left-hand solution will display two values:
Confidence is AlphaFold’s own estimate about the accuracy of its prediction. Figure 3 above suggests that designs with higher confidence are more likely to fold successfully. Players should aim for confidence values of 80% or higher.
Similarity measures how closely the AlphaFold prediction matches your designed structure. If similarity is low, then AlphaFold has predicted that your design sequence will fold into a different shape than your designed structure.
To load the AlphaFold prediction into the Foldit puzzle, select the left-hand AlphaFold solution and click the Load button at the bottom of the dialog. Note that AlphaFold predictions may not score as well as solutions that have been optimized in Foldit. If you decide to work off of the AlphaFold solution, we recommend a quick Wiggle and Shake of the raw AlphaFold model.
The AlphaFold confidence and similarity values will not affect your Foldit score in any way. For the time being, the AlphaFold feature is simply a tool that you can use to get feedback about your solution, and to see how your design sequence is predicted to fold up.
Unlike typical Foldit tools, the AlphaFold algorithm runs remotely on an online server.
Normally, when you run Foldit on your computer, all of the Foldit computations are performed by your computer. If your internet connection fails in the middle of a puzzle, you can still continue to use all of the Foldit tools.
This AlphaFold feature is different, and the actual computations will run on a server hosted at the UW Institute for Protein Design (IPD). So, when you click the Upload for AlphaFold button, your solution is sent to the IPD server, which runs the AlphaFold algorithm and then sends the result back to your computer.
The biggest reason for this is that the AlphaFold algorithm is... big. Even the basic slimmed-down version requires several GB of disk space. If we wanted to distribute the AlphaFold software with Foldit, that would increase the download size of Foldit by 10x.
Another reason is that the AlphaFold algorithm runs much less efficiently on common CPUs than on GPUs, which many players may not have. If you ran AlphaFold on your CPU at home, it might take an hour to get a result back. However, if we use our GPUs at IPD, the actual processing will go much faster. Since most of our recent Science puzzles have had fewer than 100 active players at a time, we think that players can get results faster if we process AlphaFold jobs on our server GPUs.
This is an exciting time for the world of protein research! DeepMind has inspired other research groups, including the IPD, to explore similar kinds of neural network algorithms for protein structure prediction. As more researchers publish their findings and learn from one another, we can probably expect to see even more accurate algorithms in the future.
AlphaFold is already transforming the study of natural proteins, and has provided researchers with confident predictions of important proteins with unknown structures. But in the field of protein design, we are still learning how to make the best use of these advances. We hope that Foldit players will find the AlphaFold predictions helpful for designing creative new proteins!
Please note that the new AlphaFold feature is experimental, and it may change or even disappear in the future. Foldit is sharing the server GPUs with other research projects, and we may need to adjust our usage or develop new strategies for running GPU-heavy computations.
Edit Nov 2, 2021: Predicting native vs. designed proteins
Since we launched the AlphaFold tool, several Foldit players have pointed out a puzzling result in certain AlphaFold predictions:
"I copied a native protein sequence onto my design, but the AlphaFold prediction is completely different from the native structure, or it has an extremely low confidence. I thought AlphaFold was supposed to be good at predicting native proteins. What's going on?"
This is because in Foldit we are using an "abbreviated" version of AlphaFold that is not expected to work well on natural protein sequences.
The official, complete AlphaFold pipeline requires an extra step, scanning a large database for sequences that are similar to your query sequence. These similar sequences should all be evolutionarily related, and AlphaFold is able to extract patterns from this evolutionary data. AlphaFold is extremely good at extracting patterns from this evolutionary data, and this seems to be one of the reasons it performed so well in CASP.
When we use AlphaFold to predict Foldit designs, we skip this extra step because it is slow and because we do not expect to find "evolutionarily related" sequences for our designed proteins. Our internal benchmarking shows that AlphaFold is still good at predicting Foldit designed proteins, even though they don't have evolutionary data. However, skipping this step means that AlphaFold may underperform for natural protein sequences.( Posted by bkoep 69 421 | Sat, 07/31/2021 - 22:39 | 16 comments )
Experiment results for MERS-CoV binders
We have lab results for Foldit MERS-CoV binder designs! Several months ago, we challenged Foldit players to design proteins that could bind to the spike protein of the MERS coronavirus and block infection (this is similar to our previous challenge to bind the COVID-19 viral spike). After the puzzles ended, we used yeast-display FACS experiments to test the most promising Foldit designs and see if they stick to the MERS spike protein.
Long story short, this experiment did not reveal any successful binders. Read on for details about these designs and the lab experiments we used to test them, including some new ideas we used to boost our chances of success. Evidence suggests that this particular MERS-CoV spike protein is an especially difficult binding target. And this latest experiment highlights the challenge of the protein binder design problem.
Starting in October 2020, we ran 7 rounds of MERS-CoV Binder Design puzzles. Prior to these puzzles, we had challenges to design binders for the SARS-CoV-2 spike and the IL6 receptor. But these MERS-CoV puzzles were some of the first Foldit puzzles to award bonuses for binder metrics, like SASA and Shape Complementarity.
Since then, we’ve replaced SASA and Shape Complementarity with the new Contact Surface Objective, which is faster to run and seems to be a better predictor of binder success. We were only able to run one MERS-CoV puzzle with the Contact Surface Objective, but the results from that puzzle looked especially good. (In fact, we ended up testing more designs from that puzzle than any other puzzle in the series!)
After the round 7 puzzle closed, we ran some additional analysis on all of Foldit players’ solutions to select the most promising designs. Our selection criteria included binder metrics like DDG, Contact Surface, and BUNS. We also ran some other calculations, like secondary structure prediction, that indicate whether a design is likely to fold correctly.
In the end, we selected 59 solutions that we believed could bind to the MERS-CoV spike protein, and queued the designs for testing:
2010667_c0023 Bletchley Park,infjamc
2010727_c0156 Bruno Kestemont
2010727_c0545 Bruno Kestemont
2010727_c0840 Mike Lewis,Enzyme
2010816_c0034 Bruno Kestemont
Boosting Foldit binder designs
If you read our previous blog post about design throughput, you might recall that the success rate for binder design experiments is around 0.1%. We think the main source of failure is from designs not folding up exactly as designed. Even a tiny inaccuracy in folding can be ruinous for binding, for example if it creates a clash or an unsatisfied polar atom.
Researchers are working hard to improve our ability to select good binders before testing, so we can increase this success rate. But right now even the best-looking design only has a 1 in 1000 chance of correctly binding the target. This also means that we’d like to be testing at least 1000 designs in each experiment. So, even though Foldit players created 59 excellent binder designs, it is still unlikely that our experiment will reveal a successful binder from a batch of this size.
In order to boost our design numbers and make the most of Foldit players’ work, we used a new grafting technique to recombine Foldit binder designs with automated design scaffolds.
Using high-throughput folding experiments, scientists at the Institute for Protein Design (IPD) have accumulated a database with millions of automatically-designed proteins that seem to be well-folded in the lab. Even though these proteins don’t do anything (like bind a target), they are good starting points for further design, and serve as useful scaffolds for modification. For many protein design projects at the IPD, scientists prefer to start from one of these well-behaved protein scaffolds rather than try to design a new fold from scratch.
From each of the 59 parent Foldit designs, we extracted the portion that makes the most binding interactions with the target. Then we looked to see if we could computationally graft that portion onto the scaffolds in our database. This method is finicky, and the graft has to match the scaffold backbone very closely for it to have any chance of working. Even though there are millions of proteins in the scaffold database, some Foldit designs cannot be matched to any scaffold.
This technique lets us recycle Foldit-designed interfaces into many unique designs with different folds and different sequences. By recombining the Foldit designs with the scaffolds, we were able to multiply our 59 parent designs into 873 grafted designs.
For good measure, we also redesigned each of the 59 parent designs using the IPD’s latest machine-learning algorithm. Typically, this does not change the parent design drastically, but it still provides a little bit more sequence diversity for the experiment. That brought us to a total of 989 designs to test for binding against the MERS-CoV spike protein.
Below is a preview of the data. You can download the data for all 989 designs here.
pdb_id counts1 counts2 counts3 counts4 counts5 counts6 ddg contact_surface BUNS 2010629_c0001 37 0 0 0 0 0 -44.673 401.919 3 2010629_c0107 4 0 0 0 0 0 -56.559 567.339 7 2010629_c0138 0 0 0 0 0 0 -41.261 422.217 6 2010629_c0143 34 0 0 0 0 0 -42.075 430.841 6 2010629_c0407 2 0 0 0 0 0 -44.809 424.300 7 2010629_c0998 1 0 0 0 0 0 -39.433 356.197 4 2010629_c1072 0 0 0 0 0 0 -36.672 440.411 5 2010629_c1101 0 0 0 0 0 0 -37.413 381.733 7 2010667_c0002 4 0 0 0 0 0 -43.552 481.502 3 ...
- Enrichment at 1000 nM target
- Enrichment at 1000 nM target
- Binding at 1000 nM target
- Binding at 200 nM target
- Binding at 40 nM target
For details about the binding experiment and how to interpret these numbers, see previous blog posts here and here. To recap: yeast cells display our designs on their surface, and we do successive rounds of sorting to collect yeast cells that appear to stick to the target. After each round of sorting, we use DNA sequencing to track which designs were collected. A high number indicates that we collected many yeast cells that appear to bind the target with our design.
Generally speaking, if a design successfully binds the target, we expect to see steady high numbers for that design across all six rounds of the experiment. An unsuccessful design will have decreasing numbers and eventually drop out of the sorting rounds.
Unfortunately, none of our 989 designs appeared to bind to the MERS-CoV spike protein.
A difficult target
In parallel with the Foldit designs, IPD scientists also tested about 30,000 designs that were created with an automated design method. From those 30,000 designs, only 11 showed any binding--and only weak binding at that. That translates to a success rate considerably lower than the usual 0.1%, and hints that the MERS-CoV spike protein is an especially difficult target.
The 11 IPD binders all have especially high Contact Surface values (around 500 or greater). This was a bit surprising, since previous data had suggested a Contact Surface value of 400 can be sufficient for good binding. This new data will help us improve our binder design puzzles in Foldit, and in the future we’ll be challenging Foldit players to strive for binder designs with even stronger binder metrics!( Posted by bkoep 69 421 | Sun, 05/09/2021 - 01:06 | 0 comments )