Foldit Blog
Blog Feed
This is the place where we will describe some of the outcomes and results of your folding work, provide a glimpse of future challenges and developments, and in general give you a better sense of where we are and where foldit hopes to go in the future.

Foldit Education Mode

Although Foldit was originally made for science, we always knew it had potential as a learning tool. Until recently, we haven’t done a lot to help teachers use Foldit in their classrooms. We added Custom Contests so teachers could make their own puzzles, but this still takes a lot of time and energy.

The Foldit team had been talking about making a version of Foldit for education, but when the pandemic hit it became clear students across the world needed more remote learning options. So we accelerated our plans, and today we are proud to announce the release of Education Mode!

foldit_edu_blog1.png

Figure 1: The Education Mode version of Wiggle teaches you both how to wiggle, and what it’s actually doing.

Education Mode will be launched as a separate app from the main Foldit game. This may change in the future, but for now, if you want to use Education Mode you need to have it installed separately. The downloads can be found on our new educator’s page here.

The core idea of Education Mode is to teach a section of a protein biochemistry class through Foldit. We hope this is helpful not only for students, but for anyone curious about the basic science behind protein biochemistry. Even if you’ve been playing Foldit for a while, check out Education Mode for some bonus science and tutorials!

foldit_edu_blog2.png
Figure 2: New Primary Structure Puzzle. This is a protein design puzzle, but the purpose is to help you think about which amino acids fit best where in a protein and why based on the underlying chemistry. You’ll notice that the design wheel has had all of the pictures removed to encourage you to visualize the amino acids.

Education Mode has 29 puzzles in 9 tiers. Many of these puzzles are variants of the campaign puzzles, which are designed to teach Foldit gameplay. However, you’re also likely going to learn some biochemistry along the way! In the typical campaign puzzles, we don’t emphasize the biochemistry learning part of it so that you can get to the game quicker and without having to feel like you’re going through a biochemistry class. In Education Mode, the tips focus on teaching you the biochemistry behind the puzzle while learning to play Foldit along the way.

foldit_edu_blog3.png
Figure 3: New Idealizing Structure Angles Puzzle. This puzzle is an evolution of the Structure and Idealize campaign puzzle, but now expanded to relate secondary structure to the Rama map, and how to use it.

The Education Mode puzzles start on atomic interactions (like clashes and hydrogen bonds), then focus on amino acid structure before proceeding through different levels of protein structure (primary, secondary, and tertiary structure). Finally, there are a few puzzles on how proteins actually fold in nature, and a final puzzle on protein binding to DNA.

A new feature that you won’t find in the campaign levels is that on many of the puzzles, you can explore the puzzle before clicking through the tutorial, and then reset the puzzle to start scoring. This is so you can explore and experiment before attempting the puzzle for real. You’ll notice that the education tips have both forward and back buttons, and some of them now have pictures to illustrate more abstract concepts! Like the campaign mode, once you’ve completed a puzzle, it will prompt you to move to the next one, but you can also keep playing the puzzle to see if you can improve your score even more.

Some of these puzzles are intentionally hard. We’ve enabled the Save function so that you can take a break and reload your progress, and we have also made it so that you can skip puzzles, in case you want to move on to another topic.

foldit_edu_blog4.png
Figure 4: New Tertiary Structure Puzzle. This puzzle is geared specifically to teach students about the difference between secondary and tertiary structure in proteins.

You might notice that we’ve disabled some popular tools (like Wiggle) in many of the Education Mode puzzles. This is to encourage more hand-folding and critical thinking about your choices as opposed to letting the computer do it for you.

Some tools, like Blueprint, are missing from Education Mode because we are still developing lessons for them. For now, the regular Campaign levels are still the best way to learn these tools.

One last feature that we added into Educational Mode is extra camera controls. By pressing Shift+Home, the camera will rock back and forth. Pressing Alt/Option+Home will set the camera into a spin motion. Press the hotkey again to stop the motion. We hope that these new features can help you better visualize the 3D space of your protein!

As always, thanks for playing, and we hope that you enjoy the new Educational Mode! Please let us know what you think of it by submitting feedback or emailing us at mail.fold.it@gmail.com!

( Posted by  horowsah 89 2865  |  Sat, 08/01/2020 - 12:53  |  0 comments )
4

The energy landscape optimization paper

This blog post is a walk-through for an upcoming paper, showing how researchers at UW and Harvard developed a new method for protein design. This research relied heavily on the work of Foldit players, who will be listed as authors on the paper. (If you have played a Monomer Design puzzle in Foldit, you can opt in to the author list here).

The paper has not yet gone through peer review, but a pre-print draft of the paper is already available online. The paper is written for highly-specialized academics with a scientific background, but we think its content can be appreciated by anyone with an interest in protein folding and design.

Below we discuss some of the background for this research, take a look at the figures, and review the main points of the paper.

What is an energy landscape?

The title of the paper is Protein sequence design by explicit energy landscape optimization. Before we jump in, we will need to make sure we understand the idea of an energy landscape. We’ve discussed energy landscapes previously, so let’s recap:

There are a lot of possible ways that an unfolded protein might fold up (think of all the different knots you can tie with a shoestring). Each of these possible folds has some amount of free energy, which depends on the amount of clashing, voids, H-bonding, etc. The lower the free energy, the more stable the fold.

In an energy landscape, we like to imagine all of these possibilities laid out on a grid, like a map of possible folds. Then we imagine that the depth at each point of the map corresponds to the energy of each fold. There will be deep valleys and wells where we have stable, low energy folds; and there will be hills and peaks where we have unstable, high energy folds. This map is our energy landscape.

An unfolded protein will naturally fold into its most stable structure. This is the structure with lowest free energy (the deepest point in the landscape).

Every protein sequence has a different energy landscape. Most random protein sequences have a featureless landscape with many, many shallow wells of similar depth. These sequences will not have a strong preference for any particular fold, and they will be poorly folded in real life.

On the other hand, a well-designed protein sequence will have an energy landscape with a single deep well. This sequence has an overwhelming preference for the low energy fold at this well, and the sequence will be well-folded in real life.

Normally, we try to approximate the energy landscape of a sequence by folding the sequence into thousands and thousands of different structures, and calculating the energy of each one (details here). Even though this only gives us a partial view of the energy landscape, it is computationally intensive, and it takes some 10,000s of CPU hours to compute. (A big thanks goes to Rosetta@home volunteers for providing this CPU power!)

Because energy landscapes are expensive to compute, most protein design methods focus on just the design structure, and ignore the rest of the landscape. We only try to reduce the free energy of our design, and we cross our fingers that the energy landscape has no other energy wells. This is sometimes effective, but it can lead to an energy landscape that has multiple low-energy wells (which means the protein could fold into an unintended structure).

Ideally, we would like a design method that considers the entire energy landscape, but without requiring thousands and thousands of energy calculations.


Figure 1. Energy landscapes and trRosetta. (A) An energy landscape visualizes the energy (depth) across all different folds, or "conformations." Suppose that we want to design a protein with fold P. Most design methods optimize the free energy of fold P and arrive at sequence B (green). Since these methods are blind to the rest of the energy landscape, sequence B might have a landscape with alternative energy wells. A better design method would consider the entire energy landscape to produce sequence A (blue), which has a single low-energy well. (B) The trRosetta neural network takes an input sequence and makes predictions about how the residues will be oriented in the folded structure. This new work shows that trRosetta predictions serve as a good proxy for the energy landscape. The neural network can optimize the sequence to improve the match between the predictions and the desired structure, molding the landscape to favor our desired structure.

Neural networks and sequence likelihood

trRosetta (transform-restrained Rosetta) is a machine learning program developed after the breakthrough AlphaFold program (details in this blog post). The input for trRosetta is a 1D protein sequence, and the output is the predicted distance and orientation between every pair of residues in the 3D folded protein structure.

Previously, researchers at the Baker Lab showed that these distance and orientation predictions are good for protein structure prediction problems. The orientations help us generate a complete 3D model of the folded protein, which accurately shows how the protein will actually fold.

In the new paper, researchers turn trRosetta on its head to evaluate and design proteins. Rather than use the predictions to generate a structure for the input sequence, they compare the distance and orientation predictions to the intended structure, and calculate the sequence likelihood for that structure.

This sequence likelihood score tells us whether trRosetta thinks the design sequence is a good match for the design structure. If the intended distances and orientations of the design structure are a close match to the trRosetta predictions, then the sequence likelihood for that structure will be high. If the design structure is a poor match to the predictions, the sequence likelihood will be low.

Predicting energy landscapes

The researchers used sequence likelihood to show that trRosetta can predict useful information about the entire energy landscape of a protein sequence -- not just information about the preferred structure.

To show this, they used a dataset of energy landscapes for >4000 Foldit designs, which have been accumulated from several years of Foldit design puzzles. This dataset represents about 100 million CPU hours of energy landscape calculations! They divided this dataset into favorable and unfavorable energy landscapes.

First, they calculated the sequence likelihood just for the design structure. They found that the sequence likelihood of a design is a good predictor of whether a design has a favorable or unfavorable energy landscape. Importantly, trRosetta sequence likelihood was a much better predictor than just the Rosetta energy (or Foldit score) of the design. Since trRosetta takes just a couple minutes to run, this could cut down the need to run expensive landscape calculations!

Next, the researchers calculated sequence likelihoods for many different structures across the energy landscape of each design. They found that these likelihoods accurately reflect the shape of the landscapes.

For example, when they looked at a favorable energy landscape with a single energy well, they saw that models within the well had a high sequence likelihood, and models outside the well had low likelihood.

They also looked closely at a few special cases, where an energy landscape shows two competing energy wells. One of these wells represents the intended design fold, and the other well represents a decoy fold that is equally stable. We expect that a protein sequence with this kind of energy landscape is equally probable to fold into the design fold or the decoy fold. This is correctly reflected in the sequence likelihood scores, which are reduced for the design fold, and are comparable between design and decoy folds.


Figure 2. trRosetta predicts information about energy landscapes. (A) Histogram of sequence likelihood (left) and Rosetta energy (right) for 4200 Foldit designs. The distribution of favorable landscapes is shown in blue, and unfavorable landscapes in gray. There is significant overlap in the distributions of Rosetta energy, showing that Rosetta energy is a poor predictor of the whole energy landscape. Sequence likelihood is a better predictor, with less overlap between blue and gray distributions. (B) Energy landscape plots for Foldit designs, with color gradient showing the trRosetta sequence likelihood of models across the landscape. At the top, a landscape with a single well has very high sequence likelihood within the well. Below, landscapes with multiple wells have weaker, more dispersed likelihood. Cartoon illustrations show the design and decoy folds X and Y. On the right, example bimodal distributions show the “ambivalency” of trRosetta distance predictions when a landscape has two energy wells.

This is all well and good. We’ve seen that trRosetta is really useful for predicting theoretical energy landscapes, and can help us cut down on computational work. But does it actually reflect physical reality? A more stringent challenge would compare trRosetta against real experimental data from lab testing.

Last year we published the experimental testing results for 145 Foldit player designs. When the researchers checked this data, they found that trRosetta sequence likelihood was a good predictor of success in the lab!


Figure 3A-B. trRosetta predicts experimental testing results. (A) When we look at the testing results for 30,000 IPD-designed proteins, we see that trRosetta sequence likelihood correlates well with folding stability (as approximated by protease resistance). By contrast, Rosetta energy of the design is poorly correlated with this stability measure. (B) Histogram of sequence likelihood (left) and Rosetta energy (right) for 145 experimentally-tested Foldit designs. Successful designs are in blue, and failures in gray. Sequence likelihood is a better predictor and energy alone, with less overlap between the success and failure distributions.

Optimizing the energy landscape

Finally, the researchers put trRosetta to the test, to see if it could actually redesign proteins to have favorable energy landscapes.

From the 4000 Foldit designs, they selected a representative set of 200 models and used trRosetta to redesign their sequences. Remember that, in Foldit, the original designs were made to optimize the energy (the Foldit score) of just the target fold. Now, trRosetta is trying to optimize the entire energy landscape, which encompasses the energies of all possible folds.

The results were surprising: although trRosetta was good for eliminating decoys and coarsely sculpting the energy landscape, the resulting landscapes lacked a sharp, deep energy well that we like to see for a stable, well-folded protein design. Instead, a combination of trRosetta (optimizing the landscape) and traditional design (optimizing the design energy) yielded the best energy landscapes, with a single deep energy well.


Figure 3C-D. Redesigning proteins with trRosetta energy landscape optimization. (C) Example energy landscapes for two redesigned Foldit proteins. Redesign with trRosetta alone produces a landscape with a single shallow well, and Rosetta lowers the energy without favoring a single energy well. Combining both approaches gives a favorable energy landscape with a single deep energy well. (D) The quality of energy landscapes across all 200 redesigned proteins. The colored lines show how many redesigns (y-axis) meet a threshold for energy landscape quality (x-axis; increasingly stringent threshold). Traditional Rosetta redesign (green) is susceptible to low energy decoys, and less than 50% of redesigns pass the lowest threshold; however, Rosetta redesigns that do pass have very deep energy wells and also tend to pass higher thresholds. trRosetta (purple) improves landscapes that fail the low-quality threshold, but cannot achieve deep energy wells that meet a high-quality threshold. A hybrid approach, in magenta, achieves the best of both worlds.

What does this mean for Foldit?

In all Foldit design puzzles so far, we’ve seen that players are very good at optimizing the score of their designs. But the real challenge of protein design is how to account for the rest of the energy landscape, and we still haven’t found a good way to do this in Foldit.

Some players probably remember the 2018 Foldit Partition Tournament, which challenged players to explore the energy landscape of each others’ designs. That showed some promise, but still was time-consuming and low-throughput (we generated only 20 landscapes in 6 weeks).

trRosetta offers a fast alternative for predicting energy landscapes, and we may be able to combine it with normal Foldit scoring. trRosetta might be able to report the sequence likelihood of a Foldit solution, and even suggest mutations to improve its energy landscape.

One disadvantage with machine learning programs like trRosetta is that they are “opaque” and sometimes difficult to make sense of. We can’t really say why trRosetta makes certain suggestions, or ask which design features are causing problems. That could make it difficult to reconcile trRosetta suggestions with Foldit score components like clashing and H-bonding.

Another shortcoming of trRosetta is that it cannot suggest how to refold the protein backbone to improve an energy landscape. Some protein backbones are inherently more difficult to design than others (or even impossible). Finding designable backbones is an important aspect of protein design, and we think that’s a particular strength for Foldit players.

Still, trRosetta is clearly a useful tool for protein design, and we’ll be looking at ways to incorporate trRosetta into Foldit. Maybe players could find new and unexpected ways to use feedback from neural networks!

( Posted by  bkoep 89 941  |  Fri, 07/31/2020 - 21:00  |  7 comments )
3

Newsletter July 24: A Good Week for Go Science

Hey folders!

Dev Josh here with your weekly Foldit update. Congratulations to Go Science! for being the top of all three puzzles this week! Go Science has been an open and active group since 2010. One of the best ways to learn and improve in Foldit is to join a group.

If Go Science isn't your style, try the hopeful and determined Anthropic Dreams, the fun and light Gargleblasters, or the dedicated Contenders

Solutions from This Week's Puzzles

(Disclaimer: This is not scientific feedback; these solutions are not officially endorsed by the Foldit scientists.)

Puzzle 1863: Refinement R1043

I've heard this puzzle was crashing pretty frequently. Thanks for your patience everyone, the devs are hard at work trying to fix these issues!

Puzzle 1864: Symmetric Trimer Design: Limited Interface

To master this puzzle, you needed to limit how big your binding interface was. Notice how the top scores rotated their helical bundles to limit their attachments!

Puzzle 1865: Coronavirus Anti-inflammatory Design 8

Bkoep said there were 15 unsolvable BUNS, but some of the top solutions got them down to 11! Great job on satisfying those BUNS everyone, keep it up!

Want to know more about why we're designing binders from scratch? Check out this forum thread for details on why we're not just using the ACE2 receptor design.

Recipe of the Week

This week's recipe is new but with great potential:
mwm64's UnBun is designed to help you reduce BUNS. This recipe only works on puzzles with the BUNS objective, and I haven't personally tried it out much, but I've heard a few folks are trying it. Plus, if you're looking to get involved with recipe evolving, this simple recipe could be a good way to get some practice with Lua. Given how important the BUNS objective is, we're going to need more recipes like this! So thanks mwm64 for making the first de-BUN-ifier!

Player of the Week

A quick shout-out this week to malphis, a friendly newcomer who joined a couple of months ago and has been really active in chat. Malphis has also been super helpful submitting bug reports to help the devs track down issues. Thanks!

Art of the Week

Looking for some more protein beauty? Check out this beautiful proteins blog! It's got a ton of real proteins that are naturally amazingly beautiful.

Today’s Master Folding Tips

Beginner: Before trying to wiggle your designed protein into the perfect shape, give it a mutate first! This will help the protein pack together better and give you a cleaner structure to work with. You can also mutate by hand: for example, although all of your amino acids start as isoleucine, it's actually better to set your loops to asparagine to start with.

Intermediate: Have you learned how to use the Rama map yet? We're working on a few new guides that should help make it easier to learn, but in the meantime Susume has two guides on how to use the Rama map to fix un-ideal loops and even copy a loop

Expert: Are you planning your design before you make it? Before you start drafting, spend a few minutes thinking about what your design will look like. How long will each helix and sheet be? Will you try to make pi stacks? What part of the protein will bind at the interface, and how will that give it shape complementarity? Once you're ready, use Loci's AA Edit and SS Edit to enter your design and give it a quick early/midgame rinse. Then hand it off to a novice member of your group to evolve and try another design!

Have a tip to share with the community? Reply with your wisdom, or post on our Forums!

Until next time, happy folding!

( Posted by  agcohn821 89 1773  |  Wed, 07/29/2020 - 18:53  |  0 comments )
2

Foldit Newsletter July 17: Bonjour Encore Triple Hélice

Hey folders!

Dev Josh here with your weekly Foldit update.

3 Solutions from This Week's Puzzle

(Disclaimer: This is not scientific feedback; these solutions are not officially endorsed by the Foldit scientists.)

Puzzle 1861: Symmetric Trimer Design: Buried Unsats

Triple helix is here to stay, look how clean and neat these bundles are! Great job silent gene and Spvincent

This solution took a less common approach to the triple helix meta. I'll let you decide for yourself whether you think it scored well or not. What do you think of it? Let us know in the Discord!

Puzzle 1862: Coronavirus Round 13

An extra special congratulations goes to clark92 for being top rank for this puzzle! This up-and-coming folder only started folding at the end of February, and already they've taken the leaderboards by storm!

These solutions come from some of our beginner folders! Can you tell what they could do better?

As a reminder, here are some helpful tips from bkoep on designing a good binder!

Want to get your top solution featured in the weekly newsletter? Click the "Share with Scientists" button in the "Open/Share Solution" menu and your solution might get featured! Don't forget to fill out our username sharing form if you'd like your username to be shown with your solutions!

Recipe of the Week

Not sure what recipes are good? Check out this all-in-one recipe: Constructor by Grom!
This mini-cookbook contains 19 different recipes all packed in one. Check it out for some inspiration this week!

Player of the Week

Big thanks to nspc this week for putting out two new French tutorial videos on how to get started with design puzzles and prediction puzzles.

If you're still on the intro puzzles, nspc also has a video on beating Hydrophobic Disaster.

I think I speak for everyone when I say merci beaucoup! Nspc (pc on Discord) is a beginner folder who has been learning fast by being really active in the chat. Say hi next time you see them around!

Art of the Week

Here's some art from 1861: a cool-looking triangle and a crazy ball of... I don't even know what... Thanks for sharing!

Today’s Master Folding Tips

Beginner:
Despite how common they are, I really recommend trying a helix bundle like the ones you've seen from the top-scoring solutions! Helices are easier to make than loops or sheets, so practicing on helix bundles is a great way to get a higher rank and practice the basics before trying something tricky and advanced like long loops or a sheet structure.

Intermediate:
Are you paying attention to which structures your AA structure preferences. AAs prefer to be in? It's not a hard-and-fast rule, but check out the wiki for AA structure preferences. I find this especially helpful for getting started by mutating my isoleucines away into something more suitable for the structure I'm designing, like asparagines for loops, valines for sheets, and MALEK for helices.

Expert:
How many structural motifs can you name? Most of you know pi stacking, some of you even know about beta hairpins. But do you know about ST turns, Greek keys, and Omega loops? What about sequence motifs?

Having these concepts in your toolkit will give you more conceptual legos from natural proteins to think about when designing. There's plenty of research out there on common patterns, and if you're looking for expert tips, then you're ready to dig into real literature. Good luck, and let us know what you find on the
Discord

Want to give your group a shoutout in the next newsletter? Reply with a blurb about what your group is and why new players should join, and your group might get featured in the next newsletter!

Until next time! Happy folding!

( Posted by  agcohn821 89 1773  |  Fri, 07/24/2020 - 03:47  |  0 comments )
0

Newsletter July 10: Triple-Quad Helices and Borromean Rings

(This post was originally sent out on July 10 to our mailing list. You can sign up for the mailing list here to receive weekly updates about Foldit, including tips and tricks and see the top-scoring solutions to the week's puzzles. Don't forget to join our Discord as well to stay in the chat even when you're not folding!)

Hey folders!

Dev Josh here with your weekly Foldit update.

Solutions from This Week's Puzzles

(Disclaimer: This is not scientific feedback; these solutions are not officially endorsed by the Foldit scientists.)

Puzzle 1858: Symmetric Trimer Design

Personally, I went with a 4-helix design for this puzzle, and it seems like that's what a lot of the highest scoring solutions did. But there were also a couple of 3-helix designs, and even some sheets!

Puzzle 1860: Refinement R1040

The highest scoring solutions for this puzzle kept two medium-sized sheets lined up and folded the rest into short helices around a core.

Compare this to some of the intermediate solutions. Although these folds are okay, they had some minor problems: some loose helices and poor scoring ends.

What was the trickiest part for you about this puzzle? Let's talk about it in Discord!

Recipe of the Week

This week's recipe has been described by Phyx as "The Best Recipe of 2014": Wisky's Repeating Rebuild All!

Let this late-game recipe run for 3-4 hours and it will do some rebuilding magic on your pose.

Player of the Week

I want to honor LociOiling! for constantly being the #1 contributor to our wiki!. This week he created the pages for Reaction Design Puzzles and Camera Controls! If you've ever read a wiki page that was made in the last few years, chances are Loci wrote it. Give him your thanks in chat next time you use the wiki!

Art of the Week

This week's most beautiful fold comes from Formula350 for his Borromean rings! This would never fold up in real life, but wow, is it pretty!

Today’s Master Folding Tips

Beginner: Don't be afraid to reassign your secondary structures to different sheets and helices! While this might seem like you're "changing the puzzle," you're really just making a suggestion for what shape the protein should take, and this suggestion can help your other tools better serve you. Try a bunch of different secondary structure assignments and use Ideal SS on them afterward, then see how this new arrangement might be easier or harder to fold. Play around with it, Foldit is about experimenting!

Intermediate: If you haven't learned to use Backbone Pins yet, I highly recommend it. This tool, hidden away in the view options, gives you more control over wiggling than CI alone. A locked pin is similar to a ZLB, it will keep your wiggle locked to that spot, while moving everything else more.

Expert: Although it might seem like more hbonds means better binding, hbonds at the interface don't actually add to the strength of the bind, since they aren't much stronger than these atoms simply binding to water. What use are interface hbonds then? Their purpose is eliminating BUNS. The real strength of your binding comes from hydrophobic interactions, shown in the Hiding and Packing subscores, and your hbond network gives the bind its specificity.

Want to recommend a recipe of the week or have your solution featured in the next newsletter? Send us your cookbooks and screenshots, we'd love to see what you're up to!

Until next time, happy folding!

( Posted by  joshmiller 89 1019  |  Tue, 07/14/2020 - 20:19  |  0 comments )
1
User login
Download links:
  Windows    OSX    Linux  
Windows
(7/8/10)
OSX
(10.12 or later)
Linux
(64-bit)

Are you new to Foldit? Click here.

Are you a student? Click here.

Are you an educator? Click here.
Social Media


Search
Only search fold.it
Other Games: Mozak
Recommend Foldit
Topics
Top New Users
Sitemap

Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, RosettaCommons