9 replies [Last post]
agcohn821's picture
User offline. Last seen 11 hours 42 min ago. Offline
Joined: 11/05/2019
Groups: Foldit Staff


How to Foldit, Part 1: What’s a Protein?

Hey folders!
Dev Josh here with the first post in a series of tips for better folding. If you’ve already been playing for some time, you might already know most of this, but I’ll try to make sure there’s something for everyone to learn. Let’s get started!

(A note for readers of the future: if you’re reading this after this series was initially posted, I highly recommend giving yourself a few days to a week -- depending on how much you’re playing -- between each post. There’s a lot to digest here, and you’ll gain more from this series if you get some practice in-between posts instead of bingeing everything at once.)

The Basics

In every puzzle, your goal is to make a well-shaped, stable protein. Biology says "form follows function," so a beautiful shape makes an effective, stable protein and vice versa. If something looks wrong -- like it’s twisted weird or just kinda spaghetti -- that’s probably not how nature would want it. A good protein is neat and compact. It’s folded up, hence the name of the game. If a protein were just a long chain, it wouldn’t do much; it works by being a particular shape. Plus, proteins don’t like empty spaces in the middle of their shape. If there’s a gap, they’ll naturally collapse into it. But they also don’t want to be too tight -- like if you compress a spring, it’ll just fly apart when you let go. And that gets to Fundamental Folding Rule 1: Not Too Close, Not Too Far. In Foldit, things too close are "clashing" and empty spaces create "voids." Try not to have too many of either.

Okay, so we want to gently fold it up into a good shape. But what makes a good shape? We’re going to need a bit of jargon to talk about that, so get ready. <<I’ll put the extra sciencey bits like this, so if you don’t care about the details, you can skip these sections.>>

Your protein is made up of a series of segments called residues. Think of it like a long chain of bendy Lego pieces with arms sticking out of each piece. <<Each residue is an amino acid: there are 20 kinds of them in Foldit.>>

The parts of the residues that chain together are called the backbone, while the bits that stick out are the sidechains. Each residue is blue <<(hydrophilic)>> or orange <<(hydrophobic)>>.

Phew! Jargon over. Now we can talk about Fundamental Folding Rule 2: Orange In, Blue Out. Why? For now, you can think of it as oranges are sticky and blues are slippery.<<Of course, at the scientific level it starts to get more complicated as you factor in hydrogen bonds, entropy, and hydrophobic collapse. But we’ll save that for later.>>
Your protein will be submerged in water, so it needs a slippery outer coating to move around. The sticky inside helps it keep its shape by holding together like glue. If it had an orange outside, it might stick to something else and pull apart. Imagine making a paper plane with a wad of gum on the right wing. Maybe it would get stuck to the floor and you would try to pull it up, but because the gum is stuck, the whole thing unfolds. If the gum were on the inside instead, the folds would hold well together and it would be less likely to come apart. Remember: the goal is stability.

Okay, simple enough. But what if you’re stuck with a sequence like this? How are you supposed to move all the oranges to the center?

Let’s look closer at the pattern here: 2 orange, 2 blue, 2 orange, 2 blue... Nature is telling us something here -- it’s asking for a very particular and common shape, called a helix. <<Scientists use the Greek letter alpha as well, calling them α-helices.>> By making a corkscrew, we can get all the orange on one side and all the blue on the other!

Tip for the pros: Helices can range from 4 to over 40 residues!

Okay, new problem. This sequence is orange, blue, orange, blue. Our corkscrew approach won’t work because helices take about 4 residues per "turn," which is why they work well for the 2-2 pattern.

In this case, nature tends to like a different pattern, called a sheet. <<Technically, a sheet generally refers to a group of strands, where the picture below would be a single strand. A helix can be a helix by itself, but strands need to be near each other to be truly considered a "sheet." Scientists also preface sheets and strands with the Greek letter beta, as in, "β-strands." We’ll come back to this in Part 10.>>

Sheets are flat, and in Foldit they look zig-zaggy. In a sheet, sidechains alternate which direction they’re facing, so blue-orange-blue-orange, becomes all blue on one side and all orange on the other.

Helices and sheets are very stable structures -- they make pretty proteins that hold well together. Residues that aren’t helices or sheets are called loops. While loops aren’t very stable, they’re more flexible than helices or sheets, so they’re useful for making turns and connecting structures. These three types of backbone are called the secondary structures (or SS) of the protein. <<The primary structure is the sequence of amino acids.>>

But wait, what are those blue and white stripes in the picture? Those are hydrogen bonds, they form across sheets and within helices, which is part of what makes them so stable. <<Really, a hydrogen bond forms whenever an acceptor atom is adjacent to a donor and they share a hydrogen atom.And that brings me to Fundamental Folding Rule 3: Make Bonds. Hydrogen bonds are the most common, so we’ll focus on those for now. They form when one of the small blue dots is close (but not too close) to a red dot. How do you make them in bulk? Just like in the picture: build helices and line up sheets!

Tips for the pros:

  • Look for all the other ways you can form hydrogen bonds. Sidechains have blue and red dots on them too! Some even have hybrid purple dots (Tyrosine, Serine, and Threonine), which function as blue AND red.)
  • If you turn on "Show Bondable H" in the View Options, you’ll see the white hydrogen dots. Hydrogen bonds (Hbonds) need the white dot to point toward the acceptor and away from the donor.

And that’s it! Three simple rules for folding proteins. Of course, it gets more complicated, and it’s not always easy to just "add more hydrogen bonds." So in the next part, we’ll take a look at some real folds to see what these rules look like in practice. In the meantime, make sure you’re on the Foldit Discord. It’s a great place to ask questions, get help, and have fun talking to other Foldit players.

Until next time, happy folding!

Summary:

  • A protein is made of a chain of residues, each with a sidechain, connected along a backbone
  • There are three types of secondary structures (SS): helices, sheets, and loops
  • Avoid clashes (stuff too close) and voids (empty spaces)
  • Orange on the inside, blue on the outside
  • Make hydrogen bonds by lining up sheets and having helices

Ready to take on the tutorial levels? Here’s a quick guide by S0ckrates or you can look up walkthroughs on the wiki. Don’t forget to check the FAQs for some basics about Foldit.

Looking for help with coronavirus? LociOiling made a three part video series (1, 2, 3) on the coronavirus puzzles! Susume made a longer video that goes into more depth on the coronavirus puzzles.

<<Want to know more science about an intro to proteins? Check out this video!>>

agcohn821's picture
User offline. Last seen 11 hours 42 min ago. Offline
Joined: 11/05/2019
Groups: Foldit Staff
Part 2


How to Foldit, Part 2: An Eye for Beauty

Hey folders!

Dev Josh here with the second post in a series of tips for better folding. <<Like last time, more detailed scientific information will be bracketed off like this.>>

Last time we went over the basics: the big picture of what we’re trying to do. Today we’re going to look at some examples to see what this actually looks like. Want to see this in practice? Click here for a video of a veteran folder drafting a design in a symmetric design puzzle.

Examples of Good Folds

This 2-helix, 4-sheet design comes from Bruno Kestemont from the recent coronavirus puzzles, where the goal is to make a binder protein that binds to the coronavirus spike protein (in gray). Notice how Bruno’s sheets line up very cleanly with each other and with the coronavirus. The loops are short and mostly there to connect the helices and sheets. This design is something Foldit players would also call “surfing hotdogs” because of how the helices “surf” over the sheets -- this is a very common type of design. Let’s look at a few more examples of winning solutions.

What can we learn from these examples? For one thing, helices are very strong! Silent gene’s and ZeroLeak7’s helical bundles show that. But a good line of sheets can be equally stable. We also see that “Orange In, Blue Out" isn’t a hard-and-fast rule, there can be a couple of exceptions, as shown by Skippysk8s in the upper left. <<And, in reality, hydrophobicity exists on a gradient spectrum.>>

Another type of design is the beta barrel <<so named because sheets are more formally beta-sheets, and helices are alpha-helices.>> This one comes from Murlow in puzzle 733. Each sheet is slightly curved, allowing it to make a barrel shape.

So now we’ve seen some good examples. Now let’s look at the ugly stuff: the misfolded proteins. Some of these examples come from the coronavirus puzzles again, so ignore the gray parts: we’re just looking at the protein design right now.

Examples of Bad Folds

Notice any issues? For one thing, this design doesn’t have a “core.” There’s no “inside,” so it’s free to flop about, which can cause it to pull itself apart. And although there are some helices, most of the helices are too short to be stable. The shortest helices are a full turn (4 residues), so anything less than 4 residues of a helix will just become a poor loop. Generally, you want a high volume to surface area ratio by making the protein compact rather than spread out. Okay, let’s try another one.

The helices are better here: clearly stable, nice and straight. The one in the upper left might be too short though, and it’s likely to become a really long loop. Loops are useful for their flexibility, but that comes at a trade-off to stability. A long loop like that might just flop about and start twisting on the rest of the protein, or get caught on something because it’s not tucked into the protein’s core. This protein is also taking up too much space -- notice how the helix on the bottom is not attached to anything, free to wave around from the long loop it’s connected to. There’s so much space between this helix and the others, so this is very unlikely to be how a protein would want to fold up.

Okay, this one has some sheets: what do you think of how the sheets are folded? See how the two in the center are fairly straight, but the one in the upper right curls up a bit? That’s a little weird, generally sheets will have similar curvature -- they may not always be perfectly straight, but if they’re going to curl, they’ll usually all curl in a similar way. Analogous to drawing visual art, it’s good to have a clear “line of action” in a character’s pose. Coincidentally, a protein’s fold is also called a pose.
There’s an orange sidechain exposed on the end too, which is what that yellow orb is pointing out. What about the two sheets on the bottom -- how exactly do they fit in with the overall structure? They’re lined up okay with each other, but they’re at a somewhat weird angle compared to the rest of the protein.

This one is super interesting because I originally included it as an example of a bad fold. The sheet closest to the camera is just totally misaligned, right? But I checked with our scientists, and actually this design looks really good! They said that Foldit players often try too hard to straighten their sheets, when in reality sheets are messy and bent, sometimes too long or too short, and bent a bit. Obviously, this complicates the nice simple rules we were trying to establish for what makes a good protein, but nature is messy sometimes and that’s okay. What does this mean for you? Don’t be too worried about whether sheets look perfectly straight, allow them to bend the way they want to bend!

Alright, last example for now. I’ll leave this one for you. What problems can you identify? How are the sheets, helices, and loops? Is there a core? Is there too much empty void? Have a question? Come talk about it on Discord.

Check out this page for more on aesthetics. To see more examples, check out my personal gallery of images, which includes good examples, bad examples, and real scientific models of actual proteins.

Tips for the pros: Get some practice looking at real protein models to keep training your eye.

Now that you can see what’s good and bad, how do you actually fold a protein? In the next part, we’ll look at your tools of the trade to see what you can use to make beautiful folds.

Want to keep practicing? The best way to see more examples is to join a group! In a group, you can play as an evolver, taking other players’ solutions and evolving them to be better. This is one of the best ways to get practice and see more folds. You can see the list of groups by most members or by highest leaderboard scores.

Until next time, happy folding!

Summary:

  • Helices are strongly stable but need to be at least 4 residues long
  • Helices are especially strong in a “bundle” of 3-4 helices.
  • Sheets are stable but need to be connected with a clear “line of action”
  • Loops are flexible but unstable, short loops are typically better
  • A good protein has a core with little void space in the middle
  • Common designs include “surfing hotdogs” and “beta barrels”
agcohn821's picture
User offline. Last seen 11 hours 42 min ago. Offline
Joined: 11/05/2019
Groups: Foldit Staff
Part 3


How to Foldit, Part 3: Tools of the Trade

Hey folders!
Dev Josh here with the third post in a series of tips for better folding. Please see Part 1 and Part 2 above. <<As always, more detailed scientific information will be bracketed off like this.>>

It’s time to actually get to work! Let’s look at our toolkit to see what we can use to fold our proteins. I’ll be talking about when to use each tool, so you need to know the three phases of gameplay in a Foldit puzzle:

  • Early game - This is the drafting phase. Don’t even look at your score, just go by rough shape. Mock something good up and don’t worry about minor problems.
  • Mid game - You’ve got a rough draft, now clean it up a bit. Here you’re looking for voids, clashes, and exposed oranges. You’re looking for more bonds you can make and problematic angles to straighten out. Your score will go positive during this phase and you’ll start looking at it for indications that you’re doing something right.
  • Late game - This is pure refinement. There’s actually very little scientific value in this phase, so if you’re just here for the science, you can submit what you’ve got at the end of mid game and try a different design. But if you want to get to the top of the leaderboards, you can put a few days of work into polishing out that last fraction of a point.

I’ll also mention a tool’s default hotkey, or keyboard shortcut. For tools that “run,” you can cancel them by pressing the hotkey again, or pressing Esc or spacebar to stop/cancel any tool.

Main Tools

These are your bread and butter, your go-to staples for daily folding. They’re mostly used in the early game when you’re “hand-folding,” or moving the protein manually (as opposed to running automated scripts).

Pull
The Pull tool, also sometimes called "Nudge", is the first tool you get. Click and drag to tug the protein how you want it. Pull is a useful tool for drafting your shape in the early game, especially in combination with rubber bands and freezing.

Tips for the pros:

  • Nudge is really just a local wiggle with a rubber band to your cursor. You can get more precise nudges through a combination of rubber bands, freezing, and wiggling.
  • You can turn Clashing Importance down to 0 to Pull the protein through itself without it bouncing off. This is really handy for drafting a shape in the early game without making cuts.
  • On the subject of terminology, veteran LociOiling thinks of nudges as a “short pull plus shake and wiggle” which can be useful in the early game

Shake
Shake is exactly what it says: it’s what you get if you hold the backbone steady and give the sidechains a good jostling. What’s it good for? Shake makes sure all of your sidechains are where they want to be. Typically, this means they’ll move into the space that’s least crowded, so shake is good for clearing clashes between sidechains. Shake also comes in a local variant if you just want to shake part of your protein. The default hotkey for Shake is S. <<The scientific term for shaking is “repacking”.>>

Tip for the pros: Shake is modified by the Clashing Importance.

Wiggle
Oh, wiggle. Wiggle wiggle wiggle. Wiggle is by far the most used tool in Foldit. It has single-handedly won tutorial levels as is a key player in most of Foldit’s best algorithms. At its most basic level, the gameplay loop of Foldit is (1) try something, fully knowing it will lose you many points, (2) shake and wiggle to get the points back and see if your change made it better. So what makes wiggle so good?

At a high-level, wiggle jostles the entire protein into a more comfortable shape. But really what it’s doing is making a ton of micro-adjustments to the protein: if the score goes up, it keeps that change, otherwise, it tries something else. It does this until it gets stuck and can’t find a better score. This doesn’t mean that’s the best score your pose could ever have, it just means the computer needs your help to find something better. Like shake, you can wiggle part of your protein (locally) or all of it (globally). The default hotkey for Wiggle is W.

<<Wiggle is scientifically known as an energy minimizer. It works by finding the local minimum energy (or maximum game score) in the multidimensional energy landscape. More on that here>>


Wiggle Power
In the Behavior tab, you’ll also find “Wiggle Power", which can be set to Low, Medium, (sometimes High,) or let the computer Auto-choose which power to use. Low power won’t try to compute “ideality,” or enforce proper bond angles. This is useful during the early game when you want to gently fold the protein into a rough shape. Low power keeps your fold pliable and malleable so that you can keep working with it. Medium and High power increasingly force ideality, becoming ruthless optimizers that will find every fraction of a point at the cost of hardening your protein’s shape. This kind of wiggle is good for late game refinement when you’re done making changes. If you do a high-powered wiggle too early, your pose will become hard to work with, like firing your clay before you’re done shaping it. High power wiggle is usually only available during Revisiting puzzles. For more information, check out the original blog post on wiggle power.


Fuzing
Like Shake, Wiggle considers Clashing Importance (CI). As taught in the tutorial level Control Over Clashing, this means that trying to wiggle with clashes present will result in everything flying apart. If you want to wiggle with clashes, it’s often a good idea to try shaking first to see if that gets rid of them. If you still need to wiggle with clashes, the answer is Fuzing. Fuzing (sometimes “fusing”), is a general strategy of lowering the clashing importance and then slowly raising it while wiggling. Shakes are often thrown into this process as well, for example: Shake (0% CI), Wiggle (0% CI), Shake (5% CI), Wiggle (5% CI), . . . and so on. In many fuzing strategies, the CI is lowered again and brought back up repeatedly. In fact, the fuzing recipe Blue Fuse was compared favorably with the scientific algorithm Fast Relax in a scientific publication in 2011 by the Foldit devs.

Mutate
Although this tool is only available in design puzzles, it can come in handy often. Mutate will look at all of the “mutable” (able to be mutated) sidechains and change the sidechain to a different one if the change would improve your score. Although this can be handy for quickly generating decent sidechains, the Mutate tool doesn’t think about "Orange In, Blue Out", so you might want to consider some mutation recipes. By default, press M to Mutate.

Rubber Bands
Like their namesake implies, a rubber band is a suggestion to tie two parts of the protein together. You can modify the strength (how strongly it pulls) and the length (how far apart it’s trying to pull to) of each band. Rubber bands are useful for lining up sheets (set the length to around 2 for forming hydrogen bonds; the default length of 3.5 is better for backbone-to-backbone bands), or keeping your helices straight, or generally compacting your protein by banding everything to everything else and crunching it together. You can also make a band to empty space (a Band in Space, or BiS) to tether the protein to the invisible 3D map. Another common strategy is making a bunch of bands of length 0 (Zero Length Bands, or ZLB). This has the effect of tethering your protein to where it is, so that any serious jostling will be more likely to keep its rough shape. Foldit players have found thousands of uses for bands, they are one of the most versatile tools in your toolkit. By default, press D to toggle the bands being Disabled or not, or press R to Remove all bands.

<<Rubber band lengths are measured in ångströms (symbol: Å), which is the canonical scientific unit of length measurement at the level of proteins. An angstrom is a metric unit equal to one hundred-millionth of a centimeter, or 0.1 nanometers.>>

Tip for the pros: Rubber bands can push too! If the current length is less than the band’s desired length, the band will try to push the two ends apart.

Freeze
Computers can do a good job optimizing a fold, but because they can’t see the overall shape, they don’t know what shapes are good and what shapes need fixing. Freeze is your ability to tell the computer “this is a good shape, don’t change it.” A frozen residue will not bend or move its sidechain (if the sidechain is frozen), though it can still move in space. Some uses of freeze are to limit what you’re Pulling on or to keep a structure’s shape while you’re working on arranging the pieces in the early to mid game. By default, press F to freeze/unfreeze everything, or shift-click to freeze a residue (shift + double click to freeze an entire structure).

Cut and Move
I mention these tools together because "cut and move" is a classic strategy for repositioning your protein. Clicking on your pose will bring up the Move tool, allowing you to rotate or translate the pose in 3D space. By itself, this is mostly equivalent to moving the camera unless there’s something else in the puzzle like a ligand or binding target. But by Cutting the protein into different pieces, you can Move them individually. The benefit of this, of course, is that it’s much MUCH easier to move your pose around, like Lego pieces rather than a massive structure. The drawback is that merging cutpoints back together can sometimes introduce distorted bond lengths and angles that you’ll need to clean up later. By default, wiggle will pull cuts together as if they were banded, which is good for merging the cutpoints. But you can disable this in the Behavior tab by unchecking “Enable Cut Bands.” This makes your protein pieces entirely independent.

Tip for the pros: Having trouble merging your loops back together? Try making the cutpoints in the middle of helices and sheets instead of at the loops.

Ideal SS
Similar to Idealize described in the next part, Ideal SS adjusts the angles of sheets and helices to straighten them out. This is useful in the early game, as straight sheets are easier to line up. Ideal SS is also helpful when first making structures in Structure Mode. During late game refinement, sheets and helices might want to bend just slightly, but they shouldn’t look too different from what an ideal structure would be.

Tip for the pros: You can also straighten sheets using the Tweak tool. Start tweaking a sheet, then grab the purple dot and pull it away from the sheet to straighten it out.

Modifiers
I see these tools as “support” rather than primary functions on their own, but they’re still a major part of your folding toolkit.

Clashing Importance
As mentioned above, Clashing Importance (CI) changes the percent to which tools like shake, wiggle, and mutate care about clashing. Literally, when considering if the score got better or worse, this percent is used as a multiplying factor for your clashing score. Lowering the CI is useful for Fuzing strategies, and setting it to 0 is useful when you’re just drafting in the early game. Generally, a lower CI will help the pose become more compact (since it cares less about getting too close), while a higher CI will give it breathing room.

<<Clashing is the Van der Waals force, calculated by the Lennard-Jones repulsion potential, which approximates Van der Waals forces as a function of distance.>>

Backbone pin
Hidden in the View Options, the backbone pin sets the “root” of how your protein folds onto itself. Imagine holding a piece of cooked spaghetti with two fingers: the pin is where you are pinching it. This means that the protein will be rooted to the pin, not moving much nearby, and have a great deal of flexibility furthest from the pin. There’s one pin per “piece” of your pose, so if you make a cut, you now have two pins. A backbone pin can be moved, locked, and unlocked -- while it’s unlocked, it’s allowed to slide along the protein when doing something like wiggling. But if you lock it, the pin will stay exactly on that residue until unlocked or moved. Here’s a video demonstration if you’re a visual learner. Why move and lock it? Essentially, you want to put pins on the good parts, where you want the protein to stay still. It can take some getting used to, but once you attune yourself to the physics of pinning and wiggling, the backbone pin is a powerful tool for controlling how you shape your pose.

Blueprint
The Blueprint panel is not available for every puzzle, but it can be useful in the early game when you’re drafting out a secondary structure. Blueprint offers a selection of building blocks for constructing an ideal secondary structure, in particular with ideal loops. I see it as a modifying tool because it adds “torsion constraints” to your SS. If Blueprint is available, I recommend using it in the early game for drafting a structure and then removing the constraints (by dragging the yellow rectangles outside of the panel) during the “eke and tweak” late game.

Quicksave
Although it’s not really a tool on its own, the Quicksave feature is a beloved part of every veteran folder’s toolkit. By default, ctrl+shift+[number] will save to a quicksave slot, and ctrl+[number] will load that quicksave. This can be very helpful for trying different approaches in quick succession.

Tip for the pros: can manually access quicksave slots 1-8 using hotkeys, but recipes can use quicksaves from 1-99!

Selection Interface
While on the subject of “not a tool but incredibly useful,” the Selection Interface is your key to expert folding. There are some things that can only be accomplished in this mode. The Selection Interface completely revamps your UI, changing many hotkeys and mouse functionality. In this mode, you can select a set of residues to apply a tool to, allowing you to, for example, shake exactly 12 residues and nothing else. For precision folding, this mode is a must.

That’s a lot of information, so I’ll wrap up this part for now. Next time, we’ll look at some of your tools for optimizing your pose.

Until next time, happy folding!

Summary:

  • Use Freeze and Pull or Cut and Move to draft a rough shape in the early game
  • Use Ideal SS to turn a freshly-assigned secondary structure into the perfect shape
  • Shake and wiggle (and sometimes mutate) are your primary tools for cleaning up a hand-fold
  • Rubber bands are a versatile everyday tool for adding constraints to where things should go
  • Shake and wiggle are modified by clashing importance, backbone pins, and blueprint constraints
  • The most common hotkeys are Shake (S), Wiggle (W), Disable/Enable bands (D), Remove bands (R), and Mutate (M). For more hotkeys, check the wiki.

Can’t get enough tips? Check out the Black Belt Folding series on YouTube.

agcohn821's picture
User offline. Last seen 11 hours 42 min ago. Offline
Joined: 11/05/2019
Groups: Foldit Staff
Part 4


How to Foldit, Part 4: Putting the Fine in Refinement

Hey folders!
Dev Josh here with the fourth post in a series of tips for better folding. << As always, more detailed scientific information will be bracketed off like this. >>

Last time we covered your main tools and some modifier tools. Today we’re looking at your toolkit for pose optimization. In other words, “I have a draft, what now?”


Optimizers

Idealize
“The Idealize tool adjusts the lengths and angles of peptide bonds in the protein backbone.” What does this mean? Well, sometimes after merging cutpoints, your backbone ends up being unrealistically stretched out. Idealize will fix this so your pose is realistic again. This can have the unfortunate side effect of moving things around in a way you don’t expect or want, though, so it can be tricky to idealize while keeping everything where you want it. It can take a combination of freezing, lowering CI, applying Zero Length Bands (ZLBs), and even other cutpoints to idealize a problem without causing more problems.

Tip for the pros: The recipes Microidealize and Cut and Wiggle are great for idealizing.

Pick sidechain
Once you start trying to form hydrogen bonds (more commonly, hbonds, pronounced “H-bonds”) between sidechains, you’ll start to need to be very selective about your rotamers or the positions of the sidechains.

Although Shake selects the highest scoring rotamer, sometimes you’ll want to manually pick one to form more hbonds. In a pinch, you can do this with the Pull tool, but for fine-grain precision you want to Pick Sidechain. Open this up and you’ll see a cloud of possible configurations for your sidechain. On the right is a list of your options: the blue bar represents rarity, with more blue being more common.

Tip for the pros: In Selection Mode, you can select multiple sidechains at once.

What’s up with the names for these rotamers? They’re shorthand notations for the rotation of each bond within the sidechain, with “m” meaning “Minus 60 degrees,” “p” being “Plus 60 degrees,” and “t” being “Trans (180 degrees).”

Rebuild and Remix
These two tools, both similar in name and function, trip up every new player. But understanding them is critical for master folding, so let’s break them down a bit.

Both Rebuild and Remix are midgame refinement tools for finding a new shape for a fragment of your pose, that is, a contiguous section of residues. << Technically, “fragment” refers to the shape itself, and the section of residues is called the region. >> They do this by comparing the current shape of your fragment to a fragment library. By searching through this database of common (and good) shapes, the game can suggest alternative similar shapes that might be better.

<< For more on fragment libraries, check out this scientific article. >>

Rebuild is an automatic tool. Let it run and it’ll search through the fragment library for 3-residue fragments that could go into your selected section: you can rebuild any length, but it searches 3 residues at a time, using the rest of the residues you selected as a “tail” to bridge the changes with the rest of your pose. Manually rebuilding is hard, and most players rely on recipes like Enhanced Deep Rebuild Worst (EDRW) for rebuilding. (A full guide to using EDRW is available on the wiki!)

Tips for the pros:

  • Rebuild considers your assigned SS, freezing, and any bands attached to the fragment -- use these to influence what rebuild suggests for a new shape.
  • Assign loops to your fragment if you’re not sure what the SS should be.
  • When manually rebuilding, keep an eye on your options. Sometimes the computer will reject a low-scoring shape when the low score is only because of a clash that you can easily clean up!

Remix is a manual tool. Select a fragment of 3-9 residues and you’ll get a set of choices that you can flip through to see what shape might fit. Unlike rebuild, Remix has libraries for fragments other than length 3. Or another way to think about it: Rebuild forces random fragments of length 3 (or “3-mers”) into your selected region, while Remix looks for a complete fragment that’s compatible with your selection (the endpoints match).

Tips for the pros:

  • Each length has a different fragment library. If you don’t get any good options, try adding or removing a residue from your selection.
  • Try including the ends of your structures to get more options for your loops.
  • Use cuts to isolate the effects of your remixing.
  • Remix is a good tool if you’re looking for ideal loops.

Tweak
Tweak is a pretty versatile tool for adjusting your SS. Tweak can:

  • Straighten a helix
  • Straighten a sheet
  • Rotate a helix
  • Flip a sheet

Straightening a sheet is really helpful for lining up your sheets in the early game, as it gives them all the same flat line of action to initially set up your hbonds. Rotating a helix is good for burying those exposed oranges, as is flipping a sheet. On a sheet, the even-numbered sidechains are on one side, and the odd-numbered on the other (remember: this is why sheets are great when the sidechains are (orange, blue, orange, blue): flipping the sheet reverses which side each sidechain is on.

Tips for the pros:

  • Flipping a sheet is a non-reversible action, as in flipping one way and then the other way won’t bring you to your starting position. Use Undo instead if you don’t like your flip.
  • You can flip helices too! Assign the helix to be a sheet, flip it, then re-assign it to a helix. This has the effect of rotating the helix 90 degrees. You can use a similar procedure to rotate sheets. Veterans know this as the “Timo maneuver,” named after TvdL’s advice on the Black Belt Folding series.

Rama Map
The Rama Map, or Ramachandran Map, is notorious for being the tool that no one understands. But we’re about to fix that.

First: what’s the point of the tool? The Rama Map is the only in-game ability that directly lets you rotate a residue’s backbone without affecting anything else. When is this useful? You can use it to fix your backbone angles, quickly rotate a region of your pose, or even copy your loops.

How does it work? Each dot on the map represents one of your residues. When you click on a dot, the map will show colored regions for that amino acid’s preferred rotations. If the highlighted dot is in a white section, that’s unrealistic.

The x-axis and y-axis represent two different backbone angles << phi (φ) and psi (ψ) respectively. >> Because these are rotation angles, the length of each axis is 360 degrees. Imagine the map like a sphere that wraps around: the edges are -180 degrees and +180 degrees, which are effectively the same. These angles are what make a secondary structure what it is. In the last part, we discussed Ideal SS. Now I can tell you that this early-game tool will set the phi and psi backbone torsion angles to the ideal values for a helix or sheet.


Try it out for yourself!
With the Rama map pulled up, assign a loop or otherwise straight chain of residues to be a helix, then use Ideal SS and watch all the dots collapse down into one point in the red region!

What about the colors, what do they mean? Besides being the preferred angles, the colors are in ABEGO coloring, the same color scheme used in the Blueprint panel. Hop aboard the tangent train, because we need to go down the rabbit hole a bit for this to make sense.


ABEGO Coloring
The ABEGO color scheme categorizes backbone angles by “torsion angle class.” There are 5 classes, one for each kind of SS:

  • A (red) - right-handed (α-)helices
  • B (blue) - right-handed (β-)sheets
  • E (yellow) - left-handed (β-)sheets
  • G (green) - left-handed (α-)helices
  • O (no color) - this special case represents the cis omega (ω) angle; because it doesn’t relate to phi and psi, it has no color and isn’t on the Rama map; see the wiki page on ABEGO coloring for more information

So now we know that the red regions are for helices and blue are for sheets. Green is a bit less common, and is only a major region for glycine: the amino acid with no sidechain (giving it the most flexibility), although some other amino acids like asparagine have a green area too. Yellow is even rarer, only appearing at all for glycine.

But now we’ve introduced a new question: what are left-handed and right-handed secondary structures? If you are really interested in mastering the game or learning about the science, we’re going to take this tangent further to talk about chirality. Otherwise, feel free to skip this next section.


Chirality
Something has chirality if “it is distinguishable from its mirror image; that is, it cannot be superimposed onto it.” What does that mean? Well, take our hands for example, which are actually what chirality is named after (in Greek).

No matter how you rotate them, your left hand and right hand can’t be lined up while still facing the same direction (palm up). Amino acids have this property too. So a structure will either be “left-handed” or “right-handed” depending on which way the backbone is facing. This has important implications for designing loops and tertiary structures, or the way your secondary structures fold up with each other. But that’s getting a bit deep for right now, so let’s come out of this rabbit hole.

So how do you use the Rama map? Make sure every dot is in a colored region, and if you want to rotate your loops, you can adjust the rotation of each residue individually by shifting its dot around a bit.

The top of the tool also has a gallery preview of ideal loops to let you see what kind of loops look good. Some of these loops are also available as building blocks in the Blueprint panel. (Note that this view is in Stick mode, which I’ll cover in the next part.)

For more tips on using the Rama Map, check out Susume’s videos on fixing a bad dot and copying a loop.

I think that’ll about do it for this part, my brain is fried just writing this. In the next part, we’ll look at the tools for visualization: once you know how to refine, where do you refine? What do you look for, and how do you find it?

But remember, Foldit isn’t just about refinement! The primary scientific goal of Foldit is about trying many possibilities. If something’s just not working, scrap it! Try something else! Try two helices, try a helix and three sheets, or maybe four sheets and three helices with loops like bendy straws. Or look up examples of real-life proteins and try to make something similar to those. By trying everything, you’re learning faster and giving yourself more opportunities to discover something great.

Have a question? Leave a comment below, or make a post in our forums.

Until next time, happy folding!

Summary:

  • Idealize is an early/mid-game tool to fix your bond angles and lengths in the backbone.
  • The Pick Sidechain tool lets you manually select rotamers for finding the perfect rotation of a sidechain.
  • Rebuild is a mid-game tool that will automatically force length 3 fragments into a region to try to find a better shape for it.
  • Remix is a mid-game tool that lets you manually compare a region of 3-9 residues to the fragment library of that length to find a better shape for it
  • Tweak is an early/mid-game tool for straightening, rotating, and flipping your SS
  • The Rama map lets you visualize and manually adjust the rotation angles of your backbone
    • It uses ABEGO coloring, which classifies angles based on SS and chirality
agcohn821's picture
User offline. Last seen 11 hours 42 min ago. Offline
Joined: 11/05/2019
Groups: Foldit Staff
Blog 5


How to Foldit, Part 5: Seeing the Problem

Hey folders!
Dev Josh here with the fifth post in a series of tips for better folding. << As always, more detailed scientific information will be bracketed off like this. >>

Last time we covered your tools for optimizing and refining. But where do you look to make refinements? Today we’re covering your tools for visualization. Starting with...


Camera Controls


Controlling the camera is the first step to really feeling in control of the protein. By now you probably know about clicking and dragging on the background to rotate your view. But there are many more hidden features to the camera! I also cover these in my ">tutorial video on moving stuff around in Foldit.

Rotating
Clicking and dragging around the edges of the background will keep the camera’s position fixed and only spin it around while still facing your pose.

Zooming
You might know that scrolling the mouse wheel zooms in and out, but you can zoom in/out even faster by holding the right mouse button (or Ctrl + left click) down and moving the mouse forward and backward.

Panning
You can hold the middle mouse button down (or Shift + left click) to pan the camera with your mouse.

Resetting
The Home key resets the camera to the default position.

Focusing
Pressing Q on a residue will focus the camera on it, while Shift+Q on the residue will also clip the foreground of the protein so you can get a better view of the residue you’re trying to focus on. Pressing Q or Shift+Q on the background will reset the focus on the center of the protein, which is useful when your protein is off-screen.

Clipping and Fading
You can fade the lighting (“fog”) on the protein by holding Ctrl + Alt + Left click and dragging. Or you can clip the background entirely by holding Ctrl + Shift + Left click and dragging. You can clip the foreground by holding Ctrl + Alt + Left click and dragging. In combination with Shift+Q to focus on a residue, these controls let you see just the part of the pose that you care about.

Visualization Options


If you open your View options, there's an overwhelming amount of control that you get over how you look at the puzzle. And there's a reason for this: how you see the protein is really important! It's one of the major differences separating experts from novices. In fact, I wrote a paper on this. So let's break down your options to know what they all do and why you'd want to use them. I think of them in 8 categories: issues, bonds, hydrogens, sidechains, color, shape, personal preference, and puzzle-specific options.

Issues
These options shine in the mid to late game when you have a draft and you're looking to clean it up by fixing the problems with it. The three main ones of course are “Show clashes”, “Show exposeds” (unburied orange hydrophobics), and “Show voids.” When possible, you want to leave these off since they add a lot of visual clutter. But while you're focusing on them, this is critical information. A finished protein isn't going to have any clashes at all, and a natural protein rarely has voids, at least not big ones. Depending on the protein's function, you may or may not have a couple exposed hydrophobics -- there's not always a good way to bury every single one in prediction puzzles.

If you’re focusing on backbone refinement, it can be a good idea to trim the sidechains (see Sidechains) but turn on “Show sidechains with clashes or exposeds.” This makes the issues stick out really obviously.

Turning on “Show backbone issues” will point out places in your backbone that want to be Idealized. You can use Idealize or click the bubble to idealize the backbone: just watch out for the usual problems of Idealizing messing up other parts of your fold. This option will also show when a peptide bond is flipped 180 degrees <<the cis omega (ω) angle, or “O” in ABEGO;>> which are pretty rare, so you can click the bubble to flip it back. By mid-game, you’ll want this on for the rest of the puzzle.

The last super important issue finder is “Relative score coloring.” While using a score-based coloring (see Color), this option will color the pose relative to itself. Without this option, your scores might look pretty average across the pose, but with relative coloring turned on, the better parts will become more green and the worse parts will become more red. This is really handy for figuring out what sections of your pose need the most attention.

Advanced issue options for the pros
Although it’s unclear if this option is still working, “Show expected residue burials” is supposed to show a blue halo around hydrophobics that are expected to be buried inside the protein.
The isosurface, on the other hand, does work consistently and can be very handy for fine-turning the shape of your pose. Essentially, the isosurface shows the accessible surface area, or the areas where water can reach. You can use this to make sure the isosurface doesn’t have any holes that would let water get into your hydrophobic core. The blue and red coloration on the isosurface cloud shows the electrostatic potential.


Tip for the pros: Sometimes the isosurface is thought of (mistakenly) as showing potential connections for hydrogen bonds. This is because the same color scheme is used for both, and there’s often a correlation between electrostatic potentials and unsatisfied hydrogens. (In fact, electrostatic potentials in hydrogen bonds are a key part in why the BUNS objective was introduced.) For example, oxygen is usually negative when its hydrogen is unsatisfied, and nitrogen is usually positive. But this is only a correlation: hydrogen bonds and electrostatic potentials are two fundamentally different forces. You can see electrostatic potentials most clearly in the positively charged amino acids (arginine, lysine, and histidine) and the negatively charged amino acids (glutamate and aspartate).
<< Glutamate and aspartate are often equated with glutamic acid and aspartic acid. But the acid forms are actually when the residue is “protonated,” or has a proton added to it, which neutralizes its electrostatic potential. Only in the -ate form (the “conjugate base”) are these residues electrostatically charged. In a protein’s normal environment (that is to say, cellular pH), the deprotonated, negative form is more stable, which is why the -ate forms are more commonly talked about.>>

Bonds
These options are useful when you’re trying to form or monitor bonds, though you want to keep these options off when you’re not using them to lower lag. Most of them, prefaced with “Show bonds” are pretty simple: they show any bonds that connect to the structure described. You can show bonds for helices, sheets, loops, sidechains, or non-protein (such as ligands or DNA). Note that these options are not mutually exclusive: a hydrogen bond could connect from a sidechain to a helix and count as both.

But the most valuable option here is “Show bondable atoms.” This turns on the red and blue dots (<< acceptors and donors>>) that show you what atoms are able to form bonds. Looking at these opens up a whole new jigsaw game of connecting the dots, especially in light of the new BUNS objective and the hydrogen bond network objective.

Hydrogens
While showing bondable atoms is nice, it’s the hydrogens that really show you where there’s room for hydrogen bonds, especially on sidechains. In Foldit, hydrogens are tiny white dots. You want to Hide All when you’re not worrying about hydrogens, as this saves some visual clutter. Showing All is useful for getting a better visualization of the space that the hydrogens take up when trying to pack your protein. But the most valuable option here is “Show Bondable H,” because this will show you what hydrogens are still available for making new bonds. It’s a good middle-ground when looking at your bond networks.

Sidechains
Similar to the hydrogen settings, you can hide all of your sidechains to reduce lag, show them all to see how things are packed together, or “Show Stubs.” Stubs gives you a preview of where the sidechains are, and you can hover over a stub to reveal the full sidechain.

The other sidechain option is “Show sidechains with clashes or exposed.” As I talked about with Issues, setting your sidechains to “Don’t Show” or “Show Stubs” and turning this on can really help problems stick out visually.

Color
There are a LOT of options for coloring your pose, so let’s break them down one at a time.


AAColor
This gives each of the 20 amino acids a unique color. It’s particularly helpful when you’re looking for one amino acid type, like cysteines (to form disulfide bridges) or prolines (to find expected kinks/bends in your pose).


AbegoColor
Remember ABEGO colors from the Rama map? With this color scheme, you can directly see the ABEGO colors of your entire pose.


CPK
CPK is based on the existing CPK coloring convention in chemistry. This scheme keeps your backbone dark gray, but colors oxygens red, nitrogens blue, and sulfurs yellow in a “sleeve” around the atom. This view is handy for seeing cysteines to make disulfide bridges or looking for oxygens and nitrogens that might form hydrogen bonds. Be careful not to confuse CPK with donors/acceptors, as shown in the picture below.


EnzDes
EnzDes is short for Enzyme Design. This mode is very similar to CPK, except it colors the backbone green instead of gray. It also puts a black nub on prolines to mark their unusual relationship with the backbone. EnzDes is good for looking at ligands, like in the Aflatoxin puzzles or anything using the Reaction Design tool.


Hydro…
There are three versions of Hydro coloring: basic Hydro, Hydro with Score, and Hydro with Score and CPK. Your basic Hydro view colors the entire protein as orange (hydrophobic) or blue (hydrophilic). With Score, it leaves the backbone alone but colors the sidechains to show how they’re currently scoring. Add CPK and now it will also show CPK coloring as described above: red oxygens, blue nitrogens, and yellow sulfurs. Your choice of color here depends on what you want to see: choose the level of nuance to fit your needs. Personally, I like Hydro/Score as a generally versatile view while working early/mid game with the hydrophobic interactions.


Ligand
A ligand, for the purposes of Foldit, is a small molecule that’s expected to interact with your protein in some way. The ligand-specific view is similar to EnzDes or CPK, except it gives a unique color to the ligand itself. When working with a ligand puzzle, use this or EnzDes to focus on your ligand target. For more tips on ligands, see this wiki page.


Rainbow
In the words of the great S0ckrates:

“Rainbow isn’t only just pretty to look at, it’s a natural progression of ROYGBIV from one end of the protein to the other, so you can find the middle and ends really quickly.”

In Rainbow view, the first residue is violet and the last one is red. This view is useful when you’re trying to work from one end of the protein to another, such as in Electron Density (ED) puzzles, or techniques that “walk the backbone.” (Recipes that do this are sometimes called “walkers” or include “rainbow” in their name.)
Rainbow also does make a pretty picture when showing off your fold!


Score…
Like Hydro, Score comes in 3 versions: Score, Score/Hydro, and Score/Hydro + CPK. With pure Score coloring, the pose will show how well each residue is scoring: green for good, red for bad, and a very bland yellow in the middle. Adding Hydro will color the sidechains by their hydrophobicity. Adding CPK will color the tips with CPK coloring. If you’re not sure what other view to use, Score/Hydro + CPK is an excellent go-to default choice for making all of the information you need available.

If you’ve got Score coloring on, consider turning on relative score coloring, as described above. This will make your greens greener and your reds redder so you can more clearly see where your problem regions are.

Shape
The game calls these options “View Protein,” but I’m calling it Shape. Like Color, you have a lot of options here. These options drastically change how the pose itself looks.


Cartoon
Cartoon is the default view. It’s the easiest to work with and the best general purpose option. If views were fantasy RPG classes, this is your basic warrior with balanced stats. Cartoon has two slight variations: Cartoon Ligand and Cartoon Thin.

  • Ligand
    Cartoon Ligand makes ligands a bit thinner than in standard Cartoon view. I haven’t personally played around with ligands enough to know if this is helpful, but if your ligand is looking too chunky, this is your view.
  • Thin
    Cartoon Thin makes sidechains a bit thinner. The difference is slightly noticeable. Use this view if your sidechains feel self-conscious about their weight.


Sphere
I’m going out of order from how this list is presented in the game because Sphere is different from the remaining four views. Similar to Isosurface, Sphere gives you a preview of how much space your protein is actually taking up. Although it’s bulky to do any hand-folding with, sphere view is great for getting a better estimate of how bad clashes, voids, and exposeds are in real-life physics, and checking the overall shape and surface coverage of your pose.


Line, Stick, Trace Line, and Trace Tube
These four options are similar in that they more realistically show how the atoms of your protein are actually arranged, showing bonds as straight lines and atoms as corners. Line is the thinnest option, then Stick, then Trace Line, then Trace Tube is the thickest. These views are sometimes useful for ligands or Electron Density (ED) puzzles, though your mileage may vary. Have a good use for these options? Post it in the comments below!


Tips for the pros:
The cylinders in Stick representation are a good approximation of the electron density of the covalent bonds. Use this for estimating the real solidity of your protein’s electron densities.

Personal Preference
These options are mostly aesthetic choices.


Show Outlines
I’ve heard outlines can be helpful if there’s a guide, commonly called Quest to the Native (QTTN) puzzles. Mostly this is an aesthetic choice to make your folding look more cartoonish.


Light Background
Up until very recently, Foldit’s default background was an eye-bleeding bright yellow color. We recently switched it to default to dark mode, but if you’d like light mode back, here it is. If you’re reading this from the future and saying “Wait, the light background used to be the default and setting it to dark was buried in this menu?”, you’re welcome.


Pulse when working
Sometimes it’s nice to get visual feedback that your wiggle is wiggling or your shake is shaking. If you are photosensitive or want to save GPU power, turn this off.


Fade GUI
With this option on, the GUI will fade out after a few seconds of holding a mouse button down (as in, any camera movement). The GUI (Graphical User Interface) in this case includes the leaderboards, chat, and any information in the corner about your current tool and its operations. Good when you want a clean screen to look at your beautiful work.


Hide GUI
This option manually turns off the GUI elements described above. Good for taking screenshots and the like, or if you just want to be alone with your protein.

Puzzle-Specific
These options aren’t part of your everyday toolkit, but they can be critically important on specific puzzle types.


Show constraints
Sometimes a puzzle wants your pose to be in a specific area, such as attached to a binding target. With this option, you see those constraints as thin red lines drawing your pose to where it’s supposed to be. This is usually good to keep on, since they don’t appear while you satisfy the constraints.


Show mutated segments
This is the “track changes” option of Foldit. For every residue you’ve mutated to be something else since starting the puzzle, this will color that residue white. This is especially handy for evolving (evo’ing) other players’ solutions or just to remind yourself which residues weren’t what they started as.


X-ray tunnel for ligand
As far as option names go, this one confused me the most starting out. By now we’ve talked a bit about what ligands are, but what’s an X-ray tunnel? This one’s actually pretty cool, and best explained visually.

Essentially, this view gives you x-ray vision that will peel away any parts of the protein between you and your ligand. Critically useful when your ligand is hidden in the middle of a protein.


Show Symmetric Chains
A symmetry puzzle is where you design a pose that has one or more symmetric copies of itself. <<This is called a dimer for two copies, a trimer for three, and so on. Your original pose is called the monomer. When the monomer is put together with its copies, this is called an oligomer, though Foldit usually just calls it a “complex.”>> When working with a symmetry puzzle, this option shows the symmetric copies of your pose (the “main chain”).


Symmetric Chain Colors
Without this option, your symmetric chains will all be the same dark color. With this option, they each get their own unique color. Personally I just think it’s pretty.


Show Guide
If the puzzle has a guide, this toggles its visibility.


Pulse Guide
If the puzzle has a guide, this gives it the faintest pulse while a tool is running (similar to “Pulse when working”).


Color relative to guide
Contrary to popular belief, this option does not color your pose based on how close it is in aligning to the guide. Rather, it shows how well that residue is scoring relative to how well that residue scores on the guide pose. In this way, this option is a kind of relative score coloring.

Customizing Colors
In addition to colorblind mode, you can fully customize all of the colors of Foldit by editing the theme file.

The Best Visualization
Now you know all of the ways you can visualize your pose, but the best visualization of all is someone else’s eyes. When you have a pose that you don’t know what to do with or that you want to show off, use the camera button in the chat window to share a screenshot with other players. Or better yet, join a group and share your solution directly, then load their solutions and check out what they did. Evo’ing many players’ solutions is the best way to train your eye for what’s good and what’s bad when it comes to Foldit.

This is the end of the beginner’s introduction to Foldit. Congratulations on making it this far! In the next part, we’ll return to our original question: what makes a good fold? Back when you were just starting out, I gave you three simple rules. But now that you’re becoming an intermediate player, it’s time to add some nuance and talk about your other objectives for folding a scientifically useful and high-scoring solution.

Until next time, happy folding!

Summary:

  • There are a lot of camera controls
  • Your view options help you look at/modify:
    • Issues, like clashes, voids, and exposes
    • Bonds and bondable atoms
    • Hydrogens
    • Sidechains
    • Color
    • Shape
    • Personal preferences
    • Puzzle-specific stuff
agcohn821's picture
User offline. Last seen 11 hours 42 min ago. Offline
Joined: 11/05/2019
Groups: Foldit Staff
Part 6

How to Foldit, Part 6: Mastering Objectives

Hey folders!
Dev Josh here with the sixth post in a series of tips for better folding. <<As always, more detailed scientific information will be bracketed off like this.>>

In the very first part of this series, I said that folding had three simple rules:

  1. Not Too Close, Not Too Far
  2. Orange In, Blue Out
  3. Make Bonds

Now that you’re skilled in the ways of basic folding, it’s time to add some secondary objectives.

Objectives
Objectives, previously called filters, come in two forms: Bonuses that give points or penalties, and Conditions that set requirements for what your solution needs to be valid. Why do you need to care about Objectives? Well, these rules were added by the scientists specifically to make sure your solution is making a helpful contribution to science. Also, if you want to get to the top of the leaderboard, you need to be earning all of the possible objective bonuses. Why can’t you let your tools worry about objectives for you? Well, the basic tools like Shake, Wiggle, and Mutate don’t pay attention to objectives when working, so they won’t automatically solve the objectives. And most recipes temporarily disable Objective checking while running to speed up performance. (That’s another thing: Objectives are slow. If you don’t need an update on how you’re doing with them, disable them for less lag.) So what kinds of objectives come up?


Cutpoints
Your solution isn’t valid if you have any open cuts. This Condition will invalidate your score until you merge your cutpoints.


Evolving
When you’re evo’ing someone’s solution, you need to get a minimum of 2 points more than they did for it to count as “evolved.” This Condition invalidates your solution until you’ve earned 2 more points.


Ideal Loops
This bonus awards points for ideal loops. Ideal loops are common design patterns in naturally occurring proteins with short loops. These loops are known to be stable and reliable, folding up well, thus “ideal.” You can click “Show” to have Foldit highlight your bad loops in red. Use the Blueprint panel, Rama map, Remix, and ABEGO coloring to fix un-ideal loops.

“Why does Foldit say this is an un-ideal loop? I made it a sheet!” This is probably the most common issue people have with Ideal Loops. The reason Foldit does this is because its definition of loops is based on the SS h-bond patterns (h-bonds between parts of the backbone). You can see what Foldit thinks of your protein at any time by pressing the Auto-Structures button, which will assign secondary structures based on the current backbone h-bonds. For more tips on making ideal loops, check out this thread.


Core Existence
Having a hydrophobic core is really important -- so important, the scientists made an objective for it. It’s not enough to have hydrophobics, they need to be buried where water can’t get to them. To help you visualize this, you can “show” the Core Existence objective to color your protein orange (core), blue (exposed), and green. The green residues are on the “boundary”, meaning they are close to being buried but are still slightly exposed. Try switching to Sphere view or turning on the Isosurface to see where water can get in.

In symmetry puzzles, you might see this objective described as “Core exists: monomer.” Your monomer is just your “main chain,” the protein you’re working with. When the symmetric chains come together, they form a complex. That complex also needs a core, so sometimes there’s a “Core complex” objective. How do you get both cores at once? Try building your main chain as you would normally, with orange in the middle and blue outside, then make one outer side orange so it can stick to the symmetric chains and form the core of the complex.


Tip for the pros: When making your complex’s core, don’t forget to build a strong hydrogen bond network! The hydrophobics are what contributes to the strength of the core binding (in the Hiding and Packing subscores), but it’s the hydrogen bond network that ensures your protein won’t stick to anything else. The h-bonds give your binding specificity.


Residue Count
This one’s simple. Big proteins are sometimes hard to synthesize in the lab, so this Objective gives you a bonus for keeping your protein to a smaller number of residues. Generally, you can get the max bonus here by not adding too many more residues than you start with.


Secondary Structure
This objective places Conditions on what SS you’re allowed to make. Typically, it’s something like “your residues can’t be more than 50% helices,” but it can also be “your residues must be at least 50% sheets.” This condition ensures that the scientists get the types of designs they’re looking for. Just make sure your SS is designed appropriately (and check it with the Auto-SS button).


SS Design
This objective limits what AAs you can put in your secondary structures. For example, one condition might be “no glycine, proline, or alanine in helices,” while another bans cysteines entirely.


Residue IE
“IE” stands for “Interaction Energy.” This bonus gives or takes points based on how well your aromatic residues (Tyrosine, Tryptophan, and Phenylalanine; the ones with the rings) are scoring. Typically, they don’t score well if they’re on the outside, and they get more interactions in the core, especially with other aromatic residues. You can see more detailed information on how a residue is scoring by pressing Tab on it. Try pi stacking to get a better IE score! More on that at the end of this post.


Hydrogen Bond Network
An “hbnet” can be one of the best but hardest parts of designing a well-folded protein. If hydrophobics are like glue that will just stick to anything, hydrogen bonds are like tiny magnets that will clasp to each other. In this way, designing a solid hbnet is insurance that your design will fold how you expect, because the many hbonds will act like jigsaw pieces, ensuring the fold fits only in the configuration you’ve designed.

An hbnet is a “web” of hydrogen bonds that connect sidechains across multiple residues.

To form an hbnet, use “Show bondable atoms,” and “Show bondable H” to visualize the potential connections. When fine-tuning your pose like this, you’ll want to do this by hand. It’s a slow, precise process, but well worth the points you get out of it.

Although the scoring has changed slightly, the details of the hbnet objective are in this blog post. Basically, you want your hydrogen bonds to be good (which is based on the angle and distance of the connection), you want a lot of hydrogen bonds, and you want to minimize the number of BUNS (see BUNS below).


Tip for the pros: Be careful using CPK to look for hydrogen bonds. CPK shows you oxygens and nitrogens, and while this usually corresponds with being an acceptor or donor, in cases like histidine and the hybrids (serine, threonine, and tyrosine), this isn’t a one-to-one correspondence. Also, some oxygens and nitrogens can bond more than once! Tryptophan’s nitrogen only makes one hydrogen bond, but Lysine’s can make three! A full list of how many hydrogen bonds each sidechain can make is given in gallery form and table form on the wiki.


Disulfide Count
This bonus gives you extra points for every disulfide bridge you can form. Use recipes like Bridge Wiggle by Brow42 to bind cysteines together. Read the objective carefully (you can hover over it for more details), since the puzzle might only reward you for a certain number of disulfide bridges.


BUNS
BUNS stands for Buried UNSatisfied polar atoms, and is the newest objective in Foldit. Essentially, this objective asks that if you have any buried blues, they’re making every hydrogen bond they can make. As bkoep points out in the BUNS blog post: “It is possible for blue sidechains to fold in the core of the protein, but only if every polar atom makes hydrogen bonds.”

The BUNS objective was developed during the coronavirus binder puzzles, when most players thought that a good binder will have many hbonds to the target. In reality, hbonds at the interface (where the binder attaches to the target) are only good for eliminating BUNS. The real metrics for a good bind are DDG, SASA, and SC, which I’ll explain below.


Binder Metrics
Although they aren’t objectives yet, these metrics are important for binding, and will become objectives soon, so it’s critical to understand them for designing a good binder protein.


DDG
Short for “delta delta G” <<or ΔΔG>>, this metric captures the difference in energy between the bound and unbound states. So, using Foldit numbers, if you’ve designed a protein that scores 15,000 on its own, but then scores 16,000 when bound to the target, that’s a DDG of 1,000. A good bind has a very high DDG to ensure the protein really wants to bind to the target. To see how your DDG is doing, try moving the protein away from your target and back -- watch the difference in score: you want the protein to score well on its own (so it folds up properly in the first place), but even better on the target.

In the last section, I said that interface hbonds don’t improve the binding. Using DDG, we can understand why: if the protein isn’t bound to the target, the sidechains that would be at the interface will instead bind to water -- they still form an hbond, just not with your target! So binding with the target in hbonds only isn’t any stronger than just binding with water. That’s why you need hydrophobics for a good connection.


SASA
A good binder connects with the target on a large surface area. Or, in other words, a good binder has a large interface. But the scientists don’t care about the nooks and crannies, for reasons you can read on bkoep’s blog post. Instead, they only care about the surface area where water can already reach, which is why SASA stands for “solvent-accessible surface area.”


SC
A good binder fits like a glove. No, really, that’s all that shape complementarity (SC) means. If the target is glove-shaped, we want the binder to be hand-shaped. A good binding protein is designed to have an even contact with the target, complementing its shapes and bends.

Pi Stacking
In this post and the previous one, I made off-hand references to the fact that some AAs are good for pi stacking, but I didn’t really go into what that was or why you’d want it. Although pi stacking isn’t officially an objective, it can be a good motif to look for in making your protein as stable as possible.

Basically, by stacking aromatic rings from tyrosine, tryptophan, and especially phenylalanine on next to each other at the correct angles, you create extra attractive forces that hold them together better. << For more science on this, see non-covalent interactions and pi interaction.>> This can take some manual nudging, but it can boost your score.


Tips for the pros:

  • Check the Hiding and Packing subscores to see the effects of your pi stacking.
  • Use Sphere view to better visualize how your stacks are packing together.

Pi_stack_sphere.png

Are there other motifs to look for? Plenty! But we’ll save that for the really advanced stuff.

In the next part, we’ll look at some of the more advanced jargon so you can start speaking like an expert folder (foldologist?).

Until next time, happy folding!

Summary:

  • Objectives are extra challenges beyond “get a high score” that make your solution more scientifically valid.
  • Objectives are either Bonuses (reward points) or Conditions (required for a valid solution)
  • Solutions with cutpoints are invalid.
  • When evolving, you need to score an extra 2 points for the evolved solution to be valid.
  • You can get the Ideal Loops bonus by using Blueprint, Remix, ABEGO coloring, and the Rama map to fix your un-ideal loops.
  • You can get the Core Existence bonus by having lots of orange on the inside and blue on the outside.
  • The Residue Count bonus gives you points as long as you don’t add too many residues to your design.
  • The Secondary Structure condition limits what SS you can make.
  • The SS Design condition limits what AAs you can put in your SS.
  • The Residue IE bonus gives you points for packing aromatics well, which you can do by pi stacking
  • The Hydrogen Bond Network bonus gives you points for forming a web of hydrogen bonds across sidechains.
  • The Disulfide Count bonus gives you points for making disulfide bridges between cysteines.
  • The BUNS bonus gives you points for decreasing the number of buried hydrophilics that aren’t making hydrogen bonds (either by un-burying them or bonding them).
  • Other binder metrics that will be important soon are DDG, SASA, and SC:
    • DDG: the improvement in score between a binder’s bound and unbound states
    • SASA: the binding interface’s surface area
    • SC: how well the binder and target’s shapes complement each other
  • Pi stacking is a motif that improves how well your aromatics pack together.
agcohn821's picture
User offline. Last seen 11 hours 42 min ago. Offline
Joined: 11/05/2019
Groups: Foldit Staff
Part 7

How to Foldit, Part 7: Learning the Lingo

Hey folders!
Dev Josh here with the seventh post in a series of tips for better folding. << As always, more detailed scientific information will be bracketed off like this. >>

Now that you’re starting to become an intermediate player, it’s time to finally learn some of the jargon that the experts keep talking about. I’ve organized the list alphabetically with stars (***) marking how common each term is (more stars is more common).

Dictionary of Foldit

This list is for intermediate players who have read the previous posts in this series. Get the beginner’s 101 version from the wiki or Foldit in 20 words.
For a more full list, see the glossary and the Foldit common language.

AA (***) - Amino acid, the building block of proteins; there are 20 types; can be used synonymously with residue
AT - Acid Tweeker, a recipe genre by Steven Pletsch and later updated by Bruno Kestemont. This very late game recipe optimizes sidechain and rotamer positions, making your pose very stiff but squeezing out a final few points before the puzzle expires. Get a recent version here.
AWP - Auto Wiggle Power
Bander - A recipe that creates bands.
BiS (*) - Short for “Band(s) in Space,” see Spaceband.
BUNS - Buried UNSatisfied polar atoms, a problem in some Foldit designs. The BUNS objective was introduced to reduce the number of BUNS in player designs.
BWP - Banded Worm Pairs, a very late game recipe for squeezing out a couple of extra points. Originally written by KarenCH and later updated by Bruno Kestemont. Recipes with “worm” in the title likely derive from this recipe genre.
Condition - See Objective
Core (*) - The hydrophobic inside of a good protein.
Chain - A single protein formed by a string of amino acids. Chains are indexed by residues, such as reference to “residues 8-16.”
CI, Clashing Importance (***) - A scaling factor modifying how much the computer should consider clashing when shaking, wiggling, and mutating, changed by the CI slider. This changes the energy landscape, allowing tools like wiggle to more easily find a better fold. Your Score is always computed at 1.0 CI.
Decoy - A high-scoring fold that is very different from the native fold or desired designed shape. Decoys are “wells” (sometimes called “holes” by Foldit players) in the energy landscape, and are visible on a folding funnel.
Dihedral angles - The angle between two intersecting planes. The most important dihedral angles in Foldit are the phi and psi torsion angles which determine the backbone shape.
DRW (**) - Short for “Deep Rebuild Worst,” this recipe genre was developed by Rav3n_pl and Timo van der Laan. This late-game recipe strategy attempts to Rebuild the worst sections of your pose.
Evo (***) - Short for Evolver or Evolving. Evo’ing is the process of improving another player’s solution. Currently you can only evo solutions within your group. To be creditable as a solution you evolved, you need to earn at least 2 points more than the original solution. After 2 points, a solution is said to be evolved, or evo’ed. Evo’ing is one of the best ways to practice and learn from more advanced players. Because evo’ing focuses on late game refinement, it’s considered a separate skill from solo play.
EDRW (*) - Enhanced Deep Rebuild Worst (DRW), an improved version of DRW by Timo van der Laan.
ED, Electron Density (**) - See ED puzzles, the ED tool, or the ED tutorial level. The concept of electron density refers to measurements taken about the positions of electrons (parts of an atom) within the protein, such as X-ray diffraction or electron microscopy. Using these measurements, scientists have an approximation of how the protein is folded: ED puzzles are about finding a good fold within this noisy approximation.
Filter - See Objective
Fuse/Fuze (**) - A common strategy, both manually and with recipes, of changing the CI while stabilizing. The Blue Fuse genre of recipes is compared with the scientific algorithm Fast Relax.
GAB (*) - Short for “Genetic Algorithm for Bands/Banding,” this is a recipe genre originally developed by Rav3n_pl. The technique “breeds” bands (“critters”), iteratively evolving better-scoring bands. Some variations of GAB also used Bands in Space (BiS). This recipe genre emerged from “compressor” recipes, and works best as a mid/late-game refinement tool. A recent example of a GAB recipe can be found here.
Guide - A shadow protein used for reference. Puzzles with guides are often called QTTN puzzles based on the Quest To The Native intro puzzle; in these puzzles, the guide is usually used to show the native fold of the protein. You can also load a groupmate’s solution as a guide.
HWP - High Wiggle Power
JET - The JET 3.5 (“Join Evolver Team”) recipe by Bruno Kestemont, which has an updated version. This very late game recipe combines Local Wiggle Sequence (see LWS) and Acid Tweeker (see AT). The name comes from the recipe’s guarantee to gain 2 points so you can “join the evolvers.”
LWP - Low Wiggle Power
LWS - Local Wiggle Strategy (Strategy is sometimes replaced with Sequence or Shake), an old recipe genre for late game refinement.
Mojo - Your pose’s kinetic stability, or its tendency to return to its current shape after a change. See also Stability.
MWP - Medium Wiggle Power
Nub - The bends in a sidechain. See the Nub wiki page for more details.
Objective (**) - A secondary goal to the puzzle, shown below your Score. Objectives can be Bonuses, that offer points or penalties, or Conditions, which set requirements for what a valid solution is. Previously referred to as “filters,” because they filter the score through additional considerations.
Pose (*)- The shape a protein is folded in. For example, “This pose has six sheets in a circular pattern with a cysteine residue at the end.”
Qstab - a “quick stabilize,” see Stabilize
QTTN - See Guide.
Residue, “res” (***) - An amino acid, used to mean a single unit as part of a protein chain, often in reference to its position in the chain, such as “residue 17.”
Segment, “seg” (**) - see residue
SS, Secondary Structure (***) - The structure of residues as sheets, helices, or loops.
Spaceband (*) - A rubber band with one end attached to an empty point in the 3D space of Foldit, as opposed to having both ends attached to proteins or other puzzle elements. Spacebands can be used to pull or push the protein, or keep it where it is. ZLBs are a common type of spaceband.
Stabilize, “stab” - To shake and wiggle
Torsion angles - A type of dihedral angle. In Foldit, the torsion angles of phi and psi determine the shape of the backbone.
TvdL - Short for Timo van der Laan, the godfather of recipes.
Tweek - See AT (Acid Tweeker).
Vet - A veteran Foldit player, someone who is experienced with Foldit.
Walking - To perform an action region-by-region from one end of the protein to the other end. Examples of usage include “walking the backbone” and “walking the sidechains.” Recipes that perform actions in this way are called “walking scripts” or “walkers,” and often have “Rainbow” in the title, in reference to the Rainbow coloring which helps visualize the protein from one end to the other.
Wiggle Factor - A common option in many recipe dialog windows, this sets a multiplier for how many iterations of wiggle to run during the recipe. For example, a wiggle factor of 3 would run 3 times as many iterations of wiggle as the recipe would normally.
ZLB (**) - Zero-Length Bands, or spacebands with their length set to 0. This kind of banding is useful for keeping your pose relatively stable in the shape you’ve set while you apply a change that would otherwise cause it to fly apart, such as wiggling with many clashes. The popular recipe for applying ZLBs manually is Zero Length Bands by Brow42.

For more recipe terms, see this wiki page.

Here’s one last tidbit, just for fun: “Players, F.” This phrase comes from earlier scientific publications from the Foldit team, where the Foldit Players were treated as a single author, resulting in their “name” being abbreviated as “Players, F.” More recently, Foldit players are credited in papers as a “consortium,” or collection of authors.

Amino Acids

<<For more information on amino acids, see here.>>

Some amino acids prefer certain structures. The five amino acids that most prefer helices are summarized in the acronym “MALEK” (Methionine, Alanine, Leucine, Glutamate, and Lysine). Proline tends to break or kink a helix, though it’s sometimes the first residue in a helix. Glycine is rare in a helix because it’s too flexible. More detail on helix preferences are available here. For sheets, typical amino acids include the large aromatics (Tyrosine, Phenylalanine, and Tryptophan), and beta-branched AAs (Threonine, Valine, and Isoleucine). These AAs are most often in the middle of sheets, whereas Proline is more common at the edges. In loops between sheets, the most common AAs are Asparagine, Glycine, Aspartate, Serine, and Threonine. For more on amino acid structure preferences, see here.

Alanine, ala, A
Alanine is identifiable as the AA with a single little nub. This slightly hydrophobic AA likes helices but will fit nearly anywhere. You can swap another AA for Alanine with little change in SS, attributed to the Alanine World Hypothesis.

Arginine, arg, R
Arginine is a long forked hydrophilic, found evenly in helices, sheets, and loops, so long as it’s on the outside to interact with the polar environment. Arginine has many rotamers.

Asparagine, asn, N
Asparagine is a short, forked hydrophilic. The sidechain makes efficient hydrogen bonds with the peptide backbone, making N good at the edges of a helix and in turn motifs of sheets, capping the hydrogen bond interactions which would otherwise need to be satisfied by the backbone. Asparagine has a nitrogen donor and an oxygen acceptor.

Aspartic acid (Aspartate), asp, D
Aspartate is a short, forked hydrophilic that does not do well in helices. It has two oxygens which can be hydrogen bond acceptors.

Cysteine, cys, C
Cysteine is a hydrophobic with a sulfur at the end that can uniquely bind to other cysteine sulfurs to create a disulfide bond or disulfide bridge. These are stronger than hydrogen bonds, so try to pair up cysteines at right angles when possible to form these.

Glutamine, gln, Q
Glutamine is like Asparagine: a forked hydrophilic with a nitrogen donor and an oxygen acceptor. Glutamine is a little longer, so it’s better at stabilizing helices and sheets, and less often found in loops that need to be flexible.

Glutamic acid (Glutamate), glu, E
Glutamate is a forked hydrophilic with two oxygen acceptors, most often found in helices.

Glycine, gly, G
Glycine is a unique hydrophobic in that it has no sidechain at all. This makes glycine the most flexible amino acid. It’s rarely in helices because of this abundance of flexibility, but that makes it good for loops, especially U-turns. When a residue next to a glycine is having conformation problems, try transferring some of the bending to glycine.

Histidine, his, H
Histidine is a hydrophilic with a nitrogen donor and a nitrogen acceptor. It also has an aromatic ring, uniquely shaped as a pentagon in Foldit. Histidine has two tautomers.

Isoleucine, ile, I
Isoleucine is a branched hydrophobic with a chiral sidechain, most often found in sheets.

Leucine, leu, L
Leucine is a branched hydrophobic most often found in helices.

Lysine, lys, K
Lysine is a long hydrophilic with a nitrogen donor most often found in helices.

Methionine, met, M
Methionine is a long hydrophobic most often found in helices. Although it has a sulfur, it can’t form disulfide bridges like cysteine can.

Phenylalanine, phe, F
Phenylalanine is a hydrophobic with an aromatic ring, making it useful for pi stacking.

Proline, pro, P
Proline is the only amino acid that binds twice to the backbone. Because of this, proline excels at making bends and kinks, such as linking two sheets together. It’s rarely in the middle of helices or sheets, but it can often be found in the first turn of a helix. It’s weakly hydrophobic, but often on the surface of the protein.

Serine, ser, S
Serine is a short hydrophilic with an oxygen hybrid acceptor/donor. Because of its small size, it’s often useful in loops and less useful in helices. Serine is often found next to Threonine forming ST-motifs.

Threonine, thr, T
Threonine is a short forked hydrophilic with an oxygen hybrid acceptor/donor. Threonine is often found next to Serine forming ST-motifs.

Tryptophan, trp, W
Tryptophan is a large hydrophobic -- in fact, the largest amino acid -- and the only amino acid with two rings. Tryptophan is useful for pi stacking. and filling out the hydrophobic core.

Tyrosine, tyr, Y
Tyrosine is a hydrophobic with an oxygen hybrid acceptor/donor and an aromatic ring that can be used for pi stacking.

Valine, val, V
Valine is a branched hydrophobic most often found in sheets.

Wow, what an info dump! But at least now you can talk like the pros, saying things like “I was able to evo your latest share on MWP. I put on some ZLBs, gave it a good nudge, qstabbed and fused and it was evo’d. That share has good mojo. A bit of GAB and DRW overnight and we’ve got a top solution right there.” (This might be a bit much, I don’t think the pros would actually talk like this.)

Anyway, I’ll let you sit with that for a while. In the next part, we’ll start looking at the late-game. I’ve talked about recipes for a while now, and you might have even downloaded a couple out of curiosity. In the next part, it’s finally time to dig into them: what are recipes good for? How do you use one, or make one? Get your spatulas ready, because next week it’s time to start cooking!

Until next time, happy folding!

Summary:

  • Foldit has a lot of jargon. I covered some basics, like AA, core, CI, DRW, evo, ED, fuse, GAB, pose, res, seg, SS, spaceband, and ZLB.
  • Foldit has 20 amino acids. I covered some tips on identifying them and what secondary structures they prefer to be in.
agcohn821's picture
User offline. Last seen 11 hours 42 min ago. Offline
Joined: 11/05/2019
Groups: Foldit Staff
Part 8

How to Foldit, Part 8: Cooking Your Fold

Hey folders!
Dev Josh here with the eighth post in a series of tips for better folding. << As always, more detailed scientific information will be bracketed off like this. >>

Recipes. For some players, they build their entire strategy around them. Others only dabble. But whatever your strategy, there will be some recipes that you’ll find incredibly useful. So what are they?

A recipe is a script, a piece of code that tells the game to execute particular instructions in sequence. In a way, recipes are like mods, since they are player-created tools. And, much like mods in World of Warcraft and Angry Birds, Foldit’s recipes are written in the Lua programming language.

Recipes are stored in your Cookbook, where you can organize them into dividers/folders to categorize your recipes (use the plus button to make a new divider). Players who write recipes are called chefs, and like the movie Ratatouille suggests, anyone can be a chef. The wiki has tons of tutorials on getting started with Lua, and compared to other programming languages it’s relatively simple.

When running a recipe, don’t forget to click “Show Output,” which will pull up an output window so the recipe can tell you more information about what it’s doing and how well that’s going.

Downloading Recipes
Recipes can be downloaded from the Foldit website (or click the folder button in-game to jump to the recipes page). Just be logged into the game, then find a recipe and click the “Add to Cookbook” button. When a chef makes a recipe, they can either keep it for themselves, share it with the public, or share it with their group, making recipes one more reason to join a group.


Tip for the pros: Your recipes are stored locally in the all.macro file.

A Brief History of Recipes in Foldit
I mention this only because these terms are bound to come up as you play. Originally, Foldit only had “GUI” recipes, which was more like drag-and-drop instructions than programming. Then came Lua v1, which introduced proper scripting, and finally Lua v2 which replaced v1. You can still make GUI recipes for simple tasks (although some GUI features are currently broken), but the devs are planning on phasing this out in favor of giving Lua v2 more power and usability.

Types of Recipes
What can recipes do? According to a paper that the Foldit scientists published in 2011, they found four major types of recipes:

  • Aggressive rebuild
  • Change and stabilize (which they call “perturb and minimize”)
  • Local optimize
  • Banding (“set constraints”)

Of course, some recipes are hybrids of these approaches, and many more types of recipes have come out since 2011! So instead of trying to categorize recipes into types, I’m just going to review some of the best and most classic Foldit recipes. Get ready, this might be a long one...

The Best, Most Common Foldit Recipes
Organized alphabetically for reference. For sorting purposes, the recipe title comes first, e.g. Timo’s “TvdL DRemixW” becomes “DRemixW, TvdL”. This list was curated mid-2020. A more full, but outdated, list of recipes is available here.

AA Edit
Link to recipe page
Author(s): LociOiling
Usage: Early game design
Main tool: Design
Complexity: Simple
Duration: Short
AAEdit lets you quickly modify the entire protein’s amino acids by showing you the sequence as a string of one-letter codes for each amino acid. You can copy out the text, change the amino acid codes you want, then paste it back in and run to quickly make all of your specified mutations. Great for when you know what AAs you want to mutate and could use a word processor’s “Find and Replace” faster than doing them one at a time in the game.

Acid Tweeker (AT)
Link to recipe page
Author(s): Steven Pletsch and Bruno Kestemont
Usage: Very late game refinement
Main tool: Rotamer selection
Complexity: Average
Duration: Long
This “last step” recipe puts all of your sidechain rotamers in their optimal positions to eke out the last 1-2 points of a design. See the comment on the recipe page for usage instructions.

AFK (BounceWiggle)
Link to recipe page
Author(s): Enzyme and neon_fuzz
Usage: Late game refinement
Main tool: Fusing
Complexity: Simple
Duration: Long
BounceWiggle randomizes the CI then gives the protein a shake and wiggle, also known as CI wiggling. This is a decent late game recipe to run for a few extra points, jostling the protein into a slightly more optimal position.

Banded Worm Pairs
Link to recipe page
Author(s): KarenCH and Bruno Kestemont
Usage: Very late game refinement
Main tool: Bands
Complexity: Simple
Duration: Long
Sometimes called Banded Worm Pairs - Infinite Filter (BWP IF), this script iteratively tries pairs of bands based on the simulated annealing algorithm. This is a slow but steady late game recipe to run for a few extra points, jostling the protein into a slightly more optimal position.

BandFuze
Link to recipe page
Author(s): MurloW
Usage: Late game refinement
Main tool: Bands/Fuzing
Complexity: Average
Duration: Long
BandFuze combines several bander approaches using random bands and fuzing for some late game points. Several options are available for customization, including Quake (compression via bands and wiggling), BiS (spacebands), Shock (fuzing with short spacebands), and Tailgrab (banding the ends of the protein together and fuzing).

Band Equalizer
Link to recipe page
Author(s): truestone
Usage: Band utility
Main tool: Bands
Complexity: Simple
Duration: Short
This handy utility can rapidly adjust band properties. Want all your bands of length 3.5 to be length 2.0? Done. Want every band to be strength 10? You got it. Never manually adjust band properties again, this recipe does it all at once!

Band Sheets
Link to recipe page
Author(s): Serca
Usage: Early/mid game sheet alignment
Main tool: Bands
Complexity: Simple
Duration: Short
A beautifully simple recipe for banding your nearby sheets together. The default settings are great for automatically banding nearby sheets, but it also has options for adjusting the strength, length, and how far apart the recipe will look to find sheets to band up. If you leave the “Report” checked, it’ll tell you in the recipe output window what sheets it found and how far apart they were.

Blue Fuse 2020
Link to recipe page
Author(s): LociOiling and marsfan
Usage: Mid game refinement
Main tool: Clashing Importance
Complexity: Simple
Duration: Short
Based on the original Blue Fuse by vertex and Steven Pletsch, this recipe is a go-to for fusing your mid-game pose.

Bridge Wiggle
Link to recipe page
Author(s): Brow42
Usage: Disulfide bridges
Main tool: Bands
Complexity: Average
Duration: Short
This fantastic little recipe will band your cysteines together to form disulfide bridges, which is especially great if there’s a Disulfide Count objective in the puzzle. Once you run this recipe, it’ll ask you to input the residue numbers for the pairs of cysteines. You can find this using AAEdit or by turning your color view to AAColor and pressing Tab on the yellow cysteines to find their residue index. Type in the number pairs and the recipe will pull them together. There are plenty of options for customization, but the default settings work pretty well.

Compressor, TvdL DRW
Link to recipe page
Author(s): Timo van der Laan and LociOiling
Usage: Compression
Main tool: Bands, Clashing Importance, Rebuild
Complexity: Complex
Duration: Long
This recipe combines the classic TvdL DRW (see DRW) with some compression algorithms that band your protein together and wiggle at varying clashing importance to try to pack the protein a bit better. Just beware that this recipe has more customization options than you could ever want! For more good compression recipes, check out ComputerMage’s Compressor (based on compression recipes by Rav3n_pl) and Compress for Symmetry by Rav3n_pl and Brow42.

Cut and Wiggle Everything
Link to recipe page
Author(s): KarenCH
Usage: Mid game refinement
Main tool: Cut, Wiggle
Complexity: Average
Duration: Medium
It does what it says, and it does it well. This recipe inserts cutpoints along your protein, then wiggles everything. You can expect a decent amount of points from this.

Deep Rebuild 2020
Link to recipe page
Author(s): ProteinProgrammingLanguage
Usage: Learning
Main tool: Documentation
Complexity: Complex
Duration: Long
On the surface, this recipe is a copy of TvdL’s enhanced DRW (see Enhanced DRW). But what’s brilliant about this recipe is the exceptional amount of comments and documentation within the recipe code. If you are looking to get started with editing or writing recipes, this is a fantastic resource for learning about creating Foldit scripts in Lua. Read the comments on the recipe page for more usage notes. Despite having all of the complexity of DRW, ProteinProgrammingLanguage has taken the time to try to explain what all of the options do, so this recipe can help you make your way through the DRW options. Just be aware that depending on your screen’s resolution, you may not be able to see the entire recipe dialog box, making this recipe unusable. Since it’s primarily a learning tool for coding, I still think it’s a good recipe for what it does.

Doom, Team AD
Link to recipe page
Author(s): Rav3n_pl and Ocire
Usage: Compression
Main tool: Bands
Complexity: Simple
Duration: Medium
This simple compression script will band your protein together, throw in some shakes and wiggles, and generally score you points in the mid game.

DRemixW, TvdL
Link to recipe page
Author(s): Timo van der Laan
Usage: Late game refinement
Main tool: Remix
Complexity: Complex
Duration: Long
This variant of Deep Rebuild Worst (DRW) uses Remix instead of Rebuild. Like its original counterpart, DRemixW is good for late game refining, remixing your worst areas for the most improvement. This recipe is also a good substitution for DRW when Rebuild isn’t allowed in the puzzle. Note that as of writing this, DRemixW crashes on puzzles with the BUNS objective.

DRW, TvdL
See Enhanced DRW.

Enhanced DRW, TvdL
Link to recipe page
Author(s): Timo van der Laan
Usage: Late game refinement
Main tool: Rebuild
Complexity: Complex
Duration: Long
Deep Rebuild Worst (DRW) and its upgrade Enhanced DRW (EDRW) are so popular that they have their own wiki pages. In short, EDRW refines your late game pose by rebuilding the worst parts. The wiki has documentation on its usage to walk you through its complexities. If you’re interested in getting into Lua scripting, there’s even a heavily documented debuggable version of EDRW.

Fracture
Link to recipe page
Author(s): MurloW
Usage: Refinement multi-tool
Main tool: Multiple
Complexity: Average
Duration: Long
Fracture is an excellent multi-use recipe for everything from early to late game. It’s on the high-end of average complexity because there are many options if you’d like them, but the primary dialog is easy enough to use. Make sure to open up the recipe output for a quick guide to its usage: essentially, you check whether you’re doing early, mid, or late game, and which preset script you want. The options include DRW, Acid Tweeker, and RR, which stands for Rainbow Rebuilder. Like other rainbow strategies, RR walks the backbone and rebuilds in sequence.

GAB BiS
Link to recipe page
Author(s): Rav3n_pl and Bruno Kestemont
Usage: Mid game refinement
Main tool: Bands
Complexity: Complex
Duration: Long
This genre of recipes comes from the strategy of compressing the protein: make bands from everything to everything and the protein is bound to pack itself together better. But this recipe does two things differently. First, it uses BiS (Bands in Space, or spacebands). Second, it “evolves” the bands using Genetic Algorithm Banding (GAB). Basically, it spawns a lot of bands, then “breeds” the best-scoring ones and deletes the worst-scoring ones. This recipe also has options for mutates, fuzing, and qstabbing (quick stabilization: shake and wiggle).

Helix Curler
Link to recipe page
Author(s): Rav3n_pl
Usage: Helix optimization
Main tool: Bands
Complexity: Simple
Duration: Medium
This script is simple goodness for giving your helices a better curl. It bands your helix together and fuzes.

Local Mutate
Link to recipe page
Author(s): MicElephant
Usage: Early game mutation
Main tool: Mutate
Complexity: Average
Duration: Medium
There’s local shake and local wiggle, so why not local mutate? This handy script lets you specify some parameters for what that means to you: where to mutate and in what way. Great for when you only want to mutate a pocket of your protein to find a better fit.

Loop Rebuild
Link to recipe page
Author(s): spvincent
Usage: Mid game refinement
Main tool: Rebuild
Complexity: Simple
Duration: Long
In the same genre of DRW, this recipe rebuilds your loops for a better fit. By default, it will run forever, so you can leave it on overnight to wake up to better loops.

Maaa
Link to recipe page
Author(s): Bruno Kestemont
Usage: Early/mid game design
Main tool: Mutate
Complexity: Complex
Duration: Medium/Long
Short for Mutate Authorized Amino Acids, this recipe gives you more control over auto-mutation, letting you specify which residues you want to mutate and what they’re allowed to mutate into, with options like preserving hydrophobicity, trying different secondary structures, and more! See also the Mutate No Wiggle recipe.

MicroIdealize
Link to recipe page
Author(s): spvincent and LociOiling
Usage: Mid game refinement
Main tool: Idealize
Complexity: Simple
Duration: Medium
This tool lets you idealize your protein in sections at a time. It also has some customization options, like only selecting part of your protein, disabling filters (objectives), and adding more wiggles in between idealizing.

Mutate No Wiggle
Link to recipe page
Author(s): LociOiling
Usage: Early/mid game design
Main tool: Mutate
Complexity: Complex
Duration: Medium/Long
Like Maaa, this recipe gives you more control over auto-mutation, letting you specify which residues you want to mutate and what they’re allowed to mutate into, with options like preserving hydrophobicity, enforcing design rules, and preserving cysteines, glycines, and/or prolines.

Protein Print
Link to recipe page
Author(s): marie_s and LociOiling
Usage: Information
Main tool: Output
Complexity: Simple
Duration: Short
This incredible utility will tell you everything you want to know about your protein, from a summary of the subscores to the sequence information and contact maps. At the end of the quick script, it gives you a way to copy all of the information out to paste into a spreadsheet if you’d like to save the information for further use. Relatedly, experts can check out the Atom Tables recipe by Brow42 which identifies donors, acceptors, and polar hydrogens by atom number and prints this to the recipe output window.

QuakeR
Link to recipe page
Author(s): Rav3n_pl and GaryForbis
Usage: Compression
Main tool: Bands
Complexity: Simple
Duration: Medium
Quaking is a common strategy of compressing or perturbing the protein through a combination of banding and fuzing. This recipe is great early to mid game for squeezing the protein while it’s still flexible in order to reduce voids. The “R” in QuakeR is short for random. This genre of recipe originates from Quake by Grom, and now has many variants including QuakeR with Mutate by ZeroLeak7, Local Quake by spvincent, and Quaking Remix by Bruno Kestemont.

Quickfix
Link to recipe page
Author(s): spvincent and Bruno Kestemont
Usage: Mid game refinement
Main tool: Multiple
Complexity: Average
Duration: Medium
The point of this recipe is to “fix” issues while preserving your fold (keeping the structure of the protein intact). It uses a variety of methods from mutate to remix to idealize, and is a generally good go-to for mid game cleanup.

Random Idealize
Link to recipe page
Author(s): mirp
Usage: Late game refinement
Main tool: Multiple
Complexity: Average
Duration: Long
This popular late game recipe randomly idealizes parts of your protein and intersperses this with some fuzing and mutations to squeeze out some late game points.

SS Edit
Link to recipe page
Author(s): LociOiling
Usage: Early game design
Main tool: SS Assignment
Complexity: Simple
Duration: Short
This recipe lets you see and modify your secondary structure in plaintext, which can be helpful for SS drafting. Simply run the recipe, copy the text to your word processor of choice, edit it, and copy it back in to change the SS. You can also use this handy ruler, just make sure to use a fixed width font like Courier New:


         1         2         3         4         5         6         7
----+----0----+----0----+----0----+----0----+----0----+----0----+----0
LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL

There are many more great recipes, but this should be enough to get your cookbook started! Part of the fun of Foldit is discovering new recipes to use, so ask around and share your favorites on the Discord!

Until next time, happy folding!

Summary:

  • Recipes are player-made scripts that you can share, download, and run.
  • There are a LOT of good recipes to help you at every stage of your folding
    • Early game
      • AA Edit
      • Band Sheets
      • Local Mutate
      • Maaa
      • Mutate No Wiggle
      • SS Edit
    • Mid game
      • Blue Fuse
      • Cut and Wiggle Everything
      • GAB BiS
      • Helix Curler
      • Loop Rebuild
      • Microidealize
      • Quickfix
      • Compression
        • Compressor
        • Doom
        • QuakeR
    • Late game
      • Acid Tweeker
      • AFK
      • Banded Worm Pairs
      • BandFuze
      • DRemixW
      • (Enhanced) DRW
      • Random Idealize
    • Other
      • Band Equalizer (utility)
      • Bridge Wiggle (disulfide bridges)
      • Fracture (multi-tool)
      • Protein Print (information)
agcohn821's picture
User offline. Last seen 11 hours 42 min ago. Offline
Joined: 11/05/2019
Groups: Foldit Staff
Part 9

How to Foldit, Part 9: Black Belt Folding

Hey folders!
Dev Josh here with the ninth post in a series of tips for better folding. <<As always, more detailed scientific information will be bracketed off like this.>>

By now, you’re starting to reach the advanced stuff, the stuff of veteran folders. You can draft an early fold, clean it up, and have a cookbook full of recipes for squeezing out the most points possible. But if you’re like me, your mid-game is falling short, and sometimes you’ll struggle for what feels like too long with an issue that you just can’t seem to buff out. In this post, I’ll cover what you need to know for dealing with those irritating problems that get in the way of a beautiful fold.

Subscores
In Part 5, I covered all of the ways you can look at your pose, except for one really important one: the score breakdown. As mentioned very briefly in Part 6, by pressing tab while hovering over a residue, you can see all of the subscores contributing to how that individual residue is scoring. Let’s break down the breakdown:

  • Clashing - are atoms too close, creating clashes? To fix: try a shake and wiggle.
  • Packing - are atoms compacted together, or are there voids? To fix: reposition your secondary structures or try the Void Crusher recipe.
  • Hiding - are orange sidechains buried and blue exposed? To fix: orange in, blue out!
  • Backbone - are the backbone angles realistic? To fix: use Idealize, Ideal SS, or the Rama Map.
  • Sidechain - are the sidechain shapes realistic? To fix: shake and wiggle.
  • Disulfides - are the cysteines forming high-quality disulfide bridges? To fix: use the Bridge Wiggle recipe.
  • Ideality - are the peptide bonds ideal? To fix: use Idealize.
  • Bonding - are there hydrogen bonds? To fix: form hydrogen bonds.
  • Density - for ED puzzles, is the residue within the ED cloud? To fix: move the residue into the ED cloud.
  • Reference - this is the reference energy for each amino acid; it’s used as a normalization factor (balancing out the other subscores), and you can’t affect it.
  • Pairwise - how good are the electrostatic interactions? (Some amino acids are electrically charged: opposites attract and like-charges repel.) To fix: consider each amino acid’s electrostatic charge: are their hydrogens satisfied? Are the positives (arginine, lysine, and histidine) near negatives (glutamate and aspartate) instead of near other positives, and vice versa?

    You can get a summary of your score breakdown with this recipe by Susume.

    <<For more information on score parts, check out Alford et al. 2017 reviewing the Rosetta energy function.>>

    <<While we’re on the topic of score, let’s talk for a second about how your score correlates to something scientifically useful. In some cases, a high scoring fold is actually unlikely to fold up in real life! In the next part, I’ll talk about why that is (spoiler: it’s the energy landscape problem -- some seemingly good designs are actually decoys!). But why haven’t we been able to make a scoring system that accounts for what’s scientifically accurate? This blog post goes into some depth on why it’s really difficult to make a scoring system for Foldit. Ultimately, the scoring system is an approximation of the protein’s energy, because we don’t actually know how proteins fold. If we did, we might be able to automate a few of the things that we need players for (but not all of them!). Our approximation is pretty good at this point, so there’s a strong correlation between a high score and a scientifically good fold, especially with the recent objectives; but it’s important to remember that this is a correlation, not a perfect one-to-one match.>>

    Problematic Secondary Structures
    Got your sheets in a knot? Here are some tips on aligning sheets and flattening sheets. For fixing helices, I like Susume’s recommended strategy: since a helix turns once every 3.6 residues, make bands from residue N to residue N+3 and N+4. This will help wiggle idealize the turns of the helix in a gentler way than Ideal SS, and it can keep it that way while you’re working on the rest of your pose.

    This can get tedious, so luckily there’s a recipe called makehelix by jeff101 that does this for you. The dialog is a bit confusing, but all you need to do is set the first two sliders (“numa” and “numb”) to be the first and last residues of your helix, and the “way” slider controls what kind of helix to make, based on the descriptions Jeff gives. Ignoring way 2 (which is no longer useful now that we have the Ideal SS tool), you’re left with 7 ways to make a helix. The even-numbered ways are “right-handed” and the odd-numbered ways are “left-handed” (for more about that, check out “Chirality” from the end of Part 4). Ways 3 and 4 are traditional alpha helices, which will give you 3.6 residues to a turn. Ways 5 and 6 give you a “310” (or 3-10) helix, which is a bit thinner at 3 residues per turn. Ways 7 and 8 are pi (π) helices, or “4.4-16” helices, and are a bit wider at around 4.1 to 4.4 residues per turn. Finally, ways 9 and 10 are gamma (γ) helices, the widest variety, at 5 residues per turn. To quit, turn on the “qflag” before hitting OK.

    <<For more information on score parts, check out Alford et al. 2017 reviewing the Rosetta energy function.>>

    <<Pi helices are lumpy! Most pi helices are only 7 residues in length and don’t have regular dihedral angles like α-helices and β-sheets.>>

    Getting Unstuck

    What do you do when you’re stuck and don’t know what to try next? This question is unfortunately really important and really hard to answer. Here are some options:


    Ask Chat
    Post a screenshot in the chat or in your group chat if you’re in a group and get some feedback on your design.


    Sleep On It
    Sometimes you might feel stuck, but after resting overnight and letting it sit for a day, you’ll have new ideas when you come back to the puzzle.


    Try Something Else
    If you’ve come to the end of your design, maybe it’s time to try a new design! Pull everything apart or reset the puzzle and try something new. Or keep the one piece you really like and re-imagine the rest of the design.

    General Strategies

    For fixing a problematic region, try a local rebuild or a local wiggle. Especially for fixing problems that arose when merging cutpoints, it can be handy to freeze either end of the problem region (with a residue or two of leeway) and wiggle, rebuild, or remix that section. Some fusing can also clean up minor issues. However, if you’re still struggling with a problem, sometimes it’s best to really break things apart, whether this means restarting entirely and trying something new or ripping apart the problem section and re-considering that region. Chances are, if you’re having trouble getting the region to score well, nature would too, and would opt for another fold instead.

    For more strategies, check out The Foldit Labs.

    Playing in Parallel

    You’re in the middle of a solution when two different ideas come to you: what do you do? You’re at a fork in the road. Why not take them both? In the Undo menu, there’s a feature called “Tracks.” A track is like a save file within your solution: you can make a new track at any time, splitting the solution off to try an idea while your original track pursues something else.

    The only trouble with tracks is that you can only work on one at a time. If you have a good computer, you can run multiple games in parallel. “Multiboxing,” or running multiple clients simultaneously, is pretty common among the hardcore Foldit players, and if you enjoy the thrill of your computer crunching numbers for you, you’ll be happy to know there’s a precedent for pushing this to the limit. Check out the wiki page for more information on how to do this properly. Just note that multiboxing is not officially supported by the devs, so any issues you experience specific to multiboxing are unlikely to be fixed.

    See, Share, Evolve

    Another great way to figure out your problems is to cooperate! By evolving other players’ solutions, you can see a wider variety of solutions to the puzzle. And when you get stuck, you can share your solution and other players can help you fix it. Some groups have a shared shorthand notation for describing solutions, such as what recipes have been run on the solution already and who has worked on it. This forum thread is a great primer on some common shorthand descriptions.

    With these tips, your mid-game will be sharper than ever. And this concludes the intermediate section of my guide. Congratulations! You now have the knowledge to be a veteran folder! In the next part, we’ll end this series with some professional wisdom, linking to actual scientific research so you can start thinking and folding like a true scientist.

    Until next time, happy folding!

    Summary:

    • Your score is divided into score parts for each residue
    • Local rebuilds and local wiggles can clean up problematic areas
    • You can test out multiple ideas by creating separate tracks or playing on multiple clients
    • Still stuck? Try asking chat, sharing your solution, evolving other solutions, sleeping on it, or trying something totally different!
agcohn821's picture
User offline. Last seen 11 hours 42 min ago. Offline
Joined: 11/05/2019
Groups: Foldit Staff
How to Foldit, Part 10:

How to Foldit, Part 10: Thinking Like a Scientist

Hey folders!
Dev Josh here with the tenth and final post in a series of tips for better folding. <<As always, more detailed scientific information will be bracketed off like this.>>

Wow, it’s been a journey, huh? If you’ve made it all the way here, thanks for sticking through it with me! From beautiful proteins, to the tools of Foldit, to recipes and jargon, we’ve come a long way since the basics. But we’re not done yet! In this final post, my goal is to equip you with the tools you need to think and design like a scientist, and maybe even start you on your own quest to dig up scientific information from the corners of the internet. Most of what I’ll talk about here is specific to designing proteins, but hopefully this will help you start thinking about how you can apply scientific thinking and design patterns to other puzzle types, like small molecule and prediction puzzles.

What Makes a Scientifically Good Fold

What is protein folding, anyway? What makes one design better than another? If you care about your scientific contributions, then it’s not just the score that matters: you also need to think about your protein’s energy landscape (more on that here). In short, a given protein sequence will have a landscape of possibilities for how it can fold up in 3D space. A protein will tend to fold in the way that minimizes its energy, much like how wiggle will jostle the protein into a higher score. In this way, the protein will slide down the landscape to the global minimum.

(Conceptual illustration of a protein energy landscape from Dill, K.A. and MacCallum, J.L. (2012)

Why is the energy landscape important? Well, for one thing, this explains why high-scoring solutions are better: a higher score means a lower energy, and a lower energy means it’s more likely to be the bottom of the energy landscape, which is where the protein wants to be. But this also means we need to watch out for decoys in our energy landscape.

A decoy is a fold that is energetically low but not what we want the protein to fold into. Like a game of Plinko, if several outcomes are equally likely, we can’t control which one will happen. Likewise in protein folding, if there are multiple low-energy states, the protein could become any one of these.

Take a look at these two folds from bkoep’s blog. On the left, there’s really only one good low-energy state. On the right, there are a few different ways the protein can reach an energy minimum. If we’re designing a protein, we want the landscape on the left because it gives a better guarantee of reaching the design we want, rather than misfolding into a decoy. This type of energy landscape is called “funnel-shaped,” because as the protein folds, it follows the folding funnel toward the singular, native fold.

How can you test your design for decoys? Well, one way is to try making one! Once you’ve designed your protein, leave your amino acid sequence in place and try changing the secondary and tertiary structures of the protein. If you can find another fold that scores about as well, you’ve got decoys! You’ll need to change something about your sequence in a way that ruins the decoy’s score but keeps your original pose’s high score.

How does a protein fold into its desired state? The question of protein folding pathways comes up a lot about what steps the protein actually takes to go from an extended chain to a fully-folded structure. In theory, the short, strong, local interactions like secondary structure h-bonds are the first to form, and the tertiary structure comes next. In practice, smaller proteins do this in a single step, from unfolded to folded, but in larger proteins, there are separate stages of foldedness.

Designing Like a Scientist

How can you improve the scientific validity of your designs? One way is to read the scientists’ design critiques, like this one. Another way is to think about the designs in the same way the scientists do. For example, scientists look at sheets as aligning either parallel (going in the same direction) or anti-parallel (going in opposite directions, as in the image below). Anti-parallel sheets are generally stronger. A sheet can also be mixed, having some parallel and some anti-parallel strands. Check out this page for more!

Scientists also think about the tertiary structure, such as sheet sandwiches and beta barrels. When will a protein form a sandwich or a barrel? Check out this post for more details on that.

Lastly, scientists use advanced algorithms to try to predict what the secondary structure will be given a primary sequence of amino acids. One of the more common algorithms is PSIPRED (PSI-blast based secondary structure PREDiction), which you can use for free online. <<PSI-BLAST itself is an acronym for PSI-BLAST Position-Specific Iterative Basic Local Alignment Search Tool.>> Typically, for prediction puzzles, the Foldit scientists will give you PSIPRED’s prediction in the puzzle description or as the starting secondary structure. But PSIPRED can also be helpful for design puzzles as an indicator of whether your designed sequence might fold up how you think it will!

Following the Journey of your Designs

Ever wonder where your design goes when a puzzle is done? This blog post about the puzzle process covers every step that the designs go through: from hand-selection to bacteria growing to purifying, chromatographic measurements, and crystallization. For binding designs, the scientists use yeast display or fluorescence-activated cell sorting (FACS), which bkoep summarizes here. The Foldit team also put out a video about how coronavirus designs are tested.

The testing process is also described in this wiki page, which even summarizes the results of some successful player designs that have gone on to become real proteins!

Principles of Ideal Structures

Foldit players aren’t the only ones interested in designing good proteins. Some researchers have also been looking into what makes a good fold, and what they came up with is some principles for designing ideal protein structures. Within the Foldit community, these became known as the “Koga & Koga” rules, named after the first two authors. The paper introduces some useful terminology:


Unique ordered structures
When Koga and Koga use this phrase, they are referencing Anfinsen’s dogma, which claims that for certain sequences of amino acids, there will be a unique structure they fold into.


Kinked α-helices
Unlike most helices Foldit players design, natural helices tend to be bent or kinked.


β-strands
Up until now, I’ve been calling every piece of sheet a sheet. And, within the game, Foldit calls these sheets as well. But the technical terminology is that each sheet is composed of multiple strands.


Negative design
A good protein design not only encourages the right fold, it discourages decoys.


β-hairpins
A hairpin is a short loop that makes a hairpin-shaped turn between two anti-parallel strands.


Parallel and Anti-Parallel
In the last section, I talked about how sheets can be parallel or anti-parallel to each other. A connection between a sheet and a helix can also be parallel or anti-parallel. When you look at a sheet-loop-helix progression, if the end of the sheet (the last “pleat”) is pointing toward the helix, it’s parallel; otherwise anti-parallel. For helix-loop-sheet arrangements, it’s the reverse: a sheet pointing away from the helix is parallel.


Chirality
If you don’t remember chirality from Part 4, check it out again, because chirality is an important factor in Koga & Koga’s design rules.

Now, without further ado, here are the three Koga & Koga Rules:


Sheet-Sheet (the ββ rule)
A 2-residue loop most commonly has left chirality. A 3-residue loop also prefers left chirality, but less strongly. A 4-residue loop is split 50/50. A 5-residue loop is more likely to have right chirality. You can remember in this way: the shorter the loop, the leftier it wants to be. Notice also that this rule will almost always apply to anti-parallel strands because a short loop means the second strand is going in the opposite direction to the first one.


Sheet-Helix (the βα rule)
A 2-residue loop connecting a sheet to a helix is most commonly parallel, but a 3-residue loop is most commonly anti-parallel. Unlike sheet-sheet interactions, sheet-helix are more about “forward or backward” than “left or right.” Parallel helices “fall toward” the last sidechain, and anti-parallel helices “fall away from” the last sidechain. Or, with the last sidechains facing away from you again, the helix will be in behind for parallel and in front for anti-parallel. Check out Susume’s reference images below! (Remember that in Foldit’s rainbow view, the protein direction goes from blue to red!)


Helix-Sheet (the αβ rule)
Loops of length 2, 3, or 4 are usually parallel, though there are some anti-parallel natural proteins. The effect is strongest at length 2 and decays as the loop size increases. Loops of length 3 are most common in natural proteins. Just to confuse you, the parallel/anti-parallel rule is reversed here! For helix-sheet connections with the first sidechain of the sheet (the one closest to the loop, which last time was the “last” sidechain but now we’ve switched directions) facing away from you, a helix in front of the sheet is parallel, and a helix behind the sheet is anti-parallel. This means that normally, your helices will be in front of the sheet (parallel), but with longer loops, you can sometimes get away with putting it behind the sheet.

In summary, here’s a reference table:

Connection Loop Length Chirality (L/R) or
Parallelity (P/A),
With Strength (+/-) When properly oriented*, SS curves
Sheet-Sheet 2 L+ Left
3 L Left
4 - -
5 R Right
Sheet-Helix 2 P Behind
3 A In Front
Helix-Sheet 2 P+ In Front
3 P In Front
4 P- In Front

Connection Loop
Length
Chirality (L/R)
or Parallelity (P/A),
With Strength (+/-)
When properly oriented*,
SS curves
Sheet-Sheet 2 L+ Left
3 L Left
4 - -
5 R Right
Sheet-Helix 2 P Behind
3 A In Front
Helix-Sheet 2 P+ In Front
3 P In Front
4 P- In Front

*Proper orientation in this case means taking the sidechain in the sheet closest to the loop and facing it away from you.

The wiki page on Design Structures offers some tips on how to create these connections within Foldit.

Although Koga & Koga mostly studied short loops, loops can be much longer in natural proteins. However, the longer a loop is, the harder it is to predict exactly how it will fold, which is why the researchers focused on short loops.

A sequel to the Koga & Koga paper was what Foldit players call “Lin & Koga,” which has its own supplement as well. Unpacking this work, however, is left as an exercise to the reader, since there is no wiki page summarizing this work at the time of writing.

Motifs
Another way to think about your design is by identifying its sequential and structural motifs, like pi stacking. These motifs can join your toolkit like conceptual legos for designing proteins that mimic those found in nature.


Heptad Repeats
A heptad repeat is a common motif in helical bundles that consists of a repeating pattern of seven amino acids: orange, blue, blue, orange, charged, blue, charged << orange meaning hydrophobic and blue meaning polar.>> The charged amino acids are lysine, arginine, aspartate, glutamate, and sometimes histidine. << In the common leucine zipper motif, leucine is the fourth residue of the heptad. >> Check out tandem repeats for more about sequence repetition.


Sheet Motifs
Here are some common structural motifs for sheets and the loops that connect them.


Small Barrels
The small beta barrel is a bunch of sheets in a criss-crossing circle. There are two kinds of small barrels:


Up-and-down
The up-and-down is the simplest barrel, it’s just a bunch of strands going up and then down in a circle.


Jelly roll
The jelly roll fold or Swiss roll is usually made of four two-strand pairs is an elaboration of the Greek key motif described below. Its name comes from the resemblance to a Swiss roll cake.

(Image source)

If you’re curious to learn more, check out this scientific article on beta barrels, or look into beta propellers!


β-hairpins
As Koga & Koga describe, a hairpin is a turn when two anti-parallel strands are linked by a loop of 2-5 residues. Glycine and proline are very common in these loops.


Greek Key
The Greek key is named after the Greek meander art design. The simplest version of this motif has four anti-parallel strands where strands 1-3 have hairpin loops and the fourth strand bonds with the third but has a longer loop connecting it to the first strand, as in the image below. To extend this motif, imagine the third strand also being connected by a long loop to a strand that binds all the way to the left, to the first strand.


β-α-β
When the protein has a sheet-helix-sheet region, it almost always twists with right chirality. Another way to think about this is that if you’re facing the motif with the helix toward you, the helix attaches at the northwest and southeast corners. A similar β-α-β-α motif is seen in TIM barrels.


β-meander
This motif is just a bunch of anti-parallel strands all connected by hairpin loops.


Beta Bulge
A beta bulge is when there are extra residues on one strand of a sheet, causing the strand to bulge out because these extras can’t make hydrogen bonds with the other sheets. This almost always happens in anti-parallel sheets. Only 5% of bulges are in parallel sheets. Beta bulges are briefly covered in this blog post.


Psi-loop
The psi-loop (ψ-loop) puts a connecting strand in between two antiparallel strands, as shown below. This motif is much rarer than the others.


Helix Motifs
Here are some common structural motifs for helices. If you’re curious about more motifs, check out the helix-turn-helix.


Coiled-coil
A coiled-coil is when two helices wrap around each other in a “supercoil” structure. Coiled-coils often have a heptad repeat pattern to their amino acids. A common coiled-coil is the leucine zipper. A coiled-coil can also be considered a two-helix bundle.


Three-helix bundle
A helix bundle is when several helices are nearly parallel or anti-parallel to each other. The three-helix bundle is especially fast-folding.


Four-helix bundle
A four-helix bundle is a pair of coiled-coils. Most often, helices that are adjacent in sequence are anti-parallel. In four-helix bundles, there are usually sequence patterns, such as every fourth and seventh residue being hydrophobic.


Loop Motifs
One important fact to remember for loops is that loops are almost always on the surface of the protein, rather than being folded into the core.


U-Turn / Hairpin
A U-turn is Foldit’s term for a loop that bends 180 degrees. U-turns are most common between sheets, but can also include helices. They’re closely related to beta turns and beta hairpins.


Omega Loop
An omega loop is a long loop (6+ residues) that form the shape of the Greek capital letter Omega (Ω).


Beta Bulge Loop
A beta bulge loop is when 5-6 residues in a loop form a beta bulge. This usually happens at the loop ends of β-hairpins.

Further Reading
Where do you go from here? One great resource is bkoep’s three-part video series, “Through the Eyes of a Scientist” (Part 1, Part 2, Part 3). You can also check out this course on super secondary structures. For small, h-bonded motifs, this website is a great resource. This database lets you find real examples of protein motifs. Or look for scientific articles like this one, or blog posts like this one. There is plenty, plenty more out there, I’ve only just scratched the surface. But hopefully, this guide should have you thinking like a scientist and curiously consuming everything the internet has to offer on protein design and protein structure prediction. And when you do find something, share it with us! Talk about it on Discord, make a forum post, or write a wiki page!

This is as far as I can take you, veteran folder. By now, if you have read and understood this guide, practicing on your own in between each post and trying every puzzle available to you, then you have learned all that I have to teach you. Soon the student will become the master and you will go on to teach me things about Foldit.

Thanks for reading, thanks for playing, and until next time…

Happy folding.

Summary:

  • A scientifically good fold has a funnel-shaped energy landscape
  • For short loops between structures, the Koga & Koga rules describe whether they are likely to have left/right chirality and whether they are parallel or anti-parallel
  • There are lots of common sequential and structural patterns called motifs for how sheets, helices, and loops tend to fold up
Sitemap

Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, RosettaCommons