Here are some tips you can follow to increase your chance of us selecting and testing your design:
1) Don't bury the TNT ligand too deeply, this may disrupt the overall protein structure.
2) Avoid using charged residues (ASP, GLU, ARG, LYS, HIS) to make hydrogen bonds when possible (although one or two may be fine).
3) Make sure that the tail of the ligand has a way to get out of the binding site. The tail is the end of the ligand that is directed away from the center of the protein initially. (The linker isn't modeled in, but that tail is connected to more atoms, so make sure it can get out!)
Everyone who joins the group Go Science?
So when this is finished, you plan to get a protein that one of us crazy non scientists has designed, take it to the wet lab, and then see what happens when you throw it at a piece of TNT.
Brave. Very brave :)
Thank you both for your comments in the puzzle description and in the comments down below here. The both offer some helpful information.
If I could vote more than once there'd be yet another thumbs up.
Thanks to austinday for providing us with such detailed descriptions!
This is the first design puzzle I've attempted, and I'm not quite sure of the goal (aside from ftw points)
From the descriptive, it's seems* that the tail (at least) needs a way out of the protein. What's not quite so clear about the ligands in general is in regards to deformations in the ligand structure during the "adjustment" of the design (since presumably this is truly a refinement rather than a redesign puzzle at least in the ?polypeptide? (backbone) layout.
Specifically, is one of the goals to maintain the initial structural conformation of the ligand? As opened, the ligand is very monoplanar (aside from the "Y" structures on the head end.
Some adjustments result in the tail end curling slightly, but it would be very useful to understand if this is an indication of bad judgement with respect to "adjustments"...grrr tweaks, I mean tweaks! (not the foldit function, the common-english action of slightly modifying something) or just a necessary result of the transformations?
*having said that, by getting out of the binding site, I'm not exactly sure if that means literally getting out of the protein or just to another part of the protein. My assumption is/was that it means "a clear path to the outside", but it would be helpful to understand that statement in slightly more detail.
Perhaps it has to do with what a binding site is. Maybe a brief parenthetical ( a short description enclosed by parenthesis) after the use of something like "binding site", which for noviats and non protein scientists might be a bit opaque otherwise...
Also on a side note, I'm not quite clear on the "mutate side chains" function. Is that essentially an automatic version of "manually changing a single side chain into another" function from design mode that acts on the whole protein? I accidentally hit it instead of something else and noticed my score jumped a few points. (16 points to be exact)
I'll try to take you through the thinking process used to make this puzzle. I think that will *hopefully* answer the questions you had. So the overall goal is to make a physical protein, in the real world, that will bind TNT. Before we start designing the protein in the computer, we needed to figure out how to even test the designs. What we're going to be using is a yeast display system which requires that whatever molecule we are attempting to bind (TNT, in this case) be biotinylated (has an attached biotin group at one end). Also, if we just coupled a biotin directly to the TNT molecule, the biotin would most likely interfere with the way TNT binds. To reduce that possibility, there is a long linker (a long chain of carbon atoms, more or less) that is made in between the TNT and the biotin molecules. In the design puzzle, you only see the TNT molecule and part of that linker (since it would be cumbersome to include the entire linker and biotin in the puzzle).
As for the goal of the puzzle, there is usually a gap between what the score says and what we actually hope to get out of the puzzle. The puzzle setup requires quite a bit of refinement in order for the score to directly relate to what we would consider a good design. Because of that, I included those "hints" in the description such as making sure the tail has an exit, try to use fewer charged residues, etc... Since it's difficult to get the score to directly reflect that in a proper way. We do, however, log every move that is made, so even if you don't get the highest score and those tasty points, if your design is good, we'll find it.
Okay, so that's what I've been thinking. I'll try to answer some of the specific questions you had now:
The ligand should only be allowed to "bend" at certain joints. (ie. the aromatic ring in TNT has a resonance going on which favors their being planar with respect to the nitro groups). However, the two nitro groups near the linker have been found to be rotated out of plane in some instances. I believe the current version of foldit supports "joint" minimization instead of using a predefined library, which may be problematic (we'll find out after the puzzle is done). So basically, if you can bend the ligand into a conformation, the energy *should* reflect how good that new conformation is. In the real world, everything will be moving around and that tail will be flopping around and bending every whichaway that it can, so if you can find a minimum energy state (highest score state), that will *most likely* be the structure that the protein/ligand will take. (But there are also other considerations like enthalpy vs entropy which we aren't able to model very well...it's a hard problem)
An exit for the tail just means that if you imagine extending the "tail" portion of the ligand out further, the carbon that makes up the tail would be able to get clear of the protein without running into the backbone. (We can trim back the side chains if they're in the way)
The mutate side chain function (which can be used on all side chains at once, or on a subset) just uses the rosetta energy function to choose the side chain that gives the lowest energy at that position. Of course, if you make a selection of multiple side chains, it can become a huge non-linear optimization problem, so the algorithm may not find a global optimum, but it'll at least find a local one. (you can also check out the tutorials for more information on how to use that function) So we're hoping that's where the foldit players will come in. You guys are able to do a more efficient search of the possible sequence space and may have a better chance of finding a global optimum, or at least a better local optimum than rosetta would alone. Also, when you allow the mutate algorithm to "run free" on your protein, it will tend to favor charged residues where we wouldn't like them. (A problem with the energy function which is constantly being improved)
Oh! And some more hints:
1. It would be desirable if when you move the entire ligand away from the protein and repack, that the side chains which were making the ligand interactions don't move much. A less stringent test would be to just to repack with the ligand still there and make sure that the hydrogen bonding residues don't move away.
2. Also, If you can "back up" residues that interact with the ligand, that would help to "glue" them into place (which is desirable). When I say "back up", I mean you want to be making good hydrogen bonds not only to the ligand, but to the residues which interact with the ligand.
I hope that answers your questions! I think I wrote a little too much, but if you need any more clarification, don't hesitate to ask!
thanks for the detailed reply. That was exactly what was looking for. When I first got into this program, (well before my posted start date for this simulacrum me) I established a set of beliefs based partially on inchat discussion, partially on a wide range of preconceptions, about what was and wasn't the right way to think about how the protein needs to be "modified" for best purposes of the actual design/refinement/fold in the lab, some I think on target, some way off base. Posts like the ones you just made, I believe (if people read it thoroughly) will benefit the overall conceptualization about what needs to be done to make a good effort, whether high or low scoring... At least people like me, who can become fuddled by the various paradigms others are using to succeed (whatever that means).
I also really appreciate the comment about viewing all the solutions. It has concerned me that perhaps only top scores were evaluated. That goes a long way to remedying some or perhaps most of my worry about "productive time spent".
Thanks again. Very very helpful to me, and hopefully to others as well.
Why can we not mutate any residue?
I mean, if the goal here is to design a better binder for TNT and if the basic shape you gave us is to be used as a starting point, why can we not modify all of the positions?
What if (for example) we wanted to improve the rigidity (cross bonding) between the inner sheets and the outer helices?
What if we wanted to replace the outer helices with sheets in some places?
If we could modify the outer shell, we may be able to build a better cage that would allow us to push the ligand binding sites closer to the edge. With our detectors nearer to the surface, we would need a shorter "tail" and we would be more likely to catch the target molecule.
Am I just dreaming?
The main reason that we might not want to mutate the outer shell is that the structure could change considerably. In this puzzle, the backbone is held rigid-- but when the mutated folding actually folds, the rigidity doesn't actually apply... which means that you risk changing the structure too much by modifying the outer shell. In fact, even mutating only the inner residues is a risk because sometimes a difference of a single residue is enough to prevent it from functioning properly. (Example: the Delta F508 mutation, which results in the removal of a single residue from a human protein, is a common cause of cystic fibrosis.) This is part of the reason the resulting structure has to be re-checked with actual experimentation.
==> That being said, redesigning the outer shell is something that can be considered if necessary. Ultimately, it boils down to the issue of efficiency-- namely, whether the increase of complexity (larger search space) is worth the extra effort of coming up with an outer structure from scratch.
I appreciate the complexity to what we do in Foldit. And I understand the nuanced complexity of shell modification. Also, I admit that I am just a newbie compared to many of you in the Foldit community. However, I am not sure that Foldit has the tools we need to do this kind of binding design.
A: The full ligand is not available, only the active tail of it. Maybe the designers could have included a few more segments of tail to give us a better image of what the biotinized model would be?
B: The scoring system, by which many people are judging the "worth" of their modifications is based on optimizing the design protein and not necessarily the matching space to the ligand. The way I understand the criteria:
i: must bind to ligand
ii: must stay open enough to allow the ligand to reach the binding sites
iii: design for minimal protein energy that meets criterias i) and ii)
Maybe what I am really wishing for are the additional scores of "accessibility to the binding site" and "ligand compatibility"?
C: What types of changes are the designers looking for in the design protein once it binds with the target ligand? What I mean is that the normal shape (best energy score) of the design protein will be the one without the ligand present so it only makes sense that any binding to the ligand will alter that configuration in some way. How much of a difference should there be once the ligand is added to the mix? Should the "score" raise, lower, or stay the same? How does Foldit adjust the total score with the protein-ligand bonding in play?
As I said above: I am still very much a noob to folding and using Foldit, so please take my questions with a grain of salt.
@infjamc: Right on. The reason we hold the backbone locked and restrict the residues you can mutate is because the more we modify a protein, the less chance we have of it actually folding into that predicted structure. Additionally, we have found that doing design while keeping the backbone locked also yields a better chance. (Backbone movement is a REALLY hard thing to predict accurately.).
@saksoft2: A) I'll see if I can put up a pdb or an image of the full ligand we are trying to bind.
B) In the puzzle, the score contributions that the ligand participates in are upweighted relative to the rest of the protein (I believe I made this puzzle such that the interactions with the ligand are worth 2X). I totally understand the desire for a better scoring function for certain aspects. There are definitely things we can add, like a shape complementarity measure or a local measure of solvent accessible surface area (to perhaps get an idea of how solvated the last atom in the tail is?) I'm not a great programmer though, so I can only pass on these suggestions to the masters and hope there aren't too many other programming priorities ahead of mine. Although I'm sure if we get enough complaints/suggestions from players, that might get some attention. :)
C) I believe that the ligand score is basically the score difference between the score when the ligand isn't there and the score when the ligand is there. Due to the upweighting, the optimal score should be with the ligand present.
There are examples of proteins that don't really change their conformation much at all when they bind their target (antibodies), and there are examples of proteins that change around quite a bit upon binding (periplasmic binding proteins, streptavidin) so we don't think that it is necessary that we design in a movement that occurs upon binding. (it might be sufficient to try and design the perfect shape that doesn't move regardless of the presence of the ligand) That's kind of the strategy we are going for right now (since it's way too inaccurate to try and design in a backbone movement right now). That's kind of the idea behind that last "hint" I gave. (The one where it would be good to see that the residues that interact with the ligand stay in those positions even when you remove the ligand and repack.)
I think I might have answered a different question above, so I'll try again with possibility #2. In terms of the total score, the ligand interactions are scored the same as interactions in the rest of the protein except with a 2X amplification, in this case. The way the score works (more or less), is that there are various components that make up the score function (ie. terms that try to account for solvation, electrostatics, how likely a residue will take a certain rotamer position given its context, van der waals interactions, hydrogen bonding, etc...) All of these terms are basically weighted according to their importance (figured out via many methods, but one of which is by seeing how good the automatic algorithms are at reconstituting a native sequence/structure) So all these components are summed up for each residue (which makes up the residue score when you press tab over it), and all of the residue scores are added up to give the entire puzzle score. (which is why you can get "higher" scores on proteins with more residues.) The ligand is basically treated as another residue in this case, except that the scores attributed to it are upweighted.
I hope that makes sense! For sure let me know if I can clarify parts of that. (I wasn't sure if that was what you were asking, so I made it "brief"... sort of.)
I think my alternative solution is better then my top scoring solution : no ASP, GLU, ARG, LYS, HIS.
How can I choose what solution I submit ?
I save it at "2".
As Austin mentioned:
"We do, however, log every move that is made, so even if you don't get the highest score and those tasty points, if your design is good, we'll find it."
If you are referring to the CASP9 Design Contest,
That is the puzzle where we plan to examine your top scoring solution, but that was mostly for you not to worry about the scores of other players for that contest.
Sorry for the confusion, and if you find a cool shape for the contest (that doesn't have your highest score) feel free to let us know and post it in that Forum link!