Here is the CASP link for this target (showing the amino acid sequence):
Here is the sequence logo predicted by the SAM server.
H = helix
E = sheet
C = loop (or coil)
The taller the letter at each position, the higher the probability of that specific secondary structure for that amino acid.
You can see that these secondary structure predictions imply that this protein is mostly loop!
Link to template PDBs:
Perhaps it is more important to get the cysteines in the right spot rather than matching the amino acid sequence (these are insensitive to substitution in general, aren't they?)
This sequence has 1-11-5-3-1-5-1 loops. The templates are x-6-5-3-1-5-x. The next time a probable cysteine knot comes up, could you include the best homologue that has the same internal loop lengths, even if it doesn't match AA or type?
Should a disulfide bridge give a negative score? Can someone please explain this?
Negative disulfide score
Disulfides are tricky. They do score negative sometimes. This means when you shake sidechains, the client will usually break the brdige to improve the score, making it really hard to evolve a good bridge. This is why I made the Bridge Wiggle script.
However, that still didn't work well enough, because if you got into position for a good disulfide score, the sidechain score would be negative, like -20. Auuuugh! A shake sidechain would again break the bridge to raise the score, or you'd keep the bridge but have a low scoring segment. I've modified Bridge Wiggle to optimize to total sidechain+disulfide.
However, that still didn't work well enough. The protein isn't flexible enough to find a good scoring position, and if you did find a good position, the backbone is negative! Auuuugh!
This puzzle and 573b are easier because the templates all have proper bridges, so you start with high scoring bridges and they tend to stick around then....but people are passing me in rank and I bet they don't have all 3 bridges.
I don't know the details of how mini-rosetta scores this, but in general I think it measures dihedral angles and then looks up the probability of those angles based on real proteins. The wikipedia page here http://en.wikipedia.org/wiki/Disulfide_bond#Occurrence_in_proteins says that the C1-S-S-C2 dihedral is always 90 degrees, meaning the C1-S-S plane and the S-S-C2 plane formed by the last segment of the cysteines and the S-S bond itself, are rotated 90 degrees. Make an L of your two hands and join the thumbs, then rotate one hand 90 degrees....that's what it has to look like. Anything else is highly penalized with a negative score for being unrealistic.
That doesn't say anything about the C-S-S angle, but I strongly suspect that also needs to be 90 degrees, L instead of V.
I made Bridge Wiggle band up cysteines in the exact lengths to force these bond angles...I thought it would give me good scores...It didn't! Auuuugh!
I've suspected that maybe mini-rosetta needs to have a separate sidechain table for bonded and not-bonded cysteines. Maybe it does, or maybe they aren't really different. However, if foldit/rosetta consistently fail to converge on knotted conformations in cysteine knots, maybe they'll look into that. We'll find out when the CASP natives are released.
For the record, my current best on this puzzles has disulfide scores of 7 - 9 (9 is the highest I've ever seen) and sidechain scores of -2 - -1, which is pretty darn good. So maybe it does work. I got this far by using the templates, and by modifying scripts to 'restore-best-with-3-bridges' instead of just restore-best.