smortier, per your request...connection issues discussed in IRC. Loaded 1440 approx. 0100 PST. Beginning around 0200 until bed (0600), various rhythmic connection issues. Diagnostics said it was neither me nor Comcast, but Comcast is infamous for late Sunday night undocumented meddlings. "Signature" of such is usually the rhythm pattern. Aside from dropped connections (very unusual here), variety of upload/download problems on shares, continuing into today. Top share on 1440 crashing regardless of client I try it on...unusual problems, not the usual garden variety we're familiar with. Sorry for not being more specific, problem's being especially vague (all the usual diagnostics /corrections/workarounds done, to not much avail). Thanks...
The loops tend to go crazy if I move one fairly radically, then wiggle. They stretch their cutpoints out really long, then zip back and go beyond, causing intersections with other SS, and my score goes to like -400000. Have to run a recipe to slowly work out the stresses before I can work on the moved loop. Adding bands just aggravates the loops, makes them go wilder. Make progress slow.
I would like to know if the sponsoring parties also agree to not patent the designs or derivatives and place them in the public domain ?
A question on 1440, is it important to have as many bonds as possible on the ligrand? or just go for high score>
We expect that the best designs will make additional interactions with the ligand.
These interactions can be hydrogen bonds with polar atoms on the ligand; but they can also be nonpolar residues that pack nicely against the ligand, or sidechains that occlude voids nearby. Good designs may also include improvements that indirectly favor ligand-binding, by stabilizing protein residues near the binding pocket or elsewhere. We hope that (if the puzzle is set up well) the score will reflect all of these factors.
All findings from the Aflatoxin challenge puzzles will be in the public domain. This was promised in the challenge kick-off talks.
As Susume says, this was in the materials we received ahead of the launch event.
It's easy to miss, but it's in the puzzle description. It's also in the Foldit aflatoxin blog post, at the end, right above the logos and the footnotes:
By participating in these Aflatoxin Challenge puzzles, the players agree that all player designs will be available permanently in the public domain, and the players will not seek intellectual property protection over the designs created as part of the challenge.
This is a little different than paragraph g of the Terms and Conditions, which says the University of Washington decides ownership of results, including filing any patents. The T&Cs do say "[a]ll significant scientific discoveries [...] made in game will be made publicly available". (Publicly available does not mean free for commercial use, as in the case of the LZW data compression technique, where the algorithm was published, but you had to pay a license fee to use it in a product.)
I like the idea of the aflatoxin results being "public domain", especially given the nature of the problem. To me, "public domain" means no licensing fees, as in the case of a literary work being public domain, but I am not the authority here. (I hope players don't start lawyering up.)
[Edit: clarified "public domain" appears in both puzzle description and blog post.]
I would like to see a legally binding statement by the promotors (and affiliates) that they too will abide by the rule that all their findings and derivatives will be in the public domain.
I agree that another sentence or two wouldn't hurt, but the term "public domain" is reasonably clear, and it appears in the blog post and the puzzle description. I did miss it the first couple of times I looked, could be a late edit, but I think I got hung up on the part about players not seeking their own IP protection for results. The Foldit Terms and Conditions (dated 8 March 2013) never allowed that anyway, as far as I can tell, giving the University of Washington the right to decide who owns what, results-wise.
One handout we received before the aflatoxin launch mentioned all results being in the public domain. The blog post is basically the same text as the handout, but adds the part about players not agreeing not to seek IP protection.
I added a comment to the blog post asking for a bit more clarity on the IP issue.
Loci, Susume, The statement explicitly only mentions PLAYERS designs, NOT DERIVATIVE works from the SPONSORS. The fact that only players are so specifically mentioned casts the impression that it will be ok for sponsors to make patentable derivative works based on these (now free from claims !) 'players' designs.
These derivative works can then be turned into products and subsequently be sold to the very farmers we, the players, try to help. THAT patenting is what I explicitly would like to avoid, hence I will not participate in this project unless that is unambiguously clarified.
All the organizations involved (Mars Inc., Thermo Fisher, UW, etc.) have agreed that no one will pursue IP protecting the direct results from Foldit players in these Aflatoxin puzzles.
As Aubade01 noted below, there is nothing we can do about protecting derivatives of work in the public domain. Anyone (including the sponsors, their competitors, and Foldit players themselves) will have access to the Foldit player designs from these puzzles, and will be free to use the results to inform their own original research.
We nonetheless consider this an opportunity for Foldit players to do constructive science and make significant, positive contributions to aflatoxin research. However, I understand your misgivings, and recognize that any Foldit player may abstain from Aflatoxin Challenge puzzles if he or she so chooses.
As I read the T&C, the U of W has always been free to exploit our results commercially as it sees fit. Having the results at least start life as public domain seems like a step forward.
Developing any results into something actually useful would likely require substantial further R&D. Maybe a foundation or government would fund something like that, maybe not. "Public domain" means that any non-profit or for-profit entity could develop derivative works. I'm not sure that the sponsors would be the most likely candidates for that R&D role, but I don't see a good reason to bar them from pursuing it while allowing the big agrichemical companies to jump in.
Personally, I willing to hazard a certain amount of time and effort without any more details. A detailed licensing statement would be nice, but it would require lawyers to interpret properly, and of course other lawyers could then dispute those interpretations. I would prefer to look on the new puzzles as a gift and play the game.
Public Domain research, funded by government grants and published in research journals, is used by private companies doing corporate research on commercial for-profit products. There is no way to stop this unless the research is kept private and a commercial secret. Even then, if a new product using this new idea is sold publicly, the product and be reverse engineered by a competing company so they can make one like it. Patents can help protect a novel product, but in the case of very big commercial hits they can be invalidated on narrow technical legal grounds. Universities with a big commercial success idea will certainly make money. Graduate students (and on-line ones like ourselves) may get some recognition, but rarely financial rewards. Consider it a gift towards the public good.
>>>>> Aubade00 / 01
In the view menu, turn on Show bonds (sidechain), Show bonds (non-protein), and Show bondable atoms. This will let you see the oxygen atoms (red or purple) on the ligand (the aflatoxin) that can form H bonds, and see if there are are any bonds formed. It also lets you see the red, blue, or purple atoms of the protein that can form bonds to the ligand. Forming bonds to the ligand helps the protein grab it and break it apart, and also helps your score. If your View menu does not have the Show options listed above, click menu, options, advanced GUI to get the additional view options.
If your game is running slowly during wiggles, you may want to set View Sidechains to Don't Show, just during wiggles or scripts. The hotkeys for this are Shift-D (Don't Show sidechains) and Shift-A (Show All sidechains).
In addition to H bonds, the protein can grab the ligand by forming pi stacks with it. This is when the hexagon or pentagon of a sidechain lines up neatly with a hexagon or pentagon of the ligand. This Wikipedia article has a nice diagram of three different shapes that pi stacks can take.
In addition to Susume's tips, the Foldit aflatoxin launch blog post is a pretty good place to start. It's pretty long and somewhat technical, but there's a lot of good information. Here are some nut-and-bolts details that may help translate the science into Foldit terms.
In puzzle 1440, the aflatoxin is the small molecule that appears as segment 303. It's not connected to the rest of the protein, making it a "ligand". You can use "X-ray tunnel for ligand" in the advanced view options to help spot the aflatoxin, but the tunnel view doesn't always center the ligand. Rotating the protein with the X-ray tunnel open makes it easier to see the ligand. Unlike the protein, the ligand shows lots of double bonds, another guide to spotting it.
The aflatoxin in 1440 is locked, so you won't be able to use any of the small molecule design tools on it. Instead, you're allowed to design selected parts of the protein. You can use all the usual protein design and refinement tools, including mutate, on the unlocked part of the puzzle.
The starting protein in puzzle 1440 is a "gluconolactonase", specifically the one in PDB id 3DR2. PDB is the protein data bank, available at rcsb.org. (More on that in another post.)
Gluconolactonase is an enzyme which works on ring structures called lactones. The version of aflatoxin in puzzle 1440 has a lactone group. According to the blog post, degrading that lactone is "demonstrated to decrease aflatoxin toxicity by more than 20-fold". Unfortunately, the puzzle 1440 protein doesn't work on aflatoxin. The goal is to fix that. No one knows what changes to the protein might get the job done.
The protein in 3DR2 doesn't include aflatoxin. 3DR2 is actually a dimer, meaning there are two copies of the protein. There's only one copy of the protein in puzzle 1440. There are some calcium ions in 3DR2 that aren't included with the puzzle 1440 protein.
This post is the first in a series. More on some other slight differences between PDB 3DR2 and what you see in Foldit in the next post.
(Edit: couple of minor edits. Need either more sleep or more coffee.)
In puzzle 1440, the starting protein is based on a published experimental solution, which you can look at online. It's not clear how much this will help, since the goal is to change the structure and function of the protein, but the additional information in the Protein Data Bank (PDB) might at least provide inspiration.
The PDB website is at rcsb.org. You can look up proteins by their PDB id. Puzzle 1440 is based on PDB id 3DR2. A given PDB entry contains an abstract of the article describing the experiment, and lots of technical detail about the protein. A given protein may have multiple PDB entries, reflecting different experiments.
You can also look at the actual PDB data file, which, among many things, includes the XYZ coordinates of each atom in the protein. (The PDB files are available under the "Display Files" and "Download Files" dropdowns on a protein's page at rcsb.org.)
In the "Mol" viewers, all you see initially is a cloud of atoms. You can right-click on the background, then select Style -> Scheme -> Cartoon to switch to a view similar to the Foldit cartoon view. If you right-click again, and select Style -> Bonds -> 0.15 Ä, you get something similar to showing all sidechains in Foldit. These options are the same in JSMol and JMol, but I haven't verified them in PyMol.
One nice thing about the "Mol" cartoon view is that helixes and sheets have arrows (or rockets) pointing in the direction of the higher-numbered segments. This feature makes it easier to well which end of the protein is which and which way a structure is aligned.
The "Mol" viewers also have true (simulated) 3D options under Style -> Stereographic. Lots of fun if you "forgot" to return those red-and-blue glasses after the movie.
If you looking at 3DR2, you need to make a few adjustments to get to what you see in Foldit, especially at the segment-by-segment level.
3DR2 is a dimer, meaning it has two copies of the protein. In this case, the copies are not quite identical, due to problems with the experiment.
In the "Mol" viewers, you can limit the view to chain A, the first of the two copies. Right-click on the background and select "Console". In the console window, enter these commands:
These commands select the first copy of the protein, and limit the display to that copy. (Close the console window to see the protein again.)
The sequence for 3DR2 shows three additional segments at the beginning of both chains. The experiment didn't find these segments for chain A, so the model starts at segment 4. (For chain B, the first *four* segments somehow went missing.)
So, looking at chain A of 3DR2, segment 4 is segment 1 in puzzle 1440. You see the segment numbers (or "residue" numbers) when you hover over the segment in a "Mol" viewer, similar to shift-tab in Foldit.
Overall, segments 1 to 302 in Foldit are segments 4 to 305 of chain A of 3DR2 in the PDB. Segment 303 in Foldit is the aflatoxin ligand, which isn't present in 3DR2.
A few segments in the middle of 3DR2 also didn't show up in the results. For the A chain, segments 210, 211, and 212 are missing. The puzzle designers filled in the corresponding segments, 207 to 209, in puzzle 1440. These formerly missing segments are included in the ones which can be designed in 1440.
(Second in a series, collect them all!)
As Susume has noted, the protein in puzzle 1440 has a "beta propeller" shape.
This complicated shape no doubt is key to how the protein works as an enzyme. The abstract for PDB, entry 3DR2 describes it technical terms, saying it
...forms a novel disulfide-bonded clamshell dimer comprising two doughnut-shaped six-bladed beta-propeller domains, yet with an exceptionally long N-terminal subdomain forming an extra helix and four additional beta-strands to enclose half of the outermost beta-strands of each propeller.
At least the "clamshell" and "doughnut" parts are easy. Some of the other stuff requires more explanation.
A "dimer" is just a protein built in two parts. Both parts may be identical, as in 3DR2. (OK, *mostly* identical.)
The "disulfide-bonded" part refers a disulfide bridges that would connect the two halves of the dimer. Since one half is missing, these bridges isn't present in puzzle 1440. The cysteine segments at segments 5 and 157 in Foldit would join the two halves of the dimer. (Segment 5 on the A chain would be bridged to segment 157 on the B chain, and vice-versa, so two disulfide bridges.)
There *is* a bridge between segments 242 and 279 in 1440, but both these segments are locked, so there's no chance to do anything with it. Segments 15, 99, 258, and 295 also start out as cysteine, and are unlocked. You might be able to mutate other segments to cysteine to form bridges with them, or you can also mutate them to be something other than cysteine. The PDB for 3DR2 doesn't even list the bridge between 242 and 279, so you're on your own if you want to add more.
Continuing on in the abstract, "domains" and "sub-domains" are really just "parts" or "chunks".
The "N-terminal" of the protein means the part that starts at segment 1, the amiNo end. The other end is the "C-terminal", the aCid end. N-terminal refers to the nitrogen in the "amino group" of the amino acid. Segment 1 is the first amino acid added to the protein, and its nitrogen isn't bonded to anything.
"Beta-strands" are what we call "sheets" in Foldit. Technically, each flat zigzag part in the cartoon view would be a "beta strand", and you could only call it a "beta sheet" when two or more separate strands are bonded together. For Foldit, dropping the "beta" part of the name and just calling a strand a sheet works well enough.
The "beta" name came about because sheets/strands were discovered second back in the early days of X-ray crystallography. Helixes were discovered first, so you got alpha-helixes.
It's easy to find *three* additional sheets ("beta-strands") in the N-terminal part of 3DR2 and puzzle 1440. Each of these three sheets is bonded to a different but adjacent blade of the propeller. We'll look for a fourth "additional" sheet in yet another post, which will hopefully include some pictures.
Here's the aflatoxin ligand from puzzle 1440, segment 303. It's a little different than the structures you see for aflatoxin B1 (AFB1) and other common varieties of aflatoxin. For example, atom 24, at the top of the image, is calcium. Everything else is either carbon (light blue, using EnzDes coloring here), oxygen (red), or hydrogen (white). It can be a little hard to tell, but the hydrogens are atoms 26 through 38.
The scientists will have to explain exactly what's going on here.
The atoms were identified by using structure.GetAtomCount, then drawing a band between the ligand and a nearby segment of the protein. Atom 24 was a little difficult to pin down, but it was traced through the config files 0002004322.ir_puzzle.params and database/chemical/atom_type_sets/fa_standard/atom_properties.txt.
Puzzle 1440 aflatoxin ligand with atom numbers: Puzzle 1440, segment 303, aflatoxin ligand.
Thanks Loci! That makes it much easier to talk about specific atoms in the ligand!
The pentagon that hangs directly off the lactone ring (atoms 5, 7, 8) has two hydrogens each on carbons 7 and 8 (hydrogens 26, 27, 28, 29) so the hydrogens don't lie in the plane of the pentagon. Can that pentagon form a pi sandwich with an aromatic sidechain (or two), or do the hydrogens get in the way?
Also, can a pentagon and a hexagon form a pi sandwich with each other, or do they have to be the same shape?
Here's aflatoxin B1 as seen in JMol. It's quite a bit different than the ligand at segment 303 in puzzle 1440.
Aflatoxin B1: Aflatoxin B1 seen in JMol.
I think the only difference is that where ours has an OH (25, 38) and an oxygen with the calcium ion (23, 24), the B1 has only a single O. All the rest is the same. The reason the bonds look different is that foldit draws rings with double bonds all the way around as a convention, while JMol uses a different convention and draws some as double and some as single. The Baker lab scientists who were at the launch event explained this to me when I asked why the carbons in our ligand look like they have more than 4 bonds each.
They also said the ligand we were given is an aflatoxin that has been partly metabolized - that's how it gains the extra OH and the calcium. Edit to add: after re-reading rmoretti's comment on a recent feedback, I realize I completely misunderstood about the state of the ligand. It's not partly metabolized - it's a transition state - see s0ckrates' helpful analogy below for what that means, because I haven't really got a clue.
The Foldit convention of doubling all the bonds in a ring kind of makes sense. There's something about how there aren't quite enough electrons to go around, and they end up being shared among all the atoms in the ring. I think this case is sometimes represented as a hexagon with a circle inside it, but that may be something else. (Getting tired of looking things up on wikipedia.)
Nothing like getting used aflatoxin. Seems like par for the course. Other games probably have nice fresh aflatoxin to work with.
On a related (?) front, the ring where the calcium is attached in ours is probably the lactone group that's the target. Not sure if it could *still* be considered a lactone, due to the changes Susume has noted, but I personally don't need another funky/weird chemical name at the moment.
Speaking of chem names, the oxygen, carbon, and three hydrogens on the lower right of both images above is a methoxy group, which you may see as OMe on some diagrams. Or if you prefer, it's a methyl group (the CH3 part), also known as Me, bound to oxygen (O).
Phew, I need to rest for a bit.
Here are the details of the "propeller" shape of the protein this puzzle.
PDB id 3DR2 propeller blades: "beta propeller" seen in puzzle 1440
As described previously, there are six "blades" on this propeller. Each propeller has at least four sheets (or four "beta strands" if you're writing for publication). Three blades have an extra sheet taken from the "N-terminal domain", or the beginning part of the protein.
Blade six is a further exception. It actually has two of its sheets (strands) from the N-terminal domain. It's where the fourth N-terminal sheet mentioned in the abstract for 3DR2 is hiding. The first three sheets are attached to blades 1, 6, and 5, respectively. Then the N-terminal domain reverses course into a helix. The fourth N-terminal sheet ends up bound to the second one, forming the outer two sheets of blade 6 in this diagram.
N-terminal stuff aside, each blade is made from consecutive sheets arranged in a back-and-forth "anti-parallel" style. (Or maybe, "boustrophedonically".) The outer sheet from each blade goes into a somewhat long section of loop, crossing over to start the inner sheet of the next blade.
The blades in this protein are not at all the same, so it would make pretty shaky propeller. There's a small helix between the innermost two sheets of blade 1. Blade 3 has a much larger and very bent helix between its middle two sheets. (The dividing line I've drawn goes through this helix, but it's all part of blade 3.) That and the whole N-terminal part would throw the whole thing way out of balance.
There's nothing magic (or scientific) about how I numbered the blades here, but they are in segment order, except for the N-terminal contributions.
This image was captured from JMol. The annotations were done in Paint Shop Pro, which also handled the screenshots. It's not clear to me how much this information helps with folding, but it's kind of interesting in its own right.
Thanks to Loci for all the information. I'd found the JMOL info, but must admit I'm still a bit lost about what to do with this for a "good science" answer. I'd recommend we let the puzzle close first -- and then perhaps show some examples of how to solve this usefully
I can get solutions that bond the 3DR2 to the ligand, but they don't score well. I have ancient and limited chemistry background, so something directed at what sorts of things to work on before the next scoring puzzle would be really useful. Short of adding the dimer -- that would put the puzzle size beyond my machine
I'd be happy to contribute to a thread of questions after the close
I come from a gamer and science background, so here's my time to shine.
The protein we're working on here is an enzyme if I remember correclty, and part of an effective enzyme is stabilizing the "transition state" of its substrate.
Say we were trying to make an enzyme to snap a stick in half. The "transition state" would be a bent, but not snapped, stick. If we designed the enzyme to be able to have a super stable "bent stick state" (which is actually what we have in puzzle 1440, a transition state of aflatoxin), then it can do its job better. The bent stick state needs less energy to snap the stick then the normal stick state, and the enzyme's usual job for catalyzing is
To have a stable bent sti--er transition state, we just follow mostly the same rules for Foldit, except we're not so much concerned with blue/orange placement but we're more so focused on packing the aflatoxin in nice and cozy (minimize voids, that is) and hydrogen bonding the hell out of aflatoxin (to hold it in place for that "bent stick state") with a whole bunch of your polar blue boys ready to go.
The team has decided to release the results around mid-December when the aflatoxin challenges end. This way, no player can gain an advantage by viewing other players' solutions.