Back to Recipes Homepage
recipe picture
Recipe: AA Edit 2.0.1
Created by LociOiling 4 1
Your rating: None Average: 4.9 (27 votes)


Name: AA Edit 2.0.1
ID: 102879
Created on: Sun, 09/02/2018 - 22:33
Updated on: Thu, 04/16/2020 - 20:24

AA Edit displays the protein 's amino sequence using 1-character codes. For puzzles with "mutable" segments, it applies any updates entered to the protein. Version 2.0 handles multiple chains, and puzzles with DNA or RNA. Version 2.0.1 has a quick fix for proline at the N terminal.

Best For


LociOiling's picture
User offline. Last seen 2 hours 50 min ago. Offline
Joined: 12/27/2012
Groups: Beta Folders
more science...

AA Edit 2.0 now displays each protein chain or non-protein section separately, making it easier to use with the PDB and other "real scientist" tools. There are several other minor changes, such as displaying the segment count and revisions to the scriptlog output, plus new support for RNA, DNA, and other rare puzzle features. See "Version 2.0 changes" below for more detail.

As before, AA Edit displays the primary structure of the protein as a sequence of single-character amino acid codes. The sequence can be copied and pasted using the normal keyboard shortcuts. (For example, on Windows, ctrl-c for copy, and ctrl-v for paste.)

For proteins which contain "mutable" segments, a new sequence can be pasted into the "seq" box. When you click the "change" button, the recipe attempts to mutate each segment to the new value entered.

For puzzles with multiple chains, there will be chain A, chain B, and so on. The sequence for each chain can be cut and pasted separately.

The recipe does not attempt to change non-mutable segments. It does not change any segments where the current amino acid is the same as the new amino acid. It does not change any segments where the new amino acid code is not one of the 20 amino acids recognized by Foldit.

In puzzles with ligands, the ligand may be represented by one or more segments which return amino acid type "unk" instead of one of the normal codes. The recipe changes "unk" to "x" for display purposes. Since "x" is also not one of the recognized Foldit codes, it's ignored when the change is applied.

The "change" button is not displayed for proteins with no mutable segments, and any changes to the "seq" box are ignored in this case.

The general behavior of this recipe is similar to the companion recipe, SS Edit 1.2. If the input sequence is shorter than the length of the protein, the segments at the end of the protein are left unchanged. If the input sequence is longer than the protein, the extra portion of the input is ignored.

A new sequence can be pasted in its entirety by clicking to the left of the first character of the displayed sequence, then typing ctrl-v or the equivalent "paste" shortcut in your environment.

Alternately, you can clear the input field (via ctrl-x or the equivalent) then copy and paste the new sequence on a "blank slate".

The recipe also references the "select all" shortcut (ctrl-a or the equivalent). In general, this does not seem necessary in a Foldit textbox. For example, clicking anywhere the textbox and typing ctrl-c copies the entire contents of the textbox. Clicking anywhere in the textbox and typing ctrl-x clears the entire textbox. Use "select all" if your environment exhibits different behavior. Thanks to Bertro for pointing out the nuances of copy-and-paste in Foldit.

See also print protein 2.8 for a recipe which captures more information about the protein.

Version 2.0 changes

As mentioned above, handling multiple chains is the key feature in version 2.0.

A few puzzles have had move than one protein chain. Examples include the "insulin mutant", periodically seen as a revisiting puzzle, and the recent hydrogen bond puzzles 1552, 1555, and 1564.

AA Edit now displays each chain separately, so you'll see chain A and chain B for the puzzles above. Most Foldit puzzles have only one chain, which will be displayed as chain A.

For clarity, each ligand is now treated as a separate chain. On a typical ligand puzzle, you'll see protein chain A and ligand chain B.

AA Edit now also handles DNA and RNA, although testing has been very limited. Each section of DNA or RNA is treated as a separate chain. The single-character codes are displayed. We've only had one RNA puzzle, 1463, and it was all RNA (no protein), and did not allow mutations. Although there are mutable DNA intro puzzles, they don't allow recipes. Bugs may emerge if we ever get more DNA or RNA puzzles, but at least AA Edit won't display all x's any more.

AA Edit now handles non-standard amino acids, such as the ones found in puzzles 879 and 1378b. Both these puzzles had one amino acid with an attached glycan. These AAs were indicated by the code "unk", but unlike a ligand, had a normal secondary structure code (H, E, or L). AA Edit recognizes these non-standard AAs, and treats them as a part of the protein chain.

chain detection

AA Edit uses at least three different methods to detect chains.

For proteins, the atom count can be used to detect the ends of the chain. Since a disulfide bridge can also change the atom count, the recipe includes some disulfide logic, but it doesn't try to determine which cysteines are part of a bridge.

The secondary structure code is another indicator, with type "M" (for molecule) indicating a ligand. Each ligand is treated as a one-segment chain.

Foldit uses special codes for RNA and DNA. For example, "ra" indicates RNA adenine, and "da" presumably would mean DNA adenine. These codes are the third way AA Edit detects a different chain.

The current chain logic may not be 100%. For example, a glycosylated (had to look that one up) or other non-standard amino acid at the beginning or end of a chain might be an issue. There's also no known way to detect the case of two separate but adjacent RNA or DNA chains. Based on the limited information available, it seems like the atom count method won't work for RNA or DNA.

The code in AA Edit 2.0 is a work in progress, but may be useful in other recipes. The next step is to add the new logic to the "print protein" family.

[Edit: that's "insulin mutant"...]

LociOiling's picture
User offline. Last seen 2 hours 50 min ago. Offline
Joined: 12/27/2012
Groups: Beta Folders
Version 2.0.1 quick fix

Version 2.0.1 fixes chain detection when proline is at the N terminal. In this case, the atom count is 18, instead of 15 when proline is mid-chain. Usually the atom count increases by 2 at the N terminal.

(Edit: mid-chain count is 15.)

Want to try?
Add to Cookbook!
To download recipes to your cookbook, you need to have the game client running.



Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, Boehringer Ingelheim, RosettaCommons