Back to Recipes Homepage
recipe picture
Recipe: NetSurfP 1.0
Created by LociOiling 2 1
Your rating: None Average: 5 (1 vote)


Name: NetSurfP 1.0
ID: 102278
Created on: Sun, 01/08/2017 - 23:27
Updated on: Sun, 01/08/2017 - 23:27

Convert NetSurfP webpage output into secondary structure prediction and copy-and-paste spreadsheet format.

Best For


LociOiling's picture
User offline. Last seen 23 min 37 sec ago. Offline
Joined: 12/27/2012
Groups: Beta Folders
convert NetSurfP to secondary structure string and tab-delimited

NetSurfP is yet another web-based secondary structure prediction service. (JPred is another.)

NetSurfP outputs its results in a columnar format. The predictions for helix, sheet, and loop are expressed as probabilities.

(NetSurfP also predicts he "surface accessibility" of a given residue, which seem to be more or less the inverse of the likelihood the residue is buried in the hydrophobic core.)

The formatting for the NetSurfP results doesn't lend itself to being pasted directly into a spreadsheet.

This recipe does two things. First, it converts the NetSurfP output to a tab-delimited format that can be pasted into a spreadsheet. Second, it creates a secondary structure string. Both formats can be found in print protein 2.4.

To use this recipe, run a NetSurfP prediction, then copy the output to the clipboard. (It's not necessary to include the comment lines, but you can if you wish.)

Run the recipe, and paste the NetSurfP output into the textbox on the first screen. When you click OK, the secondary screen displays two text boxes, one containing the spreadsheet output, and one containing the secondary structure string.

The secondary structure string is created by picking the secondary structure type with the highest probability for each segment. The picking logic is quite simple, and doesn't worry about ties or close finishes.

This recipe depends heavily on the NetSurfP output format. Any changes to NetSurfP output will require revisions to the recipe.

Sample NetSurfP output:

# For publication of results, please cite:
# A generic method for assignment of reliability scores applied to solvent accessibility predictions.
# Bent Petersen, Thomas Nordahl Petersen, Pernille Andersen, Morten Nielsen and Claus Lundegaard
# BMC Structural Biology 2009, 9:51 doi:10.1186/1472-6807-9-51
# Column 1: Class assignment - B for buried or E for Exposed - Threshold: 25% exposure, but not based on RSA
# Column 2: Amino acid
# Column 3: Sequence name
# Column 4: Amino acid number
# Column 5: Relative Surface Accessibility - RSA
# Column 6: Absolute Surface Accessibility
# Column 7: Z-fit score for RSA prediction
# Column 8: Probability for Alpha-Helix
# Column 9: Probability for Beta-strand
# Column 10: Probability for Coil
E T  Sequence               1    0.865 120.003   0.476   0.003   0.003   0.994
E E  Sequence               2    0.758 132.423   0.324   0.694   0.003   0.303
E E  Sequence               3    0.741 129.488   0.588   0.782   0.003   0.216
E R  Sequence               4    0.409  93.707   0.281   0.858   0.002   0.139
E K  Sequence               5    0.380  78.063   0.459   0.923   0.002   0.076
E K  Sequence               6    0.597 122.844   1.114   0.938   0.007   0.055
E E  Sequence               7    0.609 106.340   1.316   0.970   0.001   0.030
B I  Sequence               8    0.066  12.284  -0.022   0.970   0.001   0.030
E Q  Sequence               9    0.436  77.941   0.867   0.970   0.001   0.030
E K  Sequence              10    0.613 126.012   1.216   0.970   0.001   0.030
LociOiling's picture
User offline. Last seen 23 min 37 sec ago. Offline
Joined: 12/27/2012
Groups: Beta Folders
note on line formats for Mac and Linux

I had to kludge a bit to get the line breaks on Windows correct. When pasted into a Foldit textbox on Windows, the lines apparently retain the CRLF format (0x0d0a). The Lua regular expression used to read the pasted lines allows for a carriage return.

On Mac and Linux, the pasted lines will probably have just a newline (0x0a). Hopefully, the code allows for this, but it's not been tested on other platforms.

Please send a PM if there are problems.

Joined: 09/24/2012
Groups: Go Science

Interesting to define own structure from Escel sheet, with own preferences.
The SS proposed by this recipe can be copy-paste to note 19 then run in order to implement it on the puzzle.

Want to try?
Add to Cookbook!
To download recipes to your cookbook, you need to have the game client running.



Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Microsoft, Adobe, RosettaCommons