fun with clay - My software as a protein

Case number:699969-989581
Topic:General
Opened by:Seagat2011
Status:Closed
Type:Suggestion
Opened on:Wednesday, April 20, 2011 - 06:25
Last modified:Saturday, July 27, 2013 - 05:36

My software as a protein

I decided to design my software as a protein. The goal was to be able to reverse engineer it.

APIs - Active Sites

Helices - for/while/repeat commands
Beta strands - horizontal function calls
Loops - linear flow of execution

I use helices to represent while-loop procedures because of their natural way of looping while also flowing forward.

I use beta strands to represent lua functions because of their "cross-linking" behavior.

foldit uses 26 common amino acids. So I decide to allow each instruction word 26 degrees of freedom (00-25,a-z)

lua keyword -- Amino Acid (AA) designation -- comment

for - A(00) - philic - Used to indicate the start of a helice (software loop)
while - B(01) - philic - Used to indicate the start of a helice (software loop)
do - C(02) - phillic / phobic - I designate as phobic because do prefers to be inside the software loop
end - D(03) - philic - Used to indicate the end of a helice (software loop) or beta strand (lua - api / function call)

In lua we also have "print" statements, which are used to send visual feedback to the outside environment :) I needed a way of also doing this in proteins. I decided to use "aromatic" AAs, which absorb UV light. Aromatic AAs help determine the concentrated amount of protein present by measuring the amount of PHE, TYR, and TRP present in the polypeptide chain.

lua keyword -- Amino Acid (AA) designation -- comment

print - E(04) - no pref - aromatic - emit to the outside environment
print - F(05) - no pref - aromatic - emit to the outside environment
print - G(06) - no pref - aromatic - emit to the outside environment

I choose to leave H(07) - Z(25) undesignated: These can be used to define custom variables, procedures, and function calls :) You will also see later, in solving the overall structure, it doesn't matter what these lengths are.

Here is my original piece of code:

local function _compile_segment_score ( total_segs )

local segment_scor = 0
for idx = 1,total_segs do
segment_scor = segment_scor + get_segment_score ( idx )
end

return segment_scor

end

do

local segment_score
local total_segments

segment_score = 0
total_segments = get_segment_count ()

segment_score = _compile_segment_score ( total_segments )

print ( segment_score )

end

..in the above code we have:

2 Active Sites (i.e. 2 API calls: get_segment_count, get_segment_score )
1 beta sheet (2 strands, i.e. 1 function call with return)
1 for
1 do
3 end
1 print
4 custom variables (i.e. segment_score, total_segments, segment_scor, total_segs, idx )

..and now we can assign residue types..

2 Active Sites - get_segment_count - K(10), get_segment_score - M(12)
1 beta sheet - (2 strands) - _compile_segment_score - Z(25), with return and end (accounted for, below)
1 for - A(00)
1 do - B(02)
3 end - D(03)
1 print - E(04)
4 custom variables (i.e. segment_score - N(13), total_segments - O(14), segment_scor - P(15), total_segs - Q(16), idx - R(17) )

..now line up the statements..

_compile_segment_score - total_segs - segment_scor - for - idx - total_segs - do - segment_scor - segment_scor - get_segment_score - idx - segment_score - end - do - segment_score - total_segments - segment_score - total_segments - get_segment_count - segment_score = _compile_segment_score - total_segments - print - segment_score - end

..then assign residues..

_compile_segment_score(strand)(25) - total_segs(16) - segment_scor(15) - for(00) - idx(17) - total_segs(16) - do(02) - segment_scor(15) - segment_scor(15) - get_segment_score(12) - idx(17) - segment_score(13) - end(03) - do(02) - segment_score(13) - total_segments(14) - segment_score(13) - total_segments(14) - get_segment_count(10) - segment_score(13) - _compile_segment_score(25) - total_segments(14) - print(04) - segment_score(13) - end(03)

..then assign my secondary structures..

(strand)25-16-15-(helice-start)00-17-16-02-15-15-12-17-13-(helice-end)03-02-13-14-13-14-(Active Site)10-13- (strand)25-14-(Aromatic)04-13-03

..for the next step, I incorporate the entirety of the function call into the beta strand because - although the strands can be any length, they also are usually equal. The helice (it being embedded in the strand) should be no longer necessary.

E(Active Site)-02-13-14-13-14-10(Active Site)-13-E-14-(Aromatic)-13-03

..the resulting sequence is..

E(Active Site A)-L-L-L-L-L-L-L(Active Site B)-L-E(Active Site A)-L-L(Aromatic)-L-L

..and so we have a resulting 14-residue sequence. Now if each residue further takes on 3 possible conformations, according to restricted (torsion) angles, then the entire sequence can take on 14 ^ 3 possible conformations. This is a very huge number, considering the code hasn't really "done" anything yet. So how do we now go backwards and interpret what the sequence means?

Answer: the active sites.

There are many different ways the code can be composed to serve this single purpose, but we will always need 2 active sites to accomplish this task. The "aromatic" designation in my example was unimportant. I call this protein's intended purpose its "theme" - because no matter how differently the sequence is constructed, its purpose does not change.

(Wed, 04/20/2011 - 06:25  |  2 comments)


Joined: 08/24/2010

3 ^ 14 possible conformations

spmm's picture
User offline. Last seen 17 weeks 5 days ago. Offline
Joined: 08/05/2010
Groups: Void Crushers
Status: Open » Closed

can be found by search

Sitemap

Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, Boehringer Ingelheim, RosettaCommons