3 replies [Last post]
cjmarsh's picture
User offline. Last seen 5 years 4 weeks ago. Offline
Joined: 05/21/2011
Groups: Go Science

I've received some requests for more information about a recipe/program we (Seagat2011, thom001, rav3n_pl, and I) are currently creating for foldit. Since the recipe will eventually be available under a GNU license, it certainly couldn't hurt to answer any questions about what we're doing. To start out with, I'll just post some of the posits the recipe idea is based on to get the discussion started:

-- posit: foldit exists to leverage human ingenuity ( when defined as human meta-pattern recognition )
-- posit: human capacity for meta-pattern recognition is unmatched by traditional computational touring machines
-- posit: traditional computational machines are unmatched in computing power in R^1 ( where R^1 is the real numbers )
-- posit: creating an interface between human meta-pattern recognition and machine computation in R^1 would yield exponentially greater efficiency in relation to the separate sum of their efficiencies
-- posit: indexing levels of complexity with members of a finite set in R^1 models human meta-pattern recognition
-- posit: if members of R^1 are Godel numbers representing each function in the foldit API then they can be manipulated in one level of complexity
-- posit: recursively defining complexity and projecting everything to base two yields an axiomatic operation of two binary numbers for every instruction in a program
-- posit: if the binary numbers are encoded into sufficiently small bit types then every program can be defined the same way
-- posit: A program which defines the set of every segment containing ideal positions, "score", and time would then yield a number representing the ideal protein
-- posit: if the protein's Godel number is used to calculate the neccesary band forces then the ideal protein can be engineered
-- posit: if the program implemented the manipulation of the protein's Godel number then a protein of any structure can be engineered
-- goal: write a program according to the definitions outlined above and below to enable generation and manipulation of a protein's Godel number, apply user-defined forces and provide a set of modular, exposed methods for protein engineering
-- ... more content/implementation ...

If you've got a question about something in particular, please ask. Thanks.

cjmarsh's picture
User offline. Last seen 5 years 4 weeks ago. Offline
Joined: 05/21/2011
Groups: Go Science
more content:


" = " is the logical assignment
" is " = logical assignment
" -- " = comment/note
" in " = logical subset member of a superset
" and " = logical and -- (inclusive)
" or " = logical or -- (inclusive)
" xor " = logical exclusive or
" != " = logical assignment negation -- ( does not equal )
" ~ " = logical negation
" <==> " = logical if and only if -- (a.k.a. iff)
" ==> " = logical therefore
"n" = variable number in R^1
"m" = variable number in R^1
"< n >" = variable number in R^m
" R^m " = reals in m dimensions
"< m, n >" = vector in R^2 of ( m and n ) in R^1
" PE " = potential energy signature -- a.k.a. score
" Min " = infimum (global minimum) of PE = inf(PE) = < n, M(n) > -- Note: the infimum is the minima of all local minima, which indicates the limit of the trajectory at this point collapses into a local minima.
" M(n) " = < M(n-1), M(n-1) > -- M(0) = ( 1 or 0 ) in the case of a Turing machine.

A1 = The "Native fold" is the most stable fold
A2 = Any potential energy signature is non-unique

-- Hypothesis: the function for PE is actually an m-dimensional manifold where the inf(PE) occurs at infinitely many points, hence the native fold is both the most stable and one of many.

POSIT: [1] ( A1 and A2 ) and ( A1 and ~A2 ) <==> [2] Min : R^1 -> R^m <==> [3] p = ~p

Let Min = { m and < n > }, m in R^1, < n > in R^m
==> Min = { inf(PE) = m, inf(PE) = < n > }
==> inf(PE) = m = < n > ( <==> < n > = < 1 > )
==> Min : R^1 -> R^1
==> A1 and A2
qed 1a
==> inf(PE) = m != < n > ( <==> < n > != < 1 > )
==> A1 and ~A2
qed 1b
==> Min : R^1 -> R^m
qed 2
==> ( A1 and A2 ) and ( A1 and ~A2 ) <==> Min : R^1 -> R^m
==> If A2 = n = p
==> A2 = ~A2
==> p = ~p
qed 3


itskimo's picture
User offline. Last seen 9 years 43 weeks ago. Offline
Joined: 08/10/2010
Groups: foldeRNA
for seagat about the cci script

n5/10/2011 10:12:47 AM
[04:40] hi seagat :)
[04:40] i read the paper and looked at the charts, so i understand better the problum
[04:43] as i see it the AAs have too many simularities to and do not very enough so that the computer has too many choices.
[04:45] in the eterna game there is only 4 colors green red blue and yellow. with even just 4 variables the computer can match the charges.
[04:46] even changeing just one of the colors in a chain can totally disrupt the protien and make itchange into a completely different shap
[04:46] e
[04:48] so with your cci script there are just to many veriables to have the computer shift throught themall
[04:49] so inter the human who might have a intuative feeling for what might be changed
[04:50] of course you need to be in a puzzle that alows you to change the AAs
[04:50] so it seem to me that ther need to be another function involved
[04:50] this would be a stepping feature
[04:51] so that you enter the segments you want and it would do 2 things
[04:54] the stepping feature would eather step through a searies of segment selections by matchin s1,s2, them s1, s3 then s1, s4 etc. so that you can run a series of segment segments with out haveing to start the script over and over
[04:55] also it might be nice if the stepper could slide the selected segments back and forth through aseries of steps a selected distance
[04:58] for example you select the segments to be checked and the program checks that and then stepsthe segments 1 segment to the left and them 1 segment to the right and the 2 segment and then 3 or however may you speiafi in the stepping window
[04:59] this would give you a range of solutions without have ing to run the script over and over therebygiving you a choice
[05:01] i think you are onto something. i just hope its not the tigers tail. what you need to do is get up on top of the tigers back so that you can ride him through the forest :)
[05:03] anyway thats about it. in conclution i like the idea of haveing st tool scripts so that you can atleast see whats happening
[05:04] but you also need to anamate the boring parts and get a range of soutions not just one at a time
[05:04] i hope this help:)
[05:05] let me know
[05:07] o i do hope you have cought all this. in the middle of the night i can think better :)

cjmarsh's picture
User offline. Last seen 5 years 4 weeks ago. Offline
Joined: 05/21/2011
Groups: Go Science
cci script


You are completely right in thinking we need to reduce the number of variables for the computer to process. Technically, any Turing machine is efficient in R^1 and nowhere else (unless we translate for it). So, distilling the variables down into R^1 is essentially what we are currently working on.

In order to do so, we need to translate every variable (and therefore every protein) into a one-to-one function in R^11 (R^1:->R^10). This function will then be mapped to a one-to-one function in R^1 (R^1:->R^1) which can be computed by a Turing machine. When the results are parsed by the program into the R^11 point represented by the protein function, any variable in the protein can be expressed in terms of the others with a one-to-one function. This is significant because all one-to-one functions have a global minimum and global maximum (infimum and supremum) and so can be optimized for any variable (like score).

PS: If you're curious about why we chose R^11:
Basically, we believe R^11 to be the minimum amount of information required to represent any object (in the universe). Essentially, in protein terms, it represents two Quaternions [(think vectors with a twist around the vector line - aka rotation vectors) which represent the atom's overall pushing force and the atom's overall attracting force (or the overall pushing force on the atom)], and variables for charge, mass, and time. In the general case, the two quaternions represent the perspective rotation and the universal rotation (i.e. your eye and the world around you).

Any other questions/ideas, please don't hesitate.


Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Amazon, Microsoft, Adobe, Boehringer Ingelheim, RosettaCommons