Developer Chat

inkycatz Welcome everyone to our March science chat! 13:59
inkycatz As always our chats are logged so you don’t miss a thing, be sure to look afterwards for the link in here and on the news 13:59
inkycatz We’ve got a nice pile of questions today. 13:59
inkycatz But as always if yours doesn’t hit the main question period take heart, you may see it soon on a blog 13:59
inkycatz or just answered in the thread itself. (we have a couple like that this time I think) 14:00
inkycatz As usually happens, 1-2 side questions are fine but only if they’re quick ones, 14:00
inkycatz please no multi-part “followups” today, or hold those until the end. 14:00
inkycatz This keeps it moving for everyone 14:00
inkycatz and especially those who posted in advance. :) 14:00
inkycatz (And thank you to all our contributors to that thread) 14:00
inkycatz Let’s get rolling with our first question 14:01
inkycatz What are the differences in the scoring function between design puzzles and prediction puzzles? 14:01
bkoep In addition to the filters, Foldit design puzzles usually include a few other adjustments to the score function. 14:02
bkoep The two big ones affect the Backbone subscore and the Ideality subscore 14:02
bkoep The Backbone subscore is augmented so that penalties (but not bonuses) are steeper as you move away from shaded regions of the Rama map 14:03
bkoep Effectively, we like designed proteins to have all residues in the fully "favored" regions of the Rama map; but it's unrealistic to expect this of native proteins, which will typically have a few residues on the fringes 14:04
bkoep The Ideality subscore is also upweighted, so that Foldit designs are less tolerant of deviant bond lengths and angles 14:05
bkoep Again, this is a high bar to enforce for native proteins 14:06
inkycatz Do you ever take an interesting backbone from a foldit design and use Rosetta to find better sidechains for it? 14:06
bkoep We've experimented with this a little bit in the past, but it wasn't very fruitful 14:07
bkoep In order to get appreciably different sidechains, Rosetta would also have to let the backbone move quite a bit 14:08
bkoep Also, there's no reason that Rosetta should be any better at picking sidechains that Foldit's Shake tool 14:09
inkycatz What are the reasons the Blueprint tool is available on some puzzles and not others? 14:10
bkoep The Blueprint tool has only been available on design puzzles, for similar reasons that the score function is a little different 14:11
bkoep The Blueprint tool is pretty restricting; there are only a handful of super-stable loop types available. 14:11
bkoep We like this for design, because it restricts the backbone to super-stable, highly probably shapes 14:12
bkoep But any given native protein is likely to have a few strange, moderately-stable loops 14:13
bkoep So the Blueprint tool doesn't seem terribly appropriate for structure prediction puzzles. However, if players think it would be useful, we could certainly try it in some prediction puzzles! 14:14
tokensIRC The blueprint tool would be useful when we get a prediction puzzle for a design made in foldit (but that might be considered cheating a bit) 14:14
bkoep @tokens, haha, yes I see your point 14:14
inkycatz I have noticed the prediction scores for the SS do not appear in puzzles 1344, 1347, 1350. What criteria do you use to determine if these are shown or not? 14:15
bkoep If not outright cheating, it would definitely reduce the usefulness of those prediction puzzles 14:16
bkoep If "SS prediction scores" is in reference to the old logo plots, we haven't included those plots for some time 14:17
bkoep We used to use a SS prediction server that generated those nice logo plots for a query sequence, but that server fell out of maintenance 14:18
bkoep We've since been using PSIPRED, which does also report a confidence measure for its predictions, but it doesn't print the nice logo plots 14:19
bkoep I suppose we could generate those plots ourselves, or report the SS prediction confidence to players some other way 14:20
inkycatz Is it really worthwhile to submit to scientists a fold which is only r50? 14:21
bkoep YES 14:21
inkycatz :D 14:21
inkycatz well that was clear! 14:22
bkoep The whole motivation behind the "Submit to Scientists" button is to highlight low-scoring, but potentially interesting, designs 14:22
inkycatz If possible, can you elaborate on what makes the wet lab folds work and not work, rather than the score? So far, we mostly have either horseshoe helices (from one end), helices from each end, or a "seatbelt" fold which is similar to the Koga & Koga designs. 14:23
bkoep As it is, this feature is still very much under-utilized by players 14:23
bkoep If you have designed a protein that you really like, or think is interesting for some reason, please share it with us, and make a note in the 'Description' text box about why you like it 14:24
spvincent It is hard to know what is interesting 14:25
jflat06 It's worth noting that we DO look at these, and we have found things that have made us excited. 14:25
spvincent as opposed to random quirky rubbish 14:25
jflat06 as an example: 14:26
bkoep @spvincent, we like to know what YOU think is interesting 14:26
jflat06 you may feel that the score function just isn't rewarding what you intuitively feel is a better fold. 14:26
bkoep Quirky rubbish is fine 14:26
spvincent ok, fine 14:26
alcor29 Guess I have to keep submitting quirkies. 14:27
Susume2 it's useful to know that you read the Description field 14:27
jflat06 You might have a crazy idea that you wanted to try out and let us see the result, even if it doesn't score well 14:28
jflat06 for example 14:28
jflat06 like a pi helix! 14:28
jflat06 (but really, we've had enough pi helices) 14:28
tokensIRC The good old pi helix days :) 14:29
inkycatz (it has to be a really quirky pi helix, if you must) 14:29
bkoep About what makes certain folds work: I can't elaborate on what works and what doesn't, mostly because we don't know 14:29
bkoep One very interesting question in biology is whether nature has exhaustively searched all possible protein folds 14:30
bkoep There's a lot of evidence suggesting it hasn't, which would mean that it's possible to design protein folds that do not exist in nature 14:30
bkoep The Top7 designed protein is a good example of a designed protein that has a unique fold—no known natural protein has the same fold as Top7 14:31
inkycatz In the recent blog comments, bkoep mentions that the box plots about improvement in "top scoring solutions" are based on the top 10 solo and evo solutions per puzzle. Was 10 the cutoff just for those graphs, or is that a cutoff you use for sending things to Rosetta as well? For example, if my best solution finishes at rank 11-30, does it get looked at equally with the others, or do I need to share that best one with scientists as well as my 14:34
inkycatz lower-scoring tracks? 14:34
bkoep There are two different pathways for a Foldit design to reach the wet lab 14:35
jeff101 do you ever e-mail/PM players who have been making interesting but low-scoring designs to encourage them ? 14:37
bkoep The first is that it is high-scoring. The cream of the crop is automatically submitted to Rosetta@home, mainly as a way to benchmark our progress from week to week. We don't even really inspect these designs. 14:37
bkoep The second pathway involves manual inspection. Here we're looking for designs that look like promising folders to us. 14:38
Susume2 like roughly how many are the "cream of the crop"? 14:39
bkoep We start with the shared solutions. Then we move on to the bulk solutions, which are clustered (to remove identical or near-identical solutions), and ranked by score. 14:39
inkycatz (so another good benefit to sharing solutions!) 14:39
bkoep We go down the list of clustered solutions (which naturally includes the top-ranking solutions) and inspect anything that looks "plausible"—usually around 100 models. 14:41
tokensIRC I have made several designs where the prediction tool raptorx was unable to predict a good contact map for the design. Is raptorx a good indicator for whether the design is likely to fold as designed? 14:42
bkoep @Susume, it varies a little bit based on the solo/evolver make-up of the rankings, but think "top 10" 14:42
Susume2 thx 14:42
bkoep @tokens, no this is probably not an appropriate use of RaptorX 14:43
bkoep I'm not terribly familiar with RaptorX, but in general, contact map predictions are based on sequence homology 14:44
bkoep Sometimes these predictions can be based on very distant sequence homology, but nevertheless rely on an alignment of multiple proteins that are similar to the query protein 14:44
bkoep Unless your design is closely related to a natural protein (and it shouldn't be), then I would not use contact prediction as a proxy for "likeliness to fold" 14:46
tokensIRC ok, thx 14:47
rmoretti From skimming the paper, the contact predictions from RaptorX does indeed work with evolutionary coupling and sequence conservation, so it's unlikely to be of use for novel protein designs. 14:47
inkycatz Do you use filters when you send some "ideal" design to Rosetta? 14:49
bkoep @jeff101, I don't think we've reached out to any specific players. If we see something interesting, we'd be more likely to make a public discussion out of it. 14:50
bkoep We have not been using any additional filters when submitting designs to Rosetta@home 14:51
inkycatz Looks like we’re about to our last question! 14:51
inkycatz Are there disordered loops and ordered loops? Can you explain these to us if so? 14:52
bkoep We run some additional analysis on Foldit designs, but this is more for benchmarking than for selection 14:52
Wbertro http://fold.it/portal/node/2003280#comment-34163 14:52
Wbertro http://fold.it/portal/comment/reply/2003481/34472 14:53
Wbertro will there be follow-up? 14:53
inkycatz These are on our list 14:53
Wbertro Thanks 14:53
bkoep Yes, in natural proteins there are ordered loops and disordered loops 14:53
inkycatz (but as noted, probably likely in a separate post outside chat :) 14:54
Wbertro I understand 14:54
bkoep Really, in a solution, it's better to think of this as a spectrum. Different parts of the protein will be more flexible and have more freedom to move around. 14:54
spvincent   I was wondering if all the exterior Lysines and Arginines mutate puts in introduces any kind of bias 14:54
bkoep In crystallography, when most of the protein is locked down into a rigid conformation, there is a more distinct boundary between 'ordered' and 'disordered', but this does not necessarily reflect how the protein behaves in solution. 14:55
bkoep In an electron density map, for example, if density is missing for some residues, that just means that those residues were disordered in the crystal (and usually this means those residues are also disordered in solution) 14:56
bkoep However, residues that appear rigid in a crystal can sometimes be very flexible in solution. In many proteins, this flexibility is essential for their biological function. 14:57
jeff101 should we expect more voids in disordered parts of a protein? 14:58
bkoep @jeff101, no, probably not 14:59
bkoep Because Foldit uses an implicit solvent model, this is a little tricky (sometimes a void can be filled with water) 15:00
inkycatz (Just as a side note, we’re past our official hour so I want to be sure to thank bkoep, jflat, our lurking scientists and of course… all of you before you run off.) 15:00
bkoep But usually, disordered just means: "There are protein atoms in this region, but their position and orientation varies from molecule to molecule in the crystal" 15:01
Wbertro Will there be a follow-up on the Drug design puzzles? 15:02
jflat06 Regarding bertro's questions: 15:02
jflat06 Remix is designed to find shapes for a backbone between two endpoints. As such, it isn't equipped for dealing with end-points. This is something that we would like to address. One solution we have considered is reverting to rebuild in these cases. 15:03
jflat06 We'd also like to get rid of rebuild if possible, though, as it requires some large libraries that inflate our download size. 15:04
jflat06 (and I believe also increase initial load times) 15:04
Wbertro Two things almost everyone can live with I think 15:05
jflat06 But we would like to have solution for endpoints 15:05
jflat06 Rebuild also isn't the most ideal solution in general 15:05
Wbertro yes, it is a bit harsh on the protein 15:06
Susume2 rebuild seems to sample nearby space more broadly than remix (finds more similar shapes that score better) 15:07
jflat06 With the way rebuild works, it just lowers its standards until it finds something new, then tries to fix it up. 15:08
jflat06 it's useful when you really want to put something there, but ends up introducing some pretty poor quality fragments in the process 15:08
tokensIRC Some things cannot be done with remix afaik, like banding segments to space and let rebuild move them to their new position 15:09
Wbertro I think Remix disregards all bands when working 15:10
jflat06 we haven't looked at Remix in a while - it would be worth revisiting it to deal with some of these inadequacies 15:12
jflat06 another possibility that we have is to increase the library from which remix pulls, which would increase how often you could find things, as well as the quality of the things you find. 15:12
Wbertro :) 15:13
jflat06 that would ramp up the download size even larger (hundreds of megabytes), but if we remove the rebuild library, that could be a reasonable tradeoff 15:13
jflat06 (and again, not just download size - memory requirements and potentially load times as well) 15:14
Wbertro we download Gb other games so 100-300Mb more is nothing 15:15
jflat06 i can respect that is true for some players, but some new players with poorer connections may immediately dismiss the download if they see it taking too long 15:16
jflat06 but more pressing is the memory requirements 15:16
jeff101 do the remix/rebuild libraries just get downloaded when we update clients? 15:16
jflat06 i believe any time we update the "database" part of the game, you have to redownload it 15:16
jflat06 as well as the first initial download 15:17
jflat06 if i recall correctly, the whole "remix" library is ~1 gigabyte, so it's really a question of what is an acceptable size for a tradeoff 15:17
Wbertro sorry have to go, I will read the chat logs 15:18
Wbertro thanks for your time and help/support 15:19
jeff101 yes, thanks for having chats like this. they are very useful. 15:19
jflat06 any other last-minute questions? 15:20
jeff101 how does the new President affect Foldit ? 15:20
inkycatz Looking forward to the log, jflat - thanks for doing them :) 15:21
inkycatz Well. As you know Foldit is funded from a variety of sources, which include the NIH. (You can see a list here: http://fold.it/portal/info/credits) 15:28
inkycatz Once more, thanks everyone for coming. Next chat likely in May :) 15:36
inkycatz Great questions today. 15:36

Generated by irclog2html.py 2.13.1 by Marius Gedminas
- find it at mg.pov.lt!

Get Started: Download
  Windows    OSX    Linux  
Windows
(Vista/7/8)
OSX
(10.7 or later)
Linux
(64-bit)

Are you new to Foldit? Click here.

Are you a student? Click here.

Are you an educator? Click here.
Search
Only search fold.it
Recommend Foldit
User login
Soloists
Evolvers
Groups
Topics
Top New Users
Sitemap

Developed by: UW Center for Game Science, UW Institute for Protein Design, Northeastern University, Vanderbilt University Meiler Lab, UC Davis
Supported by: DARPA, NSF, NIH, HHMI, Microsoft, Adobe, RosettaCommons