The Secular Creationist: protein space

Saturday, September 28, 2013

islands of functionality and the flash of genius

Hazen et al illustrate some alternative ideas of functional protein accessibility in protein space, where the plane represents the dimensionality of protein space (2 is much, much fewer than what would be needed) and the E axis represents the catalytic usefulness of the various points in protein space (in essence, a fitness landscape):

D has more of a needle-in-a-haystack problem than A, B, or C, due to its relatively small hypervolume (area in the figure) of protein space. But it's not the only aspect that makes in inaccessible. The relative distance from other islands of functionality. Using the above figure a little out of its intended representation, we see that B and C are relatively close together, so that not as vast a distance of neutral mutation would have to be crossed to get from B to C as from A to D.

But vast oceans of neutral mutation to be crossed are not the only impediment to finding the points of especially high functionality ... There is also the fact that less optimal peaks might serve as attractors that divert computational resources away from the brass ring. In this case, the good is the enemy of the best.

D of Hazen et al's figure above corresponds to this diagram from one of Douglas Axe's papers, where sequence/protein space is represented by only one dimension:

Here the white noise of suboptimal adaptations might be considered negligible, all of the being more or less neutral in that they don't change the survivability rate of the organisms enough to inhibit the traversal of sequence space. It is possible that the neutral is filled with many little hills and valleys, a so-called rugged landscape. In the Picasso-esque landscape below, it may be that the difficulty in finding the high peak of innovation is compunded, both by the volume of sequence/protein space to search but also by the attractive "force" of suboptimal solutions.

The size of the relevant space to be searched along with the distractive force of more accessible (more "obvious") solutions might contribute to the Non-Obviousness of the more optimal solution.

It would seem that both of these have relevance to Bennett's concept of "logical depth", as they both may drive up the necessary computational resources (or, the amount of brute force "tinkering") to realize the non-obvious solution -- where a flash of genius might render all that brute force tinkering unnecessary.

In Shadows of the Mind, in which Roger Penrose argues for mathematical insight requiring something beyond computation, Penrose has a section on "Things that computers do well -- or badly":

Conscious understanding is a comparatively slow process, but it can cut down considerably the number of alternatives that need to be seriously consideredand thereby greatly increase the effective depth calculation.

In other words, a flash of insight can cross large distances of "logical depth" a la Charles H. Bennett. Insight is like a wormhole, a directed wormhole, through solution space.

Sunday, September 8, 2013

not protein space exactly

Stuart Kauffman writes about the protein space idea of Smith's. It's an interesting concept but more primary is that a string of codons inhabits configuration (genotype) space as well as protein (phenotype) space. Both spaces together constitute the hyperspace or phase space of the string.

The "walk" through phase space is not a traversal where each site is a dimension of 20 interconnected nodes. Each base pair choice is directly connected to three others, making each codon directly connect to three single point mutations. In typographical space, these nine other values are one "jump" away; three being slightly closer than the others (transitional mutations). In protein space, the alternate codons may be near or far, depending on the effect that the change to the functional utility of protein itself has on the probability that mutation will be propagated. This could be approximated, for a simple model, in terms of average substitutibility, in three dimensions.
http://www.evolutionnews.org/2006/07/mathematicians_and_evolution002387.html

What is the combined measure of distance between nodes in phase space if typographical space has Hamming distance and protein space has some king of functional dissimilarity measure? Probability is the common denominator. Hamming distance may be reconceived as log probability and so may functional dissimilarity.

In the big picture, it is not simply the ability for the search through phase space to discover a "winning" amino acid sequence but the utility of it to be sensed by the sensing apparatus of natural selection.

Note: Long jumps through phase space is part of a bigger picture, but one that does a lot less to describe the actual behavior as a Markov process. Intragenic recombination and frame shift mutations are little wormholes through phase space, but they do little to describe evolutionary behavior. It would be interesting to be able to model these dei ex machini but my guess is that their probabilities are very poorly understood.
http://www.sciencedirect.com/science/article/pii/S0888754306001807

Treat: Schutzenberger's "Algorithms and the neo-Darwinian Theory of Evolution":
http://www-igm.univ-mlv.fr/~berstel/Mps/Travaux/A/1967-3NeoDarwinianFullPaper.pdf

Bonus: Negoro's explanation of why he doesn't think that nylonase resulted from a frame-shift mutation.
http://www.sciencedirect.com/science/article/pii/S0022283607005347