In this video Robert Sapolsky explores the biological and conceptual / scientific implications of complex adaptive systems including discussions on cellular automata, networks, fractals, and power law distributions. He concludes with profound takeaways for science and for design:

Quality, excellence, complexity and adaptive optimization can emerge from a large quantity of simple elements operating with simple rules. "The simpler the constituent parts, the better. ... The more simple the building blocks, the better. ... More random interactions make for better more adaptive networks. ... That's how you stumble onto optimal solutions. Randomness is a good thing. ... [A]t the time that we're making new neurons in the cortex, that's when you induce the transposable events in the genome, that's where you juggle the DNA producing randomness. ... Randomness is a good thing. Randomness adds to the excellence of networks."

Gradients of attraction and repulsion provide a lot of the optimization in these systems. Nearest neighbor interactions with simple rules are vitally important. "Generalists work better than specialists. Generalists are more likely to come up with these adaptive outcomes." These emergent properties are where the complexity of human brains and their behavior comes from: but a lot more work needs to be done to really get a handle on how to think about these emergent properties.

Chaotic strange attractors show that the notion that there is an ideal or essential optimal result is a myth: "we are all deviating from the optima because the optima is just an emergent imaginary thing."

You can get complex adaptive optimized systems without top-down blueprints.

He implies throughout that reductionism has serious limitations and although we do not yet know how to get beyond reductionism in science (especially in the lab), these preliminary results from thinking about biology from this new complex systems / chaoticism perspective help us to understand better what makes us human and how the Universe actually operates.

Detailed notes below.

Cellular automata are a great way to see the principles of chaos (Melanie Mitchell defines "chaos" as sensitive dependence on initial conditions), complex systems, the butterfly effect, etc. They use simple binary rules (filled in "cells" or not) to generate the next generation from a given starting state. In most cases, the rules bottom out to "all off" or "extinction". Many of the rules and the starting states that "succeed" (meaning do not go extinct), look very similar to each other after some number of generations. That is, there is a convergence to a small number of patterns. Which means that you cannot predict the starting state given the "mature" pattern. Also, given the starting state and the rule, it is impossible to predict what it will look like in 20 generations without actually calculating it out. That is, the starting state gives no predictive power for describing the mature state. Starting states that are asymmetrical tend to produce more complex patterns than symmetrical ones. You can read more about cellular automata at

Analogously, we find that very simple patterns can generate different patterns of sea shells. In Africa and the Andes the very few plants that survive in the glacial equatorial mountains (at the 15,000 foot level), look very similar but are taxonomically (biologically) unrelated. So convergence onto common forms that survive in glacial equatorial mountains can happen from distinct starting points. Similarly we see convergence in animals and plants from arid regions. Biology seems to exhibit the same behaviors as do cellular automata: "slight differences magnify enormously (butterfly effects)".

Emergence: "ways in which you can code for a lot of the complexity in the natural world with small numbers of simple rules".

Neural networks: "information is not coded in a single molecule, a single synapse, a single neuron ... instead, information is coded in networks / in patterns of neural activation". He does an exercise with Paul Gauguin, Claude Monet, Vincent van Gogh, and Henri Marie Raymond de Toulouse-Lautrec-Monfa (and Picasso) as abstract expressionists, to try to illustrate how neural networks work: the more connected neurons "understand" abstract expressionism better.

Computers can effectively do sequential analytical calculations: The human brain can do parallel processing looking for patterns and similarities in a neural network where each neuron is at the intersection of many inputs. This gives the world of similarity and metaphor. Creativity, in this model, would be very broad networks finding connections and relationships in the neural network: "A broader network in some way is going to have wiring that is more divergent at the intersection of a bunch of networks that are acting in a convergent way." In the associational cortex (which no one understands very well), you find neurons that are multimodal in their responses: "all sorts of things stimulate them". They are neither sensory neurons (in the first few layers of sensory processing) nor are they grandmother neurons that respond to just one thing.

Karl Lashley was searching for engrams (a postulated representation of memory) storing individual facts by destroying parts of the brains of experimental animals. "He couldn't make the information disappear" until he destroyed broader areas. He concluded in his famous paper "In search of the engram" that there could be no such thing as memory. Which Sapolsky takes as evidence that neurons work in networks and not in localized structures. Early on in Alzheimer's when there are just a few neurons killed, the memories are just harder to recall. Individual memories don't die, the network is weakened by the losses. Memories appear to be stored in networks of neurons.

Fractal genes: "genes whose instructions are scale-free". The rule grow until five times longer than the width then bifurcate is a simple fractal rule to grow tree-like patterns. Thought experiment: if protein stability were a function of length (for example, when structuring a tube), at a certain length relative to width, the proteins could stop cohering until the frayed ends form a bifurcation split.

There is no cell in your body that is more than five cells away from a blood vessel. The circulatory system comprises less than 5% of body mass. It is a system that is "everywhere" but uses almost no space. The cantor set (see p. 93 of James Gleick's "Chaos" or is an infinite process of cutting out the middle third of a line and then the middle third of each remaining piece and so on ad infinitum. The result is "an infinitely large number of objects, lines, that take up an infinitely small amount of space". It has a fractal dimension between 0 (the dimension of a point) and 1 (the dimension of a line).

The Koch snowflake (see Gleick p. 99 or cut out the middle third of a line segment and replace it with a triangle whose edges are 1/3 the length of the original segment. This results in the impossible situation of an object with an infinite perimeter in a finite space. It's fractal dimension is between 1 and 2 (the dimension of a plane). It gives an idea of how we can have a large surface area but using a negligible amount of space.

The Menger sponge (see Gleick p. 101 or in the limit case (at infinity) is an object with an infinite amount of surface area but no volume. It's fractal dimension lies between 2 and 3 (the dimension of ordinary space).

These three examples, the cantor set, the Koch snowflake, and the Menger sponge give an idea about how the body solves the packing problem for the circulatory system and other biological systems "that is everywhere and taking up virtually no space". They are examples of "how you can use a fractal system to solve the packing problem."

Fractal mutations: a mutation that has consequences that are scale-free. Kallmann syndrome: stuff that is wrong with midline structures in the body such as the nasal septum (separating the nostrils), in the hypothalamus, and in the septum of the heart.

Paul Green (1931-1998), developmental biologist, explained how when you heat a disk with a softer interior than the surface, you get a double saddle shape in the resultant form. "So that's how you get a potato chip." Similarly when plants bifurcate with shoots coming out, example after example have a double saddle shape. The form may be a mathematical necessity with no specification in the genetics. "It is an emergent property of the physical constraints of the system."

In 1906 Francis Galton observed at a county fair that the median guess of the weight of an ox was very accurate even though most of the guesses were not. This has become a famous example for the "wisdom of the crowds". On the quiz show "Who Wants to Be a Millionaire", contestants often choose their answer with the aide of the average guess of the audience which tends to be correct over 90% of the time. Prediction markets: average the predictions of a number of experts to get a more accurate guess. "Put a lot of somewhat decent experts together on a problem and they will be more accurate than almost any one single amazing expert." Provided they really are experts and provided their biases are randomly distributed. A group of ants moving a food source to the nest may use a wisdom of the crowds kind of average to act in concert to head there directly even though each ant tends to wander a bit.

Field of emergent complexity. A small number of rules governing the behavior of many "simple" participants (agent based systems). When they act in sufficient numbers with these nearest neighbor rules, a complex adaptive system emerges. It is a bottom up organization as no agent (ant in Sapolsky's example) has any instructions or notion of what needs to be done at the aggregate, emergent level.

Travelling salesman problem: visit each of n randomly spaced locales in the least effort (distance or time). It cannot be solved mathematically (the combinatorial complexity makes listing all possibilities impossible). We can program virtual ants with "swarm intelligence" so that after many iterations, a pretty efficient solution emerges. The idea is for the ants to travel from location to location randomly but they leave a pheromone trail as they go. Because the pheromone steadily evaporates, the less efficient paths will have less pheromone and gradually more and more ants will follow the pheromone trails of the shorter paths and reinforce it with their own pheromone. This is not a wisdom of the crowds example because the ants have no concept about shortest distance and finding an efficient path. It is an emergent swarm intelligence that isn't present at the level of the agents in the system.

Sapolsky suggests that bees in relocating their colony will use a similar method to move to where the better food source is by swarming toward those locations that elicit the longer waggle dance (which implies a better food source; more on the bees waggle dance:

Another approach is to have agents that attract or repel one another (for example magnetic poles). So if different shops are attracted to each other but repelled by identical shops, an agent based model will emerge with clusters as business districts where the forces of attraction and repulsion balance. An urban plan can be developed in this way. Hmm, maybe that is how our cities actually formed: by a complex of agent-based actions of humans as mindless ants acting in their world?

Neural growth can be modelled on similar attraction and repulsion rules. Certain attractors cause a neural projection to be sent toward it while repulsive signals cause projections to be sent in the other direction.

Sapolsky may be referencing this study in Science "Rules for Biologically Inspired Adaptive Network Design" which is summarized here which he says shows that these agent-based systems can more accurately than skilled humans map out complex systems like subway systems.

Since molecules can have + and - charges on them, building proteins and other molecules can assemble according to these models.

Harold C. Urey and Stanley L. Miller experiment ( synthesized complex organic compounds including amino acids from simple molecules by adding electricity. The experiment suggests that with the right ingredients and lightening, life could have formed spontaneously on the early Earth. Later research showed that the electricity might not be needed: just having charged molecules and random mixing will eventually result in complex organic compounds.

Power law distributions have a decaying rate of occurrence. For example less severe earthquakes occur much more frequently than major devastating earthquakes such as the 7.8 one in Nepal or the 9.0 one in Japan in 2011. The same power law distribution is observed in distance to the other party in phone calls, the distance that dollar bills travel (he may be referring to Brockmann's paper: The number of links between web sites, complexity of proteins, the rate that people send e-mails, the amount of separation to Kevin Bacon all follow a power law distribution. Such distributions are intrinsically fractal: the pattern is similar at different scales and for different subjects.

The distribution of neural projections in the brain follows a power law which seems to be an efficient way to get stable interactions amongst clusters of nodes (neurons) but also has a few long distance connections. Sapolsky suggests that data from fetal neural development supports the power law distribution behavior.

Some evidence suggests that the projection profile of neurons in the cortex of autistic children follows a steeper than normal power law distribution so that there are more close-by projections and too few long distance connections. This may produce modules of function that are isolated from others. Autism may be a lack of integration between brain functions. Could it be a disease in the shape of the power law distribution of neural connections? Even though there are a perfectly normal number of neurons and probably a normal number of connections between neurons, we might have a malady due to the distribution of those connections.

Male brains also have a steeper power law distribution with more modular (local) connections and fewer connections to distant regions. Due to this, the average corpus callosum is thicker in women than in men (even thinner in the autistic). The corpus callosum is a wide, flat bundle of neural fibers beneath the cortex that connects the two hemispheres of the brain in most mammals.

Instead of the old world of expert reviews, we now live in a bottom up, self-correcting world of Amazon and Netflix ratings built on attraction and repulsion rules (likes and dislikes) wherein good recommendations emerge from a cacophony of voices (inputs). Wikipedia being one of the preeminent examples. A study in Nature ( found that the hard-nosed facts in Wikipedia were nearly as accurate as those in Encyclopaedia Britannica. The wisdom of the crowds can produce a self-correcting, accurate adaptive system with no blueprint and no top-down control. Weighted wisdom of the crowds sites appear to generate even better results. A significant drawback to these systems is a bias toward conformity, that is, facts or opinions with large variance (including outliers) may not bottom up emerge accurately.

How does the nervous system wire up its power law distribution of neural projections? Swarm intelligence: radial glial cells form a pioneer generation of radially oriented cells. Then neurons in later generations act as "random wanderers" preferentially growing along the radial glial cells. This process produces the optimal power law distribution all from simple local rules.

"Looking at a single neuron you can't tell which species it came from". Humans and fruit flys are indistinguishable. We have the same neurotransmitters, ion channels, action potentials as flys and worms ("minor details are different"). The difference is that we have more neurons. Fun anecdote: Kasparov vs. Deep Blue: the first time a computer beat a top-level chess grandmaster. Kasparov said "with enough quantity [computational power] you invent quality". It is the quantity of neurons in the human nervous system that gives us our more nuanced mental faculties despite the fact that our brains have the same basic neuronal building blocks as all other animals (and insects).

Now that the human and chimp genomes have been sequenced, we can compare them. Humans and chimps share 98% of their DNA. What's the 2% that differ? They disproportionately code for transcription factors (background information on transcription factors:, splicing enzymes, and non-coding regions. Humans have about 1000 fewer olfactory receptors than do chimps due to pseudo-genes that do not express in humans: that's about half the difference between chimps and humans! There were some changes due to morphology and bone development (probably related to our bipedalism), there was one change in genes affecting hair development (humans are less hairy), and some differences in some mating related genes. There were very few differences related to brain development: in humans neural progenitor cells undergo an increased number of rounds of cell division than in chimps: our only significant neural difference from chimp brains is the number of cells in our brains! Qualitatively chimps have the same kind of brain cells as we do.

Sapolsky predicts that within our lifetime, there we will a revolution that collapses some government effectuated from our living rooms with a bottom-up Internet initiative with no physical demonstration or even bloodshed.