> The Manifold Hypothesis: The Elegant Principle Driving Deep Learning's Success

Manifold Hypothesis Cover

Have you ever wondered why we can so effortlessly spot a familiar face in a crowd?

Despite the seemingly infinite variations of faces around us, your brain employs an astonishing principle known as the 'Manifold Hypothesis'.

In this article we will:

  • Dive into the fascinating paradox surrounding Deep Learning
  • Grapple with the underlying concepts of Manifolds
  • Explain how modern neural networks utilise the Manifold Hypothesis
  • Discuss the possibility of whether Manifolds could lead us to AGI

Deep Learning Shouldn't Work

At its core, deep learning is inherently flawed; it shouldn't make sense.

Consider this: a 1000x1000 pixel color image, where each pixel has three color values (RGB), results in a staggering 3 million dimensions to examine. The number of possible images you can make within these dimensions is astronomical — even larger than the number of atoms in the known universe.

Yet somehow neural networks can learn to identify patterns from relatively small datasets.

How is this possible?

The answer, as always in AI, lies in the data itself. Despite the incredibly high-dimensional space (3 million), the data itself resides within a lower-dimensional structure known as a manifold.

Understanding Manifolds: A Journey Through Dimensions

The key insight behind manifolds: they can twist through higher-dimensional spaces while maintaining a simpler internal structure.

But what does this actually mean?

This means, taking our image example, there is a simple internal structure that governs the images. This then twists to reach such high dimensional spaces.

Now, let's apply this to faces. A face of a person might have millions of pixels, but it doesn't have millions of dimensions. Instead, faces have a few dimensions of what they can do:

  • Head rotation (left/right & up/down)
  • Expression (happy, sad, angry, surprised)
  • Lighting conditions
  • Age
  • Other Identity characteristics

Whilst this break-down may seem like a simplistic representation, the same principle applies to all natural data including written text, and even how you talk!

The Dimensional Dance

Neural networks use the Manifold Hypothesis as a set structure for how to generate images/text.

Each layer in a neural network correlates to the lower dimensions, thus the more layers, the more dimensions a neural network can tackle.

Returning to our image example, a deep learning algorithm may follow these steps:

  • Initial Layers: Identify and map nearby relationships between pixels
  • Intermediate layers: Untwist the high dimensional space to low dimensions, similar to smoothening out a crumpled sheet of paper
  • Deeper layers: Group similar pixels together and separate them from others

These gradual steps outline how deep learning algorithms unfold high-dimensional images into a manifold. Larger neural networks excel because they can navigate these layers with remarkable efficiency.

Manifolds: The Missing Link to AGI

So when you recognise that familiar face, your brain isn't actually processing millions of pixels but rather following the many intricate pathways encoded in our brain.

These pathways are nature's own manifolds in action, and they appear in everything we do!

Your brain processes roughly 11 million bits of information every single second, but you are only consciously aware of about 50.

The human brain is so remarkable that the key to AGI is us.

Finding a way to emulate the human brain is our best bet as achieving AGI — not through brute force computation, but by following nature's elegant blueprint.

And I for one, can't wait to see where it leads.