What's the secret behind deep neural networks efficiency ?

Deep learning methods have changed the artificial intelligence world over the last few years. The skills and strategies that humans once believed were special to us have started to fall under the assault of ever more powerful machines.

One by one. In such tasks as facial reconnaissance and object recognition, deep neural networks are much stronger than humans. The old Games were perfected and the best human players thrashed.

Yet a dilemma persists. There are no mathematical explanations why layered

networks should be so good at this. They're flummoxed mathematicians. No one knows how they succeed despite the massive success of deep neural networks.

That is changing today through Henry Lin 's work at Harvard and Max Tegmark 's work at MIT. These guys claim that the source of mathematicians' shame is that the answer is decided by the existence of the universe. In other words, the answer is not mathematics but physics.

First, let's set the issue with a megabit grayscale classification example

to figure out if a cat or a dog is showing it. A million-pixel image consists of one of 256 gray-scale values. Thus in principle, 256100000 images can be made, and if it displays a cat or a dog for each, it must be measured. Neural networks, with just thousands or millions of parameters, always handle this classification role in some way easily.

Neural networks operate in the language of mathematics by approximating complex math functions with simpler functions. When the images of cats and dogs are categorised, the neural network must perform a process that takes a million grayscale pixels as input and outputs the distribution of probabilities.

The problem is that order of magnitude exists for approximating more mathematical functions than possible networks. But somehow deep neural

networks get the right reply. Lin and Tegmark are now saying that they worked out why. The response is that a limited sub-set of all possible functions rules the universe. In other words, when mathematical writing is carried out of the laws of

physics, all functions with a remarkable set of simple properties can be defined.

Therefore profound neural networks should only approximate a small subset of mathematical functions. Take the order of a polynomial function that is the height of its highest exponent into account. Thus a quadratic equation like y = x2 is 2, the equation y = x24 is 24 etc.

The number of orders obviously is limitless and yet in physics laws only a small subset of polynomials appear. "Our universe can precisely be represented by low-order polynomial Hamiltonians due to reasons that are still not fully understood," Lin and Tegmark claim. The polynomials representing physics laws usually have orders ranging from 2 to 4.

Physics laws have other major characteristics. In rotation and translation, for

instance, they are typically symmetrical. Turn a cat or dog by 360 degrees and it looks the same, translate by about 10 metres. Translate it by 100 metres. This makes it easier to approximate the identification process for cats or dogs.

These characteristics mean that neural networks do not actually have to approximate an infinity of potential mathematics, just a small subset of the easiest functions.

The neural networks manipulate another property of the universe. This is the systemic hierarchy. 'Primary particles form atoms that in turn form chemical, cellular, organism, planetary, nuclear, galaxy, etc.' Lin and Tegmark say so. In several instances, complex structures are built in a series of simple steps.

That is why it is also important to structure neural networks, because each stage in the causal sequence can be approximated by these networks.

The Big Bang echoes that pervade the universe, Lin and Tegmark give the example of the cosmic radiation from the background microwave. Different spacecrafts have mapped this radiation in growing resolution in recent years.

And physicists have, of course, perplexed as to why these maps are taking shape.

Tegmark and Lin stress that, for whatever reason, the causal hierarchy is inevitably the result. "The power spectrum of variations in density within our universe is determined by a number of cosmological parameters (density of dark matter, etc.) which in turn determine the pattern of cosmic radiation from our early universe which is coupled with our galactic earth radio noise to make the frequency-dependent cell maps reported by the satellite-based .

Each of these causal layers gradually contains more data. The maps and the noise that they provide are made up of milliards of numbers just a handful of cosmological parameters. Physics is aimed at evaluating large numbers in a way that displays smaller numbers.

This work, with major consequences, is fascinating and relevant. Neural artificial networks are well-known for organic networks. So Lin and Tegmark not only explain why deep learning machines work so well but also why human brains can make sense of the universe. Evolution has focused on an optimal brain system to dismantle the mystery of the universe.

This work opens the way for important artificial intelligence advances. Now that we understand finally why deep neural networks function so well, mathematicians

can explore the unique math properties that cause them to perform so well.

Lin and Tegmark say, "Reinforcing the empirical awareness of deep learning will suggest ways to develop it."

I have get this error when i calculated the geometrical optimization for prco3, it takes 12 hours until gives this message in the outputfile?

How much total RNA concentration to be extracted from sorted plasma cells from bone marrow of C57BL/6 mice for RT-PCR ?

How can we use mobile apps for improving students' academic performance?

Is someone interested in testing some of our new Alluaudite compounds (Molybdate) in the fields of heterogeneous catalysis and photocatalysis?

Has anyone used Jump 2 before?

Why applying a saved template of a bar chart in a different/new one doesn't take effect in SPSS?

IHC profiler in image j software. has anyone used it to quantify nuclear DAB positivity?

While using the IHC PROFILER plugin in image j software, to quantify the epithelial cytoplasmic DAB staining do you crop remove the stromal tissue?

What is EPOC ?

How calculate the conversion alph from tga analysis?

Feedback defines the constitution of an organism?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

Measuring the Intelligence of a Species?

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

The Curse of Evolution and Complexity?

Need help with my research project on open source SIEM and machine learning?

Swimming/space travel depends on the proprioceptive muscle spindles?

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Some new emerging problems on application of RL for scheduling in IoT networks?

How to Compress Information Neurally?