If Isaac Newton had used Big Data


Isaac NewtonEngland, in the 1680’s. Isaac Newton is hard at work formulating his famous Laws of Motion. Except that this time, he has access to that great big savior technology, Big Data.

Naturally, what does he do? First, he collects as much data as he can about the movement of things. Balls rolling down slopes. Stars moving in the sky. His cat chasing the rats. He feeds it all into the Big Data machine, and turns it on.

After a while, the Big Data machine says “Ready”. Lo and behold, he enters the data from the latest comet, the machine makes some strange noises and then makes a prediction where the comet is going to go. 97% of the time, it got it almost right. It’s a great success.

He calls his machine Newton’s Big Motion Data machine, sells access to everybody (artillery! Where will the shell land? Keys! Where did my keys go?) and gets filthy rich in the process. Sure, 3% of the time the machine gets things really wrong, but that’s a great fundraising opportunity: obviously, the machine needs more R&D.

For the next 300+ years, the machine gets fed more and more data, and it can predict more and more things. So it’s not a big deal at all when in the 1890’s, some data shows up that when things move really quickly, close to the speed of light, Newton’s Big Motion Data machine is really, really off. But nobody pays much attention because as soon as it happens, the then-owners of the machine simply add the aberrant data to the pool of Big Data, and now it can make 97% correct predictions close to the speed of light, too. Theory of Relativity, what’s that? Just feed more data to the machine.

And, for the next 300+ years and ever after, nobody ever actually understands anything. Nobody knows of, or cares about, the key abstractions that Newton came up: mass, friction, impulse, and the beautifully simple way they relate to each other. Because the laws of motion aren’t known, when electricity is being discovered, nobody comes up with the key abstractions and models there either: how voltage, current, charge and so forth relate. We need no Maxwell Equations. Just go and ask the machine, it’s going to be mostly right most of the time.

And because of that, we don’t get electrical motors or cell phones, or a ton of other inventions. Because, you see, all of those require us to actually *understand* how nature works. To work hard, like Newton did, and Maxwell did, and thousands of other scientists, to come up with the right abstractions and a model for how they relate. A model that allows us to come up with crisp, falsifiable hypotheses. A model that we humans can understand, so we can examine it from all sides, turn it, and look inside, combine with other things, and eventually, come up with the steam machine, and p-n junctions that produce blue light.

All that Big Data does is pretend it understands. Admittedly it is so good at pretending, it can be really spooky. But when you open the box, nothing is inside that you can build on.

I’m so happy Isaac Newton didn’t have a Big Data machine.