We know that correlation doesn’t imply causation. Here is one great example:
Popularity of the first name Stevie correlates with Netflix’s stock price (NFLX) (r=0.996)
Statistically, they are correlating, but we know it’s only pure chance and no connection. But how do we know? We use logical reasoning.
What is the difference between logic and statistics? They represent two different models for understanding the world. Each has strengths and weaknesses
Statistics
It uses math and works with probabilities, generalizing from patterns in data instead of relying on rules. It is good at making decisions based on previous examples and making predictions.
The data can be huge, even terabytes, and actually, Statistics thrives in data-rich environments where patterns can be learned. Machine learning, for example, depends heavily on statistical inference.
But…
It can’t prove absolute correctness - only probabilities of correctness.
You will have a hard time with unreliable data.
It can lead to misleading conclusions. See the spurious correlation above.
Logic
When you have a logical reasoning properly set up you can rely on its certainty, the conclusion follows the premises. For this reason, it’s usually easier to understand a logical system.
When there is no data, you must rely on logical reasoning. It is good in no-data or bad-data environments.
But…
When solving logical problems, the more variables you have, the harder the solution becomes. Logic does not scale well.
Logical rules are strict and cannot be easily adapted to new data.
Cannot handle probability
Human Reasoning
Neither approach is perfect, so we use both every day.
Here are a few examples:
We use logic to check if the door is locked. This is a binary (true/false) problem. If we remember locking the door and know it cannot unlock itself, we can logically conclude it’s locked.
“I turned the key and heard the lock click, so the door must be locked.”
The above example is called deductive reasoning. The structure follows
General principle → Specific conclusion.
or its reverse:
But that is too strict and we usually have more complex problems. Here are another examples:
When should we go to sleep? We rely on experiences to estimate how much sleep we need and how tired we will feel the next day. So we use statistics.
“When I sleep less than 6 hours, I feel exhausted the next day, so I’ll go to bed now.”
Deciding what to eat for lunch can be a mix. If we have dietary restrictions, we use logic (“I can’t eat peanuts because I’m allergic”). But if we’re choosing between two restaurants, we might rely on experiences (statistics).
Most real-life decisions involve a mix of both logical and statistical reasoning, but also emotions and other factors like domain knowledge can impact the decision.
Consider this situation:
You are sitting in a café, enjoying your breakfast. You reach for your cup and realize that your wallet is missing from the table. A moment later you notice a man in a hoodie walking toward the exit, looking suspicious. Maybe he stole your wallet?
If you quickly jump to conclusions, your reasoning may follow the above-mentioned straightforward deductive reasoning:
Premise 1: If someone takes my wallet, it will no longer be on the table.
Premise 2: My wallet is no longer on the table.
Conclusion: Someone must have taken it.
This reasoning has one major gap: the conclusion assumes only one possible explanation.
The wallet could have fallen under the table, you forgot that you put it into your bag and so on. We need a practical explanation that involves possibilities:
In this case:
A = “The wallet was stolen.”
B = “The wallet is missing.”
The fact that B is true might increase the plausibility of A, but it does not prove it.
We can make similar reasonings more subtle:
Notice that we went from true or false statements to plausible.
Let’s take a look at the correlation example again. How do we know that the relationship is not causal?
We apply logical reasoning and real-world knowledge (that comes from previous experiences - statistics):
Is there a believable chain of events where names can influence stock price? Not really
Is there a known mechanism that could prove this? Not really
Do we have reverse causality? Could Netflix’s stock price changes influence the number of babies named Stevie? Or vice versa. Not really
Is there a hidden factor that might affect both? Unlikely
Reasoning of Machines
AI, similarly to us, uses both statistical and logical reasoning.
Chatbots predict the most probable next word in a sentence based on data. Fraud Detecting models look at past transactions and assign a probability score to determine if a transaction is fraudulent.
Medical models diagnose diseases based on predefined logical conditions. Game-playing robots use if-then rules to make moves.
Of course, some models can combine them. Self-driving cars use logic for traffic rules (stop at a red light) and statistics for decision-making (predict if the pedestrian will cross).
What is the difference between brain and brAIn then?
János Neumann said in 1948,
“You insist that there is something a machine cannot do. If you tell me precisely what it is that a machine cannot do, then I can always make a machine which will do just that!”
When we say machines cannot think, we say that we cannot describe a problem in detail. Machines can think if we can define what thinking is.
And they can be better than us! As mentioned above, statistics scales well, because we can use GBs of data to analyze. But humans cannot handle that much info in their brains. Also, logic cannot scale well. People can understand simple logical trees, but a logical tree with 100 outcomes will set your brain on fire, machines can handle those as well.
Again, it’s only on us to define our problems well, so machines can solve them.