Why NULL Is a Trouble-Maker in SQL - Explained with Ice Cream
Imagine that you are running an Ice Cream shop in your city. You are surveying people on paper to improve your service and one of your questions in the survey is the following:
Which is your favorite ice cream flavor?
Simple right? You expect a single flavor people will add as an answer.
After the survey period you sit down to record the answers into a database. Reaching this question you realize that the question was not simple at all, or at least the answers are not helpful.
Missing values
A customer answers your questions but leaves the favorite flavor answer box empty. You were expecting a choice, but they never gave one. You record because you don’t know their preference.
Unknown Value
The customer wrote something down, but you cannot read it. The answer was provided but got lost. You add NULL
because the information exists, but it is unknown.
Miscellaneous Value
A customer walks in and lists chocolate, vanilla, or strawberry. Now you finally have somewhat good answers, but you expected a singular flavor. You record NULL
because the preference doesn’t fit neatly into the expected answers.
Inapplicable Value
You see this answer: Pepperoni Pizza. That’s a great choice for sure, but not for an ice cream. You didn’t ask about favorite food. What do you do? You store NULL
because no valid response applies.
Let’s talk SQL
In all the above cases, the system just sees NULL
- but NULL
doesn’t tell you why the data is missing.
If you’re trying to calculate the most popular flavor, do you count NULL
s as “vanilla”? Do you ignore them? Should you assume they were “chocolate”?
NULL
ads complication to databases. It is a special value and has no data type, meaning it can appear in text, numerical, or any other columns as well. The biggest confusion comes from the fact that NULL
can mean 4 things in SQL. It can be missing, unknown, miscellaneous, or inapplicable value in the data. NULL
makes filtering, aggregating, and comparing data tricky because it isn’t truly a value.
Just like you at the ice cream shop with the answers, an SQL database struggles to interpret NULL
correctly in queries. NULL
behaves unpredictably when used in calculations and other operations. Staying at the ice cream shop example:
NULL
in Math
Imagine trying to count how many scoops of each flavor were sold today. But some orders are NULL
in the “scoops” column. What happens when you add them up? The NULL
will destroy the calculation and your result will be NULL
.
NULL
in Comparisons
You want to check if two customers ordered the same flavor. If one order is NULL
, SQL doesn’t say “yes” or “no” - it says UNKNOWN. It’s like asking, “Did two customers order the same thing?” and getting the response: “I have no idea.” That’s SQL’s three-valued logic in action—TRUE, FALSE, and UNKNOWN.
NULL
in Sorting
When sorting orders by flavor, NULL
sometimes appear at the top, sometimes at the bottom, depending on the database settings. It’s like rearranging your menu board but having mystery items that randomly jump to the top or bottom whenever you sort the list.
NULL
in Grouping
They group for some operations but not for others.
It’s like reviewing your sales and seeing:
Vanilla: 50 orders
Chocolate: 40 orders
NULL: ??? (How many? Do they even belong together?)
What to do then?
If you are doing an online survey instead, you can set more limitations. People can only choose one value from a list of your flavors, you can mark the answer as a must, and so on.
The best problem-solving is prevention. In SQL you can add a NOT NULL
constraint to avoid NULL
and all the confusion that comes with it.
But if you end up with NULL
in the DB, then you need to handle them. That’s a different story, so see you in the next post!