
Almost every developer learns early that adding an index makes queries faster, and almost every developer has at some point added one that did nothing at all. Indexes are one of the highest-leverage tools for performance, but they are also widely misunderstood. Treating them as a magic “go faster” switch leads to bloated tables, slow writes, and the frustrating experience of adding an index and watching the query plan ignore it. Understanding what an index really is turns that guesswork into something you can reason about.
The Book Analogy, and Where It Holds
The familiar comparison is the index at the back of a book. If you want every mention of a topic, you can either read all four hundred pages front to back, or you can flip to the alphabetized index, find the term, and jump straight to the listed pages. A database index works the same way. Without one, the engine performs a full table scan, reading every row to find the matches. With one, it can navigate a sorted structure and go almost directly to the rows it needs.
Most relational databases implement this with a B-tree, a balanced structure that keeps keys in sorted order and allows lookups, range scans, and ordered reads in a number of steps that grows very slowly as the table grows. On a table of ten million rows, a B-tree lookup touches only a handful of nodes rather than ten million rows. That is the difference between a query that returns in a millisecond and one that returns in several seconds.
Read the Plan Instead of Guessing
The single most useful habit when working with indexes is to stop guessing and ask the database what it is doing. Every major engine exposes a command, usually some form of EXPLAIN, that shows the plan for a query: whether it is scanning the whole table, using an index, how many rows it estimates it will examine, and how it is joining tables together.
When you see a sequential or full table scan on a large table for a query that filters on a specific column, that is your signal that an index might help. When you see an index scan but the query is still slow, the plan often reveals the real cause, such as the database reading the index and then making millions of separate trips back to the table to fetch other columns. Learning to read these plans is more valuable than memorizing any list of indexing rules, because it tells you the truth about your specific data.
Composite Indexes and the Order of Columns
A single-column index is easy to reason about, but real queries usually filter on several columns at once, and that is where composite indexes come in and where people most often go wrong. A composite index on multiple columns is sorted by the first column, then by the second within each value of the first, and so on, exactly like sorting a list of people by last name and then by first name.
This ordering has a critical consequence known as the leftmost prefix rule. An index on (last_name, first_name) can efficiently serve a query filtering on last_name alone, or on last_name and first_name together, but it cannot efficiently serve a query filtering on first_name alone. The reason is intuitive from the phone-book analogy: a book sorted by last name is useless for finding everyone named “Maria” regardless of surname. If your most common query filters on a column, that column generally belongs near the front of the index.
When the Index Sits There Unused
The most common surprise is adding an index and watching the query planner refuse to use it. There are several recurring reasons, and each one is worth recognizing on sight.
- Wrapping the column in a function, such as filtering on the year extracted from a date column, usually defeats a plain index because the stored values are no longer what you are comparing against. An expression index that indexes the same transformation can fix this.
- A type mismatch, such as comparing a numeric column against a string literal, can force a conversion that prevents the index from being used.
- Low selectivity, meaning the filter matches a large fraction of the table, can make a full scan genuinely cheaper than jumping in and out of an index thousands of times, so the planner correctly skips it.
- Leading wildcards in a text search, such as matching any value that ends with a suffix, cannot use a standard B-tree because the search does not start at a known prefix.
In several of these cases the index is not broken and the planner is not wrong. The query is simply written in a way the index cannot serve, and the fix is to change the query or build an index that matches how you actually search.
Indexes Are Not Free
It is tempting to conclude that if indexes speed up reads, you should index everything. This is a trap. Every index is a separate sorted structure that the database must keep in step with the table, which means every insert, update, and delete has to update every relevant index as well. On a write-heavy table, a pile of rarely used indexes can measurably slow down the operations that matter most, and they consume storage and memory that could serve your hot data.
Over-indexing also has a hidden maintenance cost. Indexes can become bloated and fragmented over time as rows change, and a table with a dozen overlapping indexes is harder to reason about when you are trying to understand why a plan chose one over another. A smaller set of well-chosen indexes that match your real query patterns almost always beats a large set added speculatively.
A Sensible Way to Approach It
The dependable process is to start from your actual slow queries rather than from the schema. Find the queries that run often or run slowly, look at their plans, and add indexes that serve the specific filters, joins, and sort orders those queries use. Consider covering indexes that include the columns a query returns so the database can answer entirely from the index without touching the table. Then measure again, because the only proof that an index helped is a plan that changed and a query that got faster. Indexes reward the developer who observes over the one who assumes.