The Database Doctor
Musing about Databases

Tag: performance

Cover image for TPC series - TPC-H Query 4 - Semi Join and Uniqueness
TPC series - TPC-H Query 4 - Semi Join and Uniqueness

Today we are looking at a Q04 — which on the surface is similar to Q17. Like Q17, Q04 has a correlated subquery that can be de-correlated using a join. But sometimes, a regular INNER JOIN is...

Cover image for TPC-H series - TPC-H Query 3 - Join Ordering and Heap Sorting
TPC-H series - TPC-H Query 3 - Join Ordering and Heap Sorting

I want to teach you an important skill that will serve your well as a database specialist. One blog entry is not going to be enough, but here is my goal: When you look at an SQL query in the you...

Cover image for TPC series - TPC-H Query 2 and 17 - De-correlation
TPC series - TPC-H Query 2 and 17 - De-correlation

The great promise databases make to programmers is: "Tell me what you want and I will figure out the fastest way to do it." A database is a computer science engine — it knows and...

Cover image for Joins are NOT Expensive! - Raw Reading
Joins are NOT Expensive! - Raw Reading

When talking about Data Lakes and how people access them - we must address some of the misconceptions that made them popular in the first place. One of the largest misconceptions is are I...

Cover image for Introducing the TPC series - TPC-H Query 1: Column Storage and Local Aggregation
Introducing the TPC series - TPC-H Query 1: Column Storage and Local Aggregation

After the wonderful feedback on the previous blog about Iceberg - it is now time to switch gears. Databases are more than row storage engines. They are algorithm machines, helping that...

TPC series - TPC-H Query 9 - Composite Key Joins

TPC series - TPC-H Query 5 - Diamond shaped Joins

customer must be joined to orders which in turn must join lineitems that must join supplier Diagramming: TODO...

TPC series - TPC-H Query 20 - Nested De-correlation

TPC series - TPC-H Query 6 and Query 14 - The boring ones

TPC series - TPC-H Query 7 - Bloom Filter Pushes

TPC series - TPC-H Query 16 - Anti Joins

Cover image for Testing is Hard  and we often use the wrong Incentives
Testing is Hard and we often use the wrong Incentives

I have been spending a lot of time thinking about testing and reviewing testing lately. At a superficial level - testing looks simple: Write test matrix, code tests, run tests, learn we...

Cover image for Why are Databases so Hard to Make? - Logging to Disk
Why are Databases so Hard to Make? - Logging to Disk

Transaction logs. Why are they so important and why are they so hard to make?

Cover image for Why are Databases so Hard to Make? - CPU usage
Why are Databases so Hard to Make? - CPU usage

In our previous blogs, we have visited the idea that "databases are just loops". At this point, my dear readers may rightfully ask: "if those database are indeed just -...

Cover image for Databases are Just Loops - Row and Batch execution
Databases are Just Loops - Row and Batch execution

Our database journey makes a brief stop. We need to appreciate an important design decision every database must make: Should I use row or batch execution? Depending on the database - or...

Cover image for Databases are just Loops - GROUP BY
Databases are just Loops - GROUP BY

In my previous post - I introduced the idea that you can think of database queries as a series of loops. Let me take this ideas even further - introducing more complex database concepts in...