Tag: database

When talking about Data Lakes and how people access them - we must address some of the misconceptions that made them popular in the first place. One of the largest misconceptions is are I...

Let us finally look at what is so wrong with the Iceberg spec and why this simply isn't a serious attempt at solving the metadata problem of large Data Lakes. In the first part of this I took...

Iceberg: The great unifying vision finally allowing us to escape the vendor lock-in of our database engines. One table and metadata format to find them ... And in the darkness bind I the...

In my last post about high speed DML, I talked how it is possible to modify tables at the kind of speeds that a modern SSD can deliver. I sketched an outline of an algorithm that can easily us...

After a brief intermezzo about testing (read about my thoughts here: Testing is Hard and we often use the wrong Incentives) - it is time to continue our journey together to where we will A...

Transaction logs. Why are they so important and why are they so hard to make?

In our previous blogs, we have visited the idea that "databases are just loops". At this point, my dear readers may rightfully ask: "if those database are indeed just -...

Our database journey makes a brief stop. We need to appreciate an important design decision every database must make: Should I use row or batch execution? Depending on the database - or...

In my previous post - I introduced the idea that you can think of database queries as a series of loops. Let me take this ideas even further - introducing more complex database concepts in...

I decided to write this post in response to my recent discussions with Matt Martin. Matt and I have been sparring lately over software performance. During one of these discussions, the...