Introducing Drizzle

This is a post about an exciting new ORM tool (that’s “object relational mapper”) that is different than any ORM I’ve used before—and I’ve used quite a few! Spoiler: it’s Drizzle.

Wait, what’s an ORM?

Even if you’re using a non-relational database now (think MongoDB or Redis), sooner or later you’ll likely need a relational DB.

Knowing SQL is an essential skill for any software engineer, but writing SQL directly can be tricky! The tooling is usually primitive, with only minimal auto-complete to guide you, and you invariably go through a process of running your query, correcting errors, and repeating until you get it right.

ORMs try to help you with this process of crafting SQL. Typically, you tell the ORM about the shape of your DB and it exposes APIs to do typical things. If you have a books table in your DB, an ORM will give you an API for it where you can do stuff like:

const longBooks = books.find({ pages: { gt: 500 } });Code language: JavaScript (javascript)

Behind the scenes, the SQL created might be something like:

SELECT * FROM books WHERE pages > 500;Code language: SQL (Structured Query Language) (sql)

That may not look like a massive simplification, but as the parameters get more complex or strung together, the SQL can get a bit mindbending. Not to mention the ORM keeps that code in a language you’re likely already using, like JavaScript.

This ease of use may seem nice, but it can cause other problems. For example, you might struggle figuring out how to do non-trivial queries. And there are performance foot-guns, such as the infamous Select N + 1 problem, which you might cause without realizing it due to the abstracted away syntax.

Why Drizzle is Different

Drizzle takes a novel approach. Drizzle does provide you a traditional ORM querying API, like we saw above. But in addition to that, it also provides an API that is essentially a layer of typing on top of SQL itself. So rather than what we saw before, we might query our books table like this

const longBooks = await db
  .select()
  .from(books)
  .where(gt(books.pages, 500));Code language: JavaScript (javascript)

It’s more lines, but it’s closer to actual SQL, which provides us some nice benefits: it’s easier to learn, more flexible, and avoids traditional ORM footguns.

Let’s dive in and look closer. This post will take a brief overview of setting up Drizzle, and querying, and then do a deeper dive showing off some of its powerful abilities with this typed SQL querying API. The docs are here if you’d like to look closer at anything.

Using Drizzle in general, and some of the advanced things we’ll cover in this post requires a decent knowledge of SQL. If you’ve never, ever used SQL, you might struggle with a few of the things we discuss later on. That’s expected. Skim and jump over sections as needed. If nothing else, hopefully this post will motivate you to look at SQL.

Setting up the Schema

Drizzle can’t do much of anything if it doesn’t know about your database. There’s lots of utilities for showing Drizzle the structure (or schema) of your tables. We’ll take a very brief look, but a more complete example can be found here.

Drizzle supports Postgres, MySQL, and SQLite. The ideas are the same either way, but we’ll be using MySQL.

Let’s start to set up a table.

import { int, json, mysqlTable, varchar } from "drizzle-orm/mysql-core";

export const books = mysqlTable("books", {
  id: int("id").primaryKey().autoincrement(),
  userId: varchar("userId", { length: 50 }).notNull(),
  isbn: varchar("isbn", { length: 25 }),
  pages: int("pages"),
});Code language: JavaScript (javascript)

We tell Drizzle about our columns (we won’t show all of them here), and their data types.

Now we can run queries:

const result = await db
  .select()
  .from(books)
  .orderBy(desc(books.id))
  .limit(1);Code language: JavaScript (javascript)

This query returns an array of items which match the schema we provided Drizzle for this table.

Alternatively, as expected, we can also narrow our select list.

const result = await db
  .select({ id: books.id, isbn: books.isbn })
  .from(books)
  .orderBy(desc(books.id))
  .limit(1);Code language: JavaScript (javascript)

Note that the types of the columns match whatever we define in the schema. We won’t go over every possible column type (check the docs), but let’s briefly look at the JSON type:

export const books = mysqlTable("books", {
  id: int("id").primaryKey().autoincrement(),
  userId: varchar("userId", { length: 50 }).notNull(),
  isbn: varchar("isbn", { length: 25 }),
  pages: int("pages"),
  authors: json("authors"),
});Code language: JavaScript (javascript)

This adds an authors field to each book. But the type might not be what you want. Right now it’s unknown. This makes sense: JSON can have just about any structure. Fortunately, if you know your json column will have a predictable shape, you can specify it, like this:

export const books = mysqlTable("books", {
  id: int("id").primaryKey().autoincrement(),
  userId: varchar("userId", { length: 50 }).notNull(),
  isbn: varchar("isbn", { length: 25 }),
  pages: int("pages"),
  authors: json("authors").$type<string[]>(),
});
Code language: JavaScript (javascript)

And now, when we check, the authors property is of type string[] | null.

If you were to mark the authors column as notNull() it would be typed as string[]. As you might expect, you can pass any type you’d like into the $type helper.

Query Whirlwind Tour

Let’s run a non-trivial, but still basic query to see what Drizzle looks like in practice. Let’s say we’re looking to find some nice beach reading for the summer. We want to find books that belong to you (userId == “123”), and is either less than 150 pages, or was written by Stephan Jay Gould. We want the first ten, and we want them sort from most recently added to least recently added (the id key is auto-numbered, so we can sort on that for the same effect)

In SQL we’d do something like this:

SELECT *
FROM books
WHERE userId = '123' AND (pages < 150 OR authors LIKE '%Stephen Jay Gould%')
ORDER BY id desc
LIMIT 10Code language: SQL (Structured Query Language) (sql)

With Drizzle we’d write this:

const result = await db
  .select()
  .from(books)
  .where(
    and(
      eq(books.userId, userId),
      or(lt(books.pages, 150), like(books.authors, "%Stephen Jay Gould%"))
    )
  )
  .orderBy(desc(books.id))
  .limit(10);Code language: JavaScript (javascript)

Which works!

[
  {
    "id": 1088,
    "userId": "123",
    "authors": ["Siry, Steven E"],
    "title": "Greene: Revolutionary General (Military Profiles)",
    "isbn": "9781574889130",
    "pages": 144
  },
  {
    "id": 828,
    "userId": "123",
    "authors": ["Morton J. Horwitz"],
    "title": "The Warren Court and the Pursuit of Justice",
    "isbn": "0809016257",
    "pages": 144
  },
  {
    "id": 506,
    "userId": "123",
    "authors": ["Stephen Jay Gould"],
    "title": "Bully for Brontosaurus: Reflections in Natural History",
    "isbn": "039330857X",
    "pages": 544
  },
  {
    "id": 412,
    "userId": "123",
    "authors": ["Stephen Jay Gould"],
    "title": "The Flamingo's Smile: Reflections in Natural History",
    "isbn": "0393303756",
    "pages": 480
  },
  {
    "id": 356,
    "userId": "123",
    "authors": ["Stephen Jay Gould"],
    "title": "Hen's Teeth and Horse's Toes: Further Reflections in Natural History",
    "isbn": "0393311031",
    "pages": 416
  },
  {
    "id": 319,
    "userId": "123",
    "authors": ["Robert J. Schneller"],
    "title": "Cushing: Civil War SEAL (Military Profiles)",
    "isbn": "1574886967",
    "pages": 128
  }
]Code language: JSON / JSON with Comments (json)

The Drizzle version was actually a little bit longer. But we’re not optimizing for fewest possible lines of code. The Drizzle version is typed, with autocomplete to guide you toward a valid query, and TypeScript to warn you when you miss. The query is also a lot more composable. What do I mean by that?

Composability: Putting Queries Together

Let’s write something slightly more advanced and slightly more realistic. Let’s code up a function that takes any number of search filters, and puts together a query. Here’s what the filters look like

type SearchPacket = Partial<{
  title: string;
  author: string;
  maxPages: number;
  subjects?: number[];
}>;Code language: TypeScript (typescript)

Note the Partial type. We’re taking in any number of these filters—possibly none of them. Whichever filters are passed, we want them to be additive; we want them combined with and. We’ve seen and already, and it can take the result of calls to eq, lt, and lots of others. We’ll need to create an array of all of these filters, and Drizzle gives us a parent type that can hold any of them: SQLWrapper.

Let’s get started.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
}Code language: TypeScript (typescript)

We’ve got our array of filters. Now let’s start filling it up.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
}Code language: TypeScript (typescript)

Nothing new, yet. This is the same filter we saw before with authors.

Speaking of authors, let’s add that query next. But let’s make the author check a little more realistic. It’s not a varchar column, it holds JSON values, which themselves are strings of arrays. MySQL gives us a way to search JSON: the ->> operator. This takes a JSON column, and evaluates a path on it. So if you had objects in there, you’d pass string paths to get properties out. We just have an array of strings, so our path is $, which is the actual values in the array. And the string comparrisons when we’re filtering on JSON columns like this is no longer case insensitive, so we’ll want to use the LOWER function in MySQL.

Typically, with traditional ORM’s you’d scramble to the docs to look for an equivalent to the ->> operator, as well as the LOWER function. Drizzle does something better, and gives us a nice escape hatch to just write SQL directly in situations like this. Let’s implement our authors filter.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
  if (args.author) {
    searchConditions.push(
      sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`
    );
  }
}Code language: TypeScript (typescript)

Note the sql tagged template literal. It lets us put arbitrary SQL in for one-off operations that may not be implemented in the ORM. Before moving on, let’s take a quick peak at the SQL generated by this:

{
  "sql": "select `id`, `userId`, `authors`, `title`, `isbn`, `pages` from `books` where (`books`.`userId` = ? and LOWER(`books`.`authors`->>\"$\") LIKE ?) order by `books`.`id` desc limit ?",
  "params": ["123", "%gould%", 10]
}Code language: JSON / JSON with Comments (json)

Let’s zoom in on the authors piece. What we entered as…

sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`Code language: JavaScript (javascript)

… gets transformed as:

LOWER(`books`.`authors`->>"$") LIKE ?Code language: SQL (Structured Query Language) (sql)

Our search term was parameterized; however, Drizzle was smart enough to not parameterize our column. I’m continuously impressed by small details like this. The maxPages piece is the same as before

if (args.maxPages) {
  searchConditions.push(lte(books.pages, args.maxPages));
}Code language: JavaScript (javascript)

Nothing new or interesting. Now let’s look at the subjects filter. We can pass in an array of subject ids, and we want to filter books that have that subject. The relationship between books and subjects is stored in a separate table, booksSubjects. This table simply has rows with an id, a book id, and a subject id (and also the userId for that book, to make other queries easier).

So if book 12 has subject 34, there’ll be a row with bookId of 12, and subjectId of 34.

In SQL when we want to see if a given row exists in some table, we use the exists keyword, and Drizzle has an exists function for this very purpose. Let’s move on with our function

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
  if (args.author) {
    searchConditions.push(
      sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`,
    );
  }
  if (args.maxPages) {
    searchConditions.push(lte(books.pages, args.maxPages));
  }
  if (args.subjects?.length) {
    searchConditions.push(
      exists(
        db
          .select({ _: sql`1` })
          .from(booksSubjects)
          .where(
            and(
              eq(books.id, booksSubjects.book),
              inArray(booksSubjects.subject, args.subjects),
            ),
          ),
      ),
    );
  }Code language: TypeScript (typescript)

We can pass an exists() call right into our list of filters, just like with real SQL. This bit:

_: sql`1`Code language: JavaScript (javascript)

… is curious, but that’s just us saying SELECT 1 which is a common way of putting something into a SELECT list, even though we’re not pulling back any data; we’re just checking for existence. Lastly, the inArray Drizzle helper is how we generate an IN query. Here’s what the generated SQL looks like for this subjects query:

select `id`, `userId`, `authors`, `title`, `isbn`, `pages`
from `books`
where (`books`.`userId` = ? and exists (select 1
                                        from `books_subjects`
                                        where (`books`.`id` = `books_subjects`.`book` and
                                               `books_subjects`.`subject` in (?, ?))))
order by `books`.`id` desc
limit ?Code language: SQL (Structured Query Language) (sql)

That was our last filter. Now we can pipe our filters in to execute the query we put together.

async function searchBooks(args: SearchPacket) {
  const searchConditions: SQLWrapper[] = [];
  if (args.title) {
    searchConditions.push(like(books.title, `%${args.title}%`));
  }
  if (args.author) {
    searchConditions.push(
      sql`LOWER(${books.authors}->>"$") LIKE ${`%${args.author.toLowerCase()}%`}`
    );
  }
  if (args.maxPages) {
    searchConditions.push(lte(books.pages, args.maxPages));
  }
  if (args.subjects?.length) {
    searchConditions.push(
      exists(
        db
          .select({ _: sql`1` })
          .from(booksSubjects)
          .where(
            and(
              eq(books.id, booksSubjects.book),
              inArray(booksSubjects.subject, args.subjects)
            )
          )
      )
    );
  }

  const result = await db
    .select()
    .from(books)
    .where(and(eq(books.userId, userId), ...searchConditions))
    .orderBy(desc(books.id))
    .limit(10);
}Code language: TypeScript (typescript)

The ability to treat SQL queries as typed function calls that can be combined arbitratily is what makes Drizzle shine.

Digging deeper

We could end the post here, but let’s go further and see how Drizzle handles something fairly complex. You might never need (or want to) write queries like this. My purpose in including this section is to show that you can, if you ever need to.

With that out of the way, let’s write a query to get aggregate info about our books. We want our most and least popular subject(s), and how many books we have with those subjects. We also want to know any unused subjects, as well as that same info about tags (which we haven’t talked about). And also the total number of books we have overall. This data might be displayed in a screen like this.

To keep this section manageable we’ll just get the book counts and the most and least subjects. The other pieces are variations on that theme. You can see the finished product here.

Let’s look at some of the SQL for this and how to write it with Drizzle.

Number of books per subject

In SQL we can group things together with GROUP BY.

SELECT
    subject,
    count(*)
FROM books_subjects
GROUP BY subjectCode language: SQL (Structured Query Language) (sql)

Now our SELECT list, rather than pulling items from a table, is now pulling from a (conceptual) lookup table. We (conceptually) have a bunch of buckets stored by subject id. So we can select those subject id’s, as well as aggregate info from the buckets themselves, which we do with the count(*). This selects each subject, and the number of books under that subject.

And it works:

But we want the most, and least popular subjects. SQL also has what are called window functions. We can, on the fly, sort these buckets in some order, and then ask questions about the data, sorted in that way. We basically want the subject(s) with the highest, or lowest number of books, including ties. It turns out RANK is exactly what we want. Let’s see how this works

SELECT
    subject,
    count(*) as count,
    RANK() OVER (ORDER BY count(*) DESC) MaxSubject,
    RANK() OVER (ORDER BY count(*) ASC) MinSubject
FROM books_subjects
WHERE userId = '123'
GROUP BY subjectCode language: SQL (Structured Query Language) (sql)

We ask for the rank of each row, when the whole result set is sorted in whatever way we describe.

Subjects 79, 137 and 150 all have a minSubject rank of 1, which means they are the least used subject, which makes sense since there’s only one book with that subject.

It’s a little mind bendy at first, so don’t worry if this looks a little weird. The point is to show how well Drizzle can simplify SQL for us, not to be a deep dive into SQL, so let’s move on.

We want the subjects with a MaxSubject of 1, or a MinSubject of 1. We can’t use WHERE for this, at least not directly. The solution in SQL is to turn this query into a virtual table, and query that. It looks like this:

SELECT
    t.subject id,
    CASE WHEN t.MinSubject = 1 THEN 'MinSubject' ELSE 'MaxSubject' END as label,
    t.count
FROM (
    SELECT
        subject,
        count(*) as count,
        RANK() OVER (ORDER BY count(*) DESC) MaxSubject,
        RANK() OVER (ORDER BY count(*) ASC) MinSubject
    FROM books_subjects
    WHERE userId = '123'
    GROUP BY subject
) t
WHERE t.MaxSubject = 1 OR t.MinSubject = 1Code language: SQL (Structured Query Language) (sql)

And it works.

Moving this along

We won’t show tags, since it’s basically identical except we hit a books_tags table, instead of books_subjects. We also won’t show unused subjects (or tags), which is also very similar, except we use a NOT EXISTS query.

The query to get the total number of books looks like this:

SELECT count(*) as count
FROM books
WHERE userId = '123'Code language: SQL (Structured Query Language) (sql)

Let’s add some columns to get it in the same structure as our subjects queries:

SELECT
    0 id,
    'Books Count' as label,
    count(*) as count
FROM books
WHERE userId = '123'Code language: SQL (Structured Query Language) (sql)

Now, since these queries return the same structure, let’s combine them into one big query. We use UNION for this.

SELECT *
FROM (
    SELECT
        t.subject id,
        CASE WHEN t.MinSubject = 1 THEN 'MinSubject' ELSE 'MaxSubject' END as label,
        t.count
    FROM (
        SELECT
            subject,
            count(*) as count,
            RANK() OVER (ORDER BY count(*) DESC) MaxSubject,
            RANK() OVER (ORDER BY count(*) ASC) MinSubject
        FROM books_subjects
        GROUP BY subject
    ) t
    WHERE t.MaxSubject = 1 OR t.MinSubject = 1
) subjects
UNION
    SELECT
        0 id,
        'Books Count' as label,
        count(*) as count
    FROM books
    WHERE userId = '123';Code language: SQL (Structured Query Language) (sql)

And it works! Phew!

But this is gross to write manually, and even grosser to maintain. There’s a lot of pieces here and there’s no (good) way to break this apart and manage separately. SQL is ultimately text, and you can, of course, generate these various pieces of text with different functions in your code, and then concatenate them together.

But that’s fraught with difficulty too. It’s easy to get small details wrong when you’re pasting strings of code together. And believe it or not, this query is much simpler than much of what I’ve seen.

The Drizzle Way

Let’s see what this looks like in Drizzle. Remember that initial query to get each subject, with its count, and rank? Here it is in Drizzle

const subjectCountRank = () =>
  db
    .select({
      subject: booksSubjects.subject,
      count: sql<number>`COUNT(*)`.as("count"),
      rankMin: sql<number>`RANK() OVER (ORDER BY COUNT(*) ASC)`.as("rankMin"),
      rankMax: sql<number>`RANK() OVER (ORDER BY COUNT(*) DESC)`.as("rankMax"),
    })
    .from(booksSubjects)
    .where(eq(booksSubjects.userId, userId))
    .groupBy(booksSubjects.subject)
    .as("t");Code language: JavaScript (javascript)

Drizzle supports grouping, and it even has an as function to alias a query, and enable it to be queried from. Let’s do that next.

const subjectsQuery = () => {
  const subQuery = subjectCountRank();

  return db
    .select({
      label:
        sql<string>`CASE WHEN t.rankMin = 1 THEN 'MIN Subjects' ELSE 'MAX Subjects' END`.as(
          "label"
        ),
      count: subQuery.count,
      id: subQuery.subject,
    })
    .from(subQuery)
    .where(or(eq(subQuery.rankMin, 1), eq(subQuery.rankMax, 1)));
};Code language: JavaScript (javascript)

We stuck our query to get the ranks in a function, and then we just called that function, and queried from its result. SQL is feeling a lot more like normal coding!

The query for the total book count is simple enough.

db
  .select({ label: sql<string>`'All books'`, count: sql<number>`COUNT(*)`, id: sql<number>`0` })
  .from(books)
  .where(eq(books.userId, userId)),Code language: JavaScript (javascript)

Hopefully we won’t be too surprised to learn that Drizzle has a union function, to union queries together. Let’s see it all together:

const dataQuery = union(
  db
    .select({
      label: sql<string>`'All books'`,
      count: sql<number>`COUNT(*)`,
      id: sql<number>`0`,
    })
    .from(books)
    .where(eq(books.userId, userId)),
  subjectsQuery()
);Code language: JavaScript (javascript)

Which generates this SQL for us:

(select 'All books', COUNT(*), 0 from `books` where `books`.`userId` = ?)
union
(select CASE WHEN t.rankMin = 1 THEN 'MIN Subjects' ELSE 'MAX Subjects' END as `label`, `count`, `subject`
 from (select `subject`,
              COUNT(*)                             as `count`,
              RANK() OVER (ORDER BY COUNT(*) ASC)  as `rankMin`,
              RANK() OVER (ORDER BY COUNT(*) DESC) as `rankMax`
       from `books_subjects`
       where `books_subjects`.`userId` = ?
       group by `books_subjects`.`subject`) `t`
 where (`rankMin` = ? or `rankMax` = ?))Code language: SQL (Structured Query Language) (sql)

Basically the same thing we did before, but with a few more parens, plus some userId filtering I left off for clarity.

I left off the tags queries, and the unused subjects/tags queries, but if you’re curious what they look like, the code is all here and the final union looks like this:

const dataQuery = union(
  db
    .select({
      label: sql<string>`'All books'`,
      count: sql<number>`COUNT(*)`,
      id: sql<number>`0`,
    })
    .from(books)
    .where(eq(books.userId, userId)),
  subjectsQuery(),
  unusedSubjectsQuery(),
  tagsQuery(),
  unusedTagsQuery()
);Code language: JavaScript (javascript)

Just more function calls thrown into the union.

Flexibility

Some of you might wince seeing that many large queries all union‘d together. Those queries are actually run one after the other on the MySQL box. But, for this project it’s a small amount of data and there’s not multiple round trips over the network to do it. Our MySQL engine executes those queries one after the other.

But let’s say you decide you’re better off breaking that union apart, and sending N queries, with each piece, and putting it all together in application code. These queries are already separate function calls. It would be fairly easy to remove those calls from the union, and instead invoke them in isolation (and then modify your application code).

This kind of flexibility is what I love the most about Drizzle. Refactoring large, complex stored procedure has always been a pain with SQL. When you code it through Drizzle, it becomes much more like refactoring a typed programming language like TypeScript or C#.

Debugging Queries

Before we wrap up, let’s take a look at how easily Drizzle let’s you debug your queries. Let’s say the query from earlier didn’t return what we expected, and we want to see the actual SQL being run. We can do that by removing the await from the query, and then calling toSQL on the result.

import { and, desc, eq, like, lt, or } from "drizzle-orm";

const result = db
  .select()
  .from(books)
  .where(
    and(
      eq(books.userId, userId),
      or(lt(books.pages, 150), like(books.authors, "%Stephen Jay Gould%"))
    )
  )
  .orderBy(desc(books.id))
  .limit(10);

console.log(result.toSQL());Code language: JavaScript (javascript)

This displays the following:

{
  "sql": "select `id`, `userId`, `authors`, `title`, `isbn`, `pages` from `books` where (`books`.`userId` = ? and (`books`.`pages` < ? or `books`.`authors` like ?)) order by `books`.`id` desc limit ?",
  "params": ["123", 150, "%Stephen Jay Gould%", 10]
}Code language: JSON / JSON with Comments (json)

result.toSQL() returned an object, with a sql field with our query, and a params field with the parameters. As any ORM would, Drizzle parameterized our query, so fields with invalid characters wouldn’t break anything. You can now run this query directly against your database to see what went wrong.