LevelDB & Crypto

Methods and Lexicographically

James Halliday

James Halliday

LevelDB & Crypto

Check out a free preview of the full LevelDB & Crypto course

The "Methods and Lexicographically" Lesson is part of the full, LevelDB & Crypto course featured in this preview video. Here's what you'd learn in this lesson:

After James reviews the methods that are available with LevelDB, he notes that LevelDB uses lexicographic sorting, which alphabetical order of string values based on the alphabetical order. J


Transcript from the "Methods and Lexicographically" Lesson

>> All right, so let's start talking about different kinds of methods that we've got. So this is pretty much it, there's a couple of extra things. But if you know these methods, that's all you need to know about LevelDB. Of course, knowing that the methods exists and knowing what arguments it takes.

They take isn't the whole story, because you have to think about data in a kind of a unique way if you haven't used a database that works like this before. So I'll get more into that, but what we can do to start is we can use a couple of methods put and get to just play around on the repple or maybe we can write a little program.

So why don't I write a little program, we can just make a simple counter that's going to write out a number. And every time you run the program, it'll read the previous number and it'll increment it and then print out the new number. That's a pretty simple thing to do just to give you an idea of get input.

I mean, they work pretty much like I think you might expect. So there's no surprises. Let's just write that out. So I'll call this increments or inc.js. So we're gonna require level like before. We're gonna now have a database location to use. So we call this inc.db, why not?

So increment db. Let's just use a value encoding of JSON, so I don't have to muck about with turning strings into numbers and that kind of thing. And so what we can do now is we can call db.get on count, let's say. So the basic idea with LevelDB is that you have keys and values and there's some subtleties there that I'll get into about like lexicographic orderings, and things.

But for this example, we can just have a key called count. Give it a callback and then the callback gets an error if there is one. So it's always good to bail out if you get an error. Otherwise, what we can do is take that value and increment it.

But actually if a key doesn't exist, we get an error. So in this case, we're just gonna ignore the error on. Because if there was an account already set just use zero, that's fine in this case. So we can take the value or 0 and we can add 1 to it, and this is the value that we're gonna store.

So we can call db.put now on counts with our new value. So the key value and then callback. Now if there's an error, we probably want to know about it. So. But otherwise, we wanna print out what the new count is. So I actually should have saved this to a variable.

So if I do that, let's call it n, console.log(n). Put an else in front of that. All right, so stepping through this pretty simple program. Now, again, really quick and then we'll run it. So we require level. You have to npm install level first to make this work.

We instantiate the LevelDB instance and we get a handle back called db. DB has methods like put and get and create read string, and batch, and all kinds of fun stuff. In this case, we're using a value encoding of JSON, so that we don't have to convert the string to a number over and over.

So we get a count, it's the key. If it existed, then there's no error and the value will be whatever the last value was. Otherwise, there's going to be an error called like not found keynote founder or something like that. So then we increment that number and put it back into the database, and print it out.

So here, I'm gonna run this program. So first time I run it, I get 1. Second time 2, 3, 4, etc. So pretty fun. I'll put that in repo. And you can play around, if you like. So that was an example of get input. There's also methods for doing things like deleting keys.

So we can also like db.del and then it takes the key names, accounts and callback fairly straightforward. The more interesting ones are batch and create read stream. And these are kind of related and there's some ideas that we have to go through about atomic inserts, and structuring our keys in such a way that we can pull out range queries from the data.

All right, so first, I think it's important to cover this topic or cover this idea called atomicity. This means either if you have a bunch of operations, they should either all succeed or all fail. You don't want like, for example, if you're creating a new user accounts on a website and you have to create a new user account and then you have to store their address or store their email or send an email, all this kind of stuff.

So you don't wanna have accidentally created an email and then there's no user account that matches it. You want your data to sort of be in a consistent state and this is a guarantee that you can get with LevelDB using this API called batch, which we'll look at in a moment.

All right and this is an important way to enforce another idea called consistency which just means that your data is not gonna be in some in-between state between two valid states, so that you can make certain guarantees about how your data works and how can deal with it.

All right, so a good way to enforce atomicity with LevelDB is this method called batch. So batch takes an array of operations. All of these objects in this array have a key and a value. And so this is just like doing db.put over and over again. Except if there's an error, all of those operations will be unrolled atomically or they'll all be inserted at the same time.

You're not going to get into a situation where like one of the documents was written and another one wasn't. And then at the end, you get a callback like most of the other API calls in level. So why don't we play around with batch just a little bit?

So we'll call this one batch.js. Maybe we can take some arguments on the command line, I don't know. So as before, you can require level and you can setup a LevelDB instance like this. Maybe we'll do value encoding of json again or not, whatever. So now, we can call db.batch with a bunch of records.

So maybe we wanna insert the numbers 0 o through 10. So can do a little loop through that. I'll just create an array called batch and we can push things onto that array in a loop. And then if there's an error, we wanna know about it. So. All right, so make a little loop and then we're gonna push things onto that loop.

So we have to provide a key. We'll just call it n and then the number. That's simple thing to do and then we'll put a value of maybe i*1000. Pretty contrived example, but I think you should get the idea how this works. So if I run this now, it's gonna just do nothing.

Because we're not printing anything, except for an error. If there was one, then there wasn't an error. So everything is good. This now gives us a good opportunity to look at the final method in the LevelDB API, which is called createReadStream. So if we wanna pull that data out of our database, we can use this method.

So the main thing that LevelDB gives you which is really powerful I mean, all of the rest of this stuff you can get pretty much anywhere else, right? But the main thing where a lot of the power comes from is the ability to structure your data lexicographically with these range queries.

So you sort of have to think a lot about what the c keys are going to sort, which we'll get a little more into. But once you've done that, it can be a very powerful tool that works in a really modular way with a big ecosystem of packages that you can use.

All right, so let's modify our example. Now, I'm just gonna comment out the previous stuff with this batch function that we were doing and we can now call db.createReadStream. You get an object mode stream back, an object mode readable stream. So there's a lot of ways to consume one of these, but we can pipe into this package that I covered yesterday called to2.

It's a silly name, but whatever. It's an easy way to consume a readable stream. So we're gonna get objects. So I like to call those row and then you get an encoding. You can always ignore that and you get a next function. So you have to call the next function to keep the stream flowing.

So you can do things asynchronously and you're not gonna get a bunch of data that's like backed up on the other side. The node stream API is kind of intelligent about making sure that it's only pulling the data that it needs to from the source, which in this case is going to be our database.

Pretty handy if you have to pump data out to a slow connection on 2G edge network or something and can't go very fast. Okay, so what we can just do right now is I'm just gonna print out what those objects look like. So we can use console.log with row and we can call the next function, cuz it's a batch function.

So we're going to use db.createReadStream. We're going to pipe it into just this stream that's just going to print out all of the records in our database. These are objects. Actually, I did this incorrectly anyways. So it's an object mode stream that we get back from the local db.createReadStream functions.

So you need to put .obj or you need to create your stream with object mode and camel case colon true to enable that behavior in node streams like a little bit of string, obscure stream trivia. But it comes up a lot in LevelDB, because you get objects and streams back.

So let's look at what data we get back that we put in there before. Here we go. So we get back the data that we put in there, keys go from 0 to 9 and the values look good as well like 0 to 9,000. An important thing to note is that it's not just coincidence that these are in order.

This is something that you always get back from LevelDB. So when you do a range query, you're always going to get back the results in order. The one caveat to that is if you pass in an option called reverse true, then you're going to get back the results in reverse order which is can be slower.

So it's usually not if you can get away with not doing that, it's good to do it. But sometimes, you already have data in a certain way and you just need to get it backwards. So all right, so I'm gonna put that. So we can start looking at some of the other things we can do with read streams.

Now, one really useful thing that you can do. I mean, there's also a limit function. But one of the most useful things that you can do is you can use these options to sort of have a subsection in the results. For example, if I want to do greater than n5, only gonna get back results that are greater than the key n5 when comparing in string order which is also called lexicographical order.

So now indeed, I get 6, 7, 8, and 9 and I don't get any of the other ones. There's also a less than, less than or equal to. So I can window that down even a little more, less than 8. So now, I should just get 6 and 7.

Yep, so these are just some basic primitives that you can mess around with.
>> So when you mentioned ordering, is that strictly by the key property or is that based on insertion order or?
>> The ordering is based on the key property. So it just so happened that our insertion order was the same as the keys, but that doesn't often even hold it.

Just this silly example. But yeah, so things that are in LevelDB are always sorted by the key in string order. So it always kind of be thinking about how strings are going to compare like how is the string going to be less than another one. So you wanna play around with that.

You can use the Unix sort command or you can use the erase sort command to just sort of like test out, or you can use the less than and greater than operations as well to test that out on the command line. Also, note that numbers get converted into strings.

So 12 comes before 2, because 1 as a string comes before 2. And things that are longer will come after equivalent things that are shorter. So like for example, if one and 12, 1 comes before, because it's shorter than 12, okay? So I think this is a good opportunity to play around.

I'm gonna add those files to the repo. It's kind of a mess but whatever.
>> Add your package json.
>> What?
>> Add your package json.
>> Did I not add the package.json.
>> [LAUGH]
>> All right, cool. I though that I did.
>> The less than and the greater than, were they on the key or the value?

>> They're on the key. Yeah, so the comparisons with greater than less than, etc., are always on the key.
>> From an application standpoint, you would mostly compare the values, right?
>> We'll get a little more into how to do that, you can build secondary indexes. But again, LevelDB really isn't something that you probably wouldn't want to use it for something that's just going to be a drop in replacement for like an existing kind of crud workflow crud app.

LevelDB is for things that are way over in left field in like fun lands where everything is relatively new and experimental, and we're gonna get really far into that territory at the end of today. So yeah, you can kind of use it for that kind of stuff and like people kind of work.

But yeah, it's probably easier if you have one of those problems to use something like more traditional like Postgres API. But you can't run Postgres in the browser. I don't think so, anyways. Okay, so let's see. I see a question on the chat, can you run greater than?

Nope, you can't run greater than value, whatever. So just nope, but I will show you how to do operations on the values. We can actually combine the things that we've already seen. We can combine using range queries and combined using batch, so that we can insert additional keys which is what a relational database will do behind the scenes for you anyways.

So we'll get into that in a bit. All right, another question. Can you store more complex data types like objects or arrays, or they get converted into strings? So depending on what value encoding you're using, you can store whatever kind of values fit into that. So arrays and objects are fine, although I probably wouldn't wanna store something that's like a super nested hairy object with too many elements.

LevelDB is usually best for kind of storing small documents. And if you have other bigger pieces of information like files, especially, it's good to just include like the hash or the file name in the document and then use another store for that kind of information. We'll cover some of those stores at the end of this particular set as well, done.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now