Database Replication and Duplex Streams

James Halliday

Substack

Check out a free preview of the full LevelDB & Crypto course

The "Database Replication and Duplex Streams" Lesson is part of the full, LevelDB & Crypto course featured in this preview video. Here's what you'd learn in this lesson:

James discusses how to synchronous two databases. James takes questions from students.

Get Unlimited Access Now

Preview

Transcript from the "Database Replication and Duplex Streams" Lesson

[00:00:00]
>> What if we have a second log that lives in a different place we populate it with some different content. Hopefully you can see where I'm going with this. So call this one log2.db. And I just modified the add function so it's gonna add it over there. And we can run add with some other properties like hi.

[00:00:26]
And we'll link to that message and then third one, links back to that key. Okay, so now we've got two databases, log and log2.db. And so what if we want to synchronize these datasets? What if we want to concatenate them together? So we can do that. That'll be fun.

[00:01:01]
There's a lot of ways that we could wire up that duplex transformation. But I've got a really nifty little way of doing that on the command line. This program will let us use standard in and standard out and it will just sort of swap them together. So it's like an Ouroboros, the snake that eats itself.

[00:01:16]
That's what we're gonna be doing on the command line. Okay, so well, so I'm gonna create a new file called replicate.js. And replicate unlike the other programs is not gonna take, it's not gonna hard code the log file, we're gonna take that in as a command line argument.

[00:01:38]
So that'll be the first command line argument, first and only. And then there's a function in hyper log called replicate. Sorry, replicate. So this is a duplex stream. Duplex streams are kind of like a telephone conversation you can have where there's a readable side and writable side, but both sides can sort of talk and listen whenever they want.

[00:02:04]
It's the kind of thing that whenever you see A.pipe, B.pipe A, you've got a duplex stream, which is exactly what we're going to do. Although it might not look like it because we have a layer of indirection here using standard and non-standard app. Okay, so using this magical program that I alluded to moment ago, so what we wanna do is we wanna run two copies of this program, right?

[00:02:30]
We wanna run rep.js with log.db and we wanna run rep.js with log2.db. So that's exactly what dupsh does. So, log2.db, so it takes this program dupsh takes the first commands, takes the standard out of that commands and pipes it into the standard in of the second commands, but then it takes the standard out of the second command and pipes it back into the standard in of the first command.

[00:03:01]
So this is a way of doing a kind of symmetric replication song and dance. Really useful trick if you're building symmetric protocols because then you don't have to set up a TCP server and a client for basically the same code, but set up in a different way. So, this is just a nifty little trick.

[00:03:18]
Do that. So sometimes it hangs but usually it works anyways. So now if I run list.js, I don't actually see them. I think there might be a bug in this. That's too bad. Yeah,so I think there's a bug in my dupsh command or in hyper log replicate. But what it should have done is it should have printed out all of the documents.

[00:03:52]
Why don't I go ahead and modify the replicate command to cheat a little bit and we'll just load both databases and will probably work then. So, we'll load db1,db2. So log.db is a string and log2.db, so we have log1. We have log2 and we can pipe those replication streams together.

[00:04:22]
So log1.replicate and log2.replicate and now this one actually is going to look more like the duplex stream pattern. So to r1.pipe(r2).pipe(r1) again. Now if I run the list program, that's really weird. It's still only shows the, It's not including the other ones. Maybe because there's no common root.

[00:04:54]
It's like refusing to deal with this. So let's go ahead and modify one of our log files so that there's sort of like a common piece coming to them both. So if I run add.js, it's still hooked up to log2. And I'm gonna point at the latest key.

[00:05:12]
So I'm gonna say hi from log2 and then I'm gonna link it that one. And then whoa. All right Hyperlog is actually refusing to insert a key that it doesn't know about yet. So what I actually wanna do is I'm gonna copy log.db just overtop of log2 then it should work.

[00:05:51]
There we go. Okay, so now drumroll Okay, so we wrote something into log2 that log1 doesn't know about yet. And if I do the replication song and dance, now if I run the list which is still hooked up to the first log I still don't see the second one.

[00:06:14]
That's okay. That's really, well anyways, you can imagine how this would work if I'd prepared a little more. And there's a question on the chat, can we add a blog as a value to a Hyperlog?You can, in fact the default in coding with Hyperlog is to deal with binary pieces of data.

[00:06:37]
But just like with level db and Hyperlog is using level three, it's probably a better idea to stick to small documents instead and sort of use another mechanism for dealing with large files. Like an example of that that works pretty well is called Hypercore, which is part of the debt project.

[00:06:57]
And it has a streaming replication. It also internally has a log in a cap architecture, but you can use that to exchange blobs and it's a lot better at that kind of thing. So,
>> This would have worked, it would have messed up the chain rate?
>> No, the chain would have just been fine.

[00:07:17]
I don't know why it didn't work. That's too bad but-
>> Because for the other, for the log2 database we started with one hash, and that started a chain. At log1 there was a chain-
>> It did,yeah.
>> -So when you concatenate both of them there will be no flow rate.

[00:07:31]
>> Yeah, but then I deleted the second one and based it on the contents of the first one, so I'm not sure why it didn't work then either. Could be that I accidentally put it into an inconsistency state or something but

Learn Straight from the Experts Who Shape the Modern Web

In-depth Courses
Industry Leading Experts
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now