Advanced Elm

Derived Data

Richard Feldman

Richard Feldman

Vendr, Inc.
Advanced Elm

Check out a free preview of the full Advanced Elm course

The "Derived Data" Lesson is part of the full, Advanced Elm course featured in this preview video. Here's what you'd learn in this lesson:

Richard explains why, given the option of rendering a value that is derived from another in view and throwing it away, this is a much better alternative than recording it in the model.


Transcript from the "Derived Data" Lesson

>> Richard Feldman: Other one is derived data. So let's say we have this timestamp on our server, of one of our articles, and that's what's stored in the database, and that's sort of the original source of truth, right? This is data that lives on the server, and if anybody disagrees with the server, they're wrong.

The server is right. Let's general trend with client server interaction. It's usually if the server things one thing on the client things on the different thing. Generally speaking to the servers gonna win, so if you're gonna have to pick one source of truth. It's usually best to assume the servers is correct.

So it's coming back to the server and says, Come here cross the wire in ISO-8601 string format. So it's a string that is gonna come across and some JSON blog and we're going to decode that into a Time.Posix, and from there, we're going to translate that, we're going to render it, into the end user as a nicely formatted humanly readable string in the client.

And then finally, we're going to put that string in Html and we're going to see something like this. So all of this is derived data. We have one source of truth at the root here which is the database Timestamp and everything else is derived from that. This ISO-8601 string can be derived from having this, given that we can derive this, given that we derive this and given that we derive this.

So there is really no motivation for an example like this for us to have multiple sources of truth for this. We're just saying yeah database is the source of truth and everybody is just saying tell me a thing that came from the database and I will translate it into one additional step, or translate it one step further.

So as far as like where these things actually live, the database time stamp itself is hidden from the client. The first thing that we actually see is this decoded ISO-8601 string coming across the wire in the browser. And we take that, we convert it through Time.Posix, from the HTTP request we store in the the model.

And then we calculate that string in the view, that nicely formatted human-readable string says, May 5th, 2018, or whatever it was. And then of course, the HTML message is also gonna be calculated in view. Now here we have an example of drive data. But we actually do unfortunately, regrettably end up with multiple sources of truth anyway.

So here, when we're storing this in the model this is now a client cache of something where the source of truth is the database. And because we're caching it, cache invalidation is a well known, hard problem in computer science. This can get out of date. Now in the case of a database timestamp, it really shouldn't be changing.

If it's an actual timestamp of when the post is created. But if it's a timestamp of when the post was most recently edited, that could actually change on the database, and now our cached version is out of date. Unless we have some sort of web socket set up to live update every single time the database changes, we're probably just stale.

Like the user's looking at this timestamp on the screen, and it's just wrong. If you wanna know the ultimate source of truth for what that timestamp is, it's on the database, and it's a different value than the one we've got. So again, we have ended up with multiple sources of truth, they are competing.

We are rendering the one from the client cache, but actually the one that we want to trust is the one in the database, which we don't have access to. So we actually end up catching a lot of things in the model. In general, any time something comes in from a HTTP request, we're probably caching it in the view.

This is just a reality of doing client-side development. I mean, the reason we cache things is for performance. Imagine if every single time you wanted to render, first you had to go to the server with an HTTP request. Get everything back and then render based on that. It will be incredibly slow.

Unusably slow. So of course we're gonna cache things. But we have to be aware that when we are caching things, we have to deal with caching validation which means some of the time we're potentially going to be wrong. This is sort of an accepted cost of doing business in web development or in any kind of client development.

But it is something that's important to be aware of. This can be a source of bugs.
>> Richard Feldman: Here we have some intermediate representations of this derived data. This is a lot less error prone. Like if we're saying, okay, we accept the model as the source of truth that we're gonna render from.

Which in the HTML application sort of goes without saying. Then each of these can be derived from those. And as we saw in the previous slide, we can just put those in the view function. And then there's sort of transient. They are intermediate representations on the way to the final representation that we want, which was the actual done nodes on the screen.

So the string which is derive from the model and this HTML was derive from the string are both intermediate representations on a way to the final form. And it's important that we don't store those intermediate representations. As competing sources of truth in our model. Unless we really, really have to.

The reason we might have to is again performance. Then for some reason this operation here, we're calculating the intermediate representation is really expensive and it's slowing everything down. Then yeah, maybe we have to cache that and the model too. Separately from the original source of truth that it comes from.

But then that point where cache of a cache of the original source of truth which is the database time stamp, and the more times we do that the more convening source of the truth we have, the more potential for bugs we have. So the moral of the story is number one, we're probably stuck we having that client side cache of the model and the fact that it potentially out of date with the underlying server.

That's something we're just going to have to live with for performance reasons. But we don't need to make it worse than, that unless we actually have another performance problem. By default, if you have the option of rendering something in the view and then discarding it afterwards rather than recording it in the model, always do that.

If it's derived data and it's something that comes from the model and nothing else, great, put it in the view and don't have it be anywhere else. So if you're ever wondering, hey, where should this logic go? If the answer could be in the view, start with that.

And only if you have a performance problem should you move it to update and then store the result in the model.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now