Transcript from the "Sanitizing User Data" Lesson
>> Mike North: So in terms of how we can save ourselves from these things, you want to escape user data before putting it into HTML. So there are two things you wanna do. One is sanitize it before data is persistent. And the other is sanitize data before it's rendered onto the screen.
[00:00:22] And so typically, you want to do both because there are always holes in sanitatization methods, little holes. And if you have two layers of defense, then you're less likely to have something that is truly exploitable. And furthermore, it's quite a bit easier to sanitize data right before you render it, than to worry about properly disallowing all bad things without interfering with genuine user input, right?
[00:01:19] You will see that scrip tag as text instead of feeling it being run as code, if that makes sense. So every view library you've heard of for the most part does this automatically. This is not a guarantee. I'm not saying you are safe with the thing that you're using.
[00:01:39] But most of them react to in ember and view. They have considerable cross-site scripting protection built in, right? Where you have to do work to get out to deviate from it. You have to do extra work in order to make it so that a vulnerability is present, basically opt in to the unsafe.
>> Mike North: So here's a little cheat sheet for different libraries and how you can know that you're in the danger zone.
[00:02:48] These are the danger zones. In Ember and Vue it's the triple curlies, triple handlebars. Anytime you see that, it's gonna take that string unescaped and it's gonna treat it as HTML, right? So if that turned into a script tag, you are potentially vulnerable there. React, I love their approach here.
[00:03:09] You have to actually write the words dangerously set in their HTML. And if you find yourself writing that word, stop and think. Ejs, that is the unsafe thing there. So there are a couple of things like that in this app, and we may be on our way to going to fix them.
[00:03:27] So you'd wanna change that minus sign after the first percent in the Ejs row to an equals. And then we would get our protection, right? Angular, there's an attribute. Inside square brackets called innerHTML, and that will be Unescaped. That one is not necessarily as obvious, unless you know that innerHTML is potentially dangerous.
[00:03:48] You could let that sail through a code review pretty easily. And then, finally, in Jade, if anyone is using Jade, that is the form with exclamation marks on either side. That'll be treated as basically an HTML value instead of a text value.
>> Mike North: So that is defense
>> Mike North: Yeah, that is defense number one, escaping data.
[00:04:17] Sanitize it first. Also, when you are putting stuff in attributes, Escape it as well. There's a great node library that is maintained by a set of penetration testers. It is not small, but it is comprehensive. So it might be good to take this if you wanted to sanitize a database, this would be a good service, I think, to use where you're going through all of your values, and basically evaluating whether anything seems to raise any flags.
[00:04:48] There are so many different tricks you can use involving things like capitalization and adding funky forms of white space that are designed to basically pass through rudimentary cross-site scripting filters. So it's not as simple as just look for a script tag. There are other ways that you can defeat some of the basic detection methods.
[00:05:38] You want user input to be treated as values, not as code. And in this case, that could easily become code, right? If it starts and ends with a quote, and then we have alert whatever in the middle, that'll be executed. But there's no way that that would happen if, for example, it was a templated JSON file that your code was reading, cuz you cannot have executable code in there.
[00:06:04] Does that make sense to everyone? Keep user values a values, not as code.