Web Security Web Security

Sanitizing User Data

Learning Paths:
Check out a free preview of the full Web Security course:
The "Sanitizing User Data" Lesson is part of the full, Web Security course featured in this preview video. Here's what you'd learn in this lesson:

Mike reviews methods for sanitizing data that an application's user would enter to thwart potential XSS attacks.

Get Unlimited Access Now

Transcript from the "Sanitizing User Data" Lesson

[00:00:01]
>> Mike North: So in terms of how we can save ourselves from these things, you want to escape user data before putting it into HTML. So there are two things you wanna do. One is sanitize it before data is persistent. And the other is sanitize data before it's rendered onto the screen.

[00:00:22] And so typically, you want to do both because there are always holes in sanitatization methods, little holes. And if you have two layers of defense, then you're less likely to have something that is truly exploitable. And furthermore, it's quite a bit easier to sanitize data right before you render it, than to worry about properly disallowing all bad things without interfering with genuine user input, right?

[00:00:56] So in this case, we can see that the script above turns into the string below. And the JavaScript functions you'd use to run that are escape URI component, or escape URI. That will basically make it so that you end up with a string that when rendered will be text for a scrip tag on a screen.

[00:01:19] You will see that scrip tag as text instead of feeling it being run as code, if that makes sense. So every view library you've heard of for the most part does this automatically. This is not a guarantee. I'm not saying you are safe with the thing that you're using.

[00:01:39] But most of them react to in ember and view. They have considerable cross-site scripting protection built in, right? Where you have to do work to get out to deviate from it. You have to do extra work in order to make it so that a vulnerability is present, basically opt in to the unsafe.

[00:02:01]
>> Mike North: For the library we're using here today, here's the convention that's used. So anything between percent signs is generally, that's a JavaScript expression. If you use percent equals, that ends up being an escaped expression, meaning that's cross-site scripting protection is working for us there. If instead of the equals, we use a dash, there is no protection.

[00:02:25] And if we have neither an equals nor a dash, just the percent sign alone, that's just a JavaScript expression that will be evaluated, but the output will not be rendered into the HTML.
>> Mike North: So here's a little cheat sheet for different libraries and how you can know that you're in the danger zone.

[00:02:48] These are the danger zones. In Ember and Vue it's the triple curlies, triple handlebars. Anytime you see that, it's gonna take that string unescaped and it's gonna treat it as HTML, right? So if that turned into a script tag, you are potentially vulnerable there. React, I love their approach here.

[00:03:09] You have to actually write the words dangerously set in their HTML. And if you find yourself writing that word, stop and think. Ejs, that is the unsafe thing there. So there are a couple of things like that in this app, and we may be on our way to going to fix them.

[00:03:27] So you'd wanna change that minus sign after the first percent in the Ejs row to an equals. And then we would get our protection, right? Angular, there's an attribute. Inside square brackets called innerHTML, and that will be Unescaped. That one is not necessarily as obvious, unless you know that innerHTML is potentially dangerous.

[00:03:48] You could let that sail through a code review pretty easily. And then, finally, in Jade, if anyone is using Jade, that is the form with exclamation marks on either side. That'll be treated as basically an HTML value instead of a text value.
>> Mike North: So that is defense
>> Mike North: Yeah, that is defense number one, escaping data.

[00:04:17] Sanitize it first. Also, when you are putting stuff in attributes, Escape it as well. There's a great node library that is maintained by a set of penetration testers. It is not small, but it is comprehensive. So it might be good to take this if you wanted to sanitize a database, this would be a good service, I think, to use where you're going through all of your values, and basically evaluating whether anything seems to raise any flags.

[00:04:48] There are so many different tricks you can use involving things like capitalization and adding funky forms of white space that are designed to basically pass through rudimentary cross-site scripting filters. So it's not as simple as just look for a script tag. There are other ways that you can defeat some of the basic detection methods.

[00:05:15]
>> Mike North: And then, be super, super careful, anywhere that you're putting a user's value templated into code. Meaning if you have a JavaScript file that, when it's rendered up by a server, it puts something into place, that is absolutely bad. Do not do that. Have it so that your code will go and retrieve a value from somewhere and consume the value that way.

[00:05:38] You want user input to be treated as values, not as code. And in this case, that could easily become code, right? If it starts and ends with a quote, and then we have alert whatever in the middle, that'll be executed. But there's no way that that would happen if, for example, it was a templated JSON file that your code was reading, cuz you cannot have executable code in there.

[00:06:04] Does that make sense to everyone? Keep user values a values, not as code.