Transcript from the "Network/Proxy Bug" Lesson
>> With specifically trackJS, [LAUGH] because there's your two options, you can include script tag linking to your CDN or you can load it yourself. What would you recommend?
[00:00:19] You can either include directly from our CDN, which is the easiest way to get started. Or you can pull the script yourself. It's the same risk. But if you reference it from the CDN, you're trusting us not to screw up your page, which we are going to try very hard not to screw it up.
[00:00:34] But if you're very concerned about it, pull it off of MPM and bundle it in with the rest of your scripts hosted off of your own network and then you have complete control. There is an intermediary step that you can do is we host, this isn't just us like a lot of vendors will do this.
[00:00:51] Where you can go to a particular URL and get the most current version of our script that has like all the latest bug fixes, all the latest capabilities. Or there's a different URL where you can ask for a named version. You want script version 2.7.1. Which we promised we won't ever change, but you're still inherently trusting us that like, our CDN is not going to get hacked or we're not gonna accidentally change it or stuff like that.
[00:01:17] So if you want to guarantee that you have control of when changes making your app, pull your scripts yourself. Anything beyond that is a balance between how much do you trust your vendors to not make changes? And what's the impact and risk if they do? Like all things, it depends on what your app is doing.
[00:01:37] Like, if your app fails to people die, then you wanna be really careful about who has the ability to break your app. If your app fails, somebody is minorly inconvenienced and they don't really care that much. They'll do it tomorrow, then it's not so big a deal, right?
[00:01:57] We don't all need the same levels of risk in our applications. And so it's up to all of us to look at our own applications and understand, what is the risk and impact of outages and how careful do we need to be above it? All right, here is our last bug that we gonna do together today.
[00:02:18] And this one is kinda tricky. This one is something that's actually happening to all of you. All of your apps I guarantee have this problem. Because it happens to everybody's app and there's no way you can really do anything about it. If any of you have monitoring I bet you've seen the second one.
[00:02:35] jquery is not defined right? If you've turned on monitoring, you've probably seen that report once or a while. And you're like, how can jquery not be defined? It's just right there on the page, like, how could it not be there? It also manifests itself in my app as analytics is not defined, cuz I have some other calls to analytics.
[00:02:56] That's the same root issue. It doesn't happen a lot. But it happens to some it's not just one random user with a crazy configuration. It's happening to a sizable group of people across a wide time range. So let's see how could these core level things like analytics.js and jquery.js not be defined for our applications.
[00:03:31] All right, so I'm gonna actually take a look at the source code. And we're gonna take a look at analytics.js. Now analytics, we're gonna look in source Oops and src. Now, analytics I use all over my application. I use its API to do different things. I call the, say I wanna track conversion or I want to make another call to something.
[00:04:04] But there's no real magic about how it's included. As we looked at in the last error that we solved together, analytics.js is just a vendor script. And I include that vendor script, just like any others as a script tag directly here on my page, here on line 44.
[00:04:19] I'm not loading it through some arcane mechanism. I'm not like pulling it in and evaluating it. I'm not really doing anything. I'm dropping a script tag on the page. So how would that, how would that even fail? If I just reload my page now and take a look at my Console, I'm obviously not getting that error.
[00:04:43] It's not happening to me. It's not happening to everybody. Let's see if I can recreate it. So what happens if for whatever reason, that script tag just wasn't there. What error would get reported? So I'm gonna comment that line out, reload. And when I do something, I see analytics is not defined.
[00:05:11] So this error seems to happen when analytics is not there. Which makes sense based on the message that it gets created. But how would that happen? How could this analytics not happen? Well, as a few of you had problems with it earlier today, one thing could be ad blockers.
[00:05:30] In fact, a lot of times it's ad blockers. Ad blockers go in and stop certain requests from being made. Maybe somebody has an ad blocker that thinks analytics.js is something bad that we don't want on the page hence blocking it. Now I don't have any ad blockers installed.
[00:05:47] But we can simulate some behavior using some other tools. Now you don't necessarily need to have this but I'm going to kind of show it off because it's fun. On Mac there is a application called Charles which on Windows, you could also use Fiddler, if you're familiar with that.
[00:06:06] And this is a Web Debugging Proxy. I really need to purchase a license for this. So I don't need to wait the 10 seconds of neg wear for this thing to purchase because Charles is really very cool. It is an HTTP proxy that allows me to manipulate the traffic as it's going by.
[00:06:26] So I can simulate conditions of my users in a more thorough way than what I could do with Chrome alone. For example, what if I wanted to simulate a bad network connection? One of the tools that Charles has is called Throttling. And what this lets us do is it lets us set how do we want our network to behave?
[00:07:06] Can we simulate somebody who's on a poor connection? And not just the chrome level connection that just slows it down and introduces latency, we can actually introduce bad networks. I have one that I've used before, I don't have it, that I call this one, bad 3G on a subway.
[00:07:26] That not only is it slow, 3G on subway there it is. Not only is it slow, but it occasionally will simply drop packets. Like will really happen if you are on a subway or an elevator or anywhere where you're on a mobile device that sometimes just loses signal for no apparent reason.
[00:07:51] You can do this by not only changing like the bandwidth and upload limitations and how to utilize everything is. You can change the reliability and stability of the network. So I'm going to drop the reliability of my network and stability of my network to 95%, still illustrate the point.
[00:08:07] So I'm gonna turn this thing on. It's recording and Throttling. If I reload my page here, I had a bunch of stuff break. That's good. Sometimes I have to do this like three or four times to get stuff to break. So, moment.js failed to come down, because the content that came down didn't match how big we thought it was going to be.
[00:08:39] The network broke down and the packet did not come through successfully. ERR_CONTENT_LENGTH_MISMATCH. Therefore, we created downstream impacts we had an error moment is not defined. And another moment is not defined. A couple of our ads also failed for other reasons So using this in stable network each time we reload we'll see slightly different versions of colossal failure.
[00:09:15] This time it came down but our fonts failed. This time just the hair ad failed. So using these search tools we can simulate how does our application act in bad connections. Now, this isn't super common. Like, you probably don't see this all the time. And it's uncommon enough that most of us didn't know that it happened.
[00:09:42] But at the 2016 Google IO conference, they released some internal metrics from Android. And a lot of this was from older Android that had some connectivity problems. But they were saying that between one and 4% of all HTTP requests made from Android failed. That's a lot. But if one percent of all HTTP requests failed, like that's a lot of requests that just wouldn't work.
[00:10:11] And a lot of applications that just wouldn't work. Like how would your app act if like, one out of every ten of those just would fail to load randomly for your application. How does your application look? Well, in our case, we have direct evidence that a bunch of times analytics.js is failing for us.
[00:10:36] So I wanna actually simulate that. I'm gonna turn off Throttling here. I'm gonna turn on another tool in Charles called rewrite. That enables me to dynamically rewrite a request that's being made, which is really cool. It's actually really useful for more advanced complicated system debugging. I have this rule that I already kind of put together.
[00:10:57] Here's a rule that says, hey, when the request is made here for, well that I'm going to work, I gotta change that. What's that dub.getranter.com now. Yeah, getrancher.com Port 9000/scripts/vendor/analytics when that request gets made. What I want you to do is I want you to not return the 200 you were thinking you were gonna return I want you to return a 404.
[00:11:30] And I want you to rewrite the body to be an empty string. Using this kind of simple rule builder, you can basically force your local system to return anything you want to return. You can intercept third party scripts and fix their bugs form and see if you can like confirm or fix.
[00:11:45] You can intercept production environments and interject your dev environment into a production request. You can do all kinds of very interesting debugging use cases, and this is available in both Charles and Fiddler. So you should check those tools out. So I'm gonna activate this. And now if we reload the page, Hopefully it works as I think it's going to.
[00:12:22] Analytics, it's /vendor/analytics. Sorry I wrote the rule incorrectly, /vendor/analytics. Now it's still coming down. It's shoot. Cdn.getranter.com. There we go. So now we've made the request and now analytics.js is coming down with a 404 and an empty response body because we're intentionally intercepting in our proxy. And now we can see what does our application actually do when these failures have happened.
[00:13:40] We see the error when we attempt to do something. We attempt to create a rant and it generates this problem. And so the impact of this error for those 60 users is that they're unable to really use the site. Every time they click rant, it just generates another error.
[00:14:01] So that's not a very good user experience for them. If they're one of those unlucky few who land in this case where some critical file fails to come down, they just think get ranter is broken and useless and they'll go back to Twitter. And is not what we want at all.
[00:14:17] So let's see if we can figure out a way to do some sort of intelligent fall back. Because we know that a script can fail to load sometimes. So if it fails to load that sometimes what can we do about it? Here is where we're actually making the call.
[00:14:36] And inside of that safety check we did this morning. The user is about to add a rant, we've checked to see that there is rant text and the type of that rant text is a string. And just before we try and create that new rant. We track a conversion.
[00:14:50] We track a, this was an important event that I want to track in my analytics tool. But this blows up if analytics is not defined, if that script never came down. So we should probably safety check that. And that It's not a complicated thing to do. The first time you make a call for something, make sure it's there.
[00:15:19] Because at some small percentage of the time, single scripts will fail to load. With this minor change, if we reload the page here, all I did was wrap analytics.trackConversion and an if analytics. When I reload the page analytics still doesn't come down. Under that not save. I lost my mapping sorry.
[00:15:47] Let me make that change again here. If (analytics), It's not saving my mapping. Why is it not saving my mapping? Map to File System Resource Right, we'll just change here, adrantView, Now with that changes in place according to the Console, Alex j has still failed to load so we're still in our fail mode use case.
[00:16:43] But now dang it. [LAUGH] Global values you need to anchor them. Sorry. We need to check if analytics exists on Window before we call it. If you reference a global variable without Window, that works if you're just going to call it. But if it's not there, you actually need to reference what global object it's on.
[00:17:10] So if there is a such thing is window.analytics, then let's call analytics.trackConversion. Now when we actually click on the ramp and even though analytics has failed to load, the user can still go on their merry way. The user does not need to be aware that I've had a failure.
[00:17:28] This can be our little secret. The only impact really, in this case of analytics failing to load, is that I've failed to capture my conversion metrics. But the user can go on their merry way