Transcript from the "Resolving Bugs" Lesson
>> Resolving bugs is all about lowering the risk of change. When you have a bug, especially a bug that is made all the way out in production, fixing that bug is a change. But you should fix bugs. But what if you introduce more bugs by fixing that bug?
[00:00:17] What if you introduce a worst bug by fixing that easy bug? What if it's Friday at 4 o'clock and all your buddies are down at the pub and you're trying to decide, do you fix the bug and risk introducing a worse bug or do you just let the bug go till Monday?
[00:00:34] Lowering the risk of change helps make this whole thing a non-issue. How do you make it so that releasing a new version of your app is a non-issue? This goes back to the impact in the context of your application. Is this a life or death thing? Are we talking about pacemakers?
[00:00:56] Are we talking about people's lives? If we are, releasing of a new version of your application needs to have the appropriate safeguards so that people don't die. For most of us, I think we can probably reach an appropriate level of confidence to make releasing our applications a non-issue.
[00:01:15] And that can come from a lot of different angles. It can come from continuous integration, continuous delivery build process. Do you have the ability to check in your code to your source control and know that it works? Will it automatically build? Will it automatically run test suites? Do you have test suites?
[00:01:34] Do you do unit testing for your lower level code? Do you do browser level testing to make sure that actually all work together? If you have these in place, checking in a change to your code and deploying it can be a relatively confident thing. I will tell you a little bit about some changes I made to the biggest code base that I maintain.
[00:02:14] So there's a tremendous risk that I will break my customer's pages if I do something wrong. And so my impact was quite high. I was quite scared to make changes to it. And so we had to make a number of changes to how we built the app over our lifecycle.
[00:02:31] So, introduced a better unit testing cycle for the things that needed to be tested at that level. We introduced using Sauce Labs, a full end to end browser testing cycle that runs every time we check in. And we track 400 different classes of errors across 18 different browsers every time we check in.
[00:02:51] Every time we commit to master, we run our full test suite to make sure that we're not gonna break anybody. And then because I still then trust that, we used a strategy called Canary, where I have a handful of my customers who have been very generous. Have decided they are willing to point their web applications at my Canary release.
[00:03:10] And so I can release to Canary when I've committed a change in my test pass, and I let that bake for two hours, four hours, six hours a day. However big I need to be to be confident or however long I need to be to be confident that it's going to work.
[00:03:28] And I know at the end of that, that it is going to work. I've let real users and real applications run this thing. And I have all the instrumentation to know that it's not gonna break. That was a story about how we lowered the risk for our change.
[00:03:42] And all the stuff that I needed to do to reach a level of confidence so that when I run into a bug at 4:30 on a Friday, I'm not worried about it. I can fix it in my test pass, I'm gonna send it to Canary. And then maybe when I get back from the pub, if everything looks good, I'm gonna go ahead and ship it to production.
[00:03:58] Because our testing and our analytics cycle has enabled that. That does a tremendous amount to lower the stress and lower the overhead to build quality software. If releasing a fix to your bugs is a non-issue, you'll fix more bugs. If fixing a bug is hard, your bug backlog is just gonna get bigger and bigger and bigger.
[00:04:25] And nobody's gonna wanna do it because it's scary and hard and nobody wants to be blamed for making it worse.