C Fundamentals

Strings & Memory Exercise

C Fundamentals

Lesson Description

The "Strings & Memory Exercise" Lesson is part of the full, C Fundamentals course featured in this preview video. Here's what you'd learn in this lesson:

Students are instructed to follow the code comments in the "2.c" exercise file. The steps experiment with byte array usage, null terminators, and the %zud and %s formatters.

Preview
Close

Transcript from the "Strings & Memory Exercise" Lesson

[00:00:00]
>> Richard Feldman: Let's do the exercise for part 2. So moving on right here. Let's close one, okay, so first we're gonna once again build and run the program. So, I'm just gonna copy and paste this I'm gonna leave that one open just in case we wanna refer to it later, oops.

[00:00:16]
I need to go into the exercises again, hmm, okay. So essentially, what we're seeing here is that HTTP, 1.1 200 okay, and it says that output was from right, and this is from printf. And then here's the same thing, and let's go take a look at what the differences are in there.

[00:00:33]
So first it says, let's try replacing this number 15 with a call to strlen and then you have to remember to include string.h above. So, this is essentially going to be like calling right just like we did before, but now that we've made header be a variable. Let's actually just use strlen instead of like hard coding the length like we did in part one.

[00:00:53]
After you're done using that, now try introducing a null terminator in the middle of the string. So C allows you to write backslash zero and what backslash zero means kind of similar, like how backslash n is a new line. Backslash zero is basically like write a zero byte at this exact point, you can do this anywhere you wanted a string.

[00:01:10]
So, for example, like you could put it right after the HTTP, sorry, HTTP and HTTP, and just say backslash zero right there, and you might be able to predict what that's going to do. But try to make a guess of what you think it's gonna do before you actually run the program and then run it and see what happens.

[00:01:27]
Okay, and then we have the second thing that's like, hey, that output was from right, this is from printf. And now we can try changing the %s here to %zud, there's gonna be a little compiler warning, but you can just ignore that and that's gonna print out the memory address.

[00:01:38]
We can actually see what that is and that's gonna be different depending on what machine you're running on, potentially. And then finally, try changing this back to percent s, and then replace the printfs last argument, which was right now header, with this argument instead. And then also try it with the number zero instead of that number and you can see some interesting stuff for what those prints.

[00:02:04]
>> Richard Feldman: Let's go through the solutions really quick, so to begin with, we've just got this, we'll just make sure that's working okay, looks good. So now we can go ahead and delete this part, so first thing we're gonna do is replace the 15 with a call to strlen of header.

[00:02:17]
So again, remembering that strlen is going to receive this actual memory address, so sterland takes an integer. The recurring theme that we're gonna see throughout this course is a lot of C functions actually just take integers. They just might have fancier types, but at runtime they're all integers, how does it know where the end of the strength is to calculate the length?

[00:02:33]
Well, it's just gonna keep walking until it gets to a zero byte. Now we do have an error here, which is basically saying like, hey, this is an undeclared library function, been fair enough. And the solution to that is we're just going to go grab string dot h, and then we're all set so this should do exactly the same thing.

[00:02:48]
The only difference is that now it's computing the length of the string. Normally, this would be a runtime, but after optimizations, it's actually going to be done at compile time, and we won't actually have to wait for it to walk that at runtime. Okay, now that we're using strlen, we can mess this up by adding a backslash zero in here somewhere so that's essentially going to insert.

[00:03:07]
We're gonna have three normal bytes, then we're gonna have a zero byte, we're gonna have a bunch of other bytes. But because write is like, give me how many bytes you want to write, now it's gonna truncate that. It's just gonna say HTTP, okay and notice that this affects not only write, but also printf, because printf is automatically doing the thing that we are now doing manually and write.

[00:03:27]
In other words, printf is under the hood doing that same thing, like just keep marching along the array until we hit a zero byte and then stop. So if you put a zero byte in the middle of your string, and then printf it, it's just like I guess that's where the string ends.

[00:03:41]
I hit the zero, so there might be more stuff after that, you kind of intended to be part of the string, but printf and strlen both are just like, I don't know. I saw a null terminator, so I guess that's the end of the string, again, an example of C.

[00:03:52]
There are no guardrails on this stuff with C, cool. I'm gonna go ahead and just put that back, though, so we can actually see our real HTTP header. And now we're gonna change this percent S to percent ZUD, and there is gonna be a little compiler warning, but we're just gonna ignore that.

[00:04:08]
And basically, what this is doing is now over here we're printing out the actual string, but in print F with the ZUD, that's saying, okay, treat this as a 64-bit integer. And it's like, yeah, I'll just print that out for you, now in this case, you might notice that when I print this out, I'm seeing slightly different numbers here.

[00:04:24]
So, they all start with four, three, something, something, something. But they do actually end up with, I guess we don't need the Z, sorry, we don't need the zud, I guess it's just zu, because there's a little D on the end there. But yeah, so this address does actually change a little bit.

[00:04:42]
So, and the reason for that is that basically what we are seeing [LAUGH] is that the memory address is slightly different. Now, you might be wondering why is the memory address different? Like, isn't the compiler deterministic? Why is the compiler generating a slightly different memory address for string the same string and the final binary, why would it be generating a slightly different address?

[00:05:02]
And the answer is actually it's not. What's actually happening here is that there is a security feature that operating systems will do called address space randomization, I think, something like that. At any rate, the point is that because a lot of exploits will actually try to do things like what we did in part one, where we Intentionally read too much memory.

[00:05:21]
We're trying to like get secrets in order to try to defeat hackers that are trying to do that type of thing on purpose. What they'll do is they'll actually randomize the offsets of things a little bit. So you take the same exact compiled binary, and you run it in the operating system.

[00:05:36]
The operating system will just move the addresses around a little bit. And also go in and like tweak whenever the binary goes and like ask for things. Just to intentionally try to mess around with hackers who are trying to be like, I know if I put in this magic number right here, it's gonna get the right offset.

[00:05:51]
I know where the secret is exactly 78,000 bytes after this thing, so if I inject this perfect number right here, then I get the secret right there and then I continue on. They're trying to thwart the hackers by introducing randomness, so that the addresses are unpredictable and therefore hard to do malicious things to so that's what's happening here.

[00:06:11]
It's not that the binary itself is like the compiled binary is putting these in different places, but rather that that's the operating system trying to defeat hackers. And you can see that in C by just printing this out as hey, yeah, this is the actual output of the address itself.

[00:06:26]
And we can, in this way, really, really confirm yeah, for real, this is a number in memory. Okay, finally, we're gonna change that back to %s and then replace the last argument, which was originally header, with this argument instead. So, there is that, so now we're getting a segmentation fault, very consistently, we're not printing anything out.

[00:06:47]
So why is that happening? Well, essentially, again, this is just an integer. And if I say, basically this parenthesis is saying like, hey, I know I wrote an integer here. But just please choose to interpret this as if it were a pointer to a, it's a tar star, it's the, as if it were the address of the beginning of a string.

[00:07:06]
And I just picked the number one, two, three, four, five, six, because this is a low enough address that it's part of the operating systems reserved. You may not ever touch this memory memory, and so, because I'm basically saying, hey, let's print it, let's touch that memory. The operating system is like, absolutely not I'm shutting this down immediately segmentation fault.

[00:07:23]
This is a very reliable way to segfault your program if you wanna do that. But again, C doesn't have guardrails on this if you decide that you want to send it that memory address instead of the one of the actual string, C will happily run. And then the operating system will segfault for you and then finally, we can try it with the number zero instead of that.

[00:07:38]
And now we will actually see something that is a little bit nicer, which is it actually prints out parentheses, null. So again, zero meaning null byte, as opposed to, like the actual number, null, this is something that printf just automatically does. If you give printf a zero here, well, printf obviously is like in most languages.

[00:07:57]
This is where you get a null pointer exception of course he doesn't have exceptions so it's like, well you gave me a null and, I don't know what to do with that. I'm certainly not gonna like to go and try and read memory and address zero cause that's definitely gonna seg fault.

[00:08:09]
So, printf has special case this to be like, if you give me a zero, I'm just gonna print out parentheses null so that at least you don't have a seg fault to deal with. But it doesn't handle any other cases for you than that, as we saw earlier when I gave it 1, 2, 3, 4, 5, 6, and it happily segfaulted, so there you go.

[00:08:26]
This is an example of not only printing some basic string operations, but also, I really, really wanna drive home this point that this is an integer at runtime. And so many of the things that we're gonna see and see are actually integers of memory addresses. And it's just C is deciding how to interpret those ones and zeros based on what we ask it to do.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now