C Fundamentals

File I/O Exercise

C Fundamentals

Lesson Description

The "File I/O Exercise" Lesson is part of the full, C Fundamentals course featured in this preview video. Here's what you'd learn in this lesson:

Students are instructed to use malloc and char* when declaring the buffer. The free function should also be added to avoid memory leaks. Finally, the program needs error handling for the open, stat, read, and malloc operations. Use the Linux man pages for error handling examples.

Preview
Close

Transcript from the "File I/O Exercise" Lesson

[00:00:00]
>> Richard Feldman: Let's go through the part four exercise. So we're gonna start off by building and running the program, as we've been doing, cool. So you might notice that we're printing out what appears to be a gigantic index.html file, actually two of them. So it says index.html contents and then blah blog/index.html contents.

[00:00:22]
If you look inside the exercises folder, there are a couple of extra things in here. So there's this index.html. This is like, in addition to the all the dot c files, and it looks like this says, like, Hello World. This through slash page. In the future, when we get the actual web server running, we're gonna literally be like, serving this to a browser.

[00:00:39]
But for right now it's just hard coded in our exercises directory. And then we also have a blog directory which has an index.html inside of it, which just says, this is a blog. And basically the only purpose of these two things is just to be able to verify that like our parsing and translating things into paths is actually working correctly.

[00:00:56]
Cool, so, yeah. Let's build and ran the program. Let's take a look at what this is doing. So this part right here, this two path, this is gonna look familiar. This is basically just like, if you take what happened at the end of the solution for part three, this is what's there.

[00:01:12]
So this has the like the refactor that we did. So all of this is just stuff that we've already seen before nothing new to see there. We just so still needed it because, we're building up to a static web server. Now we're going to have this new thing called print_file, which is basically like takes a path.

[00:01:27]
And this path is gonna be parsed out of the request from earlier. And then basically we're going to go through this whole process that we just learned about. So do the open to get the file descriptor, make the metadata struct, call fstat to populate it. Right now, by the way, you might notice that we are not handling any errors here, and we're gonna see why by the end of the exercise, we've chosen not to handle any errors.

[00:01:49]
And here we say, change this charge star to and malloc, sorry. Change this to charge star and malloc. So right now, this is allocating metadata, st_size + 1, which as we saw from the slides, this is not safe. This could very easily blow the stack, and if we want to avoid that, malloc is one way to do that.

[00:02:05]
And by the way, don't forget to handle the case for malloc returns null, which is what malloc returns when there's an error. So make sure that we handle errors, because malloc can fail to and also don't forget to call free on it at the appropriately making sure not to use use after free.

[00:02:19]
Freeze, or double freeze, or anything like that. Just call free at the correct times so that this will free this actual memory after we malloc it, okay. And then finally, after we've done that, then just go back and add error handling for all the cases that could happen.

[00:02:35]
So if you want to see some documentation for these different functions, this is like you. A little bit of a your first taste of, like, C documentation. But I do think that learning how to read C Docs is actually one of the most valuable things you can take away from this course.

[00:02:48]
Because if you're trying to do, interop with another language, or you're trying to, get a little tiny bit of C involved in a really performance critical section of something you're doing at work. This is, kind of the way to do that. And so these links will give you sort of a little sense of what that can look like.

[00:03:10]
>> Richard Feldman: Okay, before I go through the solutions for this, let me just start by just bringing this up on the screen so you can kind of see this for those who are like not reading this as we go along. This is the the actual docks for the the open function.

[00:03:22]
So this is the Linux version of this. In part six we're actually gonna see that, although almost all the time the versions of these things are identical for our purposes. At the very end of the workshop, we are gonna encounter a case where there actually is a slight and very important difference between how a particular function works on Linux versus on how it works in BSD, which is the same MAC OS.

[00:03:43]
And we actually are gonna have to account for that difference, but for the rest of the workshop, you can just get away with looking for either the Linux docs or the BSD docs. And I think the Linux docs are a little easier to read, but basically, like this is a pretty representative example of like, what it looks like to read docs for Linux, like C functions.

[00:04:03]
You can just kind of search for these on the web and just like open and then like maybe Linux or something like that, or even just like open and then see, and you'll get a lot of results from this website. This is man7.org. And basically, here's what it looks like.

[00:04:15]
You've got a variety of related functions. They're sort of all defined in the same place. They're whoever wrote the docs decided to organize them this way, so here's open. Basically, all of these things are things that we have at some point seen. So const char, we haven't seen.

[00:04:29]
I mean we saw const, but in a different context where it was sort of out at the top. And again, this just means, basically, you're not supposed to write into this. This is the pathname to open. So this is a char star, so terminated string int. We know about that one.

[00:04:42]
That's the flag. So this is, like the read only flag that we put there, the dot dot dot basically means Varg. So it means that you can optionally send additional arguments. And then this is just a little comment saying that, like, hey, there's like, one var arg that you're allowed to send, and it should be one of these mode t things.

[00:04:57]
Mode_t is a struct that. That specifies, or actually, sorry, it might be flags, not sure, but the type for this is defined somewhere else. But this is basically like if you want the, yeah, some additional configuration around how you want to open the file. Then there's other functions like create and open app and open app two and stuff like that.

[00:05:17]
We don't need to worry about these, but the point is just that like this is our first example of. Now we've learned enough that we can actually start reading C documentation, which opens up a whole world of exciting things to us. All right, having said all that, let's go ahead and go through these two things.

[00:05:32]
So, [COUGH] Okay, first one is we're gonna change this to char star and malloc. So we're gonna say char star buff = malloc and we're gonna get an error at first because this is not imported. We're gonna import stdlib.h. Actually, let me just go ahead and do that, right there.

[00:05:49]
Okay, stdlib.h. Great, and jump back there. Okay, and how much do we want to malloc? So malloc takes as its one argument, how many bytes do you want to allocate, which is essentially the same exact things we have in there. So metadata.st_size + 1. As usual, we got the plus one because we want to have a null terminator in there, because this is going to be a string that we're gonna write out.

[00:06:13]
Okay, great, so we've got this thing right here. It's giving us a little error, because I have both of these defined at the same time, but. So that's all we're gonna do there. So hint one, don't forget to handle the case where malloc returns null. Thanks very much.

[00:06:24]
So let's go ahead and handle that real quick. If buff double equals null, then we're just gonna return without printing anything. So this will essentially mean that make sure that we're not causing any problems with like read trying to write into null right there because that would immediately cause all sorts of explosions.

[00:06:42]
And then, of course, don't forget to free buff later to prevent memory leaks. So right down at the end here where we're closing the FD, let's free that. Great, okay, looking good. All right, there's that. Let's run the program again. And sure enough, everything looks great. We are now printing both of these files and we're doing it with malloc and free.

[00:07:06]
I don't see what all the fuss is about. That was no proble, we just added a little free in there. All right, and now finally let's go back and add error handling 'cause we omitted that earlier, so let's go ahead and just do that. Here are the docs for all these things.

[00:07:17]
So if you look at the docs for open, you can see that. It will say, hey, if FD is -1, then that means an error happened. And again, normally, we would like, handle this more gracefully. Print out, hey, no, I couldn't open this path, not found or something like that.

[00:07:34]
Maybe look at what specific error had happened. But for our purposes, we're just like, yeah, just return no problem. So that's that one. Then we have stat which can also fail sorry, fstat. So gonna say, if this is w = -1 here, we're not bothering to introduce a variable because we're, we would only read it for error handling anyway.

[00:07:54]
So just go ahead and do that. Okay, all right. And read, okay, where is read there. It is great. And now I can just say if bytes_read w = -1 return and malloc, or is that one? We already did malloc, right? Yeah, great, okay, yeah, we can print of that an error handle, but I'm not going to bother.

[00:08:19]
All right, let me run the program again and we're all good because none of those error cases even came up and we're all done and everything is perfect and I didn't make any mistakes, right? Anyone see anything I didn't do correctly? Here's a question. Let's suppose that I'm coming along here and I'm like, I called malloc here and remember malloc.

[00:08:38]
Does a long lived allocation on the heap that outlives the function, and then it's very important that we make sure to free that, otherwise we have a memory leak. So let's suppose we're coming down here, and we get down here and we're like, cool. We freed that. We didn't forget very good.

[00:08:53]
Did anyone notice that we didn't free it? Nope, we had a memory leak. [LAUGH] This is the problem, if you don't have any conditionals, yeah, sure, it's no problem. It's just free, no big deal. The problem is, real programs actually have lots of conditionals, and you have to remember to do that.

[00:09:14]
Here's another question. This one will be a little bit easier. Do we always close the file?
>> Speaker 2: No.
>> Richard Feldman: Nope, [LAUGH] yeah, what if bites red was negative one?
>> Speaker 2: We did an early return.
>> Richard Feldman: We never reached this line. So yeah, we needed to close there too.

[00:09:30]
What about up here? Okay, if buff = null, we don't need to free it because it was null, so it didn't get allocated in the first place. But we did actually need to close the file descriptor there. What if this happened? Yeah, I also need you to close the file descriptor there.

[00:09:41]
But here we need to not free because we haven't done the malloc yet. So, and then again here we don't need to close the file descriptor because this is for equals one which means that there was no file descriptor. This is the problem. This is why it's so easy to make these mistakes is because we're not usually accustomed to thinking in terms of yeah, it's just this chore, right?

[00:10:00]
We're like, open it. And then just remember to close it at the end. You're done. You malloc it. And then just remember to free at the end and you're done. It's not actually that simple because in real world programs, you do have all sorts of conditionals going on.

[00:10:11]
You have early returns. You have things that like only happen sometimes and other times don't happen. And you have to remember to close things in just the right spot, like close your file descriptors in just the right spot. And to free your malix in just the right spots.

[00:10:23]
And again, as we talked earlier, if you don't remember to free at just the right time, you get a memory leak. If you free too early, you get a use after free, which is way worse than a memory leak. And if you free multiple times, and we are calling free, multiple times in this function, then you get a double free which is even worse than a use after free.

[00:10:40]
This is how C gets a reputation for its memory on safety, causing problems deservedly. So, because this does really cause problems. Now one, right, you can go with this is you can go with rust, which basically, you don't have clothes entry in rust and rust. It just takes care of all these things automatically for you with plenty of other trade-offs, if you want to learn all about that, I have a whole course on Rust that you can watch right here on Final Masters.

[00:11:03]
But it's also worth noting that, again, if we choose not to use malloc in the first place. If we're going to do our chunked reads anyway, and we're gonna just do them on the stack, then I don't need to worry about any of this. I don't need to do the free thing because I have chosen to organize my program in a way where I didn't have to use malloc and therefore I don't have to use free and the entire category of bugs just goes away.

[00:11:26]
Granted, other trade offs happen, but I get better performance, and I don't have to manage these lifetimes and worry about what I'm like on the hook to close it or free at just the right time. Unfortunately, when it comes to file descriptors, there's kind of no way around it, like you just do need to remember to call close at, like, just the right times that if you mess that up.

[00:11:44]
Then you have a file descriptor leak on the plus side, file descriptor leaks are generally not they tend not to be as consequential as memory leaks, although, granted. It can be kind of annoying if you have a very long running program, but eventually you run out of file descriptors.

[00:11:58]
That is annoying. One other note about this is that this right here, is actually one of the major things that the sort of C successor languages that are not in the like c++ or rust family. So this is like Zig, Jai and Odin. I believe all three of them do this is that they have a keyword called Defer.

[00:12:16]
And basically what defer lets you do is that I'm just gonna make up syntactic because c does not have this is you basically say something like this, defer. I guess that would probably be down here, actually. So C, if it's negative one, if so early return. And then basically what defer means is basically like at the end of the function, whenever the function returns, whether that's just because the function naturally ended.

[00:12:37]
Or because we did an early return, just make sure you always do this close fd. And then now you wouldn't need this, and you wouldn't need this, and you wouldn't need this, and you could do the same thing for free if you wanted to. Now, granted, that only works if, like you're doing this inside the scope of this one function, like, if that works fine for closing a file descriptor.

[00:12:52]
But if the whole point of the malloc that you're doing is that you want to return. The thing that it malloced from the function then defer doesn't really help you, whereas, like the things that rust does do help you. But it is worth noting that, especially for things like file descriptors, this is a major ergonomics improvement that these languages give you.

[00:13:10]
And I have heard talk about people trying to add this to C, or, like, writing proposals to do that, but C being a 50 plus your old language. New features don't just get added on a whim. [LAUGH] it's like a long process to get things through the standards.

[00:13:22]
So who knows if we'll ever actually see a deferred keyword in C, but fortunately, these other sort of C improvement languages already do have that.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now