Transcript from the "Files & Databases Overview" Lesson
>> Jem Young: My professor in college, my favorite professor, Dr. David Gibson, from my university. He once told me that 90% of what you do as a software engineer is writing things into a database and reading things out of a database. And having been a software engineer for many years, I know that [LAUGH] is absolutely true.
[00:00:19] When you think about what your job really is, it's mostly reading stuff and writing stuff out of a database. But especially as UI engineers and web developers, we don't give a lot of thought to how that data is getting saved, where it's going, where it should even live.
[00:00:34] So when we talk about saving data, what's the easiest, simplest way of saving something? Persisting it over time?
>> Student1: Write it to a file?
>> Jem Young: A file, [LAUGH] yeah. A file is the simplest way of saving some data somewhere. What's the problem with using a file? Why can't we just save everything to a file?
>> Student2: There's overhead on reading and writing. And it's hard to manage the data.
>> Jem Young: Yeah, but a bit more, why is it hard to manage the data?
>> Student3: Cuz it's not organized the way that you might need it.
>> Jem Young: Yeah, we could fix that, though. That's absolutely right.
[00:01:16] But we can organize the data and save it in such a way that it makes sense to us. What's the problem?
>> Student3: You might only need one word out of several pages. And you have to read the entire file in order to get access to that one word.
>> Jem Young: Yeah, files are not optimized to be read in the middle, in the beginning, and the end. We can, but they're not built for that. And at the speed we're talking about, we're talking about billions of transactions a day. We need something that's optimized to do that. But I'd say the biggest problem with using just a plain, pure-flat file for saving data, it's not portable.
[00:01:52] It's only gonna live on one server. And let's say I have ten servers, do we all write to one file at the same time? That would be hard, because writing to a file requires writing to a hard disk, and hard disk is like, not like, it's the slowest level of caching.
[00:02:07] And if you think of writing data as some sort of caching as persisting data, there's gonna be your CPU cache that's gonna be the fastest. You're gonna have your memory cache. You're gonna have other layers of caching. And the slowest is gonna be a hard drive cache. So writing from a file is gonna be slow.
[00:02:21] And also, persisting that across thousands of servers, it's just not scalable. So we need some sort of data platform that is built specifically for saving data and information. That's why the database was invented. It gives us a structured way of saving data in such a way that it's readable and writable by all parties.
[00:02:42] And it just makes sense. When you're talking big data, so we're talking petabytes of data, you have to structure that data in a way that it's queryable. We can't just save things to a hard drive somewhere, and hopefully, it works. And hopefully, I can run queries on it.
[00:02:56] You have to do it in a way that makes sense.