Open Source AI with Python & Hugging Face

Parameter-Specific Fine Tuning

Temporal

Lesson Description

The "Parameter-Specific Fine Tuning" Lesson is part of the full, Open Source AI with Python & Hugging Face course featured in this preview video. Here's what you'd learn in this lesson:

Steve explains the concept of low-rank adaptation, where only a small subset of extra layers are added and tuned on top of a model to achieve significant results with minimal effort. This approach allows for easy customization and adaptation of models for specific tasks without the need to store entirely new models, acting as plug-ins to enhance functionality.

Join Now

Preview

Transcript from the "Parameter-Specific Fine Tuning" Lesson

[00:00:00]
>> Steve Kinney: LoRa is like if you took that textbook and you shoved a bunch of sticky notes on the pages with some extra context, but you didn't rewrite the whole book. You're only taking a small layer of them and you're fine tuning those and it turns out low drink adaptation, that's the part I forgot where you don't touch most of the knobs, right?

[00:00:21]
You add a few tiny extra layers on top and you just tune those, right? You're, that seems like that's not gonna work. I don't think so. Turns out that it works like 90% as good, right? It's one of those things where it's, much to everyone's surprise kinda thing.

[00:00:43]
It's the same laws that gave us the fact that JavaScript became the most popular programming language in the world. You ask anyone to predict that in 1995, you're not gonna get a lot of takers. Turns out it works, and so basically, you can get to the point where just taking a small subset of extra layers and tuning them will get you most of where you need to go.

[00:01:05]
And this is what we're gonna do in a second. We're gonna take GPT2, we're going to take a data set, we're gonna train it on that data set and we're gonna see that it, for the amount of effort that we're gonna put in. Is it gonna be perfect?

[00:01:18]
It's not gonna be perfect, for the time spent doing it, that's very good. So yeah, we take a relatively small end very specific, right, again a chat tone, right, a style or something like that, right? And effectively, hey, the pattern of the first few words are gonna be the same, right?

[00:01:45]
And we're gonna hit it with 16,000 examples, which is not a lot if you think about it, right? On any scale, we're hit with 16,000 examples, 14,000 maybe examples. Just repeatedly saying quote by name, colon, beginning, quote, some words, end quote, right? And hammer it with just 16,000.

[00:02:07]
That's not a huge amount. And we'll see that should we give it that initial first few characters, it will mostly stay in line, right? And we're not retraining the entire model. We're simply adding a few layers of extra knobs on top of it and we'll see that we get mostly the way there, right, full fine tuning will still most of the times outperform.

[00:02:38]
But if it's one of those always this trade off, okay, maybe it's 10% better, but it costs you 28 times as much. I don't know, sometimes that's the right answer, do it right on. And the nice part though is when you fine tune a model, let's say you took a model that clocked in at 48 gigabytes and you fine tune it, guess what?

[00:03:04]
You have a new 48 gigabyte model with these extra layers. You only have to store the extra layers. It's, you're putting a hat on top of GPT2 that like fine tune it to do what you want. And then you can put that hat on something else. You can take the hat off, right?

[00:03:24]
It's just an extra plugin. It's effectively plug ins for a model, right? And so you can fine tune it way cheaper, way faster. You don't have to store an entirely new model. You just have an extra layer and adapter on top of it to do the thing that you want to do.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now