Open Source AI with Python & Hugging Face

Prompt Engineering for Images

Temporal

Lesson Description

The "Prompt Engineering for Images" Lesson is part of the full, Open Source AI with Python & Hugging Face course featured in this preview video. Here's what you'd learn in this lesson:

Steve discusses the impact of prompt quality including negative prompts in the image generation process and explores different options like attention slicing for balancing quality and speed.

Join Now

Preview

Transcript from the "Prompt Engineering for Images" Lesson

[00:00:00]
>> Steve Kinney: But the interesting part is that, I don't know, all the times, you see those people on YouTube with the prompt engineering, right? They call, you know, there's just like, I figured out how to write a prompt that does this. And I think that really did truly work two years ago, but now it's like, turns out with a lot of stuff, it does matter a little bit more.

[00:00:22]
In fact, I'm gonna spoil a future slide. What's really interesting about this is you not only have in the library support for prompts, you have support for negative prompts, right? And if you think about it in terms of the math piece, it's like, what is not. Yeah, yeah, like not this, right?

[00:00:42]
And that will pull the actual statistics to where you want it to go, right? And so, like, what you don't want is actually like a field that you can fill out and that will actually have an impact here. This is the recommended kinda stack of things, right? And you don't.

[00:01:00]
There's nothing. If you do not do this, you'll be okay. This is like the like, recommended kind of like, way to structure. Like, what do you want, what style? If you see, if you ever use like midjourney or something like that, they definitely like those, all of the examples, so on and so forth, right?

[00:01:23]
Yeah, so what subject do you want? The style, details, environment, composition and lighting, I believe, like, actually, I truly do believe that the composition lighting works because, like, you know that, like, it's very easy to get the, like, camera data for the lens at least, you know, and stuff like that.

[00:01:39]
That is almost definitely in the metadata on all these things. So as much as you're like, that's not real, that's probably more real than some of the other pieces. And yeah, like, I don't know that masterpiece gets you anything better. But like, I don't know, I was reading the docs, who am I to like, sure, it does, it works.

[00:02:00]
And then we have those negative prompts, right? The concepts you don't want, right? Like no watermarks or. You know what I mean? That stuff does, actually, the model will pull away from those concepts during the denoising process. So in this case there's not necessarily a concept of that in some of the at least older transformer models.

[00:02:20]
I wonder with the reasoning steps in the newer reasoning thinking models, maybe that process does something, I think, a little bit more impactful. But in the older ones, it's not built into it. But for these Image ones it is. And like I said before, especially if we are seeking to do this on, like, the free tier or whatever, there are some like.

[00:02:48]
And all these things come at some kind of cost, right? If there was no cost to making it faster, it would just, that's how it would be, right? So we have this idea of a scheduler. Like, there is another one you can swap in from hugging face. This attention slicing, right?

[00:03:07]
Like, it does it in kind of chunks. It might be slower, but you get higher quality. Do you want to offload idle parts of CPU to reduce the VRAM and stuff like that? You can play around with all these things and see how it works for you. I think I did play around with them at great expense to my mental health in these notebooks.

[00:03:32]
So you can kinda see what I did. I will be honest with you. Some of it was I was changing numbers until I got what I wanted. I will say that there was a ton of science all the time. There are tons of different models you can choose from.

[00:03:49]
I think most of the time, I think just for speed. Again, like, I'm working under the constraints of we don't want to go on forever waiting for things. So I sometimes chose the cheapest, fastest model. But, like, if you're like, I am totally okay going downstairs making a sandwich while I wait for this, you don't have to make all the choices that I did.

[00:04:11]
But actually, I think a few of them. I have the dropdowns where you can run it. Run it with a different model. You can swap in and out the models. That's the nice part of the hugging face. SDKs is, some of these, working with those models is not always the same, but they put that abstraction over it.

[00:04:26]
So, like, you just switch out the string for the model and it downloads a different one. And it's great, so let's go play with it. I will say, if your goal is to read every line of code with these, it is a little rougher, right? Cause there's just a lot more knobs to tweak and a lot of stuff like that we did for the optimization in this case.

[00:04:51]
But the reward is really up there too, though, right? Like, seeing a quote that looks like a quote is like, cool. It's not Pygmy Hippo cool.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now