Build AI-Powered Apps with OpenAI and Node.js

Generating Images with DALL·E

Scott Moss

Scott Moss

Superfilter AI
Build AI-Powered Apps with OpenAI and Node.js

Check out a free preview of the full Build AI-Powered Apps with OpenAI and Node.js course

The "Generating Images with DALL·E" Lesson is part of the full, Build AI-Powered Apps with OpenAI and Node.js course featured in this preview video. Here's what you'd learn in this lesson:

Scott adds an additional function to the application, which allows the AI to generate an image based on a prompt. The function uses the DALL•E OpenAI API to generate the image and return the URL.

Preview
Close

Transcript from the "Generating Images with DALL·E" Lesson

[00:00:00]
>> Yes.
>> Does it mostly base that decision on what function to run, if any, based on whatever description you give to that function?
>> Yeah, this is when to run the function. This is the description on how to calculate this parameter. So yeah, there's nothing stopping you from making that.

[00:00:23]
Again, it's like, ask a friend, [LAUGH] right? And then you can just put something in here as a description as a function like, use this function if this question feels hard. If you need help asking this question, run this function, and then it'll ask another AI, right, or versus DALL.E.

[00:00:41]
DALL.E generates images, right? ChatGPT just got released with DALL.E support. How do you think they did it? They did this, they just made a function here that says, generateImage, and it takes in a prompt, Right, and it just calls the DALL.E API, which, I mean, we can try it.

[00:01:00]
Let's see. Let's see what happens. We can say, result = await openai.images.generate(). What does this take? Does this take a prompt? I don't even know what this thing takes. Body, I think it takes that thing. Yeah, okay, so you can do that. I don't know what the result is gonna be, let's just log it.

[00:01:34]
Cool, and I'll just respond back with nothing for now. So then I could do that, right? I can take this whole object, ooh, I can take this whole object here, make a new one. Okay, it didn't copy the way that I thought it was going to, I'll just do the whole thing.

[00:01:53]
And, I put it in the wrong place, didn't I?
>> One more bracket down.
>> Yeah. There we go. So I can call this createImage. And I'll just give this the same name too, or generateImage. Okay, let's do that. generateImage, and this is a, Create or generate image based on a description, right?

[00:02:28]
And then I'll take a prompt, it's a string. And I want the description of the image to generate, and then this needs to be prompt, there we go. So I don't know, this might work, [LAUGH] let's see. So then now I can go down here and I can say, I want an image, or create me an image of a cat flying through space.

[00:03:11]
See what happens. I don't know what that is. It looks like it just tried to create some markdown. There it is. It did it. Damn, [LAUGH] I was not expecting that. Let's go see it. Wow, [LAUGH] it's kinda crazy. It's just responded back with markdown, because obviously I didn't respond back with the image, right?

[00:03:38]
I just put an empty string here. So it's like, I don't know how to respond back, but we logged it, right? So, Card?
>> So it's generating that prompt, which is just text that it's in, passing to it? So is it pulling the exact same prompt that you gave or is it creating its own prompt based on-

[00:03:57]
>> That's a good question, let's see.
>> Yeah.
>> Let's check it out. So let's look at the prompt that it sent in. I don't know, I put temperature on 0, so I'm guessing it's the exact same thing. If I change the temperature, I bet it'll get crazier.

[00:04:09]
So let's log the prompt. And actually, let me send back the result. What was the result? It was an object with the data in an array. Let's send that back, so that way you can respond back with it. So result.data[0].url. There we go. Let's try that. So, same thing, image of a cat flying through space.

[00:04:44]
Yeah, the prompt was a cat flying through space. And yeah, here's what I got back. Yeah, pretty crazy, the reason why it's not responding back here is because JavaScript, right? I'm trying to do an async. I'm trying to do an await in a while loop. So it's not waiting for that to finish.

[00:05:06]
So the completion response finishes before the API call to the image finishes. So there's a race condition. So yeah, it's just like not doing it, right? So yeah, that's pretty crazy. [LAUGH] Yeah.
>> This is more of a question from the calculate example. But does the model reprocess the output of the function call?

[00:05:29]
For example, will it return whatever calculate said without double-checking?
>> Yes, it will return. It won't do anything you don't ask it to do. So if you didn't ask it to check it, it won't. What that person is describing is more of what an agent is, where it can think for itself, where it can reason about the result of a function and determine if the result was good enough.

[00:05:58]
And if not, it will respond back with, no, I gotta call another function, that didn't work out. Or in this case, it's not thinking about it. It just assumes whatever the result of the function is the answer. But you could write some prompt engineering like, hey, if the results of the functions that you might use come back is not helpful, then I need you to say they're not helpful.

[00:06:17]
Or could you ask it again with a different parameter? The possibilities are endless of how you could do it, so yeah, yeah?
>> Can the AI read back the elevated privileges that's needed on a database or active directory? For example, Well, can it find out the permissions and return each item to the prompt or are permissions are loaded by default and it can tell you what it has so far?

[00:06:53]
>> The AI can do whatever your function allows it to do. So if you have a function that does all that and exposes that and you gave the AI the ability to call that function, then yes, it can and probably will do that. [LAUGH] So don't write a function that can do that.

[00:07:11]
I mean, I've seen people add authentication to prompts. It's insane. So it's like, yeah, we're gonna teach the AI about these functions it can run. But one of the arguments that the function must have is a password, or a secret, or something like that that you have to add to your prompt in order for this to run.

[00:07:33]
So therefore, only people with this token can run this prompt that then runs this function, and the function, you'll get a token here, right? You'll get a token, and then you'll check that token to see if it's valid, and only then will you run the function, right? So I've seen [LAUGH] people do that, I'm like, this is kinda insane, but, yeah.

[00:07:54]
>> Maybe they should authenticate out of band of the AI call.
>> Yes, that's probably, yeah, exactly, don't show this view to someone who's not authenticated, right? But yeah, I've seen people do that, or they'll add the authentication here to the system message, right? It's like, here's the authentication for the logged person, they need this for function calling or something like that.

[00:08:15]
I've seen people do that. I don't know, it's crazy. It's a crazy world we're living in, I'll say that.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now