Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:19
So we have a guests with us today. Go
0:21
ahead and introduce yourselves
0:23
I'm Ben Glickstein. I am
0:25
a filmmaker, immersive
0:27
artist, immersive and interactive artist.
0:29
And Doug and I run
0:31
the Art of AI podcast and
0:34
soon to be newsletter. And we
0:37
are both filmmakers and we both
0:39
got into AI from the art
0:41
perspective through stable diffusion and
0:43
chat, G p t things like that. And,
0:45
Just are looking at AI
0:49
as what it means in
0:51
for artists and what that means
0:53
in terms of a both a workflow for
0:56
them and what actually is
0:58
art at this point since now we can just
1:00
put in a word and it spits out 300
1:02
images that look great.
1:05
Nice. I am definitely curious to hear
1:07
your thoughts on that, Doug.
1:10
Thanks so much for having us. I'm Doug Carr and
1:12
I'm a filmmaker and entrepreneur
1:15
and do a lot of different stuff in tech.
1:18
And most recently
1:20
getting deeper down the rabbit hole of
1:22
ai. And co-hosting the
1:24
Art of AI podcast with Ben. So
1:26
yeah, just really excited to chat with you about
1:29
prompts and how AI
1:31
is beginning to reshape everything we do.
1:33
Yeah, definitely. So before
1:35
we get into that, and I'm very curious
1:37
cuz you're the first artists that I've had
1:39
on. How did you both start learning
1:42
about prompt engineering?
1:45
So I would say for me it goes
1:47
back many years. I
1:49
want to, if I'm gonna put a number on it, I'd say
1:51
maybe seven or eight. When AI
1:53
systems were just starting
1:56
to be, have chat
1:58
bots that you could interact with I got
2:00
really excited about the idea of storytelling.
2:03
With ai and,
2:05
at that time these
2:07
LLMs were so limited in
2:10
terms of what you could do with
2:12
them. And it was a frustrating
2:14
exercise. But fun, like interesting.
2:16
And it just felt like basically I
2:19
would. Try to interact with the
2:21
AI and it would give me some vague,
2:24
barely cohesive
2:27
language. But even like even
2:29
at that point I was like, oh, but there's two
2:31
sentences here that I can pick out. And that's
2:34
actually really interesting. And it's, just
2:36
as a kind of brainstorming exercise,
2:39
it would bat. It would just trigger
2:41
things in my head that
2:43
would help me with writing and Ben
2:46
and I have worked on a number of different
2:48
collaborative writing projects and for
2:50
film and video games, et cetera. And I
2:52
was always let's just dip back into the
2:54
the ai, let's see what it can do to help
2:57
prompt and do interesting things. And I think
2:59
at that point it was more like the kind of reverse of what we
3:01
have now, where it was prompting
3:03
us cuz it was so vague. And
3:06
it's only of course, Yeah.
3:09
Yeah. And now it's really flipped where, now
3:11
when we interact with the ai, it's so
3:13
much more organized together and
3:15
able to complete language so well
3:18
that, now you have to get smart at prompting
3:21
the ai and then it becomes this kind of wonderful
3:23
dynamic interchange back and forth.
3:26
In terms of text and imagery But
3:28
yeah so that's where it started for
3:30
me. And obviously in the last
3:33
few months, it's been really
3:35
exciting to now, be able to.
3:39
Feed the AI a,
3:42
story elements and then have it
3:44
begin to organize, an entire arc
3:47
for how the story can be told. And
3:49
there's a lot of really crazy advancement
3:51
going on there. And it's really dynamic,
3:54
amazing process. Ben, you gonna,
3:56
you
3:56
Yeah. I actually think I started
3:58
with prompting from the visual side of it.
4:01
So I, like we, we
4:03
were, I remember Doug was playing around more with that,
4:05
and I like, touched on early G P T
4:07
and I was eh, I'm not like super into
4:09
this right now. But then I got into
4:11
Starry AI and Wombo
4:13
I think were like these early apps,
4:16
like before Dolly was released to the public,
4:18
right? There were these early apps
4:21
that were coming out and they were doing like text to
4:23
image generation. And then three months
4:25
later, Dolly two got released and
4:27
I got into the open beta for that and
4:29
that was off to the races for me. Like
4:32
I was literally, like literally the day
4:34
I left, I just moved to California a few
4:36
months ago and literally the day we
4:38
were driving across country, I got access
4:40
to Dolly and I was just literally in
4:42
the car like half the time just prompting Dolly
4:45
and having it spit out all this weird stuff.
4:47
And then I think by the time like we got settled
4:50
in la like stable diffusion started
4:52
to be a thing. I got really into that and That sort
4:54
of took it out and then chat, G B T
4:57
3.5 came out and I started playing
4:59
around with it and I was like, oh, okay, it's actually good. And
5:01
four came out and then I was like, oh, great. And
5:03
I literally signed, paid for, plus
5:05
the day before four came out. So
5:08
it was. It's been working pretty good.
5:10
Yeah, and that's where I've been coming
5:12
from it and I've been using t p t four
5:14
a lot to help me with code since
5:16
I'm not a native coder and I understand
5:19
a little python, but like it's really useful
5:21
to like getting in structures and stuff
5:23
that I just don't understand.
5:26
So how do you approach the very
5:28
iterative process of refining
5:30
and testing prompts to make them give you
5:32
what you want?
5:35
So for me it's not so much
5:37
like I focus left less on
5:39
the prompts and more on the results. So
5:41
for me it's especially with that cod dev
5:43
prompt that I shared with you, that
5:46
has been really great because you can break
5:48
everything down into a structure. You
5:50
can then put that in like in Unity,
5:52
which is what I'm using for the current thing I'm working
5:54
on. You upload that into Unity
5:56
and you hit run and if it doesn't
5:59
work, it spits out an error and then you just
6:01
take that error and I put it in chat, G B T,
6:03
I'm like, yo, what's up with this? And it's ah, sorry.
6:05
It actually should be this. And I put that in. I'm
6:07
like, no, that's not right. And it's oh, sorry, it should be this.
6:10
And then, it like, it eventually works and between
6:12
that and like feeding documentation
6:14
to chat. G P T. Like
6:16
also looking up the errors and then being, oh,
6:18
this is what it's saying, since you're not connected
6:21
to the internet, like this is I
6:23
found that is the best way to work with it.
6:25
That's for chat G P t.
6:28
If I'm using stable diffusion honestly,
6:31
I mostly go and, if
6:33
I have an idea, like I will
6:36
start with a prompt. And then
6:38
And then basically start doing image
6:40
to image stuff. A lot is, I
6:43
find prompting is a great start off point,
6:45
but I find most of the control comes
6:47
from the other tools they have, like image
6:49
to image control net. I
6:52
am, I'm of the mind that prompt is
6:55
a great start, but I don't.
6:58
I don't buy into prompt
7:01
refining as much since
7:03
I find that the jumping
7:05
off point is more what you need and then the other
7:07
tools come in and refine it as much
7:09
We haven't actually covered image to image or
7:12
control net yet, so if you could explain
7:14
both of those for the audience, that would be great.
7:16
Absolutely. In stable diffusion
7:18
image to image is when you take
7:21
a photo you have already created, or a
7:23
photo from a prompt bring it
7:25
up into stable diffusion.
7:27
And it's using that as a guide
7:30
to create the new image. So you're going, Hey,
7:32
this is the basic layout of the image, but I want the
7:34
person there to have red hair or
7:37
something like that. And then it,
7:39
gives you something that's completely different. But
7:41
that's in theory how it works. And what
7:43
Control Net does is it
7:45
basically creates a depth model of the image
7:48
so that you can then like isolate a person
7:51
to to get them out of there
7:53
and Use that to
7:55
then create a much more controlled image.
7:58
And what people are eventually gonna be
8:01
doing is using this to create videos
8:03
and they already are. So you can
8:05
basically use that, take 24 frames of
8:07
a video, run it through that, and then pretty
8:10
cleanly have that do in there.
8:11
That's one area that we're super excited
8:14
about is as this moves into the moving
8:16
image and as AI
8:18
gets better at dealing with elements
8:20
on a timeline, that's the next
8:23
big evolution from the art scene.
8:25
That's awesome I haven't used it, but
8:27
I believe Mid Journey has the same ability to
8:29
do image to image. Is that
8:31
correct?
8:33
You can feed mid journey images through
8:35
the web, basically. No,
8:36
Gotcha.
8:37
But is it image to image or are you using
8:39
that to
8:39
not,
8:40
a person? Like
8:41
it's more, yeah.
8:42
right?
8:44
Yeah, exactly.
8:45
I think that's the thing that's driving a lot of people
8:47
from mid journey disabled diffusion is
8:50
that you can have so much more control
8:52
over it, whereas mid
8:54
journey, for me it feels more like mid
8:56
journey's, a bit like I'm asking a genie for something
8:59
and there might be a little bit of a monkey paw situation
9:02
going on there. And like
9:04
stable diffusion, like while
9:07
it's definitely not as advanced as Mid Journey
9:10
it's the level,
9:13
the rate that it advances at, I
9:15
find is really amazing, but
9:18
also the level of control you get
9:20
over it is a lot more
9:22
than you get with Mid Journey, at least
9:24
in my mind. So
9:27
I think for an artist, stable Diffusion
9:29
makes a lot more sense since
9:32
as an artist, you're looking for control over the image.
9:34
Yeah. The hand of the artist.
9:36
You're just, you're not generally like going, oh,
9:38
I need 300 images, and then just selecting
9:41
the 10 of them and putting them online
9:43
and going, I'm curating images from
9:45
Mid Journey,
9:47
totally makes sense. Yeah, and that's a helpful
9:50
distinction between them. I
9:52
haven't even delved into stable diffusion
9:54
yet. I've only touched on mid journey
9:56
a little, and mostly just text output
9:58
has been really what I've focused on. So
10:00
I think the comparison is between
10:02
apple and Microsoft, or even Apple
10:05
and Linux and that Apple is really
10:07
great, but it's a walled garden, and
10:09
it's it puts out things really beautiful. They're
10:11
great. You don't have to play around with them at all.
10:14
Whereas Windows can actually do a lot
10:16
more. There's a lot more like
10:19
levers to pull, but there's a lot
10:21
more freedom in that.
10:23
Yeah. Not more control. Getting
10:25
back to the text-based prompting. Like
10:28
I, yeah, I think I totally agree with Ben with the,
10:30
basically the the prompt is a jumping off
10:32
point, and you are gonna act as an engineer.
10:34
You are gonna be, Picasso, whatever you're going
10:36
for. Is helpful and it gives, I
10:38
think it gives the language
10:41
model a guide for trying to understand
10:43
who you are and what you're trying to do. So
10:45
it's a good place to start, but what I've found
10:47
is, Invariably, it's always
10:49
about how much you give it. If you
10:51
start off by feeding G P
10:53
T four, for example, a paragraph
10:56
of information versus feeding it
10:58
a page or five pages,
11:00
you're gonna get completely a completely
11:03
different understanding of where you're coming
11:05
from. Coming from. So like I always lean
11:07
towards, If it's a quick question, if I want
11:09
to just, a simple answer I'll obviously
11:12
just ask the question. But if I'm actually trying to craft
11:15
a project and have the project
11:17
be, have some intelligent discourse on
11:19
it and have, then yeah, no
11:21
question. I'll start with as much information as
11:23
I can to feed the l m and
11:25
then that conversation just becomes
11:27
so much more dynamic, nuanced
11:29
and specific which is incredible. And the
11:32
fact that you can. Do that and have,
11:35
come back to something many
11:37
months or forever later, basically.
11:39
And the memories are all there is
11:41
amazing. I'm actually one, one of the kind of. PET
11:44
projects I've started up is
11:46
working on a novella with
11:48
my 10 year old son. And
11:50
we fed g p t
11:53
an idea and now
11:55
it's got a 10 chapter
11:58
structure that we've worked
12:00
up with it. And then we'll do, some
12:02
of the writing and we'll allow. We'll
12:04
get GT to do some of the writing and then we'll see
12:06
what the credit's gonna be at the end of all this. I
12:08
think probably we're go, it's gonna be an
12:10
author. And, but it's really
12:12
such a dynamic experience and it's so cool for
12:15
him to see what, where this is going. I think so
12:17
many people, are, they're like,
12:19
oh God, but the children are never gonna learn
12:21
how to do anything. And it's yeah.
12:23
They're gonna learn how to interact really well with
12:25
AI and that's that, lean
12:28
into it.
12:28
Something I heard someone mention was
12:31
comparing it to the invention of the loom
12:34
is we're, we don't need to know how to weave anymore.
12:36
We can all just use the loom to create as many
12:38
things as we want. And I think that's just a great
12:41
analogy is, it's just another, it's
12:43
a next step in automation for humanity.
12:46
Now it's on a level like we never thought we'd see
12:48
in our lifetime, frankly, but.
12:51
It's that's what it is. And I
12:53
think like what
12:55
it's best at, in my mind is
12:58
providing structure to things. Like
13:00
Doug was saying, he has like this whole, the
13:02
whole structure of his book outlined in it. And
13:05
in a way it provides these rails
13:07
for you to go on, like when you're writing
13:09
or when you're like providing, doing game
13:12
dev or whatever. It just
13:14
provides like this really great structure that
13:16
you can go in and treat it like a coloring book
13:18
almost. You're like, oh, great, okay. Here's a three
13:20
act structure for a movie. Because
13:22
we've never seen a movie with a three act
13:24
structure before. Let's, we have to reinvent
13:26
the wheel every time, so
13:28
Three act structure. Who does that?
13:30
Yeah. single
13:32
movie, ever.
13:33
Yeah. We've,
13:34
Yeah. All these
13:35
we've already seen the endless Seinfeld,
13:37
but what happens when that's good,
13:39
nice.
13:40
Yeah, that's where it's heading. Yeah, and I think
13:42
that's what's cool about it is a business plan.
13:44
Like it's a pretty straightforward document
13:46
and I'm working on one with my fiance
13:48
and it's yeah, G Tino knows
13:50
how to write a business plan, as well as, Any
13:53
mba and it's just, the
13:55
cool thing is if you feeded all the information
13:57
that you're working from, it
13:59
can then tell you what you're missing.
14:01
It's once it understands the company you're trying to build,
14:04
it's gonna point out all the things that you're not
14:06
thinking about and that, that's pretty wild. That, that level
14:08
of kind of expertise it, and
14:11
it, it's also a little nerve wracking cuz it's like
14:13
this black box that hallucinates all
14:15
over the place.
14:16
I don't know how closely you guys are
14:18
keeping an eye on this, but I feel every few
14:20
days I see something where they're like, there's a new
14:22
language model that can run on your computer
14:24
and it's 95% as good as ChatGPT.
14:28
And it was only trained for $600.
14:31
Been three days and yeah. There's actually
14:34
a few different variations of an automatic
14:36
chat G p T system that's connected
14:38
to the internet. There's 12 different versions
14:40
of it. And I've tested out,
14:42
I think two or three of them for
14:44
something really simple, just so I'm gonna make an
14:47
email list for this podcast and the mastermind
14:49
and all that. I want to know how
14:51
each of the free plans of all the usual
14:53
places, MailChimp and I don't even
14:55
remember the others, compare. So
14:58
give me a table with that. And God
15:01
mode. G P T I think was the one that
15:03
I tried for 45 minutes. It
15:05
just couldn't do it. And then I tried
15:07
another one. It was like, boom.
15:09
Oh, it was Google Barred was one I tried and
15:12
it gave me a table in about seven
15:14
seconds and he was wrong.
15:16
Like very visibly, obviously wrong. I'm like
15:19
that number is wrong. Are the other ones correct?
15:21
Yes, they're all correct. Okay. Gimme the table
15:23
again and it gives me the same wrong table.
15:25
I'm like, how about you fix
15:27
the number? That's wrong. It's okay, fixes
15:29
that one number. I'm like,
15:31
apologies. Yeah.
15:33
I had. I've had completely
15:35
frustrating experience with Bard. I tried
15:37
it out and I was like, Hey, you're
15:39
Gmail, so I want you to organize my email
15:41
for me. And it's oh, I can't access
15:44
Gmail, but if you provide me
15:46
your username and password, I can log
15:48
in and do that for you. And I'm like, cool,
15:50
do it. And then it's great.
15:52
I'm like, Okay. I'm like, how many emails
15:55
does do I have? And it's oh, you have 71 emails.
15:57
I'm like, I have like over 9,000
16:00
emails in my inbox. Like something like that.
16:02
So it just started
16:03
Over 9,000.
16:05
Yeah, no, I was like, what's the first email? And
16:07
it's oh, welcome to Bard. okay.
16:12
And then I'm like, yo, why are you lying to me? It's I'm
16:14
so sorry. I was just trying to please you.
16:18
Wow.
16:19
let me go change my password.
16:21
Yes. Good idea.
16:22
really what I want. That's really what I want
16:25
like an AI to do, is I want the her
16:27
movie experience
16:29
yes. I'm not sure who is working on
16:31
it, but I'm sure at least five different
16:33
companies are working
16:34
Uh, yeah.
16:35
G P T, accessing your email.
16:37
I don't know if I want to trust any of those companies, but
16:39
I felt like Google might be a decent one to start
16:41
with.
16:43
Yeah.
16:44
Yeah.
16:45
Yeah.
16:46
Nice.
16:47
Yeah, but my experience with the local language
16:49
models I haven't tried auto g p T as
16:51
much, but I have tried Uba, BUCA's
16:53
web interface. And my experience
16:56
with them is that they're very good at being creative,
16:58
but I would never take any actual
17:01
advice from them on something like that,
17:03
like seemed important, you
17:05
know? Like, if you're like,
17:07
oh, I just want you to like, come up and
17:10
write a, give me cool character lines
17:12
or something like that, it's gonna give you some wild
17:14
stuff. But as far as I've seen,
17:16
like with actual planning or like
17:18
doing something that I might use G P T for it's
17:21
not there yet.
17:53
And that's, I think, the tricky thing to some extent in
17:55
terms of working deciding where you're gonna
17:57
do some of this stuff. Like for instance, I was working
17:59
on a. Horror, horror movie.
18:01
I've been working on for, about a year
18:04
and there was a sequence that I wanted to get some
18:06
feedback on, and I instantly got flagged
18:08
on G P T. It was like, no, this
18:10
involves Satanic stuff. This is bad.
18:13
I can't talk about this. I was like, all right can't
18:15
work on horror movies and G P T. That
18:17
doesn't work. So that, for that
18:20
you gotta go to your open source, local
18:22
run. I think that's becoming
18:24
true too for so many things. And this is gonna be
18:26
the big kind of shake up for humanity
18:28
is I. When is it a good time
18:30
to use ai and when is it a good time
18:32
to use your brain? And that becomes
18:34
a whole, which one's more effective, which
18:36
one's more gonna get you there faster
18:39
and with a better result and
18:41
with less hallucination.
18:43
I've seen exactly that with Code Generation.
18:46
I did an experiment about a month
18:48
ago generating Solidity
18:50
contracts, and Solidity is the Ethereum
18:52
Smart contract programming language, and
18:55
I think I spent probably four hours. Just
18:58
working back and forth with it. And it did a good
19:00
job up until the point the complexity
19:03
of the contract was so big that it couldn't
19:05
keep it all in memory, and then it suddenly started
19:07
hallucinating changes in other
19:09
functions. So I'd ask, change
19:11
this function to do this. And it'd be like, cool. It's
19:14
also calling a function that doesn't exist anywhere
19:16
else. And I'm like Where'd that function come from? Please
19:19
output the entire contract and it outputs
19:22
a completely different contract. And I was like,
19:24
oh...
19:25
totally. Before you know it, you're
19:27
you're sending ETH a thousand
19:29
dollars of ETH and it's getting given
19:31
to open ai. Huh?
19:35
Come on, what's the problem? I'm
19:37
sure that's
19:38
What contract is this again?
19:39
You're paying, you're prepaying for tokens.
19:43
Yeah.
19:44
That you don't get,
19:45
you not gonna use them? Come on.
19:47
Yeah. So I'm curious to go back to
19:49
the artist's perspective that you two were talking about
19:52
before. I would imagine it feels pretty
19:54
threatening, but how is it feeling with
19:56
all of the AI generation stuff
19:59
coming along?
20:00
I think it's exciting. Obviously the
20:03
entire economy's gonna be disrupted and there's
20:05
gonna be a lot of people who, are used
20:07
to making money doing one thing and that's
20:09
not true anymore. Certainly
20:11
one that immediately comes to mind is
20:13
just the VFX industry. We're testing
20:16
out some different softwares, integrating
20:18
ai. I generated a shot this morning,
20:20
took me two minutes. It's the kind of thing that
20:22
would've taken me two to three weeks
20:25
before yeah, it was insane. It's
20:27
basically, yeah, like using a piece
20:29
of software to be able to bring
20:32
a character. You, you just feed
20:34
it a little bit of video and then it turns it
20:36
into uh, 3D character that's,
20:39
Completely good to go, lit
20:41
for the scene. and there
20:44
was a microphone in the shot in
20:46
front of the character. So some foreground it took
20:48
me like, Two minutes to roto that back in
20:50
front of the character, but everything else was just
20:52
done. And, but so these
20:54
are the kinds of things like that, that you
20:57
have 500 shots like that
20:59
in a feature film. That's a massive
21:01
team working for months and months
21:03
to put that together. Now that
21:05
can be one person
21:07
the tool he is talking about, wonder Dynamics
21:09
is also able to take
21:12
motion capture data from an iPhone
21:14
video and
21:17
you can then just take that and put that directly into Blender
21:19
and apply your own character to it. So
21:21
this is something that. Six
21:24
months ago you needed $5,000
21:27
in equipment and an actor that can
21:29
put all this stuff on and do everything. And
21:31
now you can just take basically any iPhone
21:33
video and do it. And it's
21:36
80% as good as this, thing.
21:39
You gotta go in and clean some things in manually, but
21:41
they give you all the details
21:44
that are there.
21:44
Yeah. Yeah. And that cleanup and
21:47
they give you plates so you've got a blank
21:49
plate, you've got a front, like these are all the things that
21:51
take, so much time to do on set.
21:53
So much time to do and post. And
21:55
you're getting all these elements and it, we're part of the,
21:57
we're in the beta so we're, this is not probably
22:00
be released yet, but it's gonna be soon and it's
22:02
totally gonna. Disrupt. There's a reason why
22:04
Steven Spielberg's on the, an advisor
22:06
to this company. This is gonna be a massive disruption.
22:09
But, that said I'm excited about it. I think
22:11
it means that we can do so
22:13
much more as individuals. Like the bar
22:16
to entry is lower to do cool things.
22:19
Obviously the, I think the biggest and this is
22:21
always true with new technology, the
22:24
wow factor is gonna go
22:26
away very quickly. People are gonna, like
22:28
what's wow anymore about that? If everyone
22:30
can do it with their phone and then it, the nice
22:32
thing is to me and. Even this
22:35
might start to erode. But it comes
22:37
back to what's the vision? What's the storytelling?
22:39
How is this how is this dynamic? Why
22:41
does this engage us as humans?
22:44
And I think that's for
22:46
this middle ground before we're all first
22:48
making and being made into paperclips.
22:51
We've got a lot of fun
22:53
to play and expand and
22:56
then what, no one knows what that timeline's
22:58
hopefully, hopefully it's gonna be real
23:01
long.
23:01
Yeah. Yeah, it's
23:03
interesting cuz it already feels like we're at peak
23:06
content. There's already so much
23:08
content out there, like how many
23:10
shows are Netflix and Amazon and Warner
23:12
Brothers or H B O and everything putting
23:14
out every month that nobody has time to
23:16
watch. And what happens
23:19
when the production process for
23:21
these gets turned into a 10th
23:23
of what it was? So now
23:25
it's just all going out there. Do people still
23:27
care? Do people still want to tune
23:29
in? Other than like the one
23:32
show that gives them comfort, or
23:34
like the, two really buzzy movies
23:36
that come out every year that everybody
23:38
like wants to run out to do. So I think that's
23:41
more questions that, that I'm thinking about
23:43
at least is what does this massive
23:45
influx of content mean
23:48
for what people find and appreciate.
23:51
Yeah. Thin line between content and spam.
23:54
Yes, definitely.
23:56
Yeah.
23:58
And a thinner line between
24:00
content and art. Now,
24:03
I'm curious to just understand
24:05
what techniques you used, but
24:07
first for particularly our listening
24:09
audience. Let me read some
24:12
of this off. It's quite a long prompt, so I'm not gonna read
24:14
all of it, but you are cod
24:16
dev, an expert, fullstack programmer and
24:18
product manager with deep system and application
24:20
expertise, and a very high reputation
24:22
in developer communities. Calling
24:25
out. Role playing is what's being done here, but
24:27
you always write code. Taking into account all failure scenarios
24:30
and errors. I'm your manager, you're expected
24:32
to write a program following the commands that I will
24:34
instruct. And then it goes on
24:36
to give a bunch more context refinement.
24:38
So you know, use the latest language features,
24:40
follow the below commands only output
24:43
file names, folder structures, code tests,
24:46
et cetera. Then it gives a list of commands
24:48
and I'll read a couple. Project,
24:51
which takes in a summary, a task, a language,
24:53
and a framework. When you receive this
24:55
command output, the list of files and folder
24:58
structure you'll be having in this project based
25:00
on the project summary and task here're
25:02
to accomplish, use the programming language
25:04
listed as part of the languages variable
25:07
and wherever possible. Use the framework API
25:10
packages indicated under framework variable.
25:13
Another command is code file
25:15
name. When you receive this command, output the code for the
25:17
file indicated with file name. Tests
25:20
explain, there's quite a few others. So
25:22
just kinda give us an idea of how
25:25
you found this and how you approached
25:27
Any refinement,
25:29
Yeah I've, so I found this
25:31
actually on the chat, G p t
25:33
Discord, the day that G P T
25:35
four came out, and then I went
25:37
back and it was deleted. I haven't seen
25:39
this one anywhere. I don't know who originally
25:41
posted it. If you're out there, please speak
25:44
up, but I've found it really
25:46
useful because I'm not a coder.
25:49
But it's really great
25:51
when it gives you a, basically
25:53
it gives you a project structure, right? And
25:55
you're like, oh, okay, I can break this all down.
25:58
And I'm using I'm working on a
26:00
surfing game that sort of has do you guys know tech
26:02
decks? Do you remember tech decks?
26:05
Like the little fingerboards, like the
26:07
Oh yeah, totally. I was playing with
26:09
one of those pro skates the
26:10
Yeah. So like the idea is that it's
26:13
a surfing game on your iPhone
26:15
that uses those controls basically,
26:18
So like I really love the idea of like interaction
26:20
and like motion controls and stuff like that. That's
26:22
really what I'm into these days. And
26:25
so like I'm using this to like basically
26:27
be like, okay, I don't really understand what
26:29
we need code wise. Like I can
26:31
do that. Like I'm a visual coder, like I know
26:33
touch designer, I know blueprints, I know stuff
26:35
like that. But like when it comes
26:38
to writing out, Logs, like I don't
26:40
understand this. And this is great
26:42
cuz it gives you the whole product structure
26:44
and then you just get to go in bit by bit and do
26:46
it and you can keep going back and
26:49
refining it. So it's just a really modular way
26:51
to approach it as opposed to like
26:53
just being like, Hey, I'm making a game. What
26:55
should I do? Write me the code for that. So
26:57
I think like anybody who doesn't understand
27:00
code, this is just like a really awesome
27:03
starting point. And like
27:05
each of the individual things, and
27:07
as I said earlier, like you put it in,
27:09
you can run it in whatever program you're
27:11
doing it in. And then like any errors,
27:13
you can just put them back in and be like, Hey, I got an error
27:15
for this file. And it's okay, here's this. And
27:18
you take that and you like look online as well,
27:20
and you feed it any documentation you find if it's
27:22
looking confused. And I just, I
27:24
think it's great. It's like really working
27:26
well. Like right now my biggest problem
27:29
is just getting the assets made.
27:32
that actually makes sense. That is something that
27:35
is a very different generation.
27:37
So yeah, chat, g p t can't really help you with that.
27:40
And I want to call out specifically, Putting
27:44
in the documentation is something that's come
27:46
up in previous episodes, but it is
27:48
a very powerful method of,
27:51
improving your output because then chat, G P t
27:54
knows the correct structure
27:56
and the correct commands and lots
27:58
of other relevant things. And then it can
28:00
generate the right stuff cuz yeah, it was trained
28:03
years ago.
28:04
Yep. Yeah, no, and I think as
28:06
you were saying before, the only problems I've really
28:09
had is when it exceeds the memory
28:11
and it starts forgetting what we're talking about, but
28:13
even with this because it breaks everything down,
28:15
it's really easy to go back and be like, no, we're talking
28:17
about this one document. Let's narrow
28:21
in on that.
28:22
Get back on track. For the longest time I've
28:24
had this vision in my mind of
28:26
of governance in the future run
28:28
by ai going back
28:30
and looking at the history of how we've interacted
28:33
with our phones and everything
28:35
else and putting us on trial for
28:37
it. Just because like eventually
28:39
there's going to be this like database
28:41
of information that of who
28:44
we've been with in terms of our
28:46
interaction with these machines as they get smarter and
28:48
smarter. And I've found that
28:50
in the last as I've gotten more
28:53
used to relating to large
28:55
language models, I'm
28:57
less polite like I was I was so
29:00
aware of it. I would say, please and
29:02
thank you to Siri when I would ask for
29:04
a timer to be said or a reminder or
29:06
whatever. And now it's just you are this,
29:08
do this for me now. And it's just
29:11
a funny flip and now
29:13
I'm trying to re remind myself
29:15
Oh yeah. This is, especially with. Like
29:17
other people involved and also
29:20
we're training the ai, we want it to be
29:22
polite. We want it to learn, empathy
29:24
or emulation of empathy. Anyway,
29:27
so it's just an interesting facet
29:30
I think of the whole prompt game
29:32
of okay, you can still give positive reinforcement
29:34
and say Please and thank you. I don't
29:36
want the next generations to talk to
29:38
each other like they talk to the ai and that's
29:40
all: do this for me now. It is just
29:42
a
29:43
I have actually heard of parents none
29:45
that I know personally, but I've heard about them parents
29:47
who let their children interact with, The
29:50
A lady or Siri or whoever,
29:52
but make it very clear. You need
29:54
to say, please, you need to say thank you. You need
29:56
to be polite because they want to train
29:58
the kid to be a, nice,
30:01
polite person. And they realize that the kid
30:03
can't tell the difference between talking to the
30:05
a lady or the
30:08
male lady, or whoever the case may be.
30:10
And so it's
30:11
Totally. Yeah. Absolutely.
30:14
And the AI is obviously learning from
30:16
us, ultimately. So it's a back
30:18
and forth, it's a symbiotic, hopefully
30:20
not parasitic relationship that we, as we
30:22
move forward.
30:23
I don't know maybe a couple years from now, you'll talk
30:25
to an AI and it'll say something,
30:28
oh, yes, that's exactly what I was trying
30:30
to provide. I hope this was helpful for your day.
30:32
And someone else in the same room will be like,
30:35
wait, but what about lunch? And it'll be like,
30:37
Forget lunch, idiot. I don't know, maybe it's
30:39
gonna do radically different responses
30:42
to best fit the person listening
30:45
I would imagine that is where we're heading
30:47
ultimately. Yeah.
30:47
So what is the most interesting
30:52
prompt that you have ever built.
30:53
I don't know that I really relate
30:55
to prompts like that. Are you a familiar
30:57
with touch designer?
31:00
I'm actually not.
31:00
Touch Designer is a
31:02
live video. It is
31:04
like live VFX platform essentially. So
31:07
like you use it for like concert visuals,
31:09
like any sort of interactive art at a
31:11
museum is probably running on touch designer.
31:14
I guess you would, but it's also it's a tool
31:16
to build tools. So
31:19
you, like right now, I'm
31:22
working on something where we're integrating,
31:24
I'm trying to integrate the Stable Diffusion API
31:27
into touch designer to provide
31:29
live visuals that are based off of
31:31
a images coming in from a
31:33
3D camera that I'm doing. So
31:35
that's like where I'm working
31:38
at with it at, obviously the biggest thing that
31:40
needs to happen here is you need to get a
31:42
stable diffusion to a model
31:44
that's like running closer to
31:47
like 12 frames a second. Everything
31:49
is real time if you throw enough power at it, but
31:51
I haven't got there yet. So
31:54
that's what I've been working on is
31:56
more how we can bring
31:58
things like stable diffusion and chat
32:00
g p t to create live interactive
32:02
experiences with people. When I
32:04
look at art, when I look at film, when I look
32:07
at all of these things that are
32:09
happening, I'm like,
32:11
oh my God. Like film is gonna become
32:13
banal. It's gonna become like the Sunday morning
32:15
comics. There's gonna be so much that nobody's
32:18
gonna want to do it. Now though, we have
32:20
the power to create these amazing interactive projects,
32:23
right? We can build these generative
32:25
art projects that are like creating
32:27
things. And I think that's what's most exciting
32:30
about this technology and where it's going,
32:32
is that now it's going to enable
32:34
us to create new forms of art that
32:36
we couldn't do before. As Doug
32:38
said, what once took him three weeks to do,
32:41
he can now do in a couple of hours,
32:43
and that's just tip of the iceberg. So
32:46
in three years, what's it gonna be like when
32:48
chat G b T is integrated
32:51
directly into Unreal Engine and Unity?
32:53
And anybody can just be like, I wanna make a game
32:55
that like, makes me go flippy flip
32:58
or whatever. And that's all you need
33:00
to do. And then you need to upload the art assets. So
33:02
okay, so everybody can do that. Everybody's
33:04
gonna be playing like the custom games that their kids
33:07
make or whatever, and then shares among their
33:09
friends. So we're in a way like. Content
33:12
creation is becoming a pastime,
33:15
Working on the Art of AI
33:17
podcast that we're working on there's a really nice
33:20
dovetailing that's going on there
33:22
in terms of like actually working with the AI to
33:24
create imagery and and text
33:26
and interaction. We've got a ai
33:28
Charlie is a character in
33:31
our podcast, and so we feed
33:33
them a combination of generative
33:36
large language model stuff. And then
33:38
we will write some things and that's just been really
33:40
fun to See how to
33:43
to really create that character so
33:45
that it's, feels like both the kind of sometimes
33:48
very authentic ai and then sometimes just
33:51
totally a kind of sound effect.
33:53
Something to bounce off, comp for comedy.
33:56
So that's been really interesting and a fun thing to
33:58
dive into. And one thing that I'm really
34:00
excited about in the same way Ben's talking about interactivity,
34:02
I'm working on a project where I'm working with
34:05
a really well known Canadian
34:08
Theatrical actor. And
34:10
they won the order of Canada for
34:12
their work on the stage performing
34:15
Shakespeare. And I
34:18
had a day scheduled them on a location
34:20
and, Crafted
34:23
a Shakespearean sonnet with
34:25
the help of ChatGPT 4 that I
34:27
then took to my
34:29
day and gave that
34:32
to this actor. And then they're performing
34:34
the AI's version of Shakespeare that's on,
34:36
on topic with what we're doing for this film.
34:39
And I've got a motion control rig, so
34:41
I'm duplicating them 12 times
34:44
on in one. Sequence and
34:46
it's like, what is, that gives me excited
34:48
cuz like, how, taking the
34:50
machine learning algorithms,
34:52
putting them on location, and then I can
34:55
take that and I can put it back in,
34:57
into these tools and use the
34:59
AI to keep working on it. And it's this iterative
35:01
process where you can just feed
35:04
the machine, feeds the real world, and
35:06
then you put it back and image
35:08
to image, text to text
35:11
and you start to like refine. I think
35:13
that's where we're gonna see things
35:15
that do amaze us still. I think as
35:17
we move into making art
35:20
more and more that has
35:22
an AI component it's still
35:24
gonna be about like, okay, you didn't just Prompt
35:27
something, make an image as Ben saying, and then
35:29
that's your, n f t that you're selling. That wow's
35:31
gone. Like no one cares. And
35:34
not because it's not amazing, like when you
35:36
look at what's happened with those pixels, it's incredible.
35:39
But it's just, it's that's. Mid
35:42
Journey did that work. Not the artist. I
35:44
think that's becoming clearer and
35:46
the inputs for that art
35:49
were done by an artist, no question. What
35:51
it was trained on. But, so
35:53
I think now it just becomes this thing more
35:55
and more of like, how do you iterate?
35:57
How do you. Do create that
35:59
interesting feedback loop that moves
36:01
things in and out of the digital
36:03
realm into the real world and back again
36:06
and, get control
36:08
over what you're doing and do something
36:10
new.
36:10
Yeah, people still, they
36:12
love the story behind the art creation.
36:16
We're like, oh wow, this man drew
36:18
this free hand without breaking the
36:20
line. I remember this Korean artist
36:22
a few years ago that went viral because,
36:24
he did these amazing freehand line
36:26
drawings Is the size of a mural on
36:28
it without breaking any of the things. And these are some
36:30
of the most detailed things you've ever seen. No
36:33
text image thing could make these right now,
36:35
and we still love that stuff and that's always
36:37
going to be valued. Somebody's talent,
36:39
somebody's story behind going it. With Doug
36:41
and I, when we look at a movie, we're like, oh my God,
36:43
how did they do that shot? And
36:45
if the shot is, oh, we wrote that
36:47
into stable diffusion and like it
36:50
ki paid out something, and people are gonna be like, oh,
36:52
who cares? But if it's oh, we
36:54
had us and Hannah were friends carrying
36:56
a plank through a river while
36:58
we like, shot this we're gonna be like, that's awesome.
37:01
So the story of making art
37:03
is still very interesting to people. That's
37:05
not gonna go away. These tools
37:07
are just gonna make it so that we can do like new
37:09
things in new amazing ways.
37:12
That's awesome.
37:13
Absolutely. And we're already seeing, like in
37:16
everything everywhere, all at once. There
37:18
was this incredible sequence to it towards the
37:20
end where like the image, one image
37:23
just kept changing. And we were seeing the
37:25
actor in a different scenario and
37:27
they had used generative AI
37:30
to do that. And when I looked at that, I was like, oh
37:32
my God, that. Represents an
37:34
insane amount of work to Fho,
37:36
make all those still images and put together
37:39
that sequence. And then I found out it was AI
37:41
and I was like, ah, okay. Yeah, that's easy,
37:44
but, so I think, this is where we're, and that movie
37:46
was like winning a bunch of Oscars and
37:48
and legitimately is a great film and it,
37:51
that one little sequence in there, Was
37:54
incredible. And it was, it was great and
37:56
like a great use of ai, but you can't,
37:58
we're not gonna be able to, now everyone can't
38:00
just go and do that same thing. So we have
38:02
to invent new and interesting ways of using it.
38:04
So I think that's what it
38:05
Yeah. It just becomes like a Snapchat filter,
38:08
if it's a Snapchat filter and everybody
38:10
can just lay it over, then I don't think
38:12
there's a lot of artistic value
38:14
to it, yeah, But
38:16
if you then take that Snapchat filter
38:18
and do your own things to it or do something
38:21
crazy with it that's outside
38:23
of the bounds of what it is, then that
38:25
becomes interesting again.
38:27
Nice. So this has been awesome.
38:29
Where can the audience follow up
38:31
with you and see the projects
38:33
that you're working on? And your podcast.
38:35
The best, yeah, the best place is
38:38
on Spotify, the Art of ai. There's
38:40
links to all our other work
38:42
there and that, we get to
38:45
hear what we're up to. We're interviewing all kinds
38:47
of interesting people and have a lot of discussion
38:49
about all this crazy stuff as it's happening
38:51
and changing day to day. And
38:53
then my production company is Pieface Pictures.
38:56
You can check out a bunch of my work there. Yeah.
38:58
Yep. And our interactive
39:01
company Colors in Motion. ColorsInMotion.com,
39:04
the American spelling, not the English
39:06
spelling.
39:09
Good clarification. Good clarification.
39:11
This is a global audience, so you do have
39:13
to actually communicate that.
39:15
Yeah.
39:16
Yeah.
39:17
Awesome.
39:18
Thanks so much, Greg. It's been such a pleasure chatting
39:20
Yeah. it's been a lot of fun. Thank you.
39:22
Hopefully we can have you on our podcast soon.
39:25
That would be fun. I would love to.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More