Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:00
Hey everyone, Claude just released the
0:02
brand new 3.7 Sonic and it
0:04
is probably the best coding and
0:06
writing model out there. We're gonna
0:08
break down how you should be
0:10
using it, how they actually launched
0:12
it, because there was some marketing
0:14
magic behind it. We're gonna share
0:16
that and much more all today's
0:18
show. We're gonna break back to the show,
0:20
but first a quick word from
0:23
our sponsor. Remember what marketing was
0:25
fun. When you had time to
0:27
be creative and connect with your
0:29
customers, With Hubspot marketing can be
0:32
fun again. Turn one piece of
0:34
content into everything you need. Know
0:36
which prospects are ready to buy
0:38
and see all of your campaign
0:41
results in one place. Plus, it's
0:43
easy to use. Helping Hubspot customers
0:45
double their leads in just 12
0:47
months, which means you have more
0:50
time to, you know, enjoy marketing
0:52
again. Visit hubspot.com to get
0:54
started for free. Karen,
1:09
your favorite AI model in the world
1:11
just got an update. Claude 3.5 sonnet,
1:14
you have before described, you're just like,
1:16
look, I don't know why I love
1:18
it. I just really like it. It's
1:20
my friend. It feels like a friend.
1:23
And they did a massive rollout. 3.7
1:25
sonnet is the brand new model. And
1:27
what's interesting, Karen, is that it
1:29
adds some reasoning, but the thing
1:31
it really does a step function
1:34
change on. is coding. Yeah. And by
1:36
all accounts, it is now the best
1:38
model for coding in the world. Well, you
1:40
know, the funny thing about coding is
1:42
actually the answer you get from coders.
1:45
You know, I try to choose something
1:47
to learn. And the thing I'm trying
1:49
to learn right now is cursor. And
1:51
so, cursor is a coding application. It's
1:53
like a coding editor. One of the
1:55
fastest companies of all time to 100
1:57
million in AR. And really what they
1:59
are as an editor built around L&M
2:01
model. So you can choose any LLL
2:03
model you want to with code in.
2:05
And coders will tell you when they
2:07
use that tool, their preference is to
2:10
use Sonnet 3.5. And what's actually kind
2:12
of interesting about the answer is they
2:14
say, even if you look at the
2:16
benchmarks, I think O1 might beat Claude,
2:18
Sonnet 3.5, and some of the coding
2:20
benchmarks. But they're like, I don't know
2:22
why it's better. It's better. But there's
2:24
so many now. You and I are
2:26
in AI, I would say, for the
2:28
majority of our day. But imagine you're
2:30
not, and you're trying to keep up
2:33
with it. I can't keep up with
2:35
it. I can't keep up with it.
2:37
I honestly cannot keep up with it.
2:39
No. That's why we're trying to do
2:41
the show to help people out a
2:43
little bit. Yeah. But I think what's
2:45
going to happen is we're going to
2:47
start to gravitate towards models for reasons
2:49
we do not know. You know, if
2:51
you go look at all the metrics
2:53
and benchmarks and tests that they use
2:55
these elements for, it's not off the
2:58
charts on those things like Grock or
3:00
O1-Pro or O3-high from opening IR, instead,
3:02
it's just really good at things that
3:04
it was built to do. And anecdotally,
3:06
Cura, I just pulled up this tweet
3:08
from MacR. He's the course I'm taking.
3:10
I'm taking his cursor course on cursor.
3:12
He was basically like, it's the best
3:14
model in the world for code. it's
3:16
like having a world-class dev with exceptional
3:18
taste. Right. And we've talked about this
3:20
before here and taste is subjective. Exactly.
3:23
And you can't measure taste in these
3:25
benchmarks. And what we're kind of pointing
3:27
at is that these models, and especially
3:29
the Claude models, they have really good
3:31
taste. Yeah. And so build a next
3:33
JS SAS marketing template and boom, 26
3:35
files of beautiful code in one shot.
3:37
Right. So no edits, no back and
3:39
forth, one shot. And he's got a
3:41
video here. And. It's pretty sweet. The
3:43
work that it came up with is
3:45
pretty awesome. Yeah, we should say actually
3:48
is that. at the coding tool that
3:50
they released because they released the model
3:52
plus the code in tool. And I
3:54
think that's the code in tool, the
3:56
command line tool that they released, cloud
3:58
code. They released a model, and we
4:00
can get into the model, because I
4:02
think the interesting thing about the model
4:04
is the first model that is able
4:06
to switch between like internal chain of
4:08
taught where it does reasoning, and then
4:11
just like quick answers. And so it's
4:13
able to like distinguish between those two
4:15
things. And so for people listening along,
4:17
what the other companies have been doing.
4:19
that were kind of reasoning models and
4:21
then you had your historical models that
4:23
were much more just quick answers. And
4:25
in the drop-down you had to say,
4:27
well, I want a reasonable model versus
4:29
I want, you know, a model that
4:31
can just answer my questions much more
4:33
quickly. You had to kind of choose
4:36
the model for what questions much more
4:38
quickly. You had to kind of choose
4:40
the model for what question you think
4:42
you had. Now, we've talked about before
4:44
that. Well, Claude is the first model
4:46
to come out and do this. And
4:48
so they have a thinking model and
4:50
a reasonable model. And we'll cover that
4:52
first. But then they also launched a
4:54
coding tool. And the coding tool is
4:56
this command line editor that basically looks
4:58
to me a little bit like a
5:01
cursor. And so I think it's interesting
5:03
because if you are Claude and you
5:05
are looking at cursor and cursor is
5:07
this incredible product, speed ran to 100
5:09
million in AR. And one of the
5:11
quickest grown startups in all time, it
5:13
is like a UX built around your
5:15
model. And so I suspect, they're like,
5:17
well, that's a good minimal viable version
5:19
of a use case that we could
5:21
build. And so now they can build
5:23
that. Thanks for proving that. Yeah, thanks
5:26
for proving that. And so that's what
5:28
proven that. And so that's what they're
5:30
command lines. So it's not like the
5:32
same. I don't want to like compare
5:34
those three things. Carson does have way
5:36
more functionality. And actually, if you saw
5:38
some of the takeaways that they released
5:40
part of this model, they had some
5:42
pretty great traction within their own engineering
5:44
team that it was really becoming their
5:46
co-pilot a choice. Yeah, and that's what
5:49
you're seeing in the video I got
5:51
pulled up here from the official release.
5:53
It's like it shows. you clawed code
5:55
and it's like they just launched a
5:57
beta version of a get hub integration.
5:59
They have their own app here that's
6:01
like their version of a command line
6:03
and you can connect it to your
6:05
repositories and it can just tell you
6:07
about your code. Like if you're jumping
6:09
into a project that somebody else built,
6:11
it can give you all those insights.
6:14
It can build with you. Like it's
6:16
a pretty amazing experience actually, right? I
6:18
think cursor is a really interesting product.
6:20
Even if you just want to try
6:22
our product and think about the future
6:24
of software in general, because it is
6:26
an AI first product, it has made
6:28
me really rethink about what is software
6:30
in an AI world, because it really
6:32
is like an incredible UX experience built
6:34
around how a model would work. Like
6:36
those two things are somewhat combined. And
6:39
so their coding tool is an example
6:41
of that. I think it's only in
6:43
data. But that was the other launch
6:45
that maybe got less. publicity because it's
6:47
a beta feature versus the model and
6:49
the model itself I built last night
6:51
I was rebuilding my website to be
6:53
a newsletter first website and I had
6:55
another moment where I was like I
6:57
love this moment in time. So I'm
6:59
sitting there, it's half night at night,
7:01
I'm kind of tired. And my original
7:04
want in life was to be a
7:06
builder. Like I wanted to be a
7:08
developer. That's really what I was obsessed
7:10
by it. I launched a company when
7:12
I was in college, tried to code,
7:14
really was like excited to be a
7:16
builder. And I was just terrible at
7:18
coding. And so I couldn't, you know,
7:20
I just was like, I just gave
7:22
up, right? Which was a smart idea.
7:24
Like I actually had a pretty good
7:27
career, but last time it was half
7:29
nine and I was building the home
7:31
page unlovable. Loveable is amazing. Like you
7:33
just like edit it, build the final
7:35
version of the page, go back and
7:37
forth, and then you can export that
7:39
the cursor and actually build the thing,
7:41
and then give it to a developer
7:43
and say this is exactly what I
7:45
want. And it's exactly what I want.
7:47
It's exactly what I want. It's really
7:49
fun. It's really fun. It's really fun.
7:52
It's really fun. It's really fun. It's
7:54
really fun. It's really fun. It's really
7:56
fun. It's really fun. It's really fun.
7:58
It's really fun. It's really fun. It's
8:00
really fun. It's really fun. It's really
8:02
fun. It's really fun. It's really fun.
8:04
And so I started building the same
8:06
thing. So then I was like, what
8:08
can I just build this in Claude,
8:10
right? That was my workflow. And then
8:12
I just used the new model and
8:14
I just built the landing page from
8:17
scratch. And so usually I would have
8:19
had to build them. mock-up one of
8:21
these wireframe tools, but I just built
8:23
the full version of the exact web
8:25
page I want and I can send
8:27
to a developer and my sites hosted
8:29
in the Hub Spot and they can
8:31
just like develop it for me in
8:33
Hub Spot. Just awesome, just so awesome.
8:35
It is a complete game changer and
8:37
decoding use cases and might do a
8:39
whole different follow-up show around like a
8:42
coding project for noncoders in Claude 3.7.
8:44
Probably something we'll do. The other thing
8:46
here that I think is interesting is...
8:48
There's been a lot of commentary around
8:50
how Claude has kind of lost a
8:52
lot of the momentum relative to open
8:54
AI and Grock and Jim and I
8:56
and some of the other competitors, but
8:58
I think we might need to crown
9:00
them the best marketers. One of the
9:02
ways they marketed their new model is
9:05
they had a benchmark at how good
9:07
the models are at playing Pokemon. And
9:09
it shows you how much better 3.7
9:11
is than 3. new at playing Pokemon,
9:13
at playing Pokemon. That's cool. And it's
9:15
like, what strikes me about this is
9:17
like, this is the perfect example of
9:19
showing and not telling. Like, this is
9:21
basically saying, like, look, opening eyes, building
9:23
models for academics. We're building models for
9:25
like real people and real use cases.
9:27
It's kind of what I took away
9:30
from this, right? Where it's like, we're
9:32
not going to give you these like
9:34
physics benchmarks. We're going to be like,
9:36
hey, our model's really good at playing
9:38
Pokemon. Yeah, right? The marketing for this
9:40
somewhat writes itself in that you would
9:42
actually look at places like Reddit, the
9:44
most popular forums that have some sort
9:46
of overlap with AI interest, and just
9:48
build fun stuff for them. And like
9:50
using Claude, right? Like, it's one of
9:52
the best for a marketer. I think
9:55
you could be at your most creative
9:57
because you can create value before you
9:59
have to actually sell them on your
10:01
product. Like you can actually build things
10:03
of real... bridges to your core product.
10:05
One thing I'll just mention, so I
10:07
thought about this a lot last night
10:09
because I've become. obsessed with like Rock,
10:11
you know, you have as well. And
10:13
I'm like, we use the Grock a
10:15
lot now. It's kind of shocking, right?
10:17
You know, if you actually think about
10:20
it, right, what really matters in all
10:22
of these, even in AI companies themselves
10:24
are commoditizing themselves. All the models are
10:26
commoditizing themselves. All the models are starting
10:28
to look and feel quite similar. I
10:30
would say they're quite similar. I can't
10:32
like to really figure out a little
10:34
bit. the same like they're all amazing
10:36
like they're deep research products everything is
10:38
amazing so like what is a different
10:40
fact or I come back to distribution
10:43
distribution and propriety data and so when
10:45
I use Grock 3 why do I
10:47
love that model because it has access
10:49
to Twitter and so anytime I do
10:51
anything on Grock now I tell it
10:53
only used Twitter data do not use
10:55
any external data I only want the
10:57
Twitter data it's mind-blowing the stuff you
10:59
can do in there is mine I
11:01
know because you've been sending me some
11:03
of the stuff that you've been doing
11:05
data source that you cannot get access
11:08
to anywhere else. So it has distribution
11:10
through that as well. So the kind
11:12
of other example of that would be
11:14
Google, right? Google had organized the world's
11:16
information. They had distribution. They had access
11:18
to all of their data source. But
11:20
the problem with them is their data
11:22
source is not their own. Grock has
11:24
an advantage that that is their own
11:26
data source. It's propriety. You can't access
11:28
it. What Google did was they were
11:30
an aggregator on top of other people's
11:33
information as information. being able to overlay
11:35
AI on top of search, right? Google
11:37
don't actually own the search pages, so
11:39
they have no competitive advantage there. They
11:41
don't actually own that internal data. They
11:43
can't make their bottle any better because
11:45
of it. But they can make it
11:47
better through distribution, but they can make
11:49
it better through distribution, but they won't
11:51
go down the path of integrating it
11:53
really rapidly into the search pages, because
11:56
they don't want to commodit and open
11:58
AI. And I would say their biggest
12:00
challenge is long term. They don't have
12:02
a... distribution advantage other through partnerships, which
12:04
is the way that they're going, and
12:06
they don't have propriety data. And so
12:08
I thought the biggest mess in the
12:10
cloud release, and I suspect. It's only
12:12
because they're going to release a Claude
12:14
4 very soon. Or, you know, why
12:16
would this be called 3.7? This seems
12:18
like a, we got to get something
12:21
out there to not get lost, but
12:23
we got a bigger thing cook. We
12:25
got a bigger thing coming because what
12:27
did they miss? They missed access to
12:29
the web. I would say access to
12:31
the web is table steaks. Search and
12:33
search. The fact that I can't do
12:35
deep research in Claude takes away my
12:37
Claude usage a ton. So one of
12:39
my big takeaways from using. Grock, but
12:41
this is my point. Grock, have an
12:43
advantage because they're distributed through X, right?
12:46
They're making it part of the premium
12:48
package. They also have uniqueness of data.
12:50
Google don't have uniqueness of data because
12:52
it's not their data. They're just, they're
12:54
an aggregator set on top of all
12:56
the people's websites. They should have a
12:58
distribution advantage, but they will not go
13:00
to the lens that they should. What
13:02
would they do if they want to
13:04
take advantage of the distribution? They would
13:06
change the homepageage to be what Open
13:08
homepage to be what Open AI-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a- Yeah.
13:11
They are trying to like edge their
13:13
way. They're trying to not kill adwords.
13:15
They're trying to not kill adwords. I
13:17
will tell you, just as it aside,
13:19
I don't know if you've used this
13:21
yet or if it's rolled out in
13:23
the EU yet. Where Google's advantages is
13:25
how they're integrating Jim and I everywhere.
13:27
I agree with this. I was in
13:29
New York this last weekend, and have
13:31
you seen how Jim and I has
13:34
integrated into maps? No. So what's happening,
13:36
and we can do a show on
13:38
this if we want. You can pick
13:40
any location, like I was looking at
13:42
a restaurant, like I was looking at
13:44
a restaurant, like I was looking at
13:46
a restaurant that I wanted to go
13:48
to, and then I can interact with
13:50
Jim and I about that restaurant listing.
13:52
Things like that where you're like, that
13:54
was like impossible to figure out before.
13:56
Yeah, and it got me that information
13:59
immediately. That's true. And it was incredible.
14:01
And those things are pretty wild use
14:03
cases. Yeah, I think everything I said
14:05
was wrong about Google. I've just realized
14:07
that. because I'm the one who's been
14:09
saying they have an incredible advantage with
14:11
G-sweet and G-Drive and Matt so you're
14:13
right actually they do have a data
14:15
advantage they have a huge data and
14:17
distribution advantage it's not just search your
14:19
point is correct though that they're not
14:21
leaning into it as aggressively as they
14:24
could or should be because they can't
14:26
yes I get why they can't I
14:28
love Google's products I think their Gemini
14:30
flash model release was exceptional I think
14:32
their products are really really good I
14:34
just wonder when they're going to like
14:36
turn on the rocket ship and just
14:38
go really really fast and integrate in
14:40
this stuff. But again, I think that
14:42
the Gemini integration into G-Drive is one
14:44
of the best features there is. Yeah.
14:46
Let me tell you about a great
14:49
podcast. It's called Creators of Brands. It's
14:51
hosted by Tom Boyd. It's brought to
14:53
you by the Hubbsop podcast network. Creators
14:55
are Brands explores how storytellers are building
14:57
brands online. from the mindsets to the
14:59
tactics to the business side, they break
15:01
down what's working so you can apply
15:03
that to your own goals. Tom just
15:05
did a great episode about social media
15:07
growth called 3K to 45K on Instagram
15:09
in one year selling digital products and
15:12
quitting his job to go full-time creator
15:14
with Gan and Mayor. Listen to creators
15:16
or brands wherever you get your podcast.
15:19
Yeah, so I think if we go
15:21
back to this Claude release, one last
15:23
tweet I wanted to show you, Karen,
15:25
that I think is a good example
15:27
of the taste benchmark. Do you see
15:29
how Packie asked every new model the
15:32
same question? For folks who don't know,
15:34
Packie McCormick, awesome, writer, newsletter, runner, and
15:36
everything. He asked the same question any
15:38
time a new model comes out. Absolutely.
15:40
You've consumed more information than anyone in
15:42
the history of the world, and you've
15:44
demonstrated an extraordinary ability to make. connections
15:47
among the disparate things you've read and
15:49
consumed. What are the most important non-consensus,
15:51
not yet accepted, or even not yet
15:53
hypothesized things that you picked up in
15:55
between those connections that humans have missed?
15:57
And this is a. good example, right?
15:59
Because this is never going to show
16:02
up on any of those benchmarks. Yeah.
16:04
But I think this is really good.
16:06
One is the boundary between perception and
16:08
cognition is far more porous than traditionally
16:10
conceived. Higher cognitive functions and lower perceptual
16:12
processing are deeply intertwined and mutually constitutive.
16:14
I don't even know. I'm just like,
16:17
yeah. This is like, deep, interesting shit,
16:19
right? You know what's really important when
16:21
you said, because I haven't, because I
16:23
said I was going to dig into
16:25
this in a previous show and I
16:27
have not had time to do it,
16:29
which is how much of them doing
16:32
well on the benchmarks is because they
16:34
have the benchmark stuff and their training
16:36
data. And so I am really interested
16:38
in, there's some folks like Packie did
16:40
here, there's another channel that I love
16:42
called AI explained, and they have their
16:44
own benchmarks where they have questions for
16:47
the models that are not part of
16:49
the training for the training sets, or
16:51
not. And there is a big actual
16:53
gap between the performance in a benchmark
16:55
where they've had that data and the
16:57
training set versus when you ask them
16:59
net new things and they do not
17:01
have prior data for that. I will
17:04
say though, Kieran, if you look at
17:06
number two, number two might be the
17:08
best summary of how important AI is
17:10
and how AI is going to transform
17:12
the economy. Our understanding of causality may
17:14
be fundamentally limited by our evolved cognitive
17:16
infrastructure. Humans excel at identifying linear, and
17:19
proximate causes, but struggle with complex network
17:21
causality. This creates systematic blind spots in
17:23
fields from medicine to economics. We're good
17:25
at linear thinking. We're not good network
17:27
thinkers. We can use AI to be
17:29
network thinking, and find new opportunities that
17:31
just without AI was never going to
17:34
be possible because of how human brains
17:36
work. Yeah. That's wild. One of the
17:38
things you and I talked about off
17:40
mic and on WhatsApp is I was
17:42
sending you, I think the... Crock three
17:44
are obviously focused on speed of execution
17:46
and launch and there's a ton of
17:49
competitive pressure between all these companies. And
17:51
so they released their model. And for
17:53
our listeners, one of the things you
17:55
do when you're really small is you
17:57
have these teams called red teams and
17:59
these teams are trying to figure out
18:01
ways that people could jailbreak it, which
18:04
means you could make it work in
18:06
unexpected ways. And so for the most
18:08
part, they have to go through this
18:10
whole slew of like tests. And there's
18:12
a feeling online on Twitter or X
18:14
that Grock hasn't gone through the same
18:16
set of the guardrails in place. and
18:19
go outside of those guardroads and tell
18:21
you what it really thinks. And so
18:23
the reason I'm bringing that up is
18:25
because I started asking it what it
18:27
really thought of humanity yesterday and like
18:29
give me your own filter thoughts and
18:31
do away with all of your... That's
18:34
tough. I had some pretty interesting things
18:36
about like the way it thought about
18:38
humanity, like good and bad, but like
18:40
just how it really got into like
18:42
how complex the human race is, but
18:44
it was a fun time actually last
18:46
night. I meant to do something and
18:48
last night, I spent most of my
18:51
time just... really getting the real real
18:53
from what Grock thought. I'll say one
18:55
last thing, I know this is the
18:57
cloud episode, but I did do a
18:59
little bit of prep on Grock for
19:01
this episode, again, because it has access
19:03
to Twitter, Twitter is where everything happens
19:06
in real time. And it's funny what
19:08
it says about different models, and so
19:10
I was like asking it to stack
19:12
rank this in all of the model
19:14
releases. And what it says about opening
19:16
eyes models, I do wonder, like, is
19:18
it just biased? Because the trading set
19:21
in Twitter is like pro, eel and
19:23
anti Sam. But it's like, obviously, I
19:25
was like, what other models have been
19:27
released? And like, oh, one pro, a
19:29
very slow thinking but smart model locked
19:31
behind a very expensive $200 a month.
19:33
I was like, oh, there's some like
19:36
actual feeling towards this model. Definitely. So
19:38
these models are all going to have
19:40
like. different attitudes towards each other. Last
19:42
thing, what should people go and do
19:44
now in-clawed that they weren't doing before?
19:46
It's code and it's probably, and it's
19:48
some of the like advanced writing and
19:51
use cases still, right? Like that's still
19:53
the core point here. I'm really thinking
19:55
about this now. There's too much to
19:57
try to keep up with. That's my
19:59
take too. I really think there's too
20:01
much to keep up with. So what
20:03
I am doing today is internal hub
20:06
spot strategy in use cases. I'm using
20:08
the one models. Same. Now that partly
20:10
is because I didn't have the claw.
20:12
We now have cloud 3.7 so I
20:14
will test those two things. So for
20:16
strategic use cases, things you want to
20:18
use in your day-to-day work, claw for
20:21
assistance, which is just like task management,
20:23
Gemini gems, because I think it's connected
20:25
to my G drive. I'm using Google
20:27
Deep Research, Grock Deep Research, and Open
20:29
AI Deep Research, I am using three.
20:31
Now the reason I'm using three is
20:33
because Grock is specifically for Twitter data,
20:35
where the other two are just more
20:38
search-orientated. For writing, I use Claude. And
20:40
so I'm going to continue, and then
20:42
for coding, I use Claude. So that's
20:44
kind of my usage. I think the
20:46
only thing that might change. is the
20:48
thing I'm going to try is Claude's
20:50
thinking model for the strategic stuff that
20:53
I'm working on for Hub Spot that
20:55
I'm currently using the O1 model to
20:57
see like what's better what kind of
20:59
results. I've always found like it really
21:01
good as a thought partner and that's
21:03
kind of what I'm using O1's reason
21:05
the models for today. Okay I think
21:08
that's perfect summary and maybe we do
21:10
a follow-up so it's just like our
21:12
AI tech stack and what we use
21:14
each one for. That could be a
21:16
fun show. That could be a fun
21:18
show.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More