My First Thoughts on New OpenAI Strawberry Model ( OpenAI o1-preview)

Published Sep 19, 2024, 8:57 PM

Here are my first thoughts after using OpenAI's New Strawberry Model for a couple of hours

Subscribe to the newsletter at: 
https://danielmiessler.com/subscribe

Join the UL community at:
https://danielmiessler.com/upgrade

Follow on X:
https://twitter.com/danielmiessler

Follow on LinkedIn:
https://www.linkedin.com/in/danielmiessler

See you in the next one!

All right. Welcome to unsupervised learning. This is Daniel. Okay. I'm going to start off with something that just happened. So strawberry just launched. It is being called zero one, which I assume the O might mean Orion because people were saying that it might have been called Orion. So this is the new model from OpenAI. And I've been messing with it for a couple hours already. So, uh, first thing is I gave it a task of building a business plan for something I'm working on, and it produced output that was far and above better than Ford or Sonnet 3.5. Yeah, it was really quite, quite good. Uh, very detailed. It took quite a while. There's no streaming in the API, so it feels a little rough compared to the current models. But whatever that, that will come with time. Uh, it's quite expensive. So basically I did a couple of conversation analysis, uh, Analysis by passing in, um, you know, conversations like transcripts from podcasts. And I think I did 2 or 3 of those, and it was almost a dollar. And there's also a mini version which is way less expensive, but I'm trying to test the capabilities, so I'm using the full model. But yeah, a few requests for a dollar, whereas I would say probably many dozen or a couple of hundred requests are normally like a few dollars. So it's many factors more expensive. So just something to consider. As with most models, you don't need the biggest, best or latest. This is a tweet I just put out, so I'm going through it. So this does one particular thing well, which is in better than anything else, which is pausing to think and actually going step by step. That's kind of like the magic sauce. Here is the chain of thought reasoning. So if you don't need that for what you're trying to do, you definitely shouldn't use this because it's more expensive, takes longer to run. All those sorts of reasons, this type of model and similar ones going forward are going to massively benefit from high quality prompting. So things like we use with fabric, which is open source on GitHub if you're not familiar, but you probably are if you're listening to this. But essentially, the more you know what you want and the better you can articulate that, the better this is going to perform, because it is a chain of thought sort of concept. So the more you give it to help with that, the better. Okay, sorry about that. I was just checking to make sure I wasn't doxxing anyone by showing you my messages, but I was not, so I don't have to rerecord. Okay, so, um. continuing on here and going to expand this window fully. Okay. So, um, yeah, the better you can articulate all of this. And by the way, I want to do an edit there for the team. So the better you can articulate this stuff in exactly what you want, the better things are. That's the bottom line here. So a lot of people are going to question is this AGI or not? Uh, Sam Altman already responded. He's like, yeah, this absolutely is not. So that that should end it in terms of the actual creator of this thing saying it's not. I also don't think it is either. Uh, whatever that matters for. But bottom line is, anyone who's making the claim of like this is or isn't AGI. Here's my request to the internet. Basically, anyone claiming something is or is not should also provide a concise and achievable definition of what that means. And I have one here, of course, which is I've talked about before, whether the ability of an AI, whether a model or a product or a system to perform the work of an average US based knowledge worker in 2002, and I say 2002 because that's pre GPT four. Right. So basically pre AI in these terms anyway. So yeah anyone who's talking about AGI make sure they have a definition. Otherwise you're just wasting your time because the entire conversation will be about definitions. And you might not even figure that out until fucking two hours later. Sorry for the cussing. All right. One of the most important changes to me with this model. This this is massive, okay? This is the first model that does this. Uh, it's the first model of its kind to do this very, very interesting. It's actually spending tokens To think, okay, before you had input and you had output and you were being charged in, the amount of work that was being done was based on the number of tokens coming in and the number of tokens coming out, and that that was the extent of it. What's happening now is you have tokens coming in and you have tokens coming out, but there are tokens being spent while it's thinking. It's actually thinking and reasoning through how to solve the problem. And what's really fascinating about this is that you now have multiple factors here. Okay. So you can do better prompting. And this is the next piece here. Number seven. You could do better prompting. You could use a smarter model. Or you could have the model think harder on the problem. And these are all going to be levers and knobs that we have to get better results from AI. And this is the first time we have this third level lever of like actually having it think, right. So at inference time, more effort being spent. And they actually say in the blog post they're like, hey, look, right now it's taking, you know, a few seconds to think or whatever, and it's going to get back great results. But we're thinking, what if it thinks for minutes? What if it thinks for hours? What if it thinks for days or weeks? And not only that, but we give it more compute power to think. And the example they gave, I think this was an OpenAI post. The example they gave here was how much do you want to solve cancer? What if you could build a data center? What if you had one data center just for working on cancer and one data center just for working on aging and so on? Okay. And you basically have models like this that scale with the inference difficulty based on the amount of difficulty of the, of the thinking. And then, of course, you have a smart model and a good neural net and all that, right? Scalability of the of the neural net. So maybe that's GPT five, GPT six, whatever. Combined with the good prompting, combined with this thinking capability and combined with, you know, all those things unified into the combined with having that giant infrastructure to run it so that that's insane. Um, and the scales all the way down to like, the smallest stupid problem where it's just like, whatever, GPT three and you get back the answer almost instantaneously. In fact, forget GPT three. It's some local model that only does one thing well. You're spending almost no resources whatsoever. It just goes to your phone, bounces back immediately, doesn't go anywhere, barely costs any cycles of a GPU or a CPU because you don't need those resources to run. Because it's just an easy thing to answer. So now we're talking about AI that scales with the difficulty of the problem, right? With, you know, cancer, aging, getting out of the solar system, escaping the sun, expanding, ultimately heat, death of the universe. That's a big one, right? Because entropy kills everything. So ultimately, we're going to need a way out of here at some point, assuming we survive that long. Not happening anytime soon. I wouldn't worry about that. But these are the types of things that are really exciting. You know, the size of the problem being being a factor, for which I you point at it with lots and lots of different knobs and levers controlling that decision. So I think that's really cool. Another important thing to mention is that the innovation seems independent of what we were waiting for for GPT five. So based on all I read, all the releases from OpenAI. And I've seen all the rumors and, you know, talked to a bunch of people who've been speculating about this. And this seems completely independent from, oh, is this GPT four oh, is it for oh, is it five? Is it an early version of five. Doesn't really matter. It's like a separate axis. This is like a capability. This is like thinking capability, which is on a separate axis from how big or smart is the neural net, right? Or how big or smart is the is the model. So really, really cool to think about those being two separate things because now we can start thinking about, okay, well if GPT five is still going to come out, you know, later this year, beginning in next year or whenever it's going to come out and whatever they're going to call it. Well, imagine GPT five with this thinking capability. That's cool. So presumably this is just a feature that you can add onto any model, which is what we're just talking about. And I think this is okay. This is really, really crucial here. I've been talking for a long time about slack in the rope and tricks that we're going to use to jump ahead in, um, advancement of AI, so so check this out. A lot of people are like, oh, we're running into a data wall. Oh, neural nets are only so good they can only get so good. We've already hit a thing. I mean, so many, so many people are saying things like this that just sound absolutely ridiculous to me. First of all, they were the ones saying we wouldn't be here. And so now we are here and everyone's surprised and they're like, well, here's what we know for sure is we're not going to get any better. How can I believe you if you didn't predict any of this and you were absolutely certain back then, and now you're absolutely certain it's not going to jump ahead again, right? Leopold talks about this in his paper. There's lots of different ways to get better. There's the architecture of the model. There's the size of the model. I forget what all levers he had, but it's the architecture of the model, the size of the model. And I think it was hobbling was the other one, which is what I called like a year ago. Slack in the rope or tricks we're going to. This is what I told a friend of mine who's really smart in this stuff. I said, watch this. We're going to find multiple tricks where we're messing around in percentage points, and then we find a thing and it jumps us 2 or 3 or 5 or 10 x or 100 x ahead. And and I actually learned this from him. Uh, I actually learned this from him. He was like, hey, you know, there are things that jump you ahead. Um, and I think he gave me example from some public paper or whatever. And it was an example of like a big jump. And my natural intuition was there's going to be a lot more of those, and they're not coming from pursuing along this axis, which is difficult. They are actually just hanging off to the side. It's like, oh, did you know if you just changed the color of this? Hey, did you know if you just orient the data backward instead of forward? Hey, did you know if you just prune the data in this way or if you add this particular data set or. And I'm just making up these examples, but simple things that you wouldn't think would work. And this is why Leopold talks about if you automate an AI engineer or an AI researcher, is what he called it. That's when it gets completely silly, because they have the ability to now go and try a whole bunch of these things, including these tricks. Um, all this to say that the slack in the rope or this series of tricks is going to keep multiplying our advances, and that's at the same time that we're working on the algorithms. Oh, that was the other. That was the other factor is algorithms. That was this is going to happen at the same time we're working on the algorithms to make those better. We're also working on the size of the neural net, um, and the quality and the structure. And everything about the neural net is going to get bigger and more powerful, but mostly just a matter of size, number of parameters. But all those things are changing at the same time as we're finding all these tricks. Right. So we're talking about this is just begun. And this is what people don't realize. This is just now starting. We're going to look back in two years and be like, what was that? That was silly. Right. And so I really want to warn people against thinking we're hitting some kind of a wall. Think of it this way. We just found alien technology. We have no idea how it works. And we're like, poking it with a stick and it's already spitting out amazing things. So think about that. Okay, we got a glowy ball. We don't know how it floats. We don't know how it's doing. Anti-Gravity, right? We don't know how it's doing this. We don't know how it's reflecting its surface. We don't know how it's coming up with these answers. We don't know how it got here from the other solar system. We don't know anything about it. You poke it with a stick and it tells this magic stuff and we're like, Holy crap, that's amazing. Somebody walks up, sees you poke it with a stick and goes, yeah, that's I mean, that's that's all it's ever going to be able to do. I mean, I've seen you poke it with a stick twice, and it gave you kind of a similar answer, which means that's all we could learn from this alien ball. That's their conclusion. I am certain that since you poked it with a stick while I was standing here three times, and it kind of gave you a similar answer. One it must be stupid. Two it's not as smart as us. And three, this is as as smart as it's ever going to be. This is the most it has to offer. That is the claim that's being made by these kind of like denialists, in my view. And that doesn't mean the current shiny ball is better than humans, or it should replace humans, or it could do everything we could do. Like, this is not a competition. Okay, here's a better way to think about this. This is not like a rock that we have animated. Think of it this way. If an alien comes here because someone else was like, hey, this is not thinking, this is processing. And I'm like, come on, come on. If you if an alien comes here, let's assume we know how our brain works. An alien comes here and we look at its brain, or it shows us its brain and it looks different. And we're like, oh, you guys do neurons and synapses different than us? Who's going to walk over and be like, well, since they're doing neurons and synapses different than us, they're not thinking. Only humans can think. And I'm like, they got here. They got here, didn't they? It's a little shiny ball. And they got here from whatever part of the galaxy or universe that they came from. They're obviously doing something right. And I is obviously doing something right too. So I think it's a little bit specious. Is that is that the name of the word? It's like specious to just magically assume that we are the best. Only we are thinking only we are special. Instead of thinking like we might have this nascent alien intelligence thing going on that actually is doing things that are very much analogous to us. It reminds me of the first time that I clicked around inside of Linux. This is like late 90s. I was messing with Linux. This must have been like 9798 or something. I'm messing with Linux and I'm clicking around because I had started with windows and I'm like, oh, it opens windows and it opens things that I could click and navigate. Then I'm like, it's it's just like on Windows Explorer. And this like, blew me away. It absolutely blew me away that this was just a different way of doing the same thing. And that underneath this, there's a universal thing of you need to be able to browse files, you need to be able to open windows, you need to be able to close windows. And that clicked for me. And I'm like, oh, I guess like all operating systems are going to do this differently. It's the same with aliens. It's the same with like they might think differently, but whatever. They have to think, right. So why would we expect this synthetic intelligence that we've birthed to do it exactly the same way that we way that we do. We should not expect that we got here accidentally stumbling through time due to evolution. And we've got this version that we have and it's awesome, obviously. But like, that's way different than we invented this thing five years ago or whenever that was 2017, six years ago. And I know it goes further back than that. But you know what I'm saying? Transformers. All right. So that's that. And this this is becoming a long thing. But whatever we'll go with it. So yeah, basically we have no idea how early all of this is. We're likely to find ten, 20 or 200 more of these holy crap optimizations like this thinking thing before we start hitting any limits for neural network architecture or the transform transformer like. Plus we could just find something better than a transformer. You realize how how lucky we were to find the transformer. Like the people who made that paper. They're like, hey, this is this is a cool way we think this is a cool way of doing something. They didn't know what they had. Okay, you should watch a Karpathy talk about the transformer. He's like, this thing is a general purpose computer. This thing is insanely good at learning. He talks about different ways that it's better than humans at learning. Okay, some some people randomly found this thing and it shot us off. Okay. So so check this out. This is another example of finding tricks or slack in the rope just lying on the ground. So we stumble through AI for decades and decades and decades. And then someone's like, hey, this is kind of cool about this attention mechanism. Hey, what do you think about this architecture for a neural net? Boom! Now we have this take off. There's nothing saying somebody isn't going to be like, I like what you did with that transformer architecture. What if it looked like this instead? It might be 20 times better. It might be 2000 times better. It might be 4% better. It doesn't matter. Like the we have only just begun. We have only just begun. I can absolutely guarantee you that assuming we don't kill ourselves off as a result of this, like that would set things back. But I'm trying to get you to think about things in this way because it's insane what's about to happen. And yeah, I'm going to have more examples here. I'm working on an example right here on this other screen. Uh, pretty cool thing I'm building with it. Um, okay. So that was that.