OpenAI's Video Generating AI Is Dead On Arrival

Published May 15, 2024, 4:00 AM

Earlier in the year, OpenAI debuted Sora, an AI that can generate videos that almost look realistic. In this episode, Ed walks through why generating video with AI is a near-impossible task, and speaks with Walter Woodman of Shy Kids, who made a movie called "Air Head" using the tool. LINKS: Shy Kids' Air Head - https://www.youtube.com/watch?v=G4wJ4WeJrz4 Mira Murati Interview with Wall Street Journal: https://www.wsj.com/video/series/joanna-stern-personal-technology/openai-made-me-crazy-videosthen-the-cto-answered-most-of-my-questions/C2188768-D570-4456-8574-9941D4F9D7E2 

All Zone Media. Hello and welcome to Better Offline as usual. I'm your host ed zitron. A few months ago, open Ai showed off Sora, a product that can generate videos based on a short text prompt, kind of like chat ebt does for text or Daali does for images. These videos, which are usually no more than sixty seconds long, can at times seem impressive until you notice a little detail that breaks the entire facade, like in a video where a cat wakes up its owner, but the owner's arm appears to be part the cushion and the cat's poor explodes out of its arm like an ameba. Reactions to Sora's Ai generated videos, and indeed the existence of the model itself, have ranged from kind of a breathless hype to genuine fear that this will be used to replace video producers, in that it can create reality adjacent videos that for a few seconds kind of seem real, especially in the case in some of open Aye's hand pick demo videos. Yet even in these handpicked Sora outputs, you'll find these weird little things that immediately shatter the illusion, like one where a woman's legs awkwardly shuffle, then somehow switch sides as she walks around, or blobs of people merging in the background of images. These are, on some level genuinely remarkable technological achievements, until you consider that what they are and what they might do, and that there are problems in them that run through the entire fabric of artificial intelligence. A little over a month after SAW was announced, open AI would debut a series of short films, including one called Airhead, where filmmakers Shy Kids told the story of a man with a balloon for a head, and because this is AI said, balloon changes sizes twenty three, twenty four, twenty six, twenty seven, twenty nine, thirty two, thirty four, thirty nine, forty one, forty two, forty three, and forty five seconds into the piece, at which point I stopped counting because it got boring and I really don't want to be mean to shy kids, as this really isn't their fault. The very nature of filmmaking is that you take different shots of the same thing. Something that I anticipated SAA was incapable of doing. Is each shot is generated fresh a saura itself. Much like all generative AI does not actually know anything when one asks for a man with a yellow balloon as his head. SAURA must then look at the parameters spawn during its training process and create an output guessing what a man looks like, what a balloon looks like, what a man's features are on his body, what color yellow is, what the man's doing, and so on and so forth. This becomes extremely problematic when you're working in film or television, where viewers are far more likely to see when something just doesn't look right, a problem exacerbated by moving images, high resolution footage, and big television screens which are now ubiquitous. Yet the press, as usual, credulously accepted Saura's quote stunning videos that were amazing and scary, suggesting to the public that we were on the verge of some sort of artificial intelligence takeover of the film industry, helping boy Sam Altman, their CEO, and his dumbast attempts to convince Hollywood that SURRA won't destroy the movie business. These stories only serve to help Sam Orman, who desperately needs you to believe that Hollywood is scared of Surer and even more scared of Generative AI, because the more you talk about fear and lost jobs and the machines taking over the less. You ask a very very simple question, does any of this shit actually work? The answer, it turns out, is not very well. In a piece for FX Guide, Mike Seymour sat down with Shy Kids, the people behind Airhead, and revealed how SORAW is in many ways a little bit useless for making films. SAURA takes ten to twenty minutes to generate a single three to twenty second shot, something that isn't really a problem until you realize that until the shot is rendered, you really have absolutely no idea what the hell it's going to spit out. Soa has no mechanism to connect one shot to another. Even with hyperdescriptive prompts. It hallucinates extra features when you haven't asked for them. And Shy Kids were shocked by how surprised open Ay's researchers were when they requested the ability to use a prompt to request a particular angle in a shot, a feature that was initially unavailable. It took this is what kind of drives me crazy here and you'll hear this in the interview with him later. These people that are open AI people, and they were making this tool for making visual images for making moving images. They didn't think that people might want different shots. I'm so glad these are the people who were in control of the future. Anyway, to quote the piece, it took hundreds of generations at ten to twenty seconds a piece to make a minute and nineteen second long film. And what's really fun about this is that the movie's fine. I it was kind of fine. I just I have nothing really to say about it. It's a minute and twenty seconds long, but it's it kind of works. But also, the balloon looks different in every other shot. This isn't shy Kids's fault. But also this isn't gonna get better. And I will get into why as we go along. These tiny little problems I've mentioned, though, they all lead to one overwhelming issue that Sora isn't so much a tool to make movies as it is a big, fat slot machine that spits out footage that may or may not be of any use at all. Almost all of the footage in Airhead was graded, treated, stabilized, the nutscaled, and that ten to twenty second lead time on generations was for four hundred and eightp resolution footage, meaning that even useful footage needed significant post production work to look good enough, and just to give you an idea for the non technical members of the audience, and this is fair. The video you see on YouTube is usually somewhere between seven TWENTYP, ten ADP or four K. The TV shows you watch usually ten AP four K or upscale ten ADP. These are all lots of numbers. What I'm saying is the stuff that SAA spits out, that takes burning a small zoo to spit out, is incredibly low resolution. On top of not being specific, look to put it as plainly as possible, every single time that shy kids wanted to generate a shot, even a three second long shot, they would give SA a text prompt and then they would wait at least ten minutes to find out if it was right, and they'd have to accept footage that was subprime or inaccurate. And there's a really good example of this. If you watch Airhead, a lot of the shots are in slow motion, and you may think, no, this is a cinematic choice, right, because you kind of just admiring this man with a balloon for a head going about his business. No, no, no, no no. They found that this was just what Sora wanted to give them when they asked for it. This was, in and of itself a hallucination, in the same way that chat GBT will authoritatively tell you that something is true that is not sorrow will spit out a man running in slow motion despite you not asking for that, And it's so weird. They had to quote them do quite a bit of adjusting to keep the whole thing from feeling like a big slow mode project, and it still kind of does. And that's rough. That's really rough. But you know, I'm a curious little critter, So I decided to sit down with Shy Kids's Walter Woodman to talk about his experience with Sora and have him delve a little daper into his experience with the product. And I'd say he had a far more utopian experience and perspective on the whole thing than I excted. Now, some of you might critique Walter for being so positive about it, but I actually caution you to just listen to what he's saying, because Walter's perspective is interesting. He sees this as a tool, he doesn't see it as a replacement, and I think it's a valid perspective to come at SAA with. I also think it's a perspective that kind of accepts a conceit of open AI's marketing strategy, that these things will get better if they do. Perhaps Walter is right, perhaps this will be an essential tool in filmmaking, even though he didn't say essential. Don't want to put words in the man's mouth, but I don't think that's the case. Let me talk to him. You decide for yourself, all right. So how did the relationship between Shy Kids and open AYE actually begin.

The relationship between Shy Kids and Open AI began when we made an installation for a film called dolly Land, which was premiering at Toronto International Film Festival, and we were the only people that our friends at Pressman Film knew in Toronto, and so we made an installation that looked like Salvador Dali's like studio inside of the basement of the Saint Regis, which is where he lived and made work out of, And inside of that installation we made a like you could make your own surrealist painting, and the way that you could make that was using DOLLI the Open AI program, and so the open AI people came to visit and check out the like what we were working on, and making sure that it was like something that they wanted to be a part of.

And so.

They met our producer Sydney, who they loved. She's easy to love.

And they.

We sent them our previous work and so from there they asked us to join this artist group. And then when Sora came out, we saw it at the same time as everyone else and we yeah, we got tapped on the shoulder and said, hey, would you like to check this out and try this out? And we said, of course, that's how it came to be.

So how did you on board? Were you just given access? Did they give you instructions? Did they physically come to you?

What was that like it was a top secret. They gave us a briefcase and in a cloudy room.

No, it was.

Yeah, there was a very simple onboarding process where they walked us through the technology as well as some of its features, and yeah, it was pretty. It was pretty. And then from there they gave us access to begin using it and making.

Things and you were allowed to use it without their presence. You had direct access.

Yep, yep.

So okay, did you get instructions on how to write effective prompts or did you just kind of do trial.

And err, no, nothing like that.

I mean in the artist group itself, there's a lot of really amazing and thoughtful creative people who kind of show their work and show how they got to make the things that they did. But no, not, there was no real engineering of our prompts. They were very much just play kind of see see what comes out of you. You're creative people that we trust, Why don't.

You just see what works through spaghetti at the wall?

That's cool. So during the in the piece of mathx guide in the interview, some more from shi Kids said the Open Eyes researchers they were surprised when they were asked about being able to say specific shots. What happened there? Was it just that you tried to ask Saora to do specific shots and it didn't work, or was it just not a feature?

I think that's maybe taken a little bit out of context.

I think.

More so it's just people come from distant, different disciplines.

And when.

I say a wide shot on a one hundred and thirty millimeter lens, people from my area of expertise know sort of immediately what I'm talking about.

Whereas the researchers, they are.

More invested in sort of other other things, and so it's it's not so much that they didn't understand or that sort of didn't understand. It's more so just there's all these terms in films.

Like a zollie or like a.

Hitchcock zoom or all of these different things that are very understandable, but even when you go from set to set, they mean something different. So I think it's about trying to create a lingua franca between all of these sort of different, very different people and very different ways of using a tool. What I may call a zoom, you may call a dolly shot, et cetera, et cetera.

So so that feels like a training date, a challenge.

Yeah, I think it's about trying to figure out how and yeah, exactly what to what to train on.

Yeah, so tell me what was the interface like? Was it a chat box? Did you have have? Like? Just tell me about what I actually look like.

Sure, there's limitations of what I can say about things like that, but I think the way that I've described it to people without giving too much away is I think if you're familiar with using something like the Adobe Suite. I think that there's some commonalities whether you're using after Effects or Premiere or whatever illustrator, there's like commonalities and if you can use one, you can sort of flu's your way around the others. I would say it's very similar like that with open.

Ayes tools and models that if you are.

Used to things like chat, GPT and Dolly and those types of models, I think you will find it find an ease of use in using Zora.

So within that article they mentioned that there was like a three hundred to one shooting ratio, which correct me if I'm wrong, means like three hundred seconds of material each second of usable material. How does that compare to conventional filmmaking in your experience, it.

Would be even more seconds than that. I would say, just three hundred shots at probably ten to twenty seconds apiece. So whatever the math is on that, I would say that that's pretty common with shooting. You know, when you are shooting a fiction film or like even a documentary is even crazier for that you shoot all day and all day and from We shot a documentary recently and I actually had to go back and watch all the dailies, we counted about ninety hours of footage that we had, and from that nineties hours, you're making an hour and a half movie, So you.

Know, you are really trimming things down.

And I think also it's like you are getting the five seconds that work or the you know, the section of that shot that works. And I would say that's pretty common to filmmaking.

How about narrative filmmaking, because I know documentary you have a lot of stuff, But I'm just wondering what the burden of selection is like compared to the amount of shots you take in just a regular movie or regular short film.

Even again, I would.

Say, at least I can only speak for the way that I shoot films. You know, if you had it's subjective. It's subjective for sure. If you're David Fincher, you're shooting eight hundred takes of like someone picking up a pencil, or Stanley Kubrick, you know, is like famous for a thousand takes. I would say that the burn rate was very similar. I would say that the challenges with Sora are like it's unbelievable at making these images that are unbelievable and so interesting to look at, But.

At its current state, it.

Can sometimes be difficult to do things that in traditional shooting would be much easier, where you say, hey, can.

That guy go over here?

Or can that person move from one side of the screen to the other. Things like that are are more difficult. But again this is baby steps. We are in like the toddler phase, so I assume that those things will get better.

So you mentioned well shike, it's mentioned in the interview the by default it tries to prevent you from creating videos that violate copyright law existing copyrights. Did you accidentally bump into this regularly or was this something that just you didn't really bother you.

No, you couldn't generate things that So when I was mentioning like a Hitchcock zoom, you couldn't mention Hitchcock, So you had to find a different way to describe that as opposed to like using public figures, anything that would have a public figure or a title you would not be allowed to generate. From my experience, there wasn't too many logos or brands or anything like that, and any of the things that I generated, and.

But something copyright. Did you generate anything that looked copyright?

No? Not to my not to my eye.

That's fine. So well, I know you don't know how much Sorrow will cost, and we don't know that don't even know when it will launch. Can you talk about how much you'd be willing to pay for it? What do you think it's worth? And I realized that this is a vague question.

For sure.

I think that there is this illusion that Sora will be this solution to all problems, and I don't think that that is the case. I think Sora is a tool amongst many tools, and for certain things it will be very valuable.

And so.

In terms of value, it's like, well, how much is a glass of water? Well, yes, if a glass of water is just like right now in my kitchen, I.

Wouldn't like to pay that high for it.

If a glass of water is for a person in the desert who desperately needs that glass of water, you can really name your price. And I would say that for some projects, I think that the usage of Sora would be absolutely invaluable, and.

I would I would.

I don't know how much exactly that would be, would depend on the budget, would depend on the limits and the scales, but I would say that there's other projects where I think it would be like totally inappropriate or like just not worth like what, well, just when I think of studio ghibli films that are hand drawn, and I think the reason that those films work is because of the way that they're made, or I think that when you think of art man animation, it's like I feel that you could feel the fingerprints in that clay, and so I don't think maybe for those types of films that it would be appropriate, But I think for other types of films like Airhead or others, I think it would be extremely appropriate. I think it's up to the artists sort of discretion how much they think that that tool is needed.

It's doesn't the inconsistency of shots make this deeply impractical, because that's the thing I kept coming back to.

Yeah, I mean, depends on what project you're working on. And again, I think that this is like early days. I think that these are kinks and bugs that are going to be changed, and already from day one where we started using it to where we are today, massive improvements have happened, and actually improvements where they've listened to things that we have suggested and things that we'd like to see and tools we'd.

Like to see.

So I think that, for example, for Airhead, the inconsistency of having a protagonist, having a protagonist that stays true through all these different shots, that's the reason why we put a balloon in front of their head, Because while different bodies can sort of be accepted, a different face and a different head is going to be a little bit difficult. And so we turned the limitation into our sort of main attribute. And I would say that again, that works for that story. But I don't think that all stories are going to find this valuable. And I also don't think every single shot needs to come from Sora.

I think that there's a world where it can be.

An addition, or it can be the start of a story where instead of just brainstorming and just having a script, you make a sort of moving mood board or a trailer or so. I think that there's like tons of stages along the pipeline that it would be extremely valuable and help elucidate concepts and bring them to life.

So thematic question, so you avoided filming locations and all of this, but you spend a lot of time writing prompts and you're waiting for Sora to generate clips, then up skating and all that. Do you think you could make airhead assuming you could get around the balloon head thing? Do you think you could make it quicker in real life? Them was soa kind of essential to get it done in the timeline you did, because it's like a week and a half two weeks, I.

Think, Yeah, I don't know, that's an interesting question. I mean, we definitely wouldn't be able to fly around the world and yes, get the shots at the car race and all of those things, so.

I think it would probably be shorter.

But I think in general, the conversations about like time and money are like super reductive in a way in that I think that without Sora, this wouldn't exist, And I think that that is the more interesting conversation. As a director, most directors I know have a folder of unrealized ideas, and I think that my hope is that Sora will allow us to dust off those folders and breathe new life life into concepts, and when people see what those concepts could be, my hope is that it gives a lot more people opportunities to have their ideas illuminated. And whether that means to go and shoot it now traditionally or some hybrid. I think that that, to me is what's most exciting.

So where do you see SORA going. I know you're considering looking at it as kind of a complementary tool, but do you think that that's its use case or do you think it'll ever do end to end filmmaking.

I think I think let a thousand flowers bloom, you know. I think that there is people who are going to just use it for small complementary things to maybe help with in the same way we use stock footage.

Now.

I think some people are going to use it as a way, say you are from a commune unity that has maybe a little bit of a less established film community, and it's a way to have you compete with the big boys in terms of special effects and usage. And again, I don't just think it's as easy as bleep blue block type in the prompt here comes the thing, but rather it allows you to just have a really powerful collaborator that you can help make maybe larger concepts and bigger ideas. And then yeah, I think that there's some people end to end who are going to make things that are completely generated or most of the shots in it are generated or things like that. In general, the thing that feels interesting to me is like helping to deepen humanity, Whereas the more you sort of simplify the process, I think that that is like, I don't know, it's never a simple process. So anytime you hear about something that is going to make it all easy and make all your troubles go away, I'd be very wary of that.

I think film is.

Going to always be difficult and a challenge, and I think the benefit of SORA will be to help lead us into new pasts and lead us into new directions. If I were to tell you, hey, we made this film called Lord of the Rings and it uses CGI orcs and it makes massive orc fights. You know, if I told you that in the nineteen thirties, you'd probably gasp. Or if I told you that CGI is going to be a predominant way in which we make films in twenty twenty four, I think you would go, ah, that's not real filmmaking.

And I don't think I think you kind of saw that in the nineties.

Really yeah, I don't think history is too kind to those people that go, this is not gonna work This is not art. This technology is not the way I just think it's it depends on the artist, and it depends what they want to bring to it. I think that's the key X factor here.

One final question, with that all in mind, do you think that SRA is going to hurt filmmakers? Do you think it's going to replace people?

I mean, I hope not. I mean that's my job, so I would very hope not.

No. I very much.

Understand people's fears, and I think that you know, I'm a student of history, so when I look back in history and the camera obscura comes out, painters are talking about how we aren't going to need painters anymore, because now we can capture reality, why do you need a painter to go and paint it? And it's a very valid point, But painters didn't go away. And then there was this whole new industry called photography, and then after photography, there was this whole new industry called film. And then after film, there was this whole new industry called home video. And then after home video, there was this whole new industry called cell phone video. And then there was this whole new industry called tiktoks and vines, and I just think that when people don't come in contacts with things they're immediate. As humans, our immediate reaction is fear, and we're worried about things that are new because we do not yet understand them. And I think that for us, we like to face those things face on. And I think that the other side of that coin is that there's some kid right now in rural Bangladesh who has this amazing, big idea and maybe doesn't have all the resources that everyone else has, and with these types of technologies, it may level the playing field for kids like that to compete with the avatars of the world, compete with the Marvels of the world, And then I think we're going to all be on this level playing field, and what's going to matter is not just who has the highest budgets and who has the most resources, but who has the best stories. And for me, that's the exciting part. We work with groups of collaborators that we love and respect, and our hope is never let's work with them less. Our hope is always let's enrich those relationships and hopefully grow them and hopefully bring more people into our collective and more people into our process. So that's our hope. Maybe I'm utopic, maybe I'm wrong, but that's the that's the choice, that that's the way we're choosing to look at this.

In Woodman's mind, Surra is a tool, an extension of creatives methods rather than a replacement of filmographers or actors, what have you. And that very much lines up with sam Ortman an open AI's sales pitch for Sura, his utopian perspective, his words, not mine. It's predicated on both film studios acting with integrity, something they've proven to never do, an open Ai being able to make Sura a significantly better tool, something that's going to require masses more training, data and compute that I think is actually in existence. Paul Trillo, an LA based artist and filmmaker, speaking to Business Insider in April, described Saura as a research project in Alpha, mentioning that it was a little confusing who the market was for the service, and I think that jails with another problem that Woodman raised, that what might be a zoom out shot for you would be a completely different term for someone else, which in turn would require open ai to have both the right training data of a zoom shot and many, many, many of them, to be clear, But they need to know the multitudes of different terminologies that go into filmmaking. Now, if they don't give a shit, maybe that's a completely different story. In short, SAUA faces both the intractable problems of AI that I've mentioned in the previous episode, PKI go and listen to it, but also a few of its own, namely that generating moving images isn't just about ingesting a bunch of footage, but it's about understanding said footage well enough to generate something else based on a multitude of different perspectives, descriptions, and cultural contexts. I'm not sure that open AI really Most people realize how complex even the simplest movie is, how much work goes into making a film, and I think that that's actually what excites people about this, because making films can be inefficient, it can be extremely taxing, it can be extremely expensive. But the problem here, I'll get into the other ones as well, is that SAURA is being sold to film studios. That is who Sam Mortman is going to, and thus it's going to be built for people who don't make movies. I'm actually really happy to hear that shy kids and other artists are involved, so it'll actually be tuned to be somewhat useful. But I don't think people realize how gigantine the task is that SRA is going after, and how I think it's impossible it can go any further. But I digress. I just don't believe that SORA actually works if you're making a movie. While pixel movies may take years to render, they've got supercomputers and specialized hardware, and more importantly, the ability to actually design and move characters in the three D space. If you are putting something in Saura, what are you designing? If you put a character in this in again, you cannot have consistency between these things. That is a problem across all generative AI. You can not do that unless, of course, using copyrighted footage, mister Oltman. But seriously, though, with no consistency cross shots, what the hell are you doing? While there are unexpected things that might happen in a three D animated movie or a CGI situation, you still have complete control over the thing you are putting on there, the thing you are animated. You can make subtle tweaks to him that doesn't seem to be the case with Sora. You can adjust what on the screen. But even though this is AI generated, it doesn't have the benefits of regular generative stuff like CGI, which stands of course for a computer generated image. I believe, and if I'm wrong, you're gonna yell at me in the emails. But seriously, though the practical use cases for SURA, they're just kind of not there. Sora's attempts to replace filmmakers, if that is open ayes goal, and I really believe it is, they're dead on arrival because it's an impractical and ineffective solution and the problems it's solving are really only ones created by Hollywood executives. The AI hype bubble, as I have noted repeatedly, is one entirely reliant on us accepting the idea of what these companies will do, rather than interrogating their ability to actually do it. Sourra, much like all generative AI, suffers from an imprecision and an unreliability caused by hallucinations, an unavoidable result of your using mathematics to generate things, and the massive power and compute requirements are just prohibitively expensive. If this is going to end up as a VFX tool, or a productivity tool, or as a fill in tool. It's going to need to be a lot cheaper than it is to run. Generative AI is already unprofitable to make, soa any kind of useful open ay will have to find a way to dramatically increase the precision of the prompts, reduce hallucinations to pretty much nothing, and vastly increase processing power across the board. Sora hasn't even been launched save for, of course, these handpicked companies that got to test it, meaning that this ten to twenty minute weight between generations of moving images that's likely to increase once people use the product. And that's before you consider how expensive it's going to be to run the bloody thing. This is a significantly more complex model than chat GPT, which is already unprofitable. Sam Moltman can make money, but can he make profit? I severely bloody doubt it. He hasn't before, and I don't think he's going to in the future. He's still begging Daddy Satchia over at Microsoft to give him a supercomputer so his things can fart out things more profitably. It's just drives me a little insane. And these things I've talked about their intractable problems that open aiy has failed to solve. They've failed to make a more efficient model for Microsoft last year in twenty twenty three, their Arakis model Jesus Christ. And while GPT five is meant to be materially better, to quote mister Altman, it isn't obvious what better means when GPT four performs worse at some tasks than its predecessor. I do believe Sam Mortman is telling the truth when he says that the future of AI requires an energy breakthrough. But the thing I think he's leaving out is that it may take an energy breakthrough and indeed more chips for generative AI to approach any level of ness. And he's hoping that people will buy the hype without asking too many annoying questions like what does this stuff actually do? Or is this useful? Or does this actually help me? Or will this be around in ten years? To be clear, Sam Altman is the single most well connected and well funded man in AI, with a direct connection to Microsoft, a multi trillion dollar tech company, and a rollodexter includes effectively every major founder of the last decade, and he still can't get past any of these problems, partly because he is not technical and thus can't really solve the problems himself, and partly because the problems he's facing are burdened by the laws of maths and physics. Generative AI hallucinates because it doesn't have a consciousness or any ability to learn or know anything. It's extremely expensive because even the simplest prompts require GPT four to run highly complex mathematical equations on graphics processing units that cost upwards of ten thousand dollars apiece. Even if generative AI were cheaper or more efficient or required less power, it would still be a process that generates answers based on the extremely complex process of ingesting an increasingly dwindling amount of training data. These problems are significantly compounded when you consider the complexity, size, and massive legal ramifications of training a model on videos. A problem that nobody has seem fit to push Altmnormorti or anyone else at Open AI about what's a pisstake really seems like an obvious one, like, hey man, you need a bunch of training data to train chat GPT, which does words how are you getting all these videos again? Big credit to Joanna Stern who asked mirror Murati, CTO of open Ai, whether Sawer was trained on YouTube videos, and then Mirrormorati of course made that incredible face. Go look up that video. I'll link it in the notes. That's how moately the problem with the current bubble. So much of its success requires us to tolerate and applaud these half fast, half finished tools that only sort of kind of do the things they're meant to do, and we're meant to nod and smile and clap and say great job, Sammy, like we're talking to a bloody child rather than a startup with thirteen billion dollars in funding with a CEO that has the backing of goddamn Microsoft and soa is the ugliest messiest problem of them all. It's videos, while superficially impressive, are still deeply, deeply flawed. They take way too long to generate a problem that's only going to get worse, and they're just far too inconsistent, which is a problem created by the nature of how generative AI works and its approach to generating things using mathematics, and if it's planning to be a VFX tool, if it's planning to be a sidearm for filmographers, it's going to have to be a lot cheaper than it's really practical to make it. Again, nothing open Ai makes is profitable. They may make over a billion dollars of revenue, but everything is burning money. It's just very frustrating. It's all very frustrating. Sora seems kind of cool, but when you take away the cool side and you just look at it for what it is, it's just another con from Sam Altman. It's just another unfinished product that is not able to fit the task. It's just another thing that you look at and you say, oh, if that was just a bit better, it'd be really good. Except in this case it would be a lot better. Yeah, all the press writes about it's incredible, it's amazing, and you can separate the technological achievement of using maths to generate a visual moving image that's genuinely cool. But you gotta stop for a second and say, as cool as this is, the people in the back of their shot, they're molding into each other. It's like the thing, it's disgusting. Hey, that monkey's got like five arms. That's weird. I don't know. I just feel like normal people don't get this much leniency. You and I don't get people saying great job when we do kind of a shitty job. And if we brought something to someone that was insanely expensive only really did ten percent of the job, you needed it too, And also the things that created took forever and looked horrifying. I don't think we'd get told great job. I think we'd be told we'd wasted a lot of money and that someone was quite mad at us. I'm tired of this. I'm tired of these companies announcing these half completed products and having the media dance around and act like they've delivered something truly incredible. I'm tired of the public being expected to do the mental and emotional labor for Sam Moultman and other AI companies, saying it's remarkable that they're even able to do this, and assume and give them credit for some inevitable future where all of thesebms are gone, despite little proof that such a thing is possible and plenty of proof that it isn't. And as I've suggested, I really don't think it is. I think Sora is dead on arrival. I think it's too expensive, too imprecise, and there is no fixing those problems. You can iterate on them, you can improve them, but without some kind of energy or chips breakthrough, they're not even going to have the compute or really the money to build this thing into anything even half functional. And I'm calling on the press to push back on these companies. I'm calling on them to refuse to declare this quasi functional software as complete. I'm tired of seeing the media back these companies and do marketing work for them when they're not done. They don't deserve the credit, and I'm demanding that people like Sam Altman actually change the world before anyone says that they're doing.

So.

Thank you for listening to Better Offline. The editor and composer of the Better Offline theme song is Matasowski. You can check out more of his music and audio projects at Mattasowski dot com M A T T O. S O W s KI dot com. You can email me at easy at Better Offline dot com, or visit Better Offline dot com to find more podcast links and of course, my newsletter. I also really recommend you go to chat dot Where's youreed dot at to visit the discord, and go to our slash Better Offline to check out our reddit. Thank you so much for listening. Better Offline is a production of cool Zone Media. For more from cool Zone Media, visit our website cool Zonemedia dot com, or check us out on the iHeartRadio app, Apple Podcasts, or wherever you get your podcasts.

In 1 playlist(s)

  1. Better Offline

    85 clip(s)

Better Offline

Better Offline is a weekly show exploring the tech industry’s influence and manipulation of society  
Social links
Follow podcast
Recent clips
Browse 86 clip(s)