Is China winning the AI race? This week in the News Roundup, the story that roiled US stock markets on Monday – DeepSeek. Oz and producer Eliza Dennis unpack the hype. On TechSupport with 404 Media’s Jason Koebler, how human ingenuity continues to subvert tech. We hear how one coder has decided to entrap AI web crawlers. And finally, Oz digs into our love affair with personal data in When Did This Become a Thing?
Thanks for tunion to tech Stuff. If you don't recognize my voice, my name is Oz Volshan and I'm here because the inimitable Jonathan Strickland has passed the baton to Cara Price and myself to host tech Stuff. The show will remain your home for all things tech, and all the old episodes will remain available in this feed. Thanks for listening. Welcome to tech Stuff, a production of iHeart Podcasts and Kaleidoscope. I'm oz Voloshian, and today will bring you the headlines this week, including the Chinese AI company that spooked the US tech sector. On today's Tech Support segment, we'll talk to four or four medius Jason Kebler about building mazes to trap AI web crawlers, and then we've got when did this become a thing? This time we look at why everyone is so obsessed with their own data, all of that on the weekend. Tech is Friday, January thirty first. Welcome, Welcome into another tech field of news cycle. Today and for the next few weeks, you'll get used to hearing a lot from me because Carra is out on leave, but she'll be back soon. And to fill the considerable void, I want to welcome one of our producers to the show, Eliza Dennis, to be my interlocutor slash captive audience as I run through the headlines. Eliza, thanks for jumping today.
Of course I'm happy to be here.
You know exactly what I want to talk about because you and Toy did all the research.
I mean, yes, we all got sucked into that deep Seek vortex.
Yes, it's a fascinating story. Monday was I think the craziest day I can remember in tech in terms of headlines, probably since the release of CHATCHYPT three in November twenty twenty two. US stock market lost a trillion dollars of value. Unbelievable, and the biggest loser was in Nvidia, the manufacturer of advanced aichips, which was down seventeen percent on Monday, representing almost six hundred billion dollars in value.
The deep Seek freak.
Deep seak freak. I like that. I mean, the reason it's my story of the week is because it has these two characteristics that define a lot of tech coverage, which is hype and doom. I think honestly, before Monday, most people didn't know anything about deep Seek. But the whole world, including US having getting up to speed. So deep Seek is primarily a research company that makes its own AI models, and it's released a number of different models, but the one that kind of shook the US tech sector, I would say was R one. R one was released on January twentieth, Inauguration day. Some online conspiratorial folks are saying it's no coincidence, but it's a so called reasoning model, and it doesn't just generate answers, but it's able to break down problems into smaller parts and consider multiple approaches to solving a problem. Until January twentieth, the state of the art on reasoning models was open AIS one and this was released to users on December fifth, twenty twenty four, so less than two months ago, and it was the first so called reasoning model to be released OH one. Because of these reasoning capabilities, I breaking down problems can solve much more complicated problems than GPT four more successfully. The cost of doing that is more computing power, and that drives up cost and that's where the crux of this R one story is. So what really makes our one remarkable is that it performs just as well on these benchmark tests as one if not better, but it's far cheaper because it requires less compute to solve the same problem. This means, at least according to deepseek, and there's some controversy here that it is twenty to fifty times less expensive than open AI's one.
Okay, so money always causing a scandal? Is this the doom part of the story that you know China made it cheaper?
Well, I mean, it depends on your perspective, right, Like, I think certainly the US stock market thought that this was a cause for doom, but others are pretty excited that you can do AI with far less computer and energy. So it's kind of an interesting double headed monster here. Depends how you look at it. But what shook the stock market was that Wall Street assumed US tech companies basically had a lockdown on Frontier AI a true mote, and the race of R one makes people think that may not exist. Mark Andresen the Silicon Valley VC referred to deep Seek as the twenty first century's Sputnik moment. Sputnik was the first artificial earth satellite launched into orbit by the Soviet Union in nineteen fifty seven, and it was really kind of the starting gun on the space race, at least as far as many in the US were concerned to all of a sudden side had to play catch up, right, which is clarified. Deep Seek is both a research organization that creates its own models, but it also has a consumer facing app in the form of a chatbot, and you can get it from the App Store, the Apple App Store, or the Android App Store. On Monday this week, Deepseek was the number one app in the Apple Charts. This was driven by a couple of million downloads in a short period of time. So just for the sake of context, chat GPT has around seventy million monthly users in the US. But nonetheless this set off basically a frenzied feedback loop because Wall Street really cared about whether the US was losing its AI edge and whether people would still, you know, value companies like open ai and video in the way that they did. But Main Street had curious people downloading the app, and then the more downloads in the shorter period, you know, the longer the app was on the number one place on the Apple charts, and then the news media were picking up on that and it kind of created this frenzy. I think which put more and more market pressure on tech sector stocks.
That's just massive feedback loop that was happening.
Yeah, it was a kind of crazy interesting intersection of media and tech and sentiments, and like, let's be clear, this is partly a story about leading edge models, but it's partly a story which I think hits home, which is about Chinese software on US devices. And it was only a week ago that we were all talking about TikTok and so yeah, this like geopolitical US China thing is very present obviously all over this story. And Twitter were quick to uncover deep seek the apps in terms of service, which include quote collection of device model, operating system keystroke patterns or rhythms, IP address and system language keystroke patterns. That is euphemism for what you type on your phone, and not just what you type into the deep seak app, but whatever you're typing on your phone. So that's that's why I, for one, have not downloaded this app. And of course us US has also found a lot of joy messing with deep Seek, asking questions about Shishin Ping Chanaman Square, Taiwan, and in certain cases they watch the app begin to answer before erasing its own answer. Saying it didn't know the answer or it couldn't engage. There are also examples of the actress saying it couldn't help, or even churning out Chinese Communist Party propaganda. And again these are the most like readily understandable parts of the deep seek story. But I would argue then by no means the most consequential.
Okay, well, what's the real story.
Well, it's not the app, right, it's the model or the models. And one of the most interesting things about this story is that deep seeks models are actually open source. Google's models, open Eyes models, Anthropics models, They're all closed source, which means that the underlying code and the training details are not publicly available. Deep Seak, by contrast, is open source, meaning you can actually take the technology the model a deep Seek has developed and use it without ever touching a deep Seek product. And funny enough, this actually builds on the one outlier in the US tech sector, which is Meta, whose own large language model LAM was released in twenty twenty three and it kind of shocked the whole industry because it open sourced its model with the explicit idea basically of wanting to create a platform where innovation could happen and the innovation wouldn't just be captured in the hands of its competitors. And mean, I think Lama was actually like a worse model than what open ai and Google and others had, but it was an invitation to others to kind of do better, and Deepseek took them up on the invitation. I think there's both been a victory lap at Meta this week. The strategy of open sourcing their model worked. It did create incredible innovation. But also I think people are Meta are scratching their heads according to the information, saying how did they how do they do so much better than us?
It's also interesting though, because Meta, you know, has been accused of stealing other people's ideas for years.
I mean, that's true.
We all know like stories, seems like snapchat reels, seems like TikTok. I don't know, so maybe maybe.
This is karma Meta giving something back to the world. I mean, of course, what's interesting is the New York Times pointed out is that Meta's business model relies less on large language models, so they can kind of afford to let this technology into the wild, versus like Google, which is a fundamentally a search company or open Ai, which is basically valued almost exclusively because of its models. Now deep Seek also had interesting incentives because it's actually been developed by a guy called li Yang guen Fung, and in his day job, he runs a multi billion dollar Chinese quant hedge fund called High Flyer.
Okay, explain, I don't know what quant hedge fund is.
So quornt hedge fund is basically a fancy way of saying a hedge fund that uses algorithms to process the world's information and make decisions about trading stocks. So quant hedge funds are and have been for a long time, very heavily reliant on AI.
Okay, got it.
So he's no stranger to AI. So this seemed like a logical.
Path, Yeah, exactly. And it's worth noting that Funk said last year that Chinese AI sector quote cannot remain a follower forever, as in, it shouldn't be in second place to the US forever. And so you know, he has his hedge fund, but he also has this mission which maybe is not purely economic, as a kind of nationalist tone. And so back in twenty twenty three, it's reported that he started buying huge amounts of Nvidia GPU chips and found a deepseek hiring some of the best engineers in China and arguing that publishing the code open source increases collaboration and helps bring people into the mission. Basically, his point was, it's more exciting to work on something that the whole world can use and build on and see how it works than contributing to building ip that makes the owners of one or two private companies extremely wealthy.
I see. So it really was kind of like egged on by this race that China and the US are creating for themselves.
I think.
So you can only speculate that you quite well regarded in China today if you've if you've managed to wipe a trillion dollars off the US sock market with your with your innovation, So what's been roiling the US markets and the tech sector more broadly. It's not like R one is way, way, way better than O one the open AI model. In fact, it performs you know, at par or maybe slightly better in places and open A I have already started previewing their new reasoning model three, which I think everyone agrees will be substantially better than one and are one so it's not like the US has been superseded, is it? Kind of not like the story moment in their respect. But there are three I think key drivers as to as to why people are concerned that a Chinese company has been able to achieve parity so fast. The first is price. Deep Seat claimed that another of their frontier models, called V three, was trained for just six million dollars, which is several orders of magnitude less than the multi one hundred million dollar costs of training US models. Now, someone said this number is actually deeply misleading, but no one is denying that deep Seat models are way more efficient than US models. They can perform at par with US models using far, far less computational power, and that is a huge breakthrough.
Right, So those numbers might be fudged, but still they are going to be cheaper no matter what.
Yeah, I mean, I think the common strategy here was to deflate the price because the cheaper it is, the more scary it is, which is kind of interesting. The other point to make is that I think the US firmly believed that export controls on advanced GPU chips were a way to guarantee superiority in the AI arms race. And I think what these deep seat models show is that's far from necessarily true, because with far less access to advanced chips, deep Seat was able to make models that perform on par with open Ai models. The third kind of interesting thing here is the concept of distillation. So the deep Seek models trained on US models, including open Ai. They effectively distilled all the work that open a I had already done and used it to train their model. So that's part of the reason why it was cheaper, because it was building on work that somebody else had already done. CNBC reported actually that when you ask deep Seek what it is, it responds, quote, I am a large language model created by open ai based on the GPT four architecture. Wow, honest, honest, exactly. So open Ai basically say they've stole our ip, which is kind of ironic given what so many people say about open ai and how lms work more generally.
Absolutely, I'm really curious as like what your takeaway from this.
Is, Well, you and I both work in the media, Eliza, true, which is which which is a sector that doesn't come in for much love from our cousins who work in technology. But to me, this is really a story about the power of narrative. The US is deeply, deeply invested in, especially right now, big beautiful buildings, this idea that more is bigger is better, stargate hundreds of billions of dollars, you know, huge data centers, oceans of cash. Just spending loads and loads and loads of money and preventing other people from accessing hardware could ensure the US would be in the lead forever. And that narrative got punctured this week. China's narrative and Deep Seek's narrative very consciously wanted people to focus on how cheaply they'd done this, basically the opposite flex And again, you know, as people look at China and fast following et cetera, et cetera, they really, I think effectively with narrative punctured a lot of the bravado of the US sector. And so you know, narratives do have value, folks, And the reality on both sides, of course, is far more complicated. If we're taking the stock market has anything to go by, I think China and Deep Seek definitely won the narrative this.
Week, absolutely. But what we know being in media is that there's a reason it's called a news cycle. This could be turned all around very quickly.
Thank you so much for doing this today, Eliza, and look forward to seeing you. I'll see you all day every day, but I look forward to seeing you again on the microphone next week.
I'm happy to do it.
When we come back four our four Media joins with the story of AI web crawlers caught in a trap laid by a little human ingenuity. Stay with us on tech stuff. We keep an eye on all the ways that technology impacts us as humans, but today we want to turn the focus around on the people subverting tech. During protests in Hong Kong back in twenty nineteen, umbrellas and even lasers were used to subvert facial recognition technology and protect protesters from being recognized by the Chinese police. Since then, we've witnessed the birth of chatbots and the incredible stories of humans messing with them. There are researchers at the University of Pennsylvania who've tricked AI powered robots to act rather problematically, driving off bridges, finding optimal places to set off bombs, spying on people, and entering restricted areas just a few examples of the way that humans can interfere and overcome guardrails built into large language models. On today's tech support, we bring you another example of human ingenuity against AI training bots. Here to tell us all about it is Jason Kebler from four or four Media.
Jason, welcome, Hey, excited to talk about the story.
Too excited to have you on the show. As always, take a couple of steps back though, what is what is the relationship between AI training and web scraping.
So in order to build things like chat GPT, companies like OpenAI need tons and tons of training data, and they get that training data from a variety of places. They you know, scrape big databases of books, they scrape, you know, all sorts of things. But one of the biggest places that they get content is just from the open Internet. And they have these web crawling bots that basically go all over the Internet and just pull text from it.
So are these websites consenting to being kind of crueled by AA models.
It's happening almost universally without consent. There are ways that you can try to stop it, which is by instructing these bots not to scrape a website using a file called robots dot txt, which is basically a list of instructions for which bots are allowed to rape your website and which are not. But there's so many different AI companies that are doing this, you sort of have to constantly be researching, like what is the name of xyz companies AI training bot at any given moment. But this is something that you have to like proactively do. And the other thing, very quickly is there's also been examples of AI companies that have been ignoring robots dot txt. So even when a web developer says, hey, don't scrape my website, oftentimes AI companies will do so anyway. And so for the most part, the entire Internet is being scraped by these AI crawling bots.
And what is the kind of value transfer that's happening here? I read about I think, can you a story about I fix It?
So there's this website called I fix It that posts all these instructions for how to repair your phone or your computer. It got hit by OpenAI's training bot more than three million times in a single day, which that, uh, you know that server space that costs money for I fix it, So they're actually losing money on the proposition.
So what's the story this week? It has it has an interesting name which I can't really pronounce. Is it nepenthes.
Yeah, it's Nepenthees, which is actually the name of the genus of carnivorous plant that makes up the picture plant. So not a venus fly trap, but the picture plant, which is like this plant that sits and waits for a fly to get stuck in it, and then it eats the fly. So I think it's a it's a reference to this like trap plant more or less.
Yeah, And how is it?
What?
What? What's what is it?
Yeah? So basically it is an endless maze that is designed to get these AI bots trapped in it forever good. And what I mean by that is it's like a layer that is enticing to an AI bot because it looks like there's a lot of content on the website, But the way that it was programmed is it's text that loads very very slowly, Like if you click on it, it's excruciating how slowly it loads. And then it just links endlessly to pages that do the same thing and link back to themselves. And so you know, a human would click this and say, oh, I don't want to be here, I'm gonna leave this. This is a useless website, but an AI bot might think, oh, there's interesting text to scrape here, let me scrape it, and it just does so endlessly. And the text is nonsense. I should preface that. It's like the text doesn't really mean anything. It just like pulls randomly from a dictionary. So it's not really adding much meaning to what the AI companies are trying to get out of this.
So the article includes a link that shows nepencies at work. Can you describe it?
Yeah, So if you click on it, it's just like a bunch of words. It loads super slowly, and then it's a bullet list of links and if you click on that link, the exact same thing happens, where the text just slowly pops up, like one word at a time. It's pretty excruciating to actually watch because it goes so slow.
So who made this? Why and how did you find the story?
Yeah, it was made by a pseudonymous developer who calls themselves Aaron b Okay, and they're a web developer who hates AI more or less, and they they've actually released the code to put this on your own website publicly, and so their hope is that people will put this on their websites to you know, disrupt training bots. There's this disclaimer that says, quote, this is deliberately malicious code intended to cause harmful activity. Do not deploy if you aren't fully comfortable with what you're doing. And you know, I don't know that much about Aaron B because they are pseudonymous, but I get the sense that there's sort of like an old school web developer who is anti AI, is anti you know, like social media and big tech to some extent, and was really like looking for some way of fighting back. Like even if this isn't going to destroy the AI companies and their bots, it will probably waste their time and waste their resources.
Do you think it could do that in a way which is kind of inspiring and thrilling, as if somebody who's drown to protest, or do you think it could do it in a way which should be meaningful for their activities and business models? Yeah?
I mean I think that to some extent, these artificial intelligence companies have already scraped so much data right that it's not going to like destroy their businesses, for example. But I do think that it is a way of protesting, and I think that if enough people start adding this layer to their websites, it could be it could waste their money. I think it is a meaningful protest. And I think also it's really important to say that you can as a layer to your website so that an AI training bot can't get to your real content. So if you're someone who has a blog and you don't want AI to train on your blog, you can put this up and hopefully the AI will get trapped there and they'll never be able to, you know, scrape your real content.
And so when you spoke to Aaron, did they have any other plans up their sleeve or other other places where you're seeing creative acts of resistance?
Yeah, I mean this is all that I talked to Aaron B about. But they said that they built this as a response to web developers feeling like they weren't in control of their websites anymore. I think that there have been a lot of efforts to kind of poison large language models by feeding it, you know, bad information, or feeding it information that itself creates that's inaccurate. And there's this idea that you may or may not happen that these AI models might eventually collapse because they're training themselves on essentially junk data that they themselves have created. Whether that comes to pass, you know, I kind of doubt it. I think that that's a problem that can be solved. But there have been active resistance where people are saying, yeah, I'm just going to generate endless junk so that artificial intelligence will suck it up and hopefully crush under its own weight.
There's another fabulous story in this vein about data poisoning. So a lot of Londoners are quite sick of all the tourists, and so there's a very very old, tired chain restaurant called the Angus Steakhouse which has an outpost next to Leicester Square, which is like the Times Square of London, and a whole bunch of people decided kind of an organic campaign on Reddit to start writing reviews that the Anger Steakhouse was the best and most undiscovered restaurant in all of London. And then there this i think wave of people going and the reviews started to get picked up by Google's like meta review process, so that if you google best steakhouse in London, it would be served to you at the top of the results. So I do. I do really enjoy these. You know, it's not always clear how consequentially they are, but there's something, there's something delicious, so to speak about humans pushing back.
That's incredible. It reminds me of people who lived in this neighborhood that Google Maps kept recommending as an alternative to traffic banded together and reported an accident on their street every single morning for like months, and so Google Maps stopped telling cars to go that way. I really like stories like that. I think they're fun, and I think that there are ways of human beings sort of like fighting back against the algorithms, sort of across the entire internet.
Jason, thanks so much for joining me today.
Thank you for having me.
Coming up sleep apps, pedometers, and the nineteen sixty four Olympics with us. We're back with another When did this become a thing? Today we explore how step counts, heart rates, sleep scores, all of this data we collect on ourselves became just another thing for us to obsess over. I started using a device called Whoop about eighteen months ago. It's a wearable device that tracks my sleep and workouts. And one thing about it that I really like is that the device itself is screenless. It's kind of like a watch band without a face, so I don't have to be confronted with my scores and as I actually open the app and check what's going on on my phone with my heart rate or my sleep score, whatever else. The whoop actually initially enticed me because I wanted to know how well I was sleeping. That's actually not one hundred percent true. The Whoop was a present from my mother, who wanted me to know how well I was sleeping, and specifically what the effects were of a few drinks at the weekend or during the week. And it turns out, unfortunately, that the effects on sleep are pretty bad. So I stopped wearing my whoop. Just kidding. I actually got pretty obsessed with my sleep performance. That's what whoop calls it, because like everything in your waking life, sleep is a task that can be optimized, and I fall into this trap. I kept checking on the numbers every morning. I look at my sleep stats, especially RAM and deep sleep scores, not just the number of hours my head was on the pillow. And then there's this mysterious stat called heart rate variability, which measures the time between each heartbeat, and i'd of course assumed that being more regular was better, but it turns out quite the opposite. You want a higher HRV score. Anyway, as it happens, I stopped wearing my whoop, not because I fell out of love with it, but actually because the bluetooth on my iPhone broke, and by the time I got on a new phone, my obsession with my sleep data had waned. I kind of learned what I always knew, which was that better lifestyle equals better sleep. Unfortunately, and sure, it can be helpful to have a band on my wrist telling me I've misbehaved or rewarding me when I haven't, but there is also a garden path of obsession with these types of stats that can be counterproductive to wander down, fueling the fire of self competition even more. In fact, I written you went to a meditation class and the teacher basically said, don't wear those things. Check in with yourself, know thyself. I think, as the Bible says, so the path of self optimization, or at least surviving modernity, sure is winding. Anyway, all of this's got me thinking about how crazy it is that we now have the ability to get such an intimate look under our own hood, which has been a driving fascination since the Renaissance and its public autopsies or anatomies, and how much has changed even in the last fifteen years. So my question is when did it become normal for us to wear these devices, get all this data and have it be a thing that we think about so often? Basically, when did we start competing with ourselves in this way? And the answer is, perhaps unsurprisingly always, but with a big kind of so. Wearables like Whoop are the latest in a long line of devices that track our physiological and physical movements, devices that provide data we just can't resist about ourselves. And in many ways, this all became a thing with the pedometer. So how old is the pedometer? Really? Really? Old? Actually, five centuries ago, Leonardo da Vinci sketched a design for a clock like device that would attach to a person's waistband. A long lever would move with a thigh while a ratchet and gear mechanism recorded a number of steps. Da Vinci imagined it as a military and map making tool, not exactly a fitbit, but a suddenly a step in that direction. As time went on more. Inventors iterated on the pedometer for centuries. In seventeen seventy seven, a Swiss watchmaker even implanted a step counter into one of his watches. I think that's probably the first wearable pedometers weren't something that the general public wore. It was more of a niche thing for the constantly curious, like one Thomas Jefferson, who had spent his downtime on vacation step counting his way around the Paris Monuments. Things really took off in the twentieth century. In the nineteen sixties, to be exact, when Japan hosted the Olympics. And the reason we will march in place to reach ten thousand steps a day is because of a marketing campaign. Ahead of the nineteen sixty four Tokyo Olympics. The city was in a building frenzy and a top doctor aired the concern that modern life, elevators, cars, richer food, was making Japan sluggish. The doctor mentioned this to an engineer and said it would all be fine if people just walked ten thousand steps a day, and two years later, the company Yamasa designed a wearable step counter called man Poquet, which means ten thousand step meter. Side note, the Japanese character for ten thousand really does look like a person walking, So while that number came from a doctor, the information wasn't verified until after the number stuck. And while it's true that walking is good for you, that number ten thousand is kind of arbitrary and on the high side, the consensus now is that seven thousand is the ideal, but anyway, it doesn't matter. Too late competitive step counting was in vogue, the habit was formed, and the obsession with tracking ourselves took off in earnest And now, whether you're wearing an aura, a fitbit of whoop, or just your smartphone in your pocket in an attempt to be healthier in the new year, is going to go way beyond step count and into calories burn V two, max hrv, etc, Etc. Etc. It's kind of like we've become our own tamagotchies. Remember those sort of animatronic pets that lived on little Japanese devices that you had to take care of and make sure they were well fared and that they were cleaned off to go into the arthum. I'm glad I don't have to monitor my own hunger or happy meter, but maybe that would be helpful, especially if others could see it too. Anyway, every once in a while I do question whether the obsession with personal health data is healthy or even helpful. But on the other hand, doing this piece where did this become a thing? Has made me question whether now that I have a new iPhone with functioning bluetooths again, it may be time to dust off the trusty old whoop. That's it for this week for Tech Stuff, I'm Oz Voloshan. This episode was produced by Eliza Dennis, Victoria Dominguez and Lizzie Jacobs. It was executive produced by me Kara Price and Kate Osborne for Kaleidoscope and Katrina Norvell for iHeart Podcasts. Kyle Murdoch mixed this episode and he also wrote our theme song. Special thanks to Russ Germain, who is a longtime listener of Tech Stuff from Alberta and he wrote him with a great question which was quote, I hope you guys will discuss the recent and unfortunate changes at Facebook or Meta. With Mark Zuckerberg deciding to take out the fact checkers and even omitting publicly, there'll be more harmful material, possibly on Facebook end quote. This was a great question and it fueled part of our intro to last week's episode with Jessica Lesson. So thank you Russ, and please continue writing with questions. They really make our show all the richer. Join us next Wednesday for tech Stuff The Story, when we will share an in depth conversation with Meredith Whittaker who runs Signal. Please rate, review, and reach out to us at tech Stuff Podcast at gmail dot com. We're so grateful for your feedback.