Emily Leproust is the co-founder and CEO of Twist Bioscience. Her problem: How do you store data in DNA -- and make it cheap enough to work in the real world.
The cells in our bodies contain an incredible data storage system: DNA. Now, scientists have figured out how to use DNA as a digital storage device that is stable and incredibly compact. If you stored all the data on the Internet in DNA, it would fit in a shoebox.
But there's a problem: It's still too expensive to work in the real world. On today's show, Emily Leproust explains how DNA storage works, and what it will take to bring it to market.
If you’d like to keep up with the most recent news from this and other Pushkin podcasts be sure to subscribe to our email list.
Pushkin. Every cell in your body is an incredible data storage device. Inside the nucleus of almost every cell is your entire genetic code, all of your DNA arranged just so, providing the complete blueprint of you in every cell. Unbelievable. A few decades ago, some scientists considered this fact and thought, DNA has stored the data of pretty much every living thing on Earth for billions of years. It's incredibly efficient, reliable. We know it works. What if we could use it to store other kinds of data for certain things? It could be profoundly better than the hard drives and tapes were currently using for data storage. And now they've figured out how to do it. I'm Jacob Goldstein and this is What's Your Problem, the show where engineers and entrepreneurs talk about how they're going to change the world once they solve a few problems. My guest today is Emily Laprust. She's the co founder and CEO of Twist Bioscience, one of the leading companies working on storing data in DNA. Sounds like science fiction, it's just science. Emily's problem, how do you make storing data in DNA as cheap as storing it on a hard drug. Emily's company, Twist, is in the DNA synthesis business. They make and sell custom DNA to researchers and healthcare companies, and Twist is also part of a group of companies, including Microsoft and the data storage company Western Digital, that are trying to bring DNA storage to market. It's clear that it can work in the lab. The trick now is to make it work in the real world and have the economics makes sense. We started our conversation by talking about a key problem with how data storage works today. Hard drives and flash drives and even old fashioned tape which is still used, are all based on magnetization, and magnetization just does not work for long term storage. It's unreliable, it degrades, and so if you want to achive data for a very long time, what you have to do is every five to seven years, you have to take the data from one hob drive to the next, or from one day to the next, or from one flash to the next every five years, because you just can't trust that the data will stay there more than five years. And so the system we have now, obviously it's great, right, like data storage is this amazing miracle of the modern world. But you're saying a flaw with it is it doesn't work forever. In fact, it becomes unreliable after five years on. One of the big effort that's going on there is exactly that it's constantly replacing hard drives that are broken with new ones, and so the migration, the maintenance, and the energy that you need to do that is a big issue for archiving, and archiving is more than sixty percent of the market of data. And so that's where DNA and totally changings is you could archive data for a long time with no migration, no maintenance, no energy, and that could be a game changer. So the key use case is like, it's not my phone, it's not my laptop, it's not anything I'm sort of personally doing. It's maybe if I'm saving something to the cloud, you know, my whatever, kids, baby pictures, and presumably more significantly big institutional sort of corporate or governmental kind of files. I mean, it's that the real opportunity that you're talking about here. Yeah, exactly, DNA will never be on on your computer. It's not for what's called data. So the data will get in and out all the time. It's it's more fall cool data. That's the data that you read that infrequently, but that is the majority of the data out there that DNA can solve. If you're a bank, if you are an insurance company, you have to store the data for a very long time. If you're a government, you want you need to store that that for a long time. So the longer you're going to need to keep the data, the more it makes sense to store it in DNA exactly. And that's why the first project we're going to launch is we call it a century Archive, so it's basically you purchase one hundred years of storage. I like that. That's good branding. Did a marketing person come up with that? Did you come up with that? I did not come up with that, but yes, definitely a very clever marketing person came up with that. And then after that we'll sell the Millennium Archive, the Millennium, the Millennium Archive. But well, let's get to one hundred first. We can I can have you on a year. We can talk about the Millennium Archive, but so are you right now selling a product where if a company wants to save something for a hundred years, they can give you money and you will store their data in DNA. Does that exist? Yeah, we've done it before. Right now, we are in the early access, so it's not fully broadly available. But someone comes today and they have money, will definitely do it for them. The one thing though, today it's still is still quite expensive, and we're working to make it affordable. It will couse a little bit more up front, not too much because people don't really want to be a premium and this market is the elastic so it's kind of ironic. But the lower the price of what we sell, the more we're gonna sell. We're gonna sell it. Sure. No, that is a very intuitive relationship and a classic issue in technology. Right you have this new thing and it's really expensive, so nobody's buying it, and the way it gets cheaper is lots of people buy it and then you figure out the economy of scale and it gets cheaper. I mean, it's that the dynamic you're talking about here, that's exactly the dynamic. And DNA is amazing for that um because DNA is extremely tiny. Yeah, it's extremely dense. You could take all the data in the Internet and which is youtue, I mean everything, it's the whole world, and it will fit in a shoe box if you put it in DNA. So I want to understand a little bit what that means, because it's so not intuitive. So all the data on the Internet means like every video that's on YouTube, everything everybody has ever posted to the internet, every tweet, every Facebook post, everything, everything, So that all exists now on hard drives, right, and giant rooms full of computers all over the world, cooled by fans. Right, Like all the data on the Internet exists now sort of distributed across many, many warehouses. Like just do you do you know how much space it takes up? Now? I don't know. I don't have any idea. I don't know, but it's it's it's millions and minus of square foot. What I do know is that one Facebook data center in Texas uses two percent of the Texas electricity. When we talk about storing data in DNA, can you walk me through, actually what happens. What happens somebody wants to save whatever, a movie, a Netflix movie, and instead of saving it on a little chip, they save it in DNA. So storage of data is a bunch of zeros and ones. Right, It's the binary code DNA four letters ACGT, And just to be super clear, when you say four letters, I mean all. All DNA is made up of four basically types of molecules, and only four. That's right. If you look at any DNA of any leaving organism from the biggest to the smoot one, that DNA is made of foditos, acg, NT, and the function of that organism is based on in which orders the acgs and ts linked together. Emily says, to use DNA to store a movie, or to store any kind of data, you let each of the four letters represent a binary pair. So for example, you could say zero zero is A and zero one is C, one one is T, and one zero is G. Once you've done that, you can take any digital file. You can take that movie and on a computer convert the zeros and ones that represent the movie into a string of the letters ACTG. That string of letters is your blueprint. And then the next step is kind of amazing because we are so used to everything happening, you know, digitally on the computer. The next step is a machine about the size of an SUV actually starts printing strands of DNA based on that digital blueprint. What we're going to actually physically synthesize, We're going to make it from scratch, is a bunch of little pieces of DNA that has two to three hundred litters. Okay, so is that the next step. It's still on the computer, but now the computer tells some DNA synthesis machine what molecules to synthesize. Where the a's go, where the teas go, where the ceas go, where the gee's go. Yeah, so exactly, So a twist, we build a DNA three D printer basically, ok and we have a we have a piece of silicon. It's a citicin chip, and so on. Our citic and chip. The first generation that we made at one million pieces of DNA. The current generation we're working on as two hundred fifty six million. Okay, And so we have a we have a cilican chip, and then we have a printer. So I want to keep going with our sort of how does it work narrative here. So we got the movie that was already ones and zeros. The ones and zeros became ATCNG, those got split up into little packets. Yeah, the computer told the three D printer basically what actual ATCNG to print physically the actual chemicals on a silicon chip, hundreds of millions of them, and then we got delivered that ACGT at the right location at the right time. And so we start from the citic ins ship. There is no DNA on it. And then we come in and we print the first layer. So we put the first ACGT on the surface, and then we come in again to put the second layer of SEGT again the third. We do that two to three hundred times, and so we are building DNA up to three hundred letters and we do that millions of time on the Citicin chip. Okay, so now you've got a silicon chip with DNA on it. What happens next? Yeah, So at that point the density is not that great. It's kind of like a hard drive. But then the beauties. You can remove the DNA from the citikinship. I want to keep track of that DNA. You just synthesize. Where does it go? You just dry it down and it's so tiny what you've made. It's a speck of dust. You can't see it. So now it goes into a tiny vessel. But it's called the DNA shell. Yeah, it's a good name. It's not ours. We buy it, and it's a piece of stainless steel and there's a glass cutting inside. You put your the DNA in it, you dry down the water, you sell it, and you put ilium, so there's no oxygen, no water. And in that DNAs shell you can put hundreds of Google Data Center in that DNA shell, like every movie on Netflix. You could pack in a out of DNA in it. And and then you see it and that is stable for thousands of years. And how big is it? It's uh so, you know, I live in Montana. Everybody has guns here, so it's the size of a small caliber bullet. That's nice. Okay, So about like I live in Brooklyn where fewer people have guns at least as far as I know. So would it be about like the size of like a like a bean. It would be the size of two black beans. Okay. You can put every movie on Netflix in there, that's right, and then you can store it your room. Tompret show um in the sun. You know, it's it's extremely stable for forever. And then when you want the data bag. So now it's fifty years later and I want to watch that movie again. Because I forgot that it wasn't very good. Well oh you oh, let me say, jeez, you know there's something happened to to the Netflix that are center, and you know that movie is gone. We need to get it back. That's the real practical use for this. I would imagine it's that like, at least for now, it's like a backup. It's like a super safe backup, even if all of the data centers get hacked or blown up or whatever. You have this volt Yeah, yeah, it's a very it's it's the vault exactly so exactly that that's the ultimate vault. And the beauty is DNA. You can make a copy very easily. Let's go back. So we have our all of Netflix in a little cylinder size of two beans. Well, now we want to get the data. How do we get the data off there? So you you crack open the shell and you put it in what's called a sequencer. So it's a machine that reads DNA. And the sequencer is not fast. It's twelve hours. So twelve hours later you'll get your data. And now it's still ACGT in the computer and you have to do the reverse decoding. You have to turn the ACGT in two zeros on one and that's the end of the story, right, and then you can watch the movie that you encoded today exactly exactly. Sounds amazing, DNA all of Netflix and a cylinder the size of two black beans. But DNA storage is not yet a thing that is really happening in the world at scale. In a minute, the problem Emily has to solve for that to happen, for DNA storage to become a real thing in the world. Now, let's get back to the show. So it sounds amazing. I'm sold the Internet in a shoebox. It sounds incredible. Why Why isn't it really a thing yet? Why does Why why aren't people doing it? Yeah, it goes back to economics. You can do it today, but today it's too expensive. And it's too expensive because today we only pack a million pieces of adigos on the civic and chip. It works. We've done dozens of administrations with again Netflix, Microsoft, University of Washington, with the Montro Jazz Festival, with the United Nations, with the Olympic Games. Right, so it works, but it is expensive, and that's why at Weiss we are pushing the technology further we're packing more and more pieces of DNA on the same citkinship. Again, the state of the art used to be one hundred pieces of DNA. The first Twist product was a million pieces of DNA. Right now we're working working on a chip that we have two hundred and fifty six millions. So okay, so you're packing more and more a DNA on a chip to make DNA storage cheaper. Right, that's the fundamental problem. You have to solve. How much cheaper have you made it so far? And how much cheaper does it have to get. We've already done a thousand times less and were we're in field of and those a thousand times cost reduction and have to add by packing more sequences. You do that by packing more pieces of DNA on the city Kin chip. And so it's weird. I mean, we're used to thinking about this kind of thing. I think because of Moore's law, right, because this thing has happened with computers for whatever it is, fifty years now. That exactly what you're talking about in DNA, right, every two years you get twice as many transistors on a chip or whatever. But it's not really a law. It didn't happen naturally. Right, It happened because like many, many, many, really really really smart people kept coming up with clever ways of getting more transistors on a chip. Right, it doesn't have to keep getting better, getting more efficient. So how do you do that? Like, is there some way without getting too technical, that we can talk about some of the problems you have to solve to keep doubling to get a thousand times cheaper. Yeah, it's it's a very it's a very difficult engineering problem. And I love that it's difficult because if it was easy, every idiots would be doing it. And so it is really really hard, and you have to be very good at sumer conductor, at silicon engineering, you have to be extremely good at electrical engineering, at mechanical engineering, and chemical engineering at computation. It just a What you have to do is build a team where everybody is the best at what they do in their own field and have those people be extremely great team players, and then you have to demand performance. So in some ways you have to run this lack of sports team. I always say at Twist, we're not a family. Right, in the family, you tolerate bad behavior. We out sports team. We hire the best athlete for each position to demand performance, and you demand that they work as a team and if you give them an amazing tough problem, they'll find a way. So you have this team, how do they actually solve the problem of you know, building denser chips to make DNA storage cheaper. Yeah, so first you have to design the Citicans ship where it's it's a design a computer that cost millions of dollars just to do design right, And then you press the button and you say, get me that chipmate. That's another set of millions of dollars. And then you get the chip in and you literally hook it up to a giant amount of set of flectionics and you literally boot the chip like you boot your computer, and you know the chip doesn't work. It's like, oh, now I have to go back either redesign it, which you know it's money and time that you want to do. Oh, you have to debug it. See what, See what happens. And when you finally get the chip to work, that's just a Citicans chip. Now you have to bring the ACGT on it. Okay, yeah, you have to. You have to build the DNA and that's very hard as well. And that's why we be one experiment and we have to do the the engineering principle of design, build tests. You design an expense, you build the DNA, you test it, you learn a little bit, you go back and you do it again and again, and every cycle of learning is time and money and again, if you have the right team, that cycle of learning will be faster than the competition, will be cheaper on a competition, and that's how we win. It is extremely, extremely difficult, and again thankfully it's difficult. That that means we have a shot at greatness. So yes, the good news is it's super hard and nobody knows how to do it. So the goal ultimately is to be able to store things in DNA for a price that's comparable to what you could put it on a hard drive for now, and in that universe it would be cheaper in the long run in the DNA. When do you think that'll happen? So I know the answer, and I get that question from all our investors all the time. Unfortunately I can't. I can't tell you, but it's it's it's very soon. And the chip we're working on now, that is the chip that we're going to go commercial with and so the chip is designed, the ship is build, and now we're in the debugging phase and when when that is finished, then we'll be able to launch commercially. And so does that mean in a year? Does that mean in five years? So if it's not done in five years, I'll definitely get a lot of love letters from our investors. So it's definitely a lot less than five years. In a minute, it's the lightning round, including Emily's advice for solving hard problems and what she actually does to solve hard problems. Now let's get back to the show. The last thing is a lightning round, a bunch of questions, but but we'll do them fast. What is one piece of advice you'd give to someone trying to solve a hard problem? If I first, you don't succeed, try again, Right, you have a choice. Right when you don't succeed, you can call mom, you can order a pizza, you can drink a beer. You try again. You have a favorite gene. I do not have a favorite gene. I love all the genes. They're like children, right, I guess you're their children. We're their children. As a scientist, As a geneticist, do you feel like you understand something about the world or about people that most people don't really understand. I don't think so. Yeah, I'm not very to shephilia as a person. I'll probably somewhere on on some spectrum. So I know, to make DNA and or to write, you know, to sell, to do fundraising. But those are probably the four things. Do you think you'll ever leave twist? You think they're all kind of time when you want to go do something else, I would be in the seat until I get fired by by the board. So this is probably my last job. I've read that you play the piano when you're thinking about work. Is there some something in particular you've been enjoying playing lately? A favorite piece these days? So I practice the piano every day. I don't know if I'm really playing it. Some people make coll it butchering or murdering the piano, but I totally enjoy it. What I love about the piano is that you know, the left hand is easy, the right hand is easy. But putting the two hands together, that's a total mind bending experience. And it's you have to use one hundred percent of your conscious mind. And when I find is if I have a hard problem at work, I can't figure that out, I go up to the piano. I'm one hundred percent. I'm using one hundredercent of my brain and somehow, in the background somewhere I think of a solution and that that is quite exciting. And yeah, all my tough problem. I never get them resolved sitting on by desk, routing on my computer. It's I get them resolved either playing piano or taking the dogs out for a walk. So maybe that's your real advice for how to solve a hard problem is don't think about the problem or do something else. Yeah, take your prim out of the building, take a hike. Emily Laprus is the co founder and CEO of Twist Bioscience. Today's show was produced by Edith Russlo, edited by Robert Smith, and engineered by Amanda K. Wong. You can reach us at problem at pushkin dot fm, or you can find me on Twitter at Jacob Goldstein. I'm Jacob Goldstein and I'll be back next week with another episode of What's Your Problem.