Plugging Into DNA for Digital Data Storage
DNA is the storage system for the biological code of the human genome. Now, engineers are tapping into this natural code to store digital data. For instance, Georg Seelig, an engineer from the University of Washington, and his team were able to store and retrieve digitized photos on strands of DNA. Seelig discusses how to translate binary code into the four nucleotide bases.
Georg Seelig is an associate professor in Electrical Engineering and Computer Science & Engineering at the University of Washington in Seattle, Washington.
IRA FLATOW: Speaking of DNA, DNA is the storage locker for our genetic code. And why not convert the ones and zeroes of a digital code into the AGTCs of the genetic code and store that digital information in our DNA? A team of engineers did just that. The engineers were able to turn digitized photos and videos into DNA code. And then they were able to retrieve the info out of DNA, which is very important if you’re going to use it as a storage medium, with 100% accuracy. Very important step.
So why not create a hard drive out of DNA? My next guest is here to tell us how to do that. Georg Seelig is part of that team. And he’s an Associate Professor of Electrical and Computer Science at the University of Washington. He joins us from KNOW. Welcome.
GEORG SEELIG: Welcome, it’s good to be here.
IRA FLATOW: So the first question is why do you do this? We have thumb drives that can store terabytes of information. Is DNA a great storage medium?
GEORG SEELIG: I think it is. There’s three main reasons why that’s true. The first one is it’s really small. It’s really dense. So you can just cram a lot of information into very small space. The second is that it lasts for a long time. I mean, we can retrieve DNA from very old, 100,000 year old, fossils. And the third one is that DNA, as you’ve just talked about, is our genetic material. And so will always be interested in reading DNA. So it’s not like a floppy disk that 10 years from now nobody will be able to look at.
IRA FLATOW: In your experimentation, tell us what you do. What kind of photos? How you were able to digitize the DNA?
GEORG SEELIG: OK, yes, so the process is actually quite simple. So anytime you store image or video or something on a computer, you’ve already digitized it into zeroes and ones. And then what we need to do is translate these zeroes and ones to A’s and C’s and G’s and T’s. So you could do that by just saying, an A is 0-0. T is 0-1. G is 1-0. And C is 1-1.
And so you just translate zeros and ones to letters. Then you essentially ask a DNA synthesis company to make strands of DNA for you that have the right series of letters. You get the DNA in the mail. Actually, you store it in a fridge as of now, but in different settings in the future. And then you sequence it back to read the information in the test tube.
IRA FLATOW: Wow, and it will stay for how long?
GEORG SEELIG: It really depends on how you treat it. DNA can be very, very stable, like thousands of years if it’s kept away from water. If it’s in water, it won’t last very long.
IRA FLATOW: I’m Ira Flatow. This is Science Friday from PRI, Public Radio International, talking with Georg Seelig at the University of Washington in Seattle. How do you then get the information back out? What kind of mechanisms can use to retrieve it from the DNA?
GEORG SEELIG: OK, that’s a 2-step process. First, you use a technique called PCR, which allows you to essentially pick out, if you have in your pool of DNA many different files and you just want to read one. You can use PCR to do random access, and pick out just the specific file you like and amplify that. And then you take what you amplify and you put it on the DNA sequencer, which is exactly the same type of device that Dr. Venter is using. And so then that allows you to read the information back.
IRA FLATOW: And give me an idea how much data you can store in DNA.
GEORG SEELIG: So in our paper, we just stored about 150 kilobytes, which is not a lot, obviously. I think it was four images. But I think that’s really changing very rapidly. I think just like 10 years ago, 20 years ago, people were able to synthesize maybe a few strands of DNA corresponding to a few characters that you could store. And I think in just a few years, we’ll be able to store orders of magnitude more data.
IRA FLATOW: Give me an idea what you mean. How many pictures, photos, videos? Is it terabytes we’re talking about?
GEORG SEELIG: Oh, I think eventually for sure. I mean, I think eventually exabytes are realistic. So I think we can store, not next year, but maybe 10 years from now or so, I think we can store the data center sized amounts of information in DNA.
IRA FLATOW: A whole data center size, how much space would that take?
IRA FLATOW: An exabyte, if you just look at the DNA alone with a little bit of packaging, you could argue that it would be like a sugar cube. I think realistically, you’ll have to build infrastructure around that sugar cube. So the effective density will be less than that. But it could be really small. It’s definitely denser than any storage material that’s out there, including magnetic tape which is currently the gold standard.
GEORG SEELIG: So you could take a whole data center and put it in a sugar cube sized thing.
IRA FLATOW: I think eventually that will happen. I think that the potential is there. The key thing that needs to happen to get there is that you need to make writing DNA, so DNA synthesis, much, much cheaper, like a million times cheaper. And there’s a lot of questions as to how to do that, but I think it can be done. And if we do it, then this becomes realistic.
IRA FLATOW: And you’re working with Microsoft on this, are you not?
GEORG SEELIG: Exactly. I don’t actually work– yeah.
IRA FLATOW: They’ve got so much business in the cloud now, they must be interested in data storage.
GEORG SEELIG: Exactly. So we’re actually a team. My expertise is on the DNA side. I’m actually a synthetic biologist by training. And then I’m working with people at Microsoft, Karin Straus, Doug Carmean, and in my own department in computer science, with the Luis Ceze and their computer architects. And really, what they are thinking about is how to improve storage. So I think there’s a real interest commercially in making storage cheaper and denser.
IRA FLATOW: Well, Georg Seelig, thank you very much for taking the time to be with us today.
GEORG SEELIG: It’s my pleasure.
IRA FLATOW: Associate Professor of Electrical and Computer Science at the University of Washington in Seattle.