Author Topic: Researchers Store Digital Data in Synthetic DNA (Read 2051 times)

Paul2 · « **on:** November 03, 2018, 04:37:51 PM »

Quote

Microsoft researchers have been exploring the role of biotechnology in IT via an end-to-end system that stores digital data in DNA.

Dr. Karin Strauss, a Senior Researcher at Microsoft Research in Redmond, explains how the unique properties of DNA could eventually enable us to store really big data in really small places for a really long time...

...The idea of storing sata in DNA actually dates back from the 60s, right after the structure of DNA started to be more well-understood. The question was whether DNA would be able to carry any kind of information besides the information about life. So, one could use DNA for that except at that time there was no technology available to fabricate DNA or to read DNA… not at reasonable speeds.

According to Dr.Strauss, the DNA has some exciting properties.

The first one is density. So instead of really storing the bits into devices that we have to manufacture, we are really looking at a molecule, storing data in a molecule itself. And so, a molecule can be a lot smaller than the devices we’re making.

"Just to give you an example, you could store the information, today stored in a datacenter, one exabyte of data, into a cubic inch of DNA. So that’s quite tiny. Durability is the next interesting property of DNA. And so, DNA, if preserved under the right conditions, can keep for a very long time, which is not necessarily possible with media that’s commercial today. DNA, if encapsulated in the right conditions, has been shown to survive thousands of years. And so, it’s very interesting from a data preservation perspective as well. And then, one other property is that, now that we know how to read DNA and we’ll always have the technology to read it," she said.

If we go back to the structure of DNA, it’s the chain of the different bases, A, T, C and G. And so, the way to think about them is, they’re a sequence of these bases. And the way to think about bits is, digital data is essentially a sequence of bits. And so, the science behind it starts with translating those bits into bases. So, a very simple way to think about that is A corresponds to zero, zero. C corresponds to zero, one. G to one, zero, and T to one, one. "And so, if we have a sequence of bits, we’ll take every two bits and translate it into a base. We use a lot more sophisticated methods, but that’s the first step," Dr. Strauss added.

So, once we've got the binary code translated into DNA code, there is a process to manufacture the DNA and there’s also a process where multiple chemicals are flowed and the DNA sort of grows.

We know which sequences need to be grown and those sequences are grown from a surface. "Once we grow the DNA, we’ll remove it from where it was grown, and we’ll encapsulate it," DR. Strauss said.

Encapsulation of the DNA can be done in glass using a type of chemistry that will encapsulate the DNA in glass. It’s actually silicon dioxide. And researchers habe developed nano-particles that, then, the DNA gets attached to and then a layer of glass is grown around it. And so, that keeps it away from water, which is something that degrades the DNA, UV light and when the temperature goes higher, it protects it from really degrading.

But how do we access the data stored in DNA?

Dr.Strauss said that data was stored in a certain location, organized in a spatial way, so there’s some way to retrieve the actual molecules encapsulated. We need to remove all that glass that was added for stability and extract the DNA. Once that’s done, and that’s the first part of a random-access process, it’s a hierarchical process.

"First, you physically find the smaller set of DNA molecules you are interested in, but then, within that, there are many molecules that may belong to different movies that you’ve stored, and you’re just interested in one movie, you don’t want to read the whole collection. Right? And reading the whole collection would actually be wasteful. And so, we would like the ability to further select that particular movie you want to watch. And it turns out that there’s a process to do that. We do it chemically. And so, it’s just another reaction that comes from nature, actually, and is repurposed for this goal, for this purpose," according to Dr. Strauss. The chemical process is borrowed from the biotech industry. It’s a pretty standard process called Polymerase Chain Reaction and it’s the process that copies DNA.

In her research, Dr.Strauss made headlines when she managed to store 200 megabytes on strands of DNA. The challenges to move further include getting the throughputs up, and also lowering costs.

"DNA manufacturing today is still quite costly. But for both of these challenges, they sort of go hand in hand: if you get the speed up, you also get the cost down. We see no fundamental, physical reason why you couldn’t really scale it to the level of being acceptable or being suitable for DNA data storage," Dr.Strauss said.

Algorithms also play a big role here and a team of coding theorists are working at Microsoft Research on this problem and on the project itself. So, they developed algorithms that really reduced the effort to recover the data from DNA.

"One of the big contributions there was to encode the data in a way that, once we read it on the way out, we need to process minimal amounts of information to really recover the data," Dr.Strauss explained.

"Success would look like everyone in the world has access to DNA data storage. And so, really at Microsoft, our mission is to empower every person and organization to achieve more. With DNA, we would empower every person and organization to store more!," she added.

https://cdrinfo.com/d7/content/researchers-store-digital-data-synthetic-dna

pretty cool i suppose. storing one exabyte in one cubic inch of DNA, that is a lot of data being store in such a small space. i think after terabyte is petabyte, and after petabyte is probably exabyte. wow. Another interesting thing is storing data in synthetic DNA can last for thousands of years which is also really cool too.

Titan · « **Reply #1 on:** November 04, 2018, 08:22:03 AM »

Can you make new animals this way? I remember reading at one point that they are storing the DNA data of animals so when the technology improves, they can take an extinct species and remake it and introduce it into the wild.

Paul2 · « **Reply #2 on:** November 04, 2018, 09:17:26 AM »

i don't know. never heard of it until you mention it, but that does sound cool and scary at the same time that they can do that.

Titan · « **Reply #3 on:** November 11, 2018, 08:30:40 AM »

They can't do it yet. But in the future I'm sure technology can do it. But I have extremely mixed feelings about it. There's a lot of ethical concerns. The only way I'd support reintroducing an extinct species is if it was a species our species is responsible for its extinction, like the Passenger Pigeon or the Dodo bird (two best examples). Otherwise, nature selected it for extinction for a reason. It's not our place to bring it back.

Paul2 · « **Reply #4 on:** November 11, 2018, 11:23:43 AM »

i see. its still sounds scary that in the future they can bring back extinct species to life doesn't matter if the extinction caused by us or from natural selection.

Paul2 · « **Reply #5 on:** March 29, 2019, 01:46:17 AM »

Quote

Researchers Manufacture DNA to Store Data

Researchers from Microsoft and the University of Washington have demonstrated a fully automated system to store and retrieve data in manufactured DNA — a key step in moving the technology
out of the research lab and into commercial datacenters.

In a proof-of-concept test, the team successfully encoded the word “hello” in snippets of fabricated DNA and converted it back to digital data using a fully automated end-to-end system.

DNA can store digital information in a space that is orders of magnitude smaller than datacenters use today. It’s one promising solution for storing the exploding amount of data the world generates each day.

Microsoft is exploring ways to close a looming gap between the amount of data we are producing that needs to be preserved and our capacity to store it. That includes developing algorithms and molecular computing technologies to encode and retrieve data in fabricated DNA, which could fit all the information currently stored in a warehouse-sized datacenter into a space roughly the size of a few board game dice.

“Our ultimate goal is to put a system into production that, to the end user, looks very much like any other cloud storage service — bits are sent to a datacenter and stored there and then they just appear when the customer wants them,” said Microsoft principal researcher Karin Strauss. “To do that, we needed to prove that this is practical from an automation perspective.”

Information is stored in synthetic DNA molecules created in a lab, and can be encrypted before it is sent to the system. While sophisticated machines such as synthesizers and sequencers already perform key parts of the process, many of the intermediate steps until now have required manual labor in the research lab. But that wouldn’t be viable in a commercial setting, said Chris Takahashi, senior research scientist at the UW’s Paul G. Allen School of Computer Science & Engineering.

“You can’t have a bunch of people running around a datacenter with pipettes — it’s too prone to human error, it’s too costly and the footprint would be too large,” Takahashi said.

For the technique to make sense as a commercial storage solution, costs need to decrease for both synthesizing DNA — essentially custom building strands with meaningful sequences — and the sequencing process that extracts the stored information. Trends are moving rapidly in that direction, researchers say.

Automation is another key piece of that puzzle, as it would enable storage at a commercial scale and make it more affordable, Microsoft researchers say.

Under the right conditions, DNA can last much longer than current archival storage technologies that degrade in a matter of decades.

The automated DNA data storage system uses software developed by the Microsoft and UW team that converts the ones and zeros of digital data into the As, Ts, Cs and Gs that make up the building blocks of DNA. Then it uses inexpensive, largely off-the-shelf lab equipment to flow the necessary liquids and chemicals into a synthesizer that builds manufactured snippets of DNA and then push them into a storage vessel.

When the system needs to retrieve the information, it adds other chemicals to properly prepare the DNA and uses microfluidic pumps to push the liquids into a machine that “reads” the DNA sequences and convert it back to information that a computer can understand. The goal of the project was not to prove how fast or inexpensively the system could work, researchers say, but simply to demonstrate that automation is possible.

One immediate benefit of having an automated DNA storage system is that it frees researchers up to probe deeper questions, instead of spending time searching for bottles of reagents or repetitively squeezing drops of liquids into test tubes.

“Having an automated system to do the repetitive work allows those of us working in the lab to take a higher view and begin to assemble new strategies — to essentially innovate much faster,” said Microsoft researcher Bichlien Nguyen.

The team from the Molecular Information Systems Lab has already demonstrated that it can store cat photographs, great literary works, pop videos and archival recordings in DNA, and retrieve those files without errors in a research setting. To date they’ve been able to store 1 gigabyte of data in DNA, besting their previous world record of 200 MB.

To store data in DNA, algorithms convert the 1s and 0s in digital data to ACTG sequences in DNA. Microsoft and University of Washington researchers stored and retrieved the word “hello” using the first fully automated system for DNA storage.

The researchers have also developed techniques to perform meaningful computation — like searching for and retrieving only images that contain an apple or a green bicycle — using the molecules themselves and without having to convert the files back into a digital format.

“We are definitely seeing a new kind of computer system being born here where you are using molecules to store data and electronics for control and processing. Putting them together holds some really interesting possibilities for the future,” said UW Allen School professor Luis Ceze.

Unlike silicon-based computing systems, DNA-based storage and computing systems have to use liquids to move molecules around. But fluids are inherently different than electrons and require entirely new engineering solutions.

The UW team, in collaboration with Microsoft, is also developing a programmable system that automates lab experiments by harnessing the properties of electricity and water to move droplets around on a grid of electrodes. The full stack of software and hardware, nicknamed “Puddle” and “PurpleDrop,” can mix, separate, heat or cool different liquids and run lab protocols.

The goal is to automate lab experiments that are currently being done by hand or by expensive liquid handling robots — but for a fraction of the cost.

Next steps for the MISL team include integrating the simple end-to-end automated system with technologies such as PurpleDrop and those that enable searching with DNA molecules. The researchers specifically designed the automated system to be modular, allowing it to evolve as new technologies emerge for synthesizing, sequencing or working with DNA.

“What’s great about this system is that if we wanted to replace one of the parts with something new or better or faster, we can just plug that in,” Nguyen said. “It gives us a lot of flexibility for the future.”

https://cdrinfo.com/d7/content/researchers-manufacture-dna-store-data

its nice that now they can store up to 1 Gbytes of data in synthetic DNA. Up from 200 Mbytes from their previous record. Now they are using something like automatic liquid to move around electronic to read dna data and decode it to 1 and 0s or something like that instead of doing it manually by humans or robots. its way beyond my head but i am guessing that is what they are saying. cool.

Hello

Author Topic: Researchers Store Digital Data in Synthetic DNA (Read 2051 times)

Paul2

Researchers Store Digital Data in Synthetic DNA

Titan

Re: Researchers Store Digital Data in Synthetic DNA

Paul2

Re: Researchers Store Digital Data in Synthetic DNA

Titan

Re: Researchers Store Digital Data in Synthetic DNA

Paul2

Re: Researchers Store Digital Data in Synthetic DNA

Paul2

Researchers show automated system to store and retrieve data in manufactured DNA