Published Thursday in the journal Science, the experiment aimed to demonstrate the viability of storing large amounts of data on DNA molecules. Since the data is recorded on individual nucleobase pairs in the DNA strand (those adenine-guanine/cytosine-thymine pairs you may be straining to remember from high school biology), DNA can actually store more information per cubic millimeter than flash memory or even some experimental storage techs, IEEE Spectrum reports.
The difficulty is in the translation — both to DNA and back again (summarized in the diagram below). The researchers started with the book’s content, which included the text as well as 11 images and a javascript program, and converted it to binary code. Then they assigned every 0 and 1 a nucleobase.
After that came the heavy lifting: synthesizing the DNA strand, which would be 5.27 million bases long. They made the journey by splitting it into baby steps, each 96 bases long. When they were done, the book was a tiny speck of synthesized DNA that had about one-millionth the weight of a grain of sand. That’s got to look pretty attractive to anyone with a Big Data problem.