Two hundred petabytes in a single gramme of DNA

  • Targets to build a machine to do it a million times cheaper and 100,000 times faster
  • DNA encoding technology can last for thousands of years


Two hundred petabytes in a single gramme of DNA


"WE'RE taking the tools of biology to store digital data within DNA molecules," explained Hyunjun Park (pic), Catalog's chief executive. "The intersection between silicon and carbon, between digital and biology... that's exactly where we are."

Hyunjun Park was describing his company’s vision at Emtech Asia 2019 held in Singapore earlier this year. Whereas digital data is more conventionally stored in hard disks, Park feels that they will not be able to cope with the ever-growing production of and demand for data.

"In 2025, (the world is) on track to generate 160 zettabytes of data - that's about 10 times the number of stars in the observable universe," he said, noting that about 40% of it will be important for enterprise operations. "The problem with that is that we're only estimated to be able to store about 12.5% of it using conventional media."

"We've basically reached the physical limits of data centres and data storage warehouses," he concluded. As it is, data centres already generate more carbon emissions per year than the entire airline industry.

Another issue is longevity. "Hard drives need to be replaced about once every five years and magnetic tape can last for maybe thirty," Park said. In contrast, the DNA encoding technology he is touting can last for thousands of years.

The final point that Park made is that DNA packs more into a smaller space. DNA has a million times the information density compared to current solid state drives, or as Park puts it, "Two hundred petabytes of information in a gramme of DNA".

Using prefabricated DNA units to speed up the process

Park and Catalog are not the only players in the game. Just last month Microsoft demonstrated an end-to-end automated system to store and retrieve data using manufactured DNA, on a system that looks like a garage lab experiment replete with bottles and tubes. When in 2016 Microsoft originally encoded 200 megabytes of data as DNA, the cost was about US$ 800,000 (RM3.3 million).

Park wants to do it more slickly and cheaply. The key to this is to progress from converting each bit of data into an equivalent DNA base pair, (letter by letter as it were) and improve it by pre-building "words".

"The technique that we landed on for storing information in DNA involved synthesising a bunch of prefabricated units," explained Park, where bits are arranged in multidimensional matrices, and sets of molecules represent their locations in each matrix. "Then we have our building blocks in a bunch of different wells and the robot is able to pick from those different wells and combine the right building blocks."

"The next thing we add is an enzyme and this is a biological molecule that then takes the pieces of DNA and sticks them together," he continued. "It's like going from copying down a book using pen and paper to using a printing press with movable type pieces.

"Now you have all of your data stored in one tiny little tube," he concluded. "It looks like a droplet but it actually contains millions if not billions if not trillions of copies of your information."

To read the data back into digital format, you take the DNA out of storage and put it through a sequencing machine.

Purpose-built machines: A million times cheaper, a hundred thousand times faster

Park also disclosed that they are currently in partnership working on a purpose-built machine due some time this year.  

"When this machine is completed we'll be able to encode information at a million times cheaper than what's been possible before and about a hundred thousand times faster."

The hope is that potential customers will at least find the idea appealing. "We'll be launching some pilot projects with pilot customers where we take their information put it into DNA," said Park hopefully,. "This is to build the software later around the entire stack, as well as to figure out if DNA-based origins can really fit into their existing ecosystem."

Insofar as applications for the future are concerned, apart from the overall advantages of a high-density and easily portable storage, Park also mentions one ironic way DNA storage could be put to good use.

"Think about a future where you have all of your medical information in DNA molecules," he said, a future where DNA will be both the building blocks of life, and the storage of its meta data.


Related Stories :

Keyword(s) :
Author Name :
Download Digerati50 2020-2021 PDF

Digerati50 2020-2021

Get and download a digital copy of Digerati50 2020-2021