Efficient information coding in living organisms

It is clear that DNA can be used as a storage medium, each nucleotide carrying two information bits, which can store vast amounts of data for very long periods of time and with high reliability; as overtime, humans remain human just as cats remain cats. However, it is also clear that it is a very complex storage medium affected by evolutionary processes which create various data corruption mechanisms as well as complex constraints over the stored information, where even a slight change in the nucleotides sequence in the DNA may profoundly affect the whole organism’s behavior and possibly decrease its growth rate, increase the mutation rate and may even cause the death of the organism, all of the above processes must be considered.

Encoding information over the genome of living organisms is a fundamental problem with various applications in synthetic biology, such as biosensors, biological treatments, and very long-term storage. Among others, any suitable cellular engineered system based on a living organism, whose objective is to monitor the human body (e.g., oncolytic viruses and bacteria) or any other environment, should have the ability to efficiently and reliably store information. Today there are hundreds of projects that develop such systems, and in the near future, many of them are expected to be commercial.

Our solution includes a framework for efficient information coding over living organisms which is expected to be more robust than other approaches. The design goals of the coding scheme are efficiency (in terms of storage density), flexibility (in terms of scalability and minimization of environment dependency), and decoder simplicity (in terms of minimal side information needed). The approach is based on the combination of tools from disciplines such as information theory, coding theory, molecular evolution, and synthetic biology.

US Provisional Patent.

Sign up for
our events

    Life Science