Abstract: While there has been tremendous progress in sequencing and editing DNA, synthesizing (writing) DNA from scratch has been terribly slow and expensive. The inability to write faster and cheaper DNA has been a great bottleneck in synbio applications. Chemical DNA synthesis has been around for 4 decades but it is slow, expensive, limited in capacity, laborious and environmentally unsustainable. The enzymatic approach to DNA synthesis has the potential to circumvent the limitations of chemical synthesis and accelerate synthetic biology.
DNA is fundamental to both understanding evolution and shaping the future of biological applications. At the most basic level, all living organisms share the same genetic material, aka Deoxyribo-Nucleic Acid(DNA). On one hand, the genetic material reflects the shared ancestry of life and on the other, it’s what makes each individual unique.
While Charles Darwin came up with the original theory of Evolution, it was Gregor Mendel, in the 1860s, who first suggested that characteristics are passed down from generation to generation. And then in 1869, “nuclein” was identified by Friedrich Miescher through the isolation of a molecule from a cell nucleus which later became known as DNA.
It took the better part of a century before James Watson, Francis Crick and Rosalind Franklin discovered the double-helix structure of the DNA, fundamentally revolutionizing biology. DNA is a polymer made of monomers called nucleotides and there are four nucleotide bases: A (adenine), C (cytosine), G (guanine), and T (thymine).
Towards the end of the 20th century, scientists embarked on one of the greatest discoveries under the Human Genome Project that aimed at sequencing and mapping the complete set of DNA aka genome. A 13-year-old project which began in 1990 and culminated in 2003 helped researchers conclude that the genome contains 3 billion bases, 20,000 genes, and 23 pairs of chromosomes.
Credits: Microbe Notes
The process of reading DNA and determining the order (sequence) of nucleotide bases (A, G, C, T) is called DNA sequencing. The order is then reported as a text string, called a read.
Credits: Labster Theory
These bases provide the underlying genetic basis (the genotype) for telling a cell what to do, where to go and what kind of cell to become (the phenotype) in order for an organism to grow, develop, and reproduce. The instructions are read 3 nucleotide bases at a time which correspond to specific amino acids, the building blocks of protein. The DNA sequence that encapsulates the instructions to make a protein is called a gene.
Through DNA sequencing, diseases and medical conditions can be interpreted and understood. For example, gene mutations that damage DNA have adverse effects on the health and well-being of an individual. Inherited disorders are all passed on through defective DNA. All these can detected through DNA sequencing and the appropriate measures can be taken
The cost of sequencing DNA dropped drastically since the Human Genome Project from 2.7 Billion $ to a few 100 $ and may continue to drop until 100 $ or less. There are different approaches to sequencing DNA. Some DNA sequencers analyze light signals originating from fluorochromes attached to nucleotides to read the DNA.
Few key definitions before we dive further:
Each nucleotide is made up of three parts:
i. a nitrogen-containing ring structure called a nitrogenous base,
ii. a five-carbon sugar, and
iii. at least one phosphate group.
Credits: Khan Academy/OpenStax College
Enzyme: A substance that acts as a catalyst in living organisms. They are mostly proteins.
Polymerase: enzymes that catalyze the successive addition of nucleotides to a growing nascent nucleic acid strand (primer)
dsDNA is the double-stranded DNA whereas ssDNA is the single-stranded DNA
Primer: a short, single-stranded DNA sequence used in the polymerase chain reaction (PCR) technique.
DNA sequence 5' to 3' directionality: The carbon atoms of a nucleotide’s sugar molecule are numbered as 1′, 2′, 3′, 4′, and 5′ (1′ is read as “one prime”.) There is particular directionality to how a DNA is read. The phosphate group is attached to the 5' end or the beginning of the chain and the hydroxyl group is attached to the 3' end or end of the chain.
Coupling efficiency: A way of measuring how efficiently the DNA synthesizer is adding new bases to the growing DNA chain
Oligonucleotides: Short DNA or RNA molecules
One of the earliest and most successful sequencing techniques used is called Sanger Sequencing. In this process, Different fragments of different lengths are copied from the target DNA. Fluorescent “chain terminator” nucleotides at the ends of the fragments allow the sequence to be determined. The Sanger method allows sequencing of only a single DNA fragment at a time. Since the 2000s, there have been different Next Generation Sequencing(NGS) methods that have been developed to allow massively parallel, sequencing of millions of fragments simultaneously per run. Such a high-throughput process allows the sequencing of thousands of genes at a given time.
NGS technology leverages amplification of fragments and sequencing by synthesis (SBS) chemistry to enable rapid sequencing with high accuracy. The process identifies DNA bases and incorporates them into a nucleic acid chain simultaneously. Recent 3rd generation DNA sequencers such as SMRT and Oxford Nanopore measure the addition of nucleotides to a single DNA molecule in real-time. (To learn more about the different companies in the NGS space, check this amazing blog.)
We discussed how biology is the best way we’ve got to rearrange atoms and transform the hardware world. From recombinant proteins to plastic eaters, the field of synthetic biology promises to be able to custom-design organisms using synthetic DNA as a building block to assemble entire genes and even synthetic genomes. Beyond synbio, DNA synthesis could also accelerate DNA based data storage services, production of nucleic-acid vaccines and applications for CRISPR gene-editing. In addition, DNA synthesis is useful for an astonishing variety of research use cases in agricultural and industrial sectors, and the potential is huge if DNA synthesis gets even cheaper.
But we’re still not good at engineering synthetic DNA.
While there has been tremendous progress in sequencing and editing DNA, synthesizing (writing) DNA from scratch has been terribly slow and expensive. DNA synthesis is the process, natural or artificial, of linking together deoxynucleic acids (adenine, thymine, cytosine, and guanine) to form DNA.
The first process was developed in 1981 and hasn’t changed much for over 3 decades. Known as the phosphoramidite chemistry, the process was the only commercially available synthesis mechanism and it allows oligonucleotide sequences to be synthesized up to 200 base pairs in length, which isn’t even as long as many genes.
Under a chemical synthesis, a computer is used to control the addition of nucleotide(A, G, C, or T) one drop at a time through the following process:
i. Deblocking: Di-methyl-tryptamine (DMT, rings a bell?) is removed from the initial nucleotide which produces a free 5’ hydroxyl group to react with the next nucleotide.
ii. Coupling: A new nucleotide - 3′ phosphate group - is linked (coupled) to the existing nucleotide at the first nucleotide’s 5′ hydroxyl group
(Remember the directionality of DNA? As new nucleotides are added to a strand of DNA or RNA, the strand grows at its 3′ end, with the 5′ phosphate of an incoming nucleotide attaching to the hydroxyl group at the 3′ end of the chain.)
iii. Capping: In case any of the previous nucleotides failed to react, they are capped to prevent any subsequent erroneous participation
iv. Oxidation: To stabilize the evolving chain, the bond between the first nucleotide and the successfully coupled second nucleotide is oxidized.
The next nucleotide is then prepared for deblocking to continue the synthesis.
In addition to being expensive, slow, laborious and limited in capacity (200 base pairs max), the chemistry involved in adding the nucleotides in this process can also start to degrade the already-synthesized portion.
Since DNA polymerases are nature’s way of copying DNA, they’re probably pretty good at it. Enzymatic synthesis involves replicating nature’s process.
The enzymatic approach to DNA synthesis has the potential to circumvent the limitations of chemical synthesis and accelerate the progress of synthetic biology. Enzymatic synthesis is a novel mechanism based on proprietary monomers and proprietary enzymes in an aqueous environment(water).
The enzymatic approach works as follows:
i. Enzyme and modified nucleotides are added
a. DNA synthesis enzyme, an enzyme that is engineered to accept modified nucleotides and
b. Modified nucleotides: nucleotides that are modified to ensure only a single addition in each cycle and prevent the addition of several random nucleotides
ii. The enzyme and the free dNTPs are washed away
Following which the system is primed for the next round of single-nucleotide addition. Wash, rinse, repeat.
A key ingredient in enzymatic synthesis is terminal deoxynucleotidyl transferase (TdT), a DNA polymerase. Most polymerases can only copy DNA from a template complementary strand. TdT has the unique ability to add nucleotides to the 3' end of a single-strand without a template strand. However, companies such as Camena Bioscience use a proprietary combination of enzymes instead of TdT to achieve template-free DNA synthesis.
The Enzymatic approach is a faster, environmentally friendly, simpler, highly efficient process, highly scalable and widely applicable process compared to the chemical synthesis.
DNA Script is a French company with 112M $ in funding that harnesses enzymes to synthesize DNA. They became the first company to sell Enzymatic DNA bio-printers.
Molecular Assemblies, a US startup with 30M $ in funding so far, is developing an enzymatic, platform-independent synthesis technology to produce long, high-quality, sequence-specific DNA reliably and affordably.
Ansa Biotechnologies, a US startup with 9.2 M $ in seed funding, has taken a slightly different path in their enzymatic DNA synthesis approach. Instead of coaxing TdT to accept modified nucleotides, they modify the enzyme by conjugating each TdT to a single deoxyribonucleoside triphosphate (dNTP) molecule.
Camena Biosciences is a UK-based startup building tools to enable enzymatic DNA synthesis. Just like Ansa Biotechnologies, they aim to be a service provider in the synBio space that will have applications span across multiple industries from data storage to agriculture.
Evonetix is a UK-based startup developing a desktop DNA writer enabled by a proprietary silicon chip that will act as a ‘plug and play’ instrument allowing large-scale DNA synthesis to occur in parallel.
Nuclera is another UK based startup, with 11.3 M $ in funding, developing desktop DNA printers. Their water-based DNA synthesis and automation platform enables the production of gene and genome libraries to synthesize long DNAs.
Ribbon Biolabs, an Austrian seed-stage startup, is leveraging enzymatic DNA synthesis to manufacture DNA of lengths up to 10K base pairs in a few hours.
Kern Systems is a pre-seed US-based startup that leverages the power of enzymatic DNA synthesis technology to create a scalable and durable data storage infrastructure.
Helixworks, an Irish seed-stage startup is also targeting Data Storage solutions through their proprietary platform that combines programmable DNA sequences, modular assembly process and enzymatic DNA synthesis.
SynHelix is a French startup aiming to enable robust and scalable DNA synthesis to accelerate large yield DNA by employing a specifically engineered enzyme.
DNA synthesis can open a whole new world of opportunities and transform our lives across healthcare, climate change and new industrial applications. However, the risk is also significantly high. Progress with caution should be the way forward.
An amazing interactive simulation developed by LabXchange at Harvard outlines the nature of nucleotide base-pairing and the structure of DNA.
To have a basic or an expert understanding of Next Generation Sequencing, Youseq has some kickass videos.