INTRODUCTION: EVOLUTION OF SEQUENCING TECHNOLOGY
The term “next-generation sequencing” (NGS) begs the questions of what “first-generation sequencing” was and how NGS is both similar to and different from its predecessor. Sanger developed the first generation of DNA sequencing in the 1970s[1,2]. His eponymous sequencing approach is an in vitro adaptation of the cellular replication machinery that cleverly leverages unextendable DNA bases. These modified bases are introduced at low concentration in a reaction minimally containing (1)a high concentration of extendable bases, (2)the single-stranded DNA template to be sequenced, (3)a short oligonucleotide primer complementary to the template onto which new bases could be synthesized, and(4) DNA polymerase enzymes that execute the extension reaction. Early Sanger sequencing experiments involved four independent reactions, each containing a single type of unextendable base (A, T,G, or C). Whenever a polymerase randomly incorporates one of the unextendable bases into the nascent DNA molecule (e. g., an unextendable G in the nascent strand incorporated opposite a C in the template), further synthesis would terminate, yielding a truncated copy of the template. Critically, since all nascent strands anchor from the same oligonucleotide primer, the position of extension termination— and hence the length of the nascent DNA strand— is a direct proxy for the base at the 3' end of the molecule. By using gel electrophoresis to resolve the respective lengths of terminated molecules in each of the four reactions, it is possible to infer the sequence of the entire template.
Sanger sequencing became slightly more scalable with the introduction of unextendable bases that were uniquely dyed( Fig. 1). Rather than achieving base-specific information by partitioning into four reactions, a capillary electrophoresis instrument coupled with a fluorescent dye detector could resolve both the relative sizes of fragments and the identity of their terminating bases [3-5]. To criticize these machines as unscalable neglects one of their unmitigated triumphs: they were the workhorses that sequenced the very first human genome in the 1990s [6-9]. However, with a cost in the billions of dollars and with a timeline on the order of years, genome sequencing would remain prohibitive in a clinical setting without a major technological leap.
NGS revolutionized genome sequencing by overcoming many of the limitations of the Sanger technique [10], yet the most pervasive NGS methodology shares much in common with its predecessor. As described in further detail later,NGS also leverages extension termination and fluorescent bases, and it relies upon DNA polymerases that append a single base at a time to a nascent DNA molecule. Indeed, in many respects, an NGS experiment is comparable to performing millions or billions of Sanger reactions in parallel ( hence the NGS moniker“massively parallel sequencing”). This explosive increase in throughput shattered some of the barriers(e. g., cost and turnaround time) that had largely prevented the use of genomics in routine clinical care, and it paved the way for cfDNA-based prenatal testing.
[FIG1]: Overview of Sanger sequencing.(A) The Sanger synthesis reaction contains a nucleotide mix with both extendable bases and a low concentration of unextendable, dyed bases(a DNA primer, DNA polymerase, and buffer are also required an not shown).(B) Extended molecules terminate at different locations with the molecule having the color of the last incorporated base.(C) Arrangement of the terminated molecules in an electric field—DNA is negatively charged— and passage through a capilary equipped with a dye detector identifies fluorescence peaks that reveal the original template's sequence.