What is whole genome sequencing?

The process that determining the complete DNA sequences of individual or population of species with existing reference genomic sequences is referred to as whole genome sequencing (WGS). The whole genome resequencing results are compared with existing reference genome sequences, and genome-wide single nucleotide polymorphisms, insertion deletion mutations, copy number variation, and institutional variation information are detected to obtain individual or population molecular genetic characteristics. Candidate gene prediction and genetic evolution analysis of important economic traits of animals are widely used in genetic variation detection, trait gene mapping, genetic map construction and genetic evolution research. The most critical step in genome-wide resequencing data analysis is sequence alignment. The re-sequencing of the reads sequence is compared with the existing reference genome sequence. The alignment process is generally performed in two steps: first, sorting the reads data or refer to the genomic sequence and then align and locate the reads sequence with appropriate algorithms.

What is sequencing indicator?

Sequencing depth and sequencing coverage are two important indicators for evaluating the amount of sequencing. Sequencing depth is an indicator for evaluating the amount of sequencing of a sample, and it refers to the ratio of the total number of bases obtained by sequencing to the size of the genome. The greater the depth of sequencing, the lower the probability of a false positive result. If the individual to be sequenced uses double-end sequencing, the sequencing depth needs to be controlled above 50X~100X to ensure genome coverage and control sequencing error rate. Sequencing coverage refers to the proportion of base coverage obtained by sequencing the genome, reflecting the randomness of sequencing, and there is a positive correlation between the sequencing depth. The specific relationship between sequencing depth and coverage can be determined by Lander-Waterman Model.

How is the development of the technology?

Whole genome sequencing technology mainly uses the next generation sequencing technology (NGS) and the third generation sequencing technology.

1). The first generation sequencing technology, also known as the Sanger sequencing method (dideoxy end termination method), was used in the early 1990s, adopts the cycle sequencing mode, but is prone to error when reading base information, and the amount of data acquired is small. The Sanger DNA sequencing method has been continuously developed, and the accuracy of sequencing has reached almost 100%, and the length of the sequencing fragment has reached 1000 kb.

2). The second-generation sequencing technology is high-throughput sequencing. The central idea is to perform sequencing while synthesizing, that is, to mark four bases A, T, C, and G with different colors of fluorescence, and then determine the sequence of DNA according to the mark of the newly synthesized end. The supported technology platforms are Illumina/Solexa Genome Analyzer, Roche/454FLX and Applied Biosystems SOLID system, each with its own advantages. The general procedure is to construct a sequencing library → anchor connection → pre-amplification → single base extension sequencing → data analysis. The second-generation sequencing technology has enabled rapid, low-cost detection of whole genome sequences, and the amount of data obtained has increased significantly.

3). The third-generation sequencing technology was officially promoted in 2011. When performing genome sequencing, it does not require PCR amplification, and can separately sequence each DNA molecule. Therefore, it is also called single-molecule sequencing technology, and the cost of sequencing is lower. There are two major sequencing technologies, single molecule fluorescence sequencing (SMRT technology) and nanopore sequencing (electrophoresis technology). The supported technology platforms include Heliscope/Helicos Genetic Analysis System, SMRT and nanopore single molecule. Nanopore sequencing technology can completely eliminate the elution and PCR amplification process, achieving ultra-high read length, high throughput, less sequencing time and simpler data analysis.

Each generation of sequencing technology played an important role at the time and has been updated. However, people are increasingly demanding scientific research, hoping to detect more genomic information with less cost and less time, and believe that the latest sequencing technology can change the world. With the advancement of technology, detection methods are more and more advanced, and sequencing can be convenient and efficient.

Author's Bio: 

CD Genomics was established in 2004, we are aiming at providing the research community with high quality Next Generation Sequencing, PacBio SMRT sequencing, and microarray services. Due to the demand for our services has being increased; CD Genomics has already updated its technology platform to mainstream NGS and microarray instruments. At present, our senior bioinformaticians have ever viewed more than ten thousands of trace files and accumulated abundant experience with our Illumina HiSeq 2500, HiSeq 4000, Miseq Benchtop Sequencer, PacBio Sequel, PacBio RS II, Ion Torrent PGM, and ABI 3730/3730XL analyzer, etc. We continue to work hard to offer you the same dependable services to pharmaceutical and biotech companies, as well as academia and government agencies for the purpose of satisfying all your sequencing or array needs.