Methods and systems for genomic analysis
Inventors
Harris, Jason • Pratt, Mark R. • West, John • Chen, Richard • Li, Ming
Assignees
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
A computer-implemented method for processing and/or analyzing nucleic acid sequencing data comprises receiving a first data input and a second data input. The first data input comprises untargeted sequencing data generated from a first nucleic acid sample obtained from a subject. The second data input comprises target-specific sequencing data generated from a second nucleic acid sample obtained from the subject. Next, with the aid of a computer processor, the first data input and the second data input are combined to produce a combined data set. Next, an output derived from the combined data set is generated. The output is indicative of the presence or absence of one or more polymorphisms of the first nucleic acid sample and/or the second nucleic acid sample.
Core Innovation
The invention provides a computer-implemented method for genetic analysis of a subject in which a first nucleic acid sample is subjected to an untargeted sequencing reaction to generate a first set of sequencing data. The untargeted sequencing provides data for at least a portion of a genome of the subject, and a second nucleic acid sample is subjected to a target-specific sequencing reaction to generate a second set of sequencing data comprising RNA sequencing data.
A computer generates a combined output from the first set of sequencing data and the second set of sequencing data. The combined output is indicative of a presence or absence of one or more polymorphisms in at least a portion of a genome of the subject, including single nucleotide variants and insertions/deletions, copy number variations, structural variants, and haplotypes.
The sequencing data are processed to support variant detection, including alignment mapping to a reference sequence, removal of redundant sequences, and genomic binning. Optional statistical modeling is described for copy number variation detection and filtering, including Hidden Markov Models, and systems include a computer processor, memory, and a display or electronic report for presenting results indicative of polymorphisms.
Claims Coverage
The independent claim describes 3 inventive features: untargeted sequencing data generation, target-specific sequencing data generation including RNA sequencing data, and computer generation of a combined output indicative of polymorphisms.
Untargeted sequencing to generate first set of sequencing data
Subjecting a first nucleic acid sample of the subject to an untargeted sequencing reaction to generate a first set of sequencing data.
Target-specific sequencing to generate RNA sequencing data
Subjecting a second nucleic acid sample of the subject to a target-specific sequencing reaction to generate a second set of sequencing data comprising RNA sequencing data.
Computer-generated combined output indicative of polymorphisms
Using a computer to generate a combined output from the first set of sequencing data and the second set of sequencing data, where the combined output is indicative of a presence or absence of one or more polymorphisms in at least a portion of a genome of the subject.
Overall, the claim coverage centers on combining untargeted sequencing data with target-specific sequencing data that includes RNA sequencing data to generate a computer-produced output indicative of one or more genome polymorphisms.
Stated Advantages
Improved performance and sensitivity/specificity relative to exome-only and/or high-coverage genome.
Support for variant detection sensitivity via processing steps and optional statistical modeling, including Hidden Markov Models for CNV detection and filtering.
Presentation of results via electronic display and electronic reports.
Describes reduced turnaround time and reduced cost/reagent requirements compared with high-coverage whole genome sequencing.
Describes clinical outputs including diagnosis, prognosis, genetic risk, and pharmacogenetic/therapy guidance based on combined outputs.
Documented Applications
Genetic analysis outputs directed to diagnosis, prognosis, genetic risk, and pharmacogenetic/therapy guidance.
Detection and characterization of genomic regions including copy number variation, polymorphisms, single nucleotide variants, and haplotypes.
Fetal diagnostics based on genomic regions on chromosomes 13/18/21/X/Y.
Handling of off-target exome outcomes using Hidden Markov Model processing.
Analysis and processing of pathogen-derived nucleic acids.
Interested in licensing this patent?