Methods and systems for genomic analysis

Inventors

Harris, JasonPratt, Mark R.West, JohnChen, RichardLi, Ming

Assignees

Personalis Inc

Interested in licensing this patent?

MTEC can help explore whether this patent might be available for licensing for your application.

Publication Number

US-11935625-B2

Patent

Publication Date

2024-03-19

Expiration Date


Abstract

A computer-implemented method for processing and/or analyzing nucleic acid sequencing data comprises receiving a first data input and a second data input. The first data input comprises untargeted sequencing data generated from a first nucleic acid sample obtained from a subject. The second data input comprises target-specific sequencing data generated from a second nucleic acid sample obtained from the subject. Next, with the aid of a computer processor, the first data input and the second data input are combined to produce a combined data set. Next, an output derived from the combined data set is generated. The output is indicative of the presence or absence of one or more polymorphisms of the first nucleic acid sample and/or the second nucleic acid sample.

Core Innovation

The invention provides a computer-implemented method for genetic analysis of a subject in which a first nucleic acid sample is subjected to an untargeted sequencing reaction to generate a first set of sequencing data. The untargeted sequencing provides data for at least a portion of a genome of the subject, and a second nucleic acid sample is subjected to a target-specific sequencing reaction to generate a second set of sequencing data comprising RNA sequencing data.

A computer generates a combined output from the first set of sequencing data and the second set of sequencing data. The combined output is indicative of a presence or absence of one or more polymorphisms in at least a portion of a genome of the subject, including single nucleotide variants and insertions/deletions, copy number variations, structural variants, and haplotypes.

The sequencing data are processed to support variant detection, including alignment mapping to a reference sequence, removal of redundant sequences, and genomic binning. Optional statistical modeling is described for copy number variation detection and filtering, including Hidden Markov Models, and systems include a computer processor, memory, and a display or electronic report for presenting results indicative of polymorphisms.

Claims Coverage

The independent claim describes 3 inventive features: untargeted sequencing data generation, target-specific sequencing data generation including RNA sequencing data, and computer generation of a combined output indicative of polymorphisms.

Untargeted sequencing to generate first set of sequencing data

Subjecting a first nucleic acid sample of the subject to an untargeted sequencing reaction to generate a first set of sequencing data.

Target-specific sequencing to generate RNA sequencing data

Subjecting a second nucleic acid sample of the subject to a target-specific sequencing reaction to generate a second set of sequencing data comprising RNA sequencing data.

Computer-generated combined output indicative of polymorphisms

Using a computer to generate a combined output from the first set of sequencing data and the second set of sequencing data, where the combined output is indicative of a presence or absence of one or more polymorphisms in at least a portion of a genome of the subject.

Overall, the claim coverage centers on combining untargeted sequencing data with target-specific sequencing data that includes RNA sequencing data to generate a computer-produced output indicative of one or more genome polymorphisms.

Stated Advantages

Improved performance and sensitivity/specificity relative to exome-only and/or high-coverage genome.

Support for variant detection sensitivity via processing steps and optional statistical modeling, including Hidden Markov Models for CNV detection and filtering.

Presentation of results via electronic display and electronic reports.

Describes reduced turnaround time and reduced cost/reagent requirements compared with high-coverage whole genome sequencing.

Describes clinical outputs including diagnosis, prognosis, genetic risk, and pharmacogenetic/therapy guidance based on combined outputs.

Documented Applications

Genetic analysis outputs directed to diagnosis, prognosis, genetic risk, and pharmacogenetic/therapy guidance.

Detection and characterization of genomic regions including copy number variation, polymorphisms, single nucleotide variants, and haplotypes.

Fetal diagnostics based on genomic regions on chromosomes 13/18/21/X/Y.

Handling of off-target exome outcomes using Hidden Markov Model processing.

Analysis and processing of pathogen-derived nucleic acids.

JOIN OUR MAILING LIST

Stay Connected with MTEC

Keep up with active and upcoming solicitations, MTEC news and other valuable information.