Methods and systems for determination of the number of contributors to a DNA mixture
Inventors
Marciano, Michael • Adelman, Jonathan D.
Assignees
Publication Number
US-12073923-B2
Publication Date
2024-08-27
Expiration Date
2036-12-02
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
A system configured to characterize a number of contributors to a DNA mixture within a sample, the system comprising: a sample preparation module configured to generate initial data about the DNA mixture within the sample; a processor comprising a number of contributors determination module comprising a machine-learning algorithm configured to: (i) receive the generated initial data; (ii) analyze the generated initial data to determine the number of contributors to the DNA mixture within the sample; and an output device configured to receive the determined number of contributors from the processor, and further configured output information about the received determined number of contributors.
Core Innovation
The invention provides methods and systems for determining the number of contributors to a DNA mixture in a sample, utilizing a machine learning approach. The system features a sample preparation module to generate initial data regarding the DNA mixture, followed by a processor running a number of contributors determination module that contains a machine learning algorithm. This algorithm receives and analyzes the generated data to estimate the number of contributors, and an output device then provides this result.
The problem addressed is the challenge in forensic and clinical settings of accurately interpreting DNA mixtures, particularly in identifying the number of individual contributors. Prior methods require assumptions about contributor counts that, if incorrect, can severely affect mixture interpretation, sometimes resulting in inaccurate analyses or time-consuming processes.
The disclosed solution computes the number of DNA contributors by probabilistically analyzing a combination of categorical and quantitative features from the DNA data. The system claims to be computationally inexpensive and is capable of delivering results within seconds using standard computer hardware. The machine learning approach is adaptable, capable of being trained on a range of input features that describe the DNA mixture, and can incorporate different types of data, such as peak counts and probability of dropout, within its determinations.
Claims Coverage
There are two independent claims, each defining inventive features related to a system for determining the number of contributors in a DNA mixture using machine learning with specified features.
System for DNA mixture contributor characterization using machine learning and specified features
A system comprising: - A sample preparation module that generates initial data about the DNA mixture in a sample. - A processor with a number of contributors determination module containing a machine-learning algorithm. - The machine-learning algorithm is trained to evaluate initial data to determine the number of contributors based on all of the following candidate features: - Sample-wide peak count - Maximum number of contributors - Minimum number of contributors - Locus-specific peak count - Probability of dropout - Minimum observed peak height - Maximum observed peak height - An output device that receives and outputs information regarding the determined number of contributors.
System with processor analyzing DNA mixture contributors using machine learning and defined features
A system comprising: - A processor configured to receive data about the DNA within a sample. - The processor is further configured to analyze the data using a machine-learning algorithm trained to determine the number of contributors. - The machine-learning algorithm uses a set of candidate features including all of: - Sample-wide peak count - Maximum number of contributors - Minimum number of contributors - Locus-specific peak count - Probability of dropout - Minimum observed peak height - Maximum observed peak height
The claims collectively cover systems employing machine-learning algorithms trained on a specific set of features to accurately determine the number of contributors in a DNA mixture, operating through modules that generate, process, and output data relevant to contributor determination.
Stated Advantages
The method is computationally inexpensive and delivers results within seconds on standard desktop or laptop computers.
It achieves high accuracy, with over 98% correct identification of the number of contributors in DNA mixtures.
The approach provides improved accuracy, especially for mixtures with three or four contributors, compared to prior methods.
The method is robust and reproducible across data from multiple laboratories and various capillary electrophoresis instruments.
The model is feature-agnostic and can be easily adapted to different data sources or DNA amplification systems.
Practically eliminates the need for lengthy processing times required by existing methods, significantly increasing throughput in forensic and clinical settings.
Documented Applications
Interpretation and deconvolution of DNA mixtures in forensic investigations.
Use in clinical and medical research for DNA mixture analysis.
Interested in licensing this patent?