DNA construct for sequencing and method for preparing the same
Inventors
Assignees
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
A DNA construct comprises multiple units sequentially attached one to the other, wherein a unit comprises: a segment; an index attached to one end of the segment; an identifier attached to another end of the segment; an introducer attached to a 5′-end of either the index or the identifier; and a closure attached to a 5′-end of a remaining either identifier or index. A method for preparing the DNA construct and a method for analyzing a sequence of the DNA construct, as well as various embodiments thereof, are disclosed herein.
Core Innovation
The invention describes a DNA construct comprising multiple units sequentially attached one to the other, where each unit includes a segment comprising a target nucleic acid sequence to be sequenced and analyzed. Each unit attaches an index to one end of the segment and an identifier to another end of the segment, enabling origin-level labeling and copy-level tracking while preserving origin-specific information.
Each unit further includes an introducer attached to a 5′-end of either the index or the identifier and a closure attached to a 3′-end of a remaining either identifier or index. These introducer and closure elements define unit borders and provide PCR primer binding sites used to generate mature double-stranded units for assembly, and the resulting units are amplified and prepared so multiple mature units are ligated sequentially to form a long DNA construct suitable for sequencing on long-fragment platforms.
The invention also provides analysis of sequencing results by separating unit reads and grouping them by index, then by segment sequence, and then by identifier sequence. Multiple segment sequences within each identifier group are collapsed into a single representative sequence to eliminate procedure and sequencing errors, and the resulting sequences can be compared to known target sequences to identify variants and/or mutations.
Claims Coverage
Independent claim clm-00001 defines the DNA construct structure and the uniqueness semantics of the index and identifier, with dependent claims further adding quantitative constraints and a specific analysis pipeline including collapsing and optional comparison to known target sequences. The inventive features focus on the sequential multi-unit DNA construct with introducer/closure-defined borders, origin-unique indexing, copy-unique identifiers, and hierarchical read grouping followed by collapsing to reduce procedure/sequencing errors.
Sequentially attached multi-unit DNA construct with target segment elements
A DNA construct comprising multiple units sequentially attached one to the other, wherein each unit comprises a segment comprising a target nucleic acid sequence to be sequenced and analyzed, an index attached to one end of the segment, an identifier attached to another end of the segment, an introducer attached to a 5′-end of either the index or the identifier, and a closure attached to a 3′-end of a remaining either identifier or index.
Origin-unique index and copy-unique identifier
The index is a DNA sequence that is unique to the origin of the segment and the identifier is a DNA sequence that is unique for every copy of the segment.
Hierarchical grouping of sequenced units by index, segment sequence, and identifier sequence followed by collapsing
A method comprising sequencing a DNA construct and then separating and grouping the resulting sequenced units by index, by segment sequence within each index group, and by identifier sequence within each segment group, before collapsing multiple segment sequences in each identifier group into a single sequence representing the target sequence for the segment.
Comparing collapsed sequences to known target sequences to identify variants and/or sequencing errors
After collapsing multiple segment sequences in each identifier group into a single sequence, comparing the resulting collapsed sequences of the target sequences with known target sequences to identify variants and/or sequencing errors.
Across the independent claim set, the core coverage is the sequential multi-unit DNA construct with introducer/closure-defined unit structure, paired origin-unique indexing and copy-unique identifiers, and a read analysis approach that groups by index/segment/identifier and collapses within identifier groups, optionally followed by comparison of collapsed sequences to known target sequences to identify variants and/or sequencing errors.
Stated Advantages
Distinguishes true mutations from procedure and sequencing errors by collapsing sequences within identifier groups.
Enables simultaneous multi-origin and multi-target sequencing by using indices unique to origins and identifiers unique to segment copies.
Documented Applications
Sequencing and analysis of target nucleic acid sequences using long-fragment platforms, including nanopore sequencing (e.g., MinION/Oxford Nanopore), by using a long DNA construct composed of sequentially attached units.
Identifying variants and/or mutations by comparing collapsed sequences of target sequences with known target sequences after hierarchical grouping and collapsing.
Interested in licensing this patent?