Generating anti-infective design spaces for selecting drug candidates
Inventors
Lee, Francis • STECKBECK, Jonathan D. • Holste, Hannes
Assignees
Publication Number
US-12087404-B2
Publication Date
2024-09-10
Expiration Date
2041-05-13
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
In one aspect, a method includes generating a design space for a peptide for an application. The generating includes identifying sequences for the peptide, and updating the sequences by determining, for each of the sequences, a respective set of activities pertaining to the application. The updating produces updated sequences each having updated respective activities. The method includes generating, based on the updated sequences, a solution space within the design space. The solution space includes a target subset of the updated sequences. The method includes performing, using a machine learning model to process the solution space, trials to identify a candidate drug compound that represents a sequence having a level of activity that exceeds a threshold level, and determining metrics pertaining to the machine learning model and a second machine learning model that performs the trials.
Core Innovation
The invention introduces a method for generating a design space for a peptide for an application, which involves identifying a plurality of sequences for the peptide and updating these sequences by determining a respective set of activities for each, relevant to the intended application. These activities encompass biomedical or biochemical attributes. The process produces updated sequences, each associated with updated activity profiles.
Based on these updated sequences, a solution space is generated within the design space that comprises a target subset of the updated sequences with their respective activities. Machine learning models are employed to process this solution space, performing trials to identify a candidate drug compound—specifically, a sequence that exhibits an activity level exceeding one or more threshold levels. Metrics pertaining to the performance of the machine learning models are determined, and comparisons between multiple machine learning models can be made using these metrics.
The problem addressed by the invention is the inefficiency, limited applicability, and computational challenges of conventional drug discovery techniques, which often search only constrained design spaces based on known facts or assumptions about drugs. The invention aims to overcome these obstacles by enlarging the design space to incorporate multifaceted information (such as sequence, structure, semantic, chemical, and activity data) and employing advanced AI architectures and machine learning models to efficiently discover, design, and select candidate drugs with improved properties.
Claims Coverage
The claims comprise several inventive features centered on generating and processing a peptide design space using machine learning for candidate drug compound selection and performance benchmarking.
Generating and updating peptide design space using activity determination
A method that generates a design space for a peptide applicable to a specific function by: - Identifying multiple sequences for the peptide. - Updating each sequence by determining a set of activities (biomedical, biochemical, or combinations thereof) for the intended application. - Producing an updated group of sequences, each with an updated activity profile based on this determination.
Creation of solution space and identification of candidate drug compounds via machine learning
From the updated sequences, a solution space is generated within the larger design space, comprising a target subset of the updated sequences with their respective activities. This solution space is processed using a first machine learning model, which performs one or more trials to identify candidate drug compounds—sequences demonstrating at least one activity level above one or more predefined threshold levels.
Benchmarking machine learning model performance using resource usage metrics
The performance of the first machine learning model conducting the trials is quantified by determining metrics such as memory usage, graphic processing unit temperature, power usage, processor usage, central processing unit temperature, or combinations thereof. These metrics are compared to corresponding metrics from at least one second machine learning model performing similar trials, enabling selection or optimization based on resource efficiency.
Processing solution space based on query parameters using machine learning
The generation of the solution space within the design space can be handled by a third machine learning model trained to evaluate, based on a query parameter (including sequence parameters), the activity levels of updated sequences. This enables targeted selection and dimensionality reduction (e.g., via UMAP, PCA, autoencoding) of the solution space based on user-specified or application-specific criteria.
The claims collectively safeguard methods, systems, and media for generating, analyzing, and reducing a multidimensional peptide sequence design space using machine learning, for efficient candidate drug discovery, selection, and benchmarking based on multidimensional activity profiles and computational resource usage.
Stated Advantages
The artificial intelligence engine enables efficient searching of enlarged design spaces that incorporate a broad combination of drug information, thus overcoming the constraints and inefficiency of conventional techniques.
The system reduces computational complexity and resource consumption (such as time, processing, and memory) when operating in large design spaces.
Enhanced discovery and selection of candidate drug compounds with improved or desired properties by using diverse machine learning models and advanced encoding/data structures.
Ability to tailor and optimize machine learning model packages for third-party needs based on data and desired performance parameters.
Production of algorithmically designed drug compounds that have demonstrated broad-spectrum activity, resistance profile improvements, and effectiveness across varied disease models.
Continuous monitoring and optimization of machine learning performance and computational metrics, improving resource efficiency and process effectiveness.
Superior user interfaces for visualizing, filtering, and selecting sequence solutions, enabling enhanced decision-making and data comprehension for users.
Documented Applications
Anti-infective drug discovery, including treatment for diseases such as prosthetic joint infections, urinary tract infections, intra-abdominal or peritoneal infections, otitis media, cardiac infections, respiratory infections, neurological infections, dental infections, digestive and intestinal infections, wound and soft tissue infections, and other physiological system infections.
Veterinary and animal health applications, including treatment of animal diseases such as bovine mastitis.
Industrial applications, such as anti-biofouling measures and generation of optimized control sequences for machinery.
Therapeutic indications for diseases including eczema, inflammatory bowel disease, Crohn's Disease, rheumatoid arthritis, asthma, autoimmune diseases, inflammatory disease processes, and oncology treatments and palliatives.
Optimizing non-pharmaceutical sequences involving decisions or actions in areas such as the video game industry (e.g., for AI-controlled non-player character decision sequencing) and integrated circuit or chip design (e.g., mask works generation and routing for efficiency and performance in hardware).
Interested in licensing this patent?