Local-global alignment for finding 3D similarities in protein structures
Inventors
Assignees
US Department of Energy • Lawrence Livermore National Security LLC
Publication Number
US-8024127-B2
Publication Date
2011-09-20
Expiration Date
2024-02-18
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
A method of finding 3D similarities in protein structures of a first molecule and a second molecule. The method comprises providing preselected information regarding the first molecule and the second molecule. Comparing the first molecule and the second molecule using Longest Continuous Segments (LCS) analysis. Comparing the first molecule and the second molecule using Global Distance Test (GDT) analysis. Comparing the first molecule and the second molecule using Local Global Alignment Scoring function (LGA_S) analysis. Verifying constructed alignment and repeating the steps to find the regions of 3D similarities in protein structures.
Core Innovation
The invention provides a Local-Global Alignment (LGA) method that finds similarities between two protein structures or fragments of protein structures. It enables identification and analysis of structural similarities in proteins even when they do not have significant amino acid sequence similarity. The method allows clustering of similar structural fragments, which can be used to identify sequence patterns representing local structural motifs in proteins. This approach enhances fold recognition, detection of distant homologs, and improves the quality and accuracy of predicted 3D protein models, especially in modeling small fragments such as loops, deletions, insertions, and signature regions.
The method includes providing preselected information of two protein molecules and performing comparisons using Longest Continuous Segments (LCS) analysis, Global Distance Test (GDT) analysis, and a Local Global Alignment Scoring function (LGA_S). The process iteratively verifies and constructs alignments to find all regions of 3D similarities between considered protein structures, generating detailed information on both local and global similarities.
The problem solved by this invention arises from the difficulty of determining protein structures experimentally due to labor intensity, cost, and time. Many protein sequences remain structurally uncharacterized, and computational methods face challenges in accurately aligning and comparing protein structures, particularly when sequence similarity is low or absent. Existing methods struggle to optimize structural similarity measures globally while recognizing local similarities. The invention addresses these challenges by combining local and global alignment strategies to comprehensively analyze protein structural similarities, facilitating better functional understanding and modeling.
Claims Coverage
The patent contains one independent claim detailing a computer-implemented method for generating a local-global alignment score to indicate global and local similarity between two protein structures. The main inventive features are extracted as follows.
Generating a local-global alignment score based on structural correspondences
The method receives a protein structure correspondence indicating pairs of residues between two protein structures and determines root mean square deviations (RMSD) for multiple sets of contiguous residue pairs. It selects the longest contiguous segment based on these RMSDs, calculates a global distance test value from distance scores reflecting residue pairs within predefined distances, and generates a composite local-global alignment score that represents both local and global similarity.
Generating protein structure correspondences from coordinate data
The method includes receiving coordinate data sets of two protein structures, either at a server or the computer system, and generating the initial protein structure correspondence based on these coordinates to serve as input for alignment scoring.
Refining protein structure correspondence based on local-global alignment score
The method iteratively modifies the set of coordinates specifying at least one of the protein structures using the global distance test values and longest continuous segment information derived from the local-global alignment score, thereby generating an improved correspondence between the protein structures.
Providing graphical representations of alignment results
The method provides graphical outputs depicting at least one of the protein structures, with some residues colored based on their distances to corresponding residues in the compared structure. The graphical forms may include bar plots or three-dimensional protein structure visualizations to illustrate alignment quality and structural similarity.
The independent claim describes a computer-implemented process that generates and refines a local-global alignment score representing both local and global 3D structural similarities between two proteins based on residue correspondences and coordinate data, with capabilities for graphical visualization of the results.
Stated Advantages
Allows identification and analysis of protein structural similarities even without significant amino acid sequence similarity.
Enables clustering of similar structural fragments to identify sequence patterns representing local structural motifs.
Improves the process of fold recognition and detection of distant homologs.
Enhances the quality and accuracy of final 3D protein models, particularly for small fragments like loops, deletions, insertions, and signature regions.
Generates detailed information about both local and global regions of similarity, facilitating better structural comparison and modeling.
Documented Applications
Structural comparison and superposition of proteins or fragments thereof.
Finding similarities between protein structures or their fragments.
Clustering similar fragments of protein structures.
Creating databases of similar protein fragments linked with corresponding amino acid sequence patterns.
Analysis of protein structure for homology modeling and fold recognition.
Modeling of small protein fragments such as loops, insertions, deletions, and signature regions.
Interested in licensing this patent?