Quality-aware keyword query suggestion and evaluation
Inventors
Namaki, Mohammad Hossein • Wu, Yinghui • Zhang, Xin • Ding, Yonghua
Assignees
Huawei Technologies Co Ltd • Washington State University WSU
Publication Number
US-11514049-B2
Publication Date
2022-11-29
Expiration Date
2039-06-06
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
A query suggestion to expand an initial query is calculated whereby the cost of the expanded initial query is bounded in both time and quality. The user validates a subset of the top-n answers Q(G) to a query Q and provides adjusted configuration parameters. The top-n diversified δ-expansion terms Q′ are calculated from the validated subset of answers Q(G) to the query Q and are provided to an interactive user interface for selection. Answers Q′(G) for the top-n diversified δ-expansion terms Q′ are cost bounded by cost threshold δ and exploration range r specified by the user. The user selects a new term of terms Q′ and an incremental query evaluation of the new term is invoked to compute expanded query answers Q′(G) by incrementally updating the validated subset of answers Q(G), without re-evaluating an expanded query Q′ including the new term from scratch.
Core Innovation
The invention provides an interactive keyword-based graph exploration system and method that enable quality-aware query suggestion and evaluation over large networks, such as knowledge graphs, social media, and e-commerce networks. The system interleaves query refinement (expansion) and incremental query evaluation, offering users control over both the quality and time cost of the exploratory search. By introducing a cost threshold (δ) and exploration range (r), users can specify the acceptable quality loss and the subgraph region to be explored during each query refinement.
The core process begins with the user issuing an initial keyword query Q on a data graph G, receiving top-n answers Q(G), and validating a subset of these answers. Based on these validated answers, the system calculates the top-n diversified δ-expansion terms Q′, guaranteeing that each suggested refined query's answer will not exceed the user-specified cost threshold. The user interface enables selection of new expansion terms, after which the system incrementally computes expanded query answers Q′(G) by updating only the validated subset of answers, rather than reevaluating the entire expanded query from scratch.
This technology addresses the difficulties users face when exploring large, heterogeneous graphs without prior schema knowledge, where query refinement can easily produce irrelevant or low-quality results and standard evaluation is computationally expensive. By bounding both the quality loss and the computational cost for each search session and supporting multiple established keyword search query types (distinct root, Steiner tree, Steiner graph), the invention allows efficient, tunable, and interpretable graph exploration with guarantees on answer quality and system responsiveness.
Claims Coverage
The patent claims cover a series of inventive features centered on a system, method, and computer-readable medium for quality-aware keyword query suggestion and incremental evaluation, featuring bounded quality and time cost for query expansion and answer computation.
Apparatus for quality and time-bounded query suggestion
The apparatus includes a user interface that receives graph configuration, keywords, exploration parameters, and validated query answers. It incorporates memory and processors to execute instructions for: - A quality-aware query expansion module that receives validated answers Q(G), computes top-n diversified δ-expansion terms Q′, and provides these to the interface, guaranteeing answer costs within threshold δ and range r. - An incremental query evaluation module that, upon user selection of a new term, computes expanded query answers Q′(G) by incrementally updating validated answers Q(G) without reevaluating from scratch, presenting updated answers to the user. - A memory and indexing device that stores graph data and supports required distance and traversal operations.
K times traversal for expansion term discovery
The quality-aware query expansion module receives k validated answers Q(G) and performs k traversals of the data graph, each starting from a source node in the i-th answer, visiting bounded hops of neighbors using a single source shortest path (SSSP) iterator, to identify candidate expansion terms.
Calculation and pruning based on additional cost for expanded queries
For each candidate expansion term identified through SSSP traversals, the module computes the total additional cost of extending the initial answer to the expanded answer. If this cost exceeds a preset threshold, the expanded query is pruned; otherwise, the term is included as a δ-expansion term.
Selection of top-n diversified and relevant expansion terms
After discovering all δ-expansion terms, the module selects the top-n terms that are both most relevant and diversified, using built-in or configurable metrics, and provides them for user selection.
Query class-specific expansion using special source nodes
δ-expansion terms for different keyword search query classes are found using specific source nodes for SSSP iterations: - For distinct-rooted tree queries, the source is the root of each answer. - For Steiner tree and r-clique queries, the source is the node set of each answer.
Incremental query evaluation with optimality guarantees
The incremental query evaluation module implements incremental approximation algorithms to compute expanded query answers Q′(G) for different keyword search query classes, ensuring bounded cost and various optimality guarantees, without evaluating the expanded query from scratch.
Computer-implemented method for interactive, quality-bounded query suggestion
The method comprises a workflow including configuration reception, keyword search invocation, display of top-n answers, user validation, parameter adjustment, quality-aware expansion to compute top-n diversified δ-expansion terms, incremental answer evaluation based on selected terms, and interactive display of answers, all with bounded time and answer quality.
Non-transitory computer-readable medium storing instructions for quality-aware query suggestion
Computer instructions are provided to execute the entire process: receiving configuration, performing keyword search, obtaining validated answers, adjusting parameters, computing top-n diversified δ-expansion terms, enabling user selection, incrementally evaluating expanded answers, and presenting results—ensuring bounded quality loss and computational cost.
In summary, the claims cover a system, method, and computer-readable medium for interactive keyword-based graph exploration, featuring quality- and time-bounded query expansion and incremental evaluation tailored to validated answers, with extensible support for various keyword search query classes and user-adjustable parameters.
Stated Advantages
Guarantees that query suggestions and answers have bounded quality loss and bounded evaluation cost, as specified by user-defined thresholds.
Enables efficient, interactive, and interpretable exploration of large graphs without re-evaluating expanded queries from scratch, thus minimizing computational expense.
Supports user configuration to flexibly trade off exploration openness and answer quality, allowing both conservative and open-ended search strategies.
Outperforms standard keyword search algorithms and query-by-example/co-occurrence methods in both answer quality and computational efficiency.
Allows exploration with strong, meaningful connections among keyword matches, resulting in more relevant and interpretable answers.
Provides a user interface for validating answers and tuning exploration parameters in real time, supporting iterative, user-driven graph exploration.
Documented Applications
Exploratory search and graph exploration in knowledge graphs, enabling users to refine queries and discover new relevant information with quality guarantees.
Information retrieval and search engines over networks, including applications using schema-free, multi-labeled graphs from sources such as DBpedia, IMDB, and citation networks.
Interactive search systems for social media, e-commerce networks, or any large heterogeneous networks requiring quality-sensitive keyword-driven exploration.
Adaptation to databases, documents, and web pages represented as heterogeneous networks for advanced exploratory search operations.
Interested in licensing this patent?