Word flow annotation
Inventors
Sommers, Jeffrey Scott • Devine, Jennifer M. R. • Seuck, Joseph Wayne • Kaehler, Adrian
Assignees
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
An augmented reality (AR) device can be configured to monitor ambient audio data. The AR device can detect speech in the ambient audio data, convert the detected speech into text, or detect keywords such as rare words in the speech. When a rare word is detected, the AR device can retrieve auxiliary information (e.g., a definition) related to the rare word from a public or private source. The AR device can display the auxiliary information for a user to help the user better understand the speech. The AR device may perform translation of foreign speech, may display text (or the translation) of a speaker's speech to the user, or display statistical or other information associated with the speech.
Core Innovation
The invention relates to an augmented reality device that identifies a thread in a text stream by converting multiple audio streams into text streams, detecting one or more keywords associated with a topic, and generating threads that correspond to grouped topics. The device generates a first text stream from a first audio stream and a second text stream from a second audio stream, and identifies one or more keywords associated with both text streams, including detecting a verbal repeating of a word by a first user and designating that word as a keyword.
The invention further determines context of keywords based on an identity of a speaker of the second audio stream, and retrieves auxiliary information associated with the keywords based on the respective context. In response to identifying topics associated with the keywords, the system groups topics based at least in part on whether the topics are related, and generates a thread associated with each group of topics. The device causes at least one generated thread to be rendered on an augmented reality display and renders the auxiliary information with the thread for the corresponding keyword and topic.
The invention includes speaker identity and contextual parsing across first and second text streams, including parsing the first and second text streams to identify a first keyword associated with a first topic and a second keyword associated with a second topic. Additional behaviors include rendering threads in distinct locations, including left and right sides of a user's field of view, and rendering threads with visual differentiation, including a first color and a second color. Eye tracking sensors on the augmented reality device can be used to selectively deemphasize or dismiss generated threads or auxiliary information based on eye tracking information.
Claims Coverage
The partial content includes four independent claims. Across the independent claims, the inventive features center on converting multiple audio streams to text streams, keyword and topic identification including verbal repetition, speaker-identity-based context determination, auxiliary information retrieval, and augmented reality rendering of generated threads together with auxiliary information; the claims further specify thread grouping, distinct rendering, and eye-tracking-based selection.
Shared-keyword topic grouping for AR thread generation
Identifies a first audio stream associated with a first user and a second audio stream at an augmented reality device, generates a first text stream and a second text stream, identifies one or more keywords associated with both text streams including detecting a verbal repeating of the first word by the first user, identifies a speaker identity of the second audio stream, identifies a topic associated with each keyword, determines one or more groups of topics based at least in part on whether topics are related, generates a thread associated with each group of topics, and causes at least one generated thread to be rendered.
Speaker-identity-based contextual auxiliary information rendering with threads
Determines a context of each plurality of keywords based at least in part on the identity of the speaker of the second audio stream, retrieves auxiliary information associated with the keywords based on the respective context, and renders the auxiliary information with the at least one of the generated threads for the keyword associated with the topic of the at least one generated thread.
AR parsing into topic keywords and rendering of auxiliary information
Identifies a first audio stream and a second audio stream at an augmented reality device, generates a first text stream from the first audio stream and a second text stream from the second audio stream, parses the first text stream and the second text stream to identify a first keyword associated with a first topic and a second keyword associated with a second topic including detecting a verbal repeating of the first word by the first user, generates a first thread associated with the first topic and a second thread associated with the second topic, causes at least one of the first thread or the second thread to be rendered, determines a context of each of the first and second keywords based at least in part on the identity of the speaker of the second audio stream, retrieves auxiliary information associated with the first and second keywords based on the respective context, and renders the auxiliary information with the at least one generated thread for the keyword associated with the topic of the at least one thread.
An augmented reality device programmed to generate and render contextual threads with auxiliary information
An augmented reality device with a hardware processor programmed to identify a first audio stream, detect ambient sounds with an audio sensor, identify a second audio stream within ambient sounds, identify an identity of a speaker of the second audio stream, generate a first text stream and a second text stream from the first and second audio streams, parse the first and second text streams to identify a first keyword associated with a first topic and a second keyword associated with a second topic including detecting a verbal repeating of the first word by the first user, generate a first thread associated with the first topic and a second thread associated with the second topic, determine a context of the first keyword based at least in part on the identity of the speaker of the second audio stream, retrieve auxiliary information associated with the first keyword based on the context, and render the threads and the auxiliary information on the augmented reality display.
Across the independent claims, the core coverage is the end-to-end augmented reality pipeline that identifies keywords from multiple audio-to-text streams, assigns topics and groups topics, uses speaker identity to determine keyword context, retrieves auxiliary information based on context, and renders generated threads together with the corresponding auxiliary information on an augmented reality display. The claims also support distinct rendering and eye-tracking-based deemphasis or dismissal as further narrowed features.
Stated Advantages
Documented Applications
No documented applications found
Interested in licensing this patent?