Skip to content

Duplicate/combined word (lists) when using the response 'SpeechRecognitionResult' from longRunningRecognizeAsync #3903

@JeroenAppel

Description

@JeroenAppel

When looping trough the 'SpeechRecognitionResult' objects, I noticed that the transcript attribute and the 'words_' list do not match for (at least) the last result. In our app, we always use the first and only alternative. I noticed that the word lists do match the transcript from the first few results, but for the last result, all words including the last one will be returned in words. I would expect that the last result only contains the words which are related to that specific transcript.

I assume this is a bug. If not; please advice.

Environment details

  • OS: Windows 10
  • Java version: 1.8.0_102
  • google-cloud-java version(s): google-cloud-speech-0.67.0-beta

Code snippet

In order to clarify this, I added a simplified code snippet below.

List<SpeechRecognitionResult> results = response.getResultsList();

for (SpeechRecognitionResult result : results) {
     SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
     String transcript = alternative.getTranscript();
     for (WordInfo wordInfo : alternative.getWordsList()) {
           String word = wordInfo.getWord();
     }
}

In my current example, we have 3 results. The number of words are correct for the first two, but the third (last) is incorrect and includes all words from the whole text.

Result 0: Only the word Jaguar (both the transcript and the only word)
image

Result 1: A longer transcript, with 102 (correct) words in total
image

Result 2: A short transcript with only 23 words. As you can see, the list with words includes all 126 words (1+102+23).
image

Metadata

Metadata

Assignees

Labels

api: speechIssues related to the Speech-to-Text API.type: questionRequest for information or clarification. Not an issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions