Whisper
E20825
Whisper is an open-source automatic speech recognition system by OpenAI that transcribes and translates spoken language with high accuracy across many languages.
All labels observed (2)
| Label | Occurrences |
|---|---|
| Whisper canonical | 2 |
| Whisper model | 1 |
How this entity was disambiguated
This entity first appeared as the object of triple T146321 — resolving that mention is where its identity was fixed. The disambiguator weighed these candidate entities and picked the highlighted one (or “None”, minting a new entity). This is how homonymy is resolved: the same surface form can point to different entities.
Target entity: Whisper Context triple: [OpenAI, developed, Whisper]
-
A.
The Concert
The Concert is an unfinished and now missing painting by Dutch Golden Age master Johannes Vermeer, depicting an intimate scene of three figures making music in a domestic interior.
-
B.
The Great Unknown
The Great Unknown is a section of Washington Irving’s 1824 story collection "Tales of a Traveller," comprising a group of tales centered on mystery and the supernatural.
-
C.
Once Again
Once Again is John Legend's Grammy-winning second studio album, known for its soulful blend of R&B, pop, and neo-soul.
-
D.
Savior
Savior is a Christian title for Jesus Christ, emphasizing his role as the divine redeemer who delivers humanity from sin and spiritual death.
-
E.
Yell
Yell is one of the larger inhabited islands in the Shetland archipelago of Scotland, known for its rugged coastline, wildlife, and remote rural communities.
- F. None of above. chosen
- G. Unsure - the case is ambiguous/there is not enough information to decide.
Target entity: Whisper Target entity description: Whisper is an open-source automatic speech recognition system by OpenAI that transcribes and translates spoken language with high accuracy across many languages.
-
A.
The Concert
The Concert is an unfinished and now missing painting by Dutch Golden Age master Johannes Vermeer, depicting an intimate scene of three figures making music in a domestic interior.
-
B.
The Great Unknown
The Great Unknown is a section of Washington Irving’s 1824 story collection "Tales of a Traveller," comprising a group of tales centered on mystery and the supernatural.
-
C.
Once Again
Once Again is John Legend's Grammy-winning second studio album, known for its soulful blend of R&B, pop, and neo-soul.
-
D.
Savior
Savior is a Christian title for Jesus Christ, emphasizing his role as the divine redeemer who delivers humanity from sin and spiritual death.
-
E.
Yell
Yell is one of the larger inhabited islands in the Shetland archipelago of Scotland, known for its rugged coastline, wildlife, and remote rural communities.
- F. None of above. chosen
Statements (69)
| Predicate | Object |
|---|---|
| instanceOf |
automatic speech recognition system
ⓘ
open-source software ⓘ |
| developer | OpenAI ⓘ |
| feature |
robustness to accents
ⓘ
robustness to background noise ⓘ robustness to technical language ⓘ |
| hasModelVariant |
base
ⓘ
large ⓘ medium ⓘ small ⓘ tiny ⓘ |
| input | audio ⓘ |
| modelType | encoder-decoder transformer ⓘ |
| openSourceRepository | https://github.com/openai/whisper ⓘ |
| organization | OpenAI ⓘ |
| output | text ⓘ |
| programmingLanguage |
C++
ⓘ
Python ⓘ |
| provides | pretrained models of multiple sizes ⓘ |
| releaseDate | September 2022 ⓘ |
| softwareLicense | MIT License ⓘ |
| supports |
multilingual speech recognition
ⓘ
multilingual speech translation ⓘ segment-level language detection ⓘ timestamped transcription ⓘ transcription of spoken language ⓘ translation of spoken language ⓘ |
| supportsLanguage |
Arabic
ⓘ
Bulgarian ⓘ Cantonese ⓘ Chinese ⓘ Czech ⓘ Danish ⓘ Dutch ⓘ English ⓘ Finnish ⓘ French ⓘ German ⓘ Greek ⓘ Hebrew ⓘ Hindi ⓘ Hungarian ⓘ Indonesian ⓘ Italian ⓘ Japanese ⓘ Korean ⓘ Malay ⓘ Norwegian language ⓘ
surface form:
Norwegian
Polish ⓘ Portuguese ⓘ Romanian language ⓘ
surface form:
Romanian
Russian ⓘ Spanish ⓘ Swahili language ⓘ
surface form:
Swahili
Swedish ⓘ Tagalog ⓘ Thai ⓘ Turkish language ⓘ
surface form:
Turkish
Ukrainian ⓘ Vietnamese ⓘ |
| task |
language identification
ⓘ
speech recognition ⓘ speech translation ⓘ speech-to-text transcription ⓘ |
| trainingData | multilingual and multitask supervised data collected from the web ⓘ |
| useCase |
building voice interfaces
ⓘ
captioning videos ⓘ transcribing meetings ⓘ transcribing podcasts ⓘ |
How these facts were elicited
The pipeline generated the facts above by prompting gpt-5.1 with this entity's name + description and the instruction below.
You are a knowledge base construction expert. Given a subject entity and a description of it, return factual statements that you know for the subject as a JSON list of dictionaries(triples), where keys must be "subject", "predicate" and "object". The number of facts may be very high, between 25 to 50 or more, for very popular subjects. For less popular subjects, the number of facts can be very low, like 5 or 10. # Requirements - If you don't know the subject at all, return an empty list. - If the subject is not a named entity, return an empty list. - Include at least one triple where predicate is "instanceOf". - Do not get too wordy. - Separate several objects into multiple triples with one object.
Subject: Whisper Description of subject: Whisper is an open-source automatic speech recognition system by OpenAI that transcribes and translates spoken language with high accuracy across many languages.
Referenced by (3)
Full triples — surface form annotated when it differs from this entity's canonical label.