WaveGlow
E200567
autoregressive-free vocoder
deep learning model
flow-based generative model
neural network model
speech synthesis system
text-to-speech model
WaveGlow is a flow-based generative neural network model for fast, high-quality text-to-speech audio synthesis.
All labels observed (2)
| Label | Occurrences |
|---|---|
| WaveGlow canonical | 2 |
| WaveGlow: A Flow-based Generative Network for Speech Synthesis | 1 |
Statements (48)
| Predicate | Object |
|---|---|
| instanceOf |
autoregressive-free vocoder
ⓘ
deep learning model ⓘ flow-based generative model ⓘ neural network model ⓘ speech synthesis system ⓘ text-to-speech model ⓘ |
| advantageOverAutoregressiveModels |
lower inference latency
ⓘ
parallel sampling ⓘ |
| architectureComponent |
series of flow steps
ⓘ
upsampling network for conditioning ⓘ |
| audioQuality | near state-of-the-art at time of publication ⓘ |
| basedOn |
Glow
ⓘ
WaveNet ⓘ |
| codeRepository | GitHub ⓘ |
| comparedWith |
ClariNet
ⓘ
Parallel WaveNet ⓘ WaveNet ⓘ |
| designedTo |
enable real-time TTS
ⓘ
replace autoregressive vocoders ⓘ |
| developedBy |
NVIDIA Corporation
ⓘ
surface form:
NVIDIA
|
| distributionAssumption | simple prior distribution on latent space ⓘ |
| domain |
audio generation
ⓘ
speech processing ⓘ |
| framework | PyTorch ⓘ |
| input | mel-spectrograms ⓘ |
| introducedAt | 2018 ⓘ |
| language | Python ⓘ |
| output | time-domain audio waveform ⓘ |
| paperTitle |
WaveGlow
self-linksurface differs
ⓘ
surface form:
WaveGlow: A Flow-based Generative Network for Speech Synthesis
|
| probabilityModel | exact likelihood model ⓘ |
| property |
fast parallel audio generation
ⓘ
fully convolutional architecture ⓘ high-quality speech synthesis ⓘ single-network architecture ⓘ |
| publisher |
NVIDIA Corporation
ⓘ
surface form:
NVIDIA
|
| releasedAs | open source ⓘ |
| supports | GPU acceleration ⓘ |
| task |
neural vocoding
ⓘ
text-to-speech synthesis ⓘ |
| trainingDataType | paired text and speech corpora ⓘ |
| trainingObjective |
log-likelihood maximization
ⓘ
maximum likelihood ⓘ |
| usedFor |
neural TTS systems
ⓘ
speech synthesis research ⓘ voice assistants ⓘ |
| uses |
affine coupling layers
ⓘ
invertible 1x1 convolutions ⓘ normalizing flows ⓘ |
Referenced by (3)
Full triples — surface form annotated when it differs from this entity's canonical label.
this entity surface form:
WaveGlow: A Flow-based Generative Network for Speech Synthesis