linguistic corpus
C15286
concept
A linguistic corpus is a large, structured collection of authentic texts or transcribed speech used for analyzing language patterns, usage, and structure.
All labels observed (17)
| Label | Occurrences |
|---|---|
| linguistic corpus canonical | 9 |
| text corpus | 4 |
| linguistic resource | 2 |
| poetic corpus | 2 |
| American English corpus | 1 |
| Arabic text corpus | 1 |
| English–French corpus | 1 |
| German language corpus | 1 |
| LinguisticResource | 1 |
| Shakespearean textual corpus subset | 1 |
| ancient Chinese text corpus | 1 |
| diachronic corpus | 1 |
| historical language corpus | 1 |
| language model training dataset | 1 |
| multilingual dataset | 1 |
| sign language corpus project | 1 |
| speech corpus | 1 |
Instances (23)
| Instance | Via concept surface |
|---|---|
| Orphic Hymns | poetic corpus |
| Chu bamboo slips | ancient Chinese text corpus |
| Pavier quartos | Shakespearean textual corpus subset |
| South Picene inscriptions | — |
|
“Yana Texts” by Edward Sapir
surface form:
Yana Texts
|
— |
| CORPES XXI | — |
| Corpus del Español del Siglo XXI | — |
| CREA | — |
| CORDE corpus | historical language corpus |
| Common Voice dataset | speech corpus |
| Corpus de Referencia del Español Actual | — |
| Jabirian corpus | Arabic text corpus |
| FrenchPlaceNames | LinguisticResource |
| Oxford Dictionaries American English corpus | text corpus |
| Auslan Signbank | linguistic resource |
| NederlandstaligeKronieken | text corpus |
| CORPES XXI corpus | — |
| CREA corpus | — |
| WMT English-French dataset | English–French corpus |
| Irish Sign Language Corpus Project | sign language corpus project |
| Horatian corpus | poetic corpus |
| Deutsches Referenzkorpus | text corpus |
|
WebText dataset
surface form:
WebText
|
text corpus |