vision-language dataset

C63730

concept

A vision-language dataset is a curated collection of paired visual data (such as images or videos) and corresponding textual annotations designed to train and evaluate models that jointly understand and generate visual and linguistic information.

Try in SPARQL

All labels observed (3)

Label	Occurrences
vision-language dataset canonical	2
image captioning dataset	1
image-captioning dataset	1

Description generation (CDg)

The one-sentence description above was generated by prompting gpt-5.1 with the class name and this instruction.

Instruction

generate a one-sentence description for a given conceptual class.
# Response Format
Return only the sentence: "Description: [one-sentence description of the conceptional class]"

Input

Class: vision-language dataset

Generated description

A vision-language dataset is a curated collection of paired visual data (such as images or videos) and corresponding textual annotations designed to train and evaluate models that jointly understand and generate visual and linguistic information.

Instances (3)

Instance	Via concept surface
MSCOCO	image captioning dataset
Flickr8k	image-captioning dataset
Flickr30k	—