Flickr8k

E899058

Flickr8k is a benchmark image-captioning dataset consisting of 8,000 images each paired with multiple human-written descriptions, widely used for training and evaluating vision-language models.

Try in SPARQL Jump to: Statements Referenced by

Statements (46)

Predicate Object
instanceOf benchmark dataset
image-captioning dataset
vision-language dataset
hasAnnotationType image captions
natural language descriptions
hasApproximateNumberOfImages 8000
hasBenchmarkRole baseline dataset for image captioning
hasCaptionSource human annotators
hasCaptionsPerImage 5
hasDataModality images
text captions
hasDataType photographic images
hasDescriptionQuality human-written captions
hasDomain computer vision
natural language processing
hasEvaluationMetrics BLEU NERFINISHED
CIDEr NERFINISHED
METEOR NERFINISHED
ROUGE NERFINISHED
hasImageSource Flickr NERFINISHED
hasLanguage English
hasLicenseType research use
hasName Flickr8k NERFINISHED
hasNumberOfImages 8000
hasScale small-scale image-captioning dataset
hasSource Flickr NERFINISHED
hasTask multimodal learning
vision-language grounding
hasTypicalSplit test set
training set
validation set
isBenchmarkFor automatic image description
multimodal representation learning
isConsidered standard benchmark in image captioning
isSmallerThan Flickr30k NERFINISHED
MS COCO NERFINISHED
isUsedIn vision-and-language research
isUsedTo compare image-captioning algorithms
evaluate caption generation quality
isWidelyUsedFor benchmarking captioning models
isWidelyUsedIn academic research
usedFor evaluating image-captioning models
evaluating vision-language models
image captioning
training image-captioning models
training vision-language models

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.