DETR
E652054
DETR (Detection Transformer) is a deep learning model that applies transformer architectures to end-to-end object detection in images, eliminating the need for traditional hand-designed detection components.
Statements (59)
| Predicate | Object |
|---|---|
| instanceOf |
deep learning model
ⓘ
object detection model ⓘ |
| advantage |
global reasoning via attention
ⓘ
removal of hand-designed detection components ⓘ simplified detection pipeline ⓘ |
| approach | end-to-end object detection ⓘ |
| availableAs | open-source implementation ⓘ |
| basedOn | Transformer architecture ⓘ |
| benchmarkDataset | COCO NERFINISHED ⓘ |
| comparedTo |
Faster R-CNN
NERFINISHED
ⓘ
RetinaNet NERFINISHED ⓘ |
| developedAt | Facebook AI Research NERFINISHED ⓘ |
| domain | computer vision ⓘ |
| eliminates |
anchor boxes
ⓘ
non-maximum suppression ⓘ region proposal network ⓘ |
| fullName | Detection Transformer NERFINISHED ⓘ |
| handles | variable number of objects ⓘ |
| hasVariant |
Conditional DETR
NERFINISHED
ⓘ
DAB-DETR NERFINISHED ⓘ DN-DETR NERFINISHED ⓘ Deformable DETR NERFINISHED ⓘ |
| implementedIn | PyTorch NERFINISHED ⓘ |
| inputType | image ⓘ |
| inspiredBy | Attention Is All You Need NERFINISHED ⓘ |
| introducedBy |
Alexander Kirillov
NERFINISHED
ⓘ
Francisco Massa NERFINISHED ⓘ Gabriel Synnaeve NERFINISHED ⓘ Nicolas Carion NERFINISHED ⓘ Nicolas Usunier NERFINISHED ⓘ Sergey Zagoruyko NERFINISHED ⓘ |
| introducedInPaper | End-to-End Object Detection with Transformers NERFINISHED ⓘ |
| limitation | slow convergence on small objects ⓘ |
| outputType |
bounding boxes
ⓘ
class labels ⓘ objectness scores ⓘ set of detected objects ⓘ |
| predictionParadigm |
one-to-one matching between predictions and ground truth
ⓘ
set prediction ⓘ |
| publicationYear | 2020 ⓘ |
| publishedAtConference | ECCV 2020 NERFINISHED ⓘ |
| requires |
large-scale training data
ⓘ
longer training schedule than traditional detectors ⓘ |
| supports |
instance segmentation (with extensions)
ⓘ
panoptic segmentation (with extensions) ⓘ |
| task |
image recognition
ⓘ
object detection ⓘ |
| trainingObjective |
L1 bounding box regression loss
ⓘ
bipartite matching loss ⓘ cross-entropy classification loss ⓘ generalized IoU loss ⓘ |
| usesArchitecture | transformer ⓘ |
| usesComponent |
Hungarian matching
ⓘ
cross-attention ⓘ encoder-decoder transformer ⓘ feed-forward network ⓘ multi-head self-attention ⓘ object queries ⓘ set-based loss ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.