zero-shot-object-detection
See
object-detection.
HuggingFace lists this separately, but the open-vocabulary
detectors (Grounding DINO, OWLv2) are listed in the
object-detection model table because
they share infrastructure. The difference is purely whether
labels come from a fixed taxonomy or a runtime text query.