`keypoint-detection`

HF group: Computer Vision · Status: ❌ not built

What it is

Image / video → joint locations (pose). Hands, body, face landmarks. Useful for AR, fitness apps, sign-language recognition.

Model	Params	Released	License	Quality	Notes
RTMPose	2-90 M	2023	Apache-2.0	Real-time SOTA	OpenMMLab.
ViTPose	90-660 M	2022	Apache-2.0	High quality, slower	Transformer.
MediaPipe Pose / Hands / Face	n/a	n/a	Apache-2.0	Real-time on CPU/mobile	Battle-tested at Google.
Apple Vision (`VNDetectHumanBodyPoseRequest`)	n/a	macOS	Apple	Native, fast	Free.

Same as other CV modalities — image input, overlay UI, encoder- only inference path.