`video-to-video`

HF group: Computer Vision · Status: ❌ not built

What it is

Video → transformed video (style transfer, frame interpolation, super-resolution, slow-mo).

Open-weight models

Model	Params	Released	License	Quality	Notes
RIFE / FILM	5-50 M	2022-23	Apache-2.0	Frame interpolation	Real-time.
Real-ESRGAN-Anime / -Video	17 M	2021	BSD-3	Video upscale	Standard tool.
AnimateDiff	~1 B	2024	Apache-2.0	Style transfer over animation	Diffusion-based.

Infrastructure required

Inference

❌ Frame-by-frame inference for non-diffusion variants (RIFE, ESRGAN-Video).
❌ Diffusion runtime for AnimateDiff.

Input

❌ Video input pipeline (decode frames, demux audio).

Output

❌ Video file save (re-encoded).
Long-running task with progress.

Storage

❌ Weights cache.
Output: fs.user-folder.

Interaction (IPC + SDK)

❌ video.transform({ path, op }) IPC with progress.

Capabilities (manifest)

capabilities.fs.user-selected, capabilities.fs.user-folder.
capabilities.models[].

Gaps

All of the same shared rails as other video modalities — video I/O pipeline, long-running task IPC, optional diffusion runtime.

See also

text-to-video
image-to-image — frame-by-frame variants share infra
Index: ../modalities-and-models-survey.md