Dec 8, 2024
Since videos are image sequences with audio, you could create custom logic to do this.
Alternatively, there are models on the HF hub that operate on videos directly: https://huggingface.co/models?pipeline_tag=video-classification&sort=downloads