TimeSformer is the first transformer-based video classification architecture and is a key paper inspiring many new works.
paper: buff.ly/3qpDAZS
github: buff.ly/3RW33rv
Bonus: I am currently integrating TimeSformer into the HuggingFace: buff.ly/3TWFZdM


