TimeViper: A Hybrid Mamba-Transformer Vision-Language Model for Efficient Long Video Understanding Paper • 2511.16595 • Published Nov 20 • 9
REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding Paper • 2511.13026 • Published Nov 17 • 25