Problem
← Back To Work
Research Build
Nepali Cultural Video Understanding
Built a multimodal research system that understands Nepali cultural videos through visual-language modeling, caption generation, and question answering grounded in visual content.
Approach
Approach
I collected Nepali cultural video data, processed clips into frame-based inputs, and evaluated vision-language pipelines that could produce captions and answer questions from multimodal context.Outcome