Demo: On-Device Video Analysis with LLMs
Vishnu Jaganathan, Deepak Gouda, Kriti Arora, and 2 more authors
In Proceedings of the 25th International Workshop on Mobile Computing Systems and Applications, , San Diego, CA, USA, , Feb 2024
We present a new on-device pipeline that efficiently summarizes lecture videos and provides relevant answers directly from a smartphone. We utilize widely accessible tools like OCR and Vosk speech-to-text, coupled with powerful large language models (LLMs), to identify crucial sentences and generate summaries. By harnessing the capabilities of LLMs and the computational power of mobile devices, we fine-tune and quantize BERT and GPT-2 to achieve efficient lecture video summarization and question answering on consumer-grade smartphones like the Pixel 8 Pro. Notably, this approach eliminates the need for cloud APIs, ensuring enhanced user privacy and minimal mobile data usage.https://www.youtube.com/shorts/zwGdONlKays