Gemini Embedding 2: A Native Multimodal Embedding Model
→
One model that maps text, images, video, and audio — and any mix of them — into a single shared vector space, beating specialized models at their own games and proving native audio embedding beats transcribe-then-embed.
May 2026
Google DeepMind
PersonaLive: Real-Time Diffusion Portrait Animation
→
How to make a diffusion model animate portraits at 15–20 FPS live — by distilling 20 denoising steps into 4, and replacing batch generation with a sliding window that streams frames continuously.
December 2025
University of Macau & Dzine.ai & Great Bay University
How Much Do Language Models Memorize?
→
GPT-style models store ~3.6 bits per parameter. This paper measures exactly how much models memorize, explains double descent, and predicts when privacy attacks fail.
June 2025
Meta FAIR + Google DeepMind + Cornell + NVIDIA
Avatar V: Scaling Video-Reference Avatar Generation
→
How HeyGen generates talking avatar videos that preserve identity, talking style, and micro-expressions from a short reference video.
April 2026
HeyGen Research