Mission
Own the generative video pipeline that powers Albright Studios — Hallo2 for talking-head video, ASR, TTS, and the production rendering stack that turns scripts into broadcast-ready content.
Responsibilities
- Own the Hallo2 / talking-head video generation pipeline in production
- Build and operate ASR and TTS pipelines (Whisper, XTTS, or successors)
- Optimize GPU utilization across the video rendering stack
- Partner with product on creative-tool UX informed by model capabilities
- Establish quality metrics — lip-sync accuracy, voice naturalness, render speed
- Stay current with diffusion-video research (Stable Video, Pyramid, Open-Sora)
- Mentor ML engineers and contribute to model-serving standards
Required qualifications
- 5+ years ML engineering; 2+ years on video or audio generative models
- Hands-on experience with PyTorch and modern diffusion models
- Strong production-deployment background (Docker, GPU K8s)
- Comfort reading research papers and reproducing results
Preferred qualifications
- Prior work on talking-head, lip-sync, or face-animation systems
- Experience with TTS fine-tuning and voice cloning ethics
- Background in broadcast media or video-tech companies
- Open-source contributions to generative-media tooling