AI-Driven Live Production: Real-Time Metadata, Workflow Automation and Accessible Audio

Home / Sessions /

AI-Driven Live Production: Real-Time Metadata, Workflow Automation and Accessible Audio

Monday, April 20 | 1:30 – 2:30 p.m. | N256LMR

Broadcast Engineering and IT (BEIT) Conference Add to My NAB Show

Live production is evolving fast—moving from SDI to IP, from manual control to intelligent orchestration, and from “video-only” to richer, real-time data and accessibility layers. This session explores next-generation broadcast architectures that synchronize video, audio, and metadata at ultra-low latency, automate live workflows through real-time orchestration, and examine how AI is reshaping audio description for audiences who are blind or have low vision. Together, these presentations reveal how smarter systems can improve speed, reliability, efficiency, and audience experience—while raising critical questions about where automation helps and where human expertise remains essential.

Subsessions

IP-Based Live Production Architecture with AI-Supported Real-Time Metadata Integration for Event Broadcasting

Monday, April 20 | 1:30 – 1:50 p.m. | N256LMR

Matthias Schnöll, Sebastian Franke

The IP-Broadcast research project investigates a high-performance, low-latency IP live production architecture as an advancement of traditional broadcast workflows. The focus is on a live event scenario that demands minimal delay and seamless synchronization of multiple media sources.
A fully IP-based approach was chosen for integration into existing systems. The goal is an IP-based production workflow extended through AI-driven metadata detection. Video, audio, and metadata streams are transmitted separately yet precisely synchronized using packet-based architectures built on IP standards. Precision Time Protocol (PTP) enables accurate time-level synchronization across all media streams.
The ongoing development takes place in a controlled, reproducible test environment. At its core lies a custom-developed, web-based playout system serving as a platform for connecting and testing proprietary AI-supported analysis modules. A key component of the ongoing research is a newly designed generic metadata model for the structured description of event-based information using AI. This model allows for the distinct classification of event types (e.g., motion, object interaction, positional change) with exact time references and hierarchical structuring. Metadata is intended to be embedded as separate IP-broadcast–compliant streams using standardized formats such as JSON or XML. The model is designed to support automatic event recognition based on previously annotated training data and aims to establish a scalable metadata processing foundation for real-time applications.
The modeling and intended implementation follow a modular architecture approach that separates raw data acquisition, semantic interpretation, and synchronized playback. This creates a continuous link between live video and event interpretation to support time-critical production processes through automation.
As part of the ongoing project, the developed infrastructure comprises a functional playout server, an AI framework for event-based analysis, and successfully conducted tests of time-synchronized multicast transmission. These components form the basis for a scalable, modular live production architecture. In the future, visually lossless low-delay codecs such as JPEG XS will be evaluated with respect to processing speed, image quality, and system integration. Additionally, real-time optimization of AI processing on CPU/GPU systems is planned.
Audio Description: AD and AI – For Good or Ill?

Monday, April 20 | 1:50 – 2:10 p.m. | N256LMR

Joel Snyder

Audio Description (AD) is a secondary audio track which provides access to visual elements of film and television primarily for the benefit of people who are blind or have low vision. When I teach AD at sessions around the world, I focus on the crafting of the language used–and most AD is written to be heard. Description writers and voice talents trained in AD best practices are critical to the success of the effort to translate a visual image to the spoken word.
We are on the cusp of AI dominance in all manner of endeavor. Speech synthesis is already employed by some companies producing AD for broadcast television. But the writing of AD is dependent on an understanding and thorough analysis of the work to be described. Similarly, the appropriate voicing of AD is done with nuance, attention to the images on screen, an understanding of the phrasing used in AD writing, and the intent of the content creator. In 2021, the American Council of the Blind passed a resolution noting its "full support for … the use of human voices in … audio description for cinema and narrative video or streaming". 
Can the spread of speech synthesis or AI development of AD scripts be stopped or forestalled? Should it be? Can its use be a time-saver for AD production?
AI apps are inevitable and will surely bring great advances to humanity. But–at least for the foreseeable future–experienced and human writers and voice talents are key to effective AD.
From Chaos to Control: Real-Time Orchestration and Cost Efficiency in Modern Sports Broadcasting

Monday, April 20 | 2:10 – 2:30 p.m. | N256LMR

James Bloomfield

Live sports production continues to grow in scale, complexity and commercial pressure. Rights holders and broadcasters must deliver more feeds, more versions, more platforms and more data-driven enhancements, often with fewer resources, tighter timelines and increasing demands for operational efficiency.
This session explores how real-time orchestration is reshaping the modern sports broadcast chain, replacing fragmented manual processes with intelligent, centralised control. Using real-world workflows and benchmarks, we will demonstrate how MNC Software’s Tapestry platform brings order to previously chaotic environments by automating decision making, coordinating multi-vendor workflows and enabling true end-to-end visibility across live events.
Attendees will learn how orchestration can:
Reduce operational costs through automated signal routing, event-aware triggers and streamlined multi-site coordination. Improve reliability and resilience in live sports environments where seconds matter and failure is not an option. Enhance technical efficiency through consistent, repeatable workflows that remove human error and eliminate redundant steps. Accelerate delivery for REMI, cloud and hybrid SDI–IP workflows without compromising quality. Create measurable ROI by lowering OPEX, reducing staffing peaks and optimising infrastructure utilisation.
The session will include practical examples drawn from major sporting events illustrating how orchestrated broadcast environments can scale from a single venue to global, multi-event operations. By the end, attendees will walk away with a clear understanding of how orchestration is becoming the cornerstone of modern sports broadcasting and how adopting platforms like Tapestry can unlock both immediate savings and long-term competitive advantages.

Speakers

Chad WigginsAssociate Vice President, Innovation Solutions and PlatformsShureModeratorVIEW BIO

James BloomfieldCTOMNC SoftwareSpeakerVIEW BIO

Joel SnyderPresident / Founding Director EmeritusAudio Description Associates, LLC / Audio Description Project of the American Council of the BlindSpeakerVIEW BIO

Matthias SchnöllProfessor, Department of Media TechnologyAnhalt University of Applied SciencesSpeakerVIEW BIO

Sebastian FrankeResearch AssociateAnhalt University of Applied SciencesSpeakerVIEW BIO

Interested in Sponsorship Opportunities?

Work with NAB Show’s Sales Team to explore how your brand can power the pros shaping what’s next.

explore Options

partner with us

Session.