AI is transforming broadcast operations from end to end—making content more accessible, more reusable, and more consistent in quality. These papers explore real-world applications of speech translation and subtitling for local communities, AI-governed rights automation that unlocks archive value, and next-generation QoE monitoring using perceptual metrics and AI-based quality estimation.
Practical lessons, technical realities, and measurable impact—across the full delivery chain.
Tuesday, April 21 | 1:30 – 1:50 p.m. | N256
Robin Herin, Robert Maas
Historically, ATSC has been used as one of the main standard for terrestrial television. Its impact on local communities through the carriage of localized news and more vital information through EAS is undeniable. Today, the majority of viewers of OTA content can be divided into two categories: the cord-cutters, looking to cut cost without compromising their access to information, and the local communities, who follow their local TV station due to content created for them by the local station’s crew . Through the use of LLMs (Large Language Models), we can now add additional audio & subtitling tracks by doing speech-to-speech & speech-to-text that would correspond to the linguistic needs of the local communities around an ATSC 3.0 station. In this paper, we will present our tests performed at a local ATSC station, looking into the results along with the (legal and technical) challenges of adding using AI-based speech-to-speech & speech-to-text technology for ASTC 3.0.
Tuesday, April 21 | 1:50 – 2:10 p.m. | N256
Yo Narita, Tadashi Yura
The secondary use of broadcast archive assets is essential for maximizing content value. However, a significant barrier is the high workload of manually logging rights information. This process is not only labor-intensive but also suffers from a fundamental lack of governance. Existing archive rights records often suffer from inconsistent quality due to operator-dependent data entry methods. Without clear oversight or standardized processes, this human-maintained database contains redundant entries, fragmented information, and ambiguous data. This lack of governance creates assets with unclear rights status, preventing their reuse and leaving valuable content locked away. To solve this critical challenge, we report on GEAR (Governance Engine for Archives Rights), a multi-AI agent system designed not just to automate, but to introduce essential governance into the rights logging workflow. The system employs a Human-in-the-Loop (HITL) approach, balancing AI-driven efficiency with the critical accuracy required for broadcast operations. GEAR's architecture is built on core AI agent functions that enforce governance: Task Orchestration Agent: Acts as the central controller, managing information from the front-end and other agents. It enforces standardized workflows by routing requests to the appropriate agent and managing sessions to ensure complex processes are handled reliably. Rights Detection Agent: Utilizes a Large Multimodal Model (LMM) to analyze video content, identifying potential copyrighted materials with timestamps. It then queries the existing, human-maintained rights database via an MCP (Model Context Protocol) server to cross-reference the detected items. It searches for any associated contact/liaison information already logged by operators and scores these potential matches for relevance. Output Formatting Agent: Processes the raw, scored data retrieved from the database. It enforces data quality by filtering, de-duplicating, and standardizing the often-inconsistent information, ensuring that all data presented to the operator meets a consistent quality standard before review. Content-Rights Linking Agent: This agent facilitates the crucial HITL validation, which acts as the final governance checkpoint. It presents formatted data to the operator via a checklist for final "adoption." Once confirmed, this agent creates an link between the content and its verified rights in the archive database. By introducing GEAR, we transform rights logging from a manual, inconsistent task into a governed, standardized, and efficient workflow. This system drastically reduces the logging burden while simultaneously enhancing data integrity, consistency, and reliability. This presentation will detail the GEAR architecture, its assistive agentic workflows, and the impact of AI-driven governance on maximizing the true value of broadcast archives.
Tuesday, April 21 | 2:10 – 2:30 p.m. | N256
CHAITANYA Mahanthi, Nikolay Lebedev
Ensuring consistent video Quality of Experience (QoE) across modern distribution pipelines has become increasingly challenging as content passes through multiple transformations—from acquisition to encoding, transcoding, packaging, CDN delivery, and final device playback. This proposal presents a unified framework for measuring and monitoring video quality at each stage of the end-to-end workflow using a hybrid combination of reference, reduced-reference, and non-reference methods. The approach integrates industry metrics such as VMAF, VMAF NEG, and UVQ for perceptual scoring, along with other open source AI/ML-based non-reference models that detect degradation without requiring a source feed. The system identifies compression artifacts, frame freezes, macroblocking, texture loss, audio issues, and representation inconsistencies across ABR ladders and devices. By correlating these objective measures with predicted QoE, the framework isolates the exact stage where degradation occurs—whether during ingest, encoder/transcoder processing, CDN propagation, or player rendering. The proposal highlights how combining traditional metrics with AI-driven quality estimation creates a more reliable and scalable methodology for Live, Linear, VOD, and FAST workflows. It introduces the concept of a “Quality Fingerprint,” enabling cross-stage attribution and trend analysis across the delivery chain. The goal is to offer broadcasters, streamers, and service providers a practical, unified solution to measure, compare, and optimize video quality from acquisition to device playback, improving overall viewing experience and operational efficiency.
Speakers
Sun SachsSenior Vice President, Digital ProductsTownsquare MediaModeratorVIEW BIO
Nikolay LebedevArchitecture Lead, YouTubeGoogleSpeakerVIEW BIO
Robert MaasSystems EngineerHeartland Video SystemsSpeakerVIEW BIO
Robin HerinDirector of Standardization | CTO OfficeAtemeSpeakerVIEW BIO
Tadashi YuraEngineerNHK (Japan Broadcast Corporation)SpeakerVIEW BIO
Yo NaritaPrincipal EngineerNHK (Japan Broadcasting Corporation)SpeakerVIEW BIOWork with NAB Show’s Sales Team to explore how your brand can power the pros shaping what’s next.