As UHD HDR becomes the default for premium live sports and entertainment, quality depends on more than capture—it depends on consistent, measurable delivery across complex conversion and QC chains. This session connects three timely challenges: interoperable, metadata-driven HDR master workflows to ensure reliable HDR/SDR conversions; new HDR-native metrics and visualization tools (including HSAM and the “Stop Waveform”) designed for modern monitoring and QC; and the rising importance of dialog intelligibility as the next audience-facing audio quality benchmark. Together, these papers offer practical methods to standardize quality and protect creative intent across today’s broadcast and streaming ecosystems.
Tuesday, April 21 | 9:30 – 9:50 a.m. | N256
David Touze
High Dynamic Range (HDR) production is fast becoming the standard for major live sports broadcasts, with global events such as the Olympics, FIFA World Cup, Formula 1 races, and the Roland Garros tennis tournament now delivered in UHD HDR. This shift to HDR brings a remarkable improvement in visual quality, but the media and entertainment industry faces ongoing challenges in fully harnessing its potential. In sports production, varying lighting conditions and sports disciplines demand unique calibrations for elements like skin tones, grass, or clay—requiring precisely set and maintained reference levels. Ensuring consistent reference levels and stable graphics across the production chain is crucial, especially in workflows that rely on a single HDR master and depend on regular conversions between HDR and SDR formats. After content capture, network operators distribute either the HDR or SDR version based on the platform. Maintaining consistent conversions is essential. Static 3D Look-Up Tables (LUTs) are preferred for their simplicity and reliability, preserving stable graphics and natural midtones—like realistic grass on a soccer field. However, the need for different LUTs for each event, or even multiple LUTs within a single event, can result in inconsistencies. Latest dynamic conversions techniques, built on content adaptation paradigms, introduce control points adjustments to guarantee that key reference levels, especially mid-tones, remain consistent throughout every conversion. Looking ahead to potential future AI-based conversions tools will also have to address the challenges of ensuring reference levels consistency. As for now, no signalization mechanism informs on how to drive the right conversion by selecting the most appropriate configuration at any point of the streaming chain and all these different conversion techniques are not interoperable without specific signaling. This paper proposes an innovative metadata-driven method, formalized in the SMPTE Dynamic Range Conversion Characterization Metadata ST 2094-60 project, that characterizes all these necessary control points. The paper describes experimental results which demonstrate how to create, deliver and apply them in the production and delivery workflows. The proposed solution allows conversion consistency throughout the infrastructure and guarantees interoperability between any conversion technique. By using metadata to automate conversions processes and improve reliability and quality, the solution will greatly enhance creative freedom for productions. In summary, the evolution of conversion techniques and metadata-driven workflows marks a substantial leap forward. By adopting these innovations, HDR producers can deliver stunning, reliable content that continues to meet and exceed global audience expectations.
Tuesday, April 21 | 9:50 – 10:10 a.m. | N256
Lakshmanan Gopishankar
High Dynamic Range (HDR) video enhances the transfer of artistic intent, but its integration into existing workflows introduces significant complexity. This can lead to inconsistent content reproduction and a suboptimal end-user experience. While the industry has adapted tools for HDR, wider adoption has revealed the limitations of current monitoring, measurement, and QC techniques, highlighting a need for new methodologies specifically designed for HDR. This paper first examines the common HDR metadata metrics, MaxFALL and MaxCLL, analyzing their origins, applications, and significant limitations. We present use cases where these measurements prove inadequate or add counterproductive complexity. As a solution, we introduce a new set of HDR-native metrics, termed HDR Screen Area Measurements (HSAM), which are designed to map content creator intent to end-user display capabilities. We will demonstrate the application of HSAM in live production, editing, and QC. Beyond new metrics, HDR also necessitates new visualization tools. The traditional linear waveform, a staple for decades, becomes compressed and difficult to interpret with HDR signals, obscuring critical detail in both shadows and highlights. This paper, therefore, also introduces a novel logarithmic display, the 'Stop Waveform.' This technique allows operators to analyze the signal in a way that aligns more naturally with human perception of linear light, providing a clear, comprehensive view of the entire dynamic range in a single display. Together, these new metrics and visualization techniques provide a more robust framework for creating, monitoring, and delivering high-quality HDR content.
Tuesday, April 21 | 10:10 – 10:30 a.m. | N256
Paul Tapper
Around 2004, the primary audio-related complaints to the FCC concerned sudden loudness changes, which also caused audience switching and reduced broadcaster revenue. In response, the ATSC, ITU, and EBU developed standards to measure and regulate loudness. These practices were adopted into QC workflows and ultimately became legal requirements in many regions. Today, the leading audio complaint for broadcasters and streaming platforms is poor dialog intelligibility. Studies show that unclear dialog reduces viewer engagement and program “stickiness,” directly impacting revenue models. Unlike loudness, there is no single industry-standard method for measuring dialog intelligibility. However, a promising approach developed by Fraunhofer IDMT, the “Listening Effort Meter” (LE Meter), is gaining adoption. The LE Meter uses a deep neural network to detect phonemes in an audio signal, and the confidence of detection serves as a proxy for listening effort: higher confidence indicates lower effort and better intelligibility. This method is largely language-agnostic and has been validated against human listening tests. Several manufacturers, including Steinberg, NUGEN Audio, and RTW, have implemented or are integrating the LE Meter into commercial products. A key question is whether dialog-intelligibility metrics should become part of formal QC requirements or remain a recommended creative-stage tool. Either way, meaningful presentation of LE Meter data remains a challenge. The algorithm generates readings every 30 ms, creating significant short-term variability. As with loudness measurement, time-windowing, smoothing, or ballistic behavior may be needed to produce usable information. For QC purposes, there is interest in creating a single program-level value or pass/fail indicator to answer: “Is dialog intelligibility acceptable?” Determining how to derive such a score from raw or processed LE data is not straightforward. Options include mean or median values or quartile-based measures, taken on raw or smoothed data. However, each method has limitations, and certain content scenarios may not be fully represented by simple statistical summaries. Identifying the most reliable and practical approach remains an open issue for the industry.
Work with NAB Show’s Sales Team to explore how your brand can power the pros shaping what’s next.