On March 16, 2026, Jensen Huang walked onto the stage at SAP Center in San Jose in front of 30,000 people and said something that landed less like an announcement and more like a mandate.

"Every company in the world today needs to have an OpenClaw strategy, an agentic system strategy. This is the new computer."

He did not hedge. He did not say "eventually" or "companies should consider." He called it as significant as HTML and Linux, and he built an entire section of NVIDIA's flagship keynote around it.

The audience absorbed the headline. What most coverage missed is the follow-on question that actually matters for anyone building real operations: if AI agents are going to handle business workflows at scale, what infrastructure do they actually run on?

That is not a philosophical question. It is an engineering question. And the answer determines whether your agentic strategy is something that works in production or something that works in a boardroom presentation.

At RTC League, we build and operate the real-time communication infrastructure that AI voice agents run on. Our stack powers TelEcho, our AI voice agent platform, and serves clients across Pakistan, the UAE, and South and Southeast Asia. This piece reflects what we have seen in production deployments, not in demos.

What Jensen Actually Said at GTC 2026 and Why the Infrastructure Question Follows

NVIDIA held its flagship GTC Conference on March 16, 2026, in San Jose, California. The centerpiece software announcement was NemoClaw, NVIDIA's enterprise-grade agentic AI platform built on OpenClaw, the open-source AI agent framework that became the fastest-growing open source project in the history of computing within weeks of its launch.

Jensen Huang suggested this shift will transform software itself, predicting a move from SaaS to what he called "Agentic AI as a Service." For companies with voice-based customer operations, this shift is not abstract. A persistent AI agent connects to your tools, data, and communication channels, then acts on your behalf. It has I/O. It has memory. It has scheduling. It has tool access.

The distinction between a chatbot and an actual agent matters enormously for infrastructure decisions. A chatbot generates a response. An agent receives a goal, executes a sequence of actions, and closes the loop, whether that loop involves scheduling an appointment, resolving a billing dispute, or routing a patient inquiry, without a human in the middle.

For any agent that interacts through voice, the infrastructure question is immediate. Every word a user speaks has to travel, get processed, trigger a response, and return as audio, all within a timeframe that makes the conversation feel natural. That requires a communication stack built specifically for this purpose.

Talk to the RTC LEAGUE Team

Book a Demo
CTA Illustration

The Market Context: Why This Is Happening Now, Not Later

The numbers behind the agentic shift are not projections. They are the current reality.

Metric

Stat

Source

Global Voice AI Agents market value (2026)

$2.4B growing to $47.5B by 2034

Market.us

CAGR of Voice AI Agents market

34.8%

Market.us

Businesses planning AI voice for customer service by 2026

80%

Nextiva

Production voice agent deployments growth (YoY)

340%

AI Voice Research

Fortune 500 companies running production voice AI

67%

AI Voice Research

Gartner forecast for contact center labor cost savings from conversational AI in 2026

$80 billion

Gartner

Cost per AI voice agent call vs. human agent call

$0.40 vs. $7-12

Teneo.ai

3-year ROI for companies using voice AI

331% to 391%

Forrester/PolyAI

Agentic AI adoption in enterprises already piloting or scaling

62%

AssemblyAI 2026

Despite 79% of organizations reporting some level of AI agent adoption, 50% of agentic AI projects remain stuck in pilot stages.The gap between pilot and production is almost always an infrastructure problem, not a model problem.

Demand for inference infrastructure is expected to exceed $1 trillion by 2027, driven by the need to serve millions of users simultaneously. The computer layer is expanding. The question is whether the communication layer underneath AI agents can keep up.

WebRTC in the Age of AI Agents: Why Latency Is the Product

WebRTC is the open standard for real-time audio and video communication in browsers and applications. It has been in production since 2011 and underpins Google Meet, Zoom's browser client, telehealth platforms, and browser-based enterprise calling tools globally.

The global WebRTC market was valued at approximately $4.23 billion in 2022 and is projected to expand at a compound annual growth rate of around 30% through 2030.

For most of that history, WebRTC was a human communication protocol. Two endpoints, a person at each one. The shift happening now is that one of those endpoints is an AI agent. This changes the latency requirements completely.

The Latency Reality of Production AI Voice

Research shows human conversation operates on a 200-300ms response window, hardwired across all cultures. Exceeding this threshold triggers neurological stress responses that break conversational flow.

Users never complain about "latency." They report agents that "feel slow," "keep getting interrupted," or "don't understand when I'm done talking."

Here is what the production latency pipeline actually looks like for a cascaded AI voice agent:

Pipeline Stage

Target Latency

Notes

Audio transmission to media server

50-100ms

Depends on geographic proximity

Speech-to-Text (STT) transcription

100-200ms

Best-in-class: NVIDIA Parakeet at 72ms TTFT

LLM inference (first token)

200-400ms

Varies by model size and load

Text-to-Speech synthesis

75-200ms

ElevenLabs Turbo: 138ms TTFB

Audio return to caller

50-100ms

Depends on routing hops

Total end-to-end target

Under 800ms

Sub-500ms for premium deployments

The ideal turn-taking delay is about 200ms according to human conversational benchmarks. Infrastructure co-locating GPUs and telephony networks in global Points of Presence reduces round-trip time between speech and inference to sub-200ms, delivering faster responses and more natural conversations.

Component latencies are cumulative and sequential. Even if STT takes 200ms, LLM takes 400ms, and TTS takes 200ms individually, they add up to 800ms total. Plus network overhead, queuing delays, and turn detection can add another 200-400ms.

This is why infrastructure architecture decisions matter at the foundation level. Every additional hop between the media server and the AI processing layer compounds. At RTC League, the WebRTC stack we operate runs AI processing co-located with the media server rather than routing audio through external API endpoints. That architectural decision is what separates TelEcho's response latency from deployments built on general-purpose cloud infrastructure.

LiveKit is essentially the operating system of agentic computers in the communications context. It is the leading open-source WebRTC media server for AI agent workloads, providing session management, real-time audio processing hooks, and AI SDK integration that production deployments require. RTC League operates managed LiveKit infrastructure for organisations that need production-grade AI voice capability without the operational overhead of running and scaling media servers in-house.

SIP Trunking: Where Most Agentic Voice Deployments Actually Break

Most businesses do not route customer calls through browser-based WebRTC sessions. Customers call a phone number. That call travels over the PSTN via SIP. For an AI voice agent to answer calls on standard phone numbers, the infrastructure needs a SIP trunk that bridges traditional telephony to the WebRTC and AI processing stack cleanly.

This is one of the most common points of failure in AI voice deployments, and it receives the least attention in most architecture discussions.

SIP-to-WebRTC Failure Point

What Happens

Production Impact

Codec mismatch at transcoding

G.711/G.729 negotiation failure

Audio artefacts, partial dropout

External SIP provider + separate WebRTC stack

Additional network crossing between providers

80-150ms added latency per crossing

Jitter accumulation at the bridge

Buffer mismanagement between PSTN and WebRTC

Choppy audio, misfire on Voice Activity Detection

Unoptimised SIP trunk configuration

Default settings not tuned for AI workloads

Elevated latency on every call

Enterprise SIP trunking for AI agent deployments needs to handle codec negotiation cleanly, maintain low jitter at the transcoding layer, and pass audio to the WebRTC media infrastructure without unnecessary routing hops.

Talk to the RTC LEAGUE Team

Book a Demo
CTA Illustration

RTC League provides enterprise SIP trunking integrated directly with the managed LiveKit infrastructure we operate. The SIP trunk and the media server are part of the same stack, not separate vendor relationships with a network crossing between them. For businesses running legacy PBX systems, this is also the migration path that allows AI agents to be layered into existing telephony infrastructure without requiring a full system replacement.

HIPAA-Compliant AI Voice Agents: What Healthcare Organisations Need to Get Right Before Go-Live

Healthcare is one of the highest-ROI verticals for AI voice agents. The volume of routine administrative interactions is enormous, and a large proportion is predictable enough for an agent to handle: appointment scheduling, prescription refill routing, insurance verification, post-discharge follow-up, lab result notifications.

The compliance constraint is HIPAA. Here is the current state of the risk environment:

Healthcare AI + Data Breach Stat

Figure

Source

Average healthcare data breach cost (2025)

$7.42 million

IBM / HIPAA Journal

U.S.-specific average healthcare breach cost (2025)

$10.22 million

IBM

Healthcare breaches involving a business associate or vendor (doubled in one year)

30% of all incidents

Verizon 2025 DBIR

PHI records stolen from third-party vendors vs. directly from hospitals

Over 80%

HIPAA Journal

Healthcare organisations hit by a cyberattack in the past 12 months

93%

Ponemon 2025

Shadow AI adding to breach costs

$670,000 average increase

IBM 2025

HIPAA penalty range per violation per year

$100 to $50,000 per category

HHS

OCR breach notification deadline

60 days from discovery

HIPAA

The most important trend in healthcare breaches is not the total number. It is where breaches originate. Breaches involving a business associate or vendor doubled in one year, from 15% to 30% of all incidents. Over 80% of stolen PHI records were stolen from third-party vendors and software services, not directly from hospitals.

This has a direct implication for AI voice infrastructure. Every component in the stack that touches patient audio is a potential third-party breach point.

Here is what HIPAA compliance actually requires for an AI voice agent deployment, translated to infrastructure decisions:

End-to-end encryption throughout the audio pipeline. WebRTC handles transport-layer encryption via DTLS and SRTP natively. The gap is the processing layer. If a media server decrypts audio, ships it to an external AI API, and returns synthesised speech unencrypted or through an uncovered vendor, there is a PHI exposure gap regardless of what happens at the transport layer.

BAAs with every vendor in the pipeline. Every third-party API integrated into a healthcare AI application, including LLM evaluation providers, speech services, and infrastructure operators, must have a signed Business Associate Agreement. AWS, Azure, and Google Cloud offer HIPAA BAAs. General-purpose consumer AI APIs typically do not. Never send PHI to an API without a signed BAA.

Access controls and audit logging. New 2025 HIPAA Security Rule updates require mandatory encryption for all ePHI in storage and transit, continuous monitoring through automated systems for real-time risk assessments and audit logs, and increased penalties adjusted for inflation exceeding $100,000 per violation annually. 67% of healthcare organisations admit they are not ready for these stricter standards.

Talk to the RTC LEAGUE Team

Book a Demo
CTA Illustration

Breach notification architecture in place before go-live. HIPAA requires notification of affected individuals and HHS without unreasonable delay and no later than 60 days after the breach is discovered. The monitoring and incident detection layer needs to be operational at launch.

RTC League's infrastructure supports HIPAA-compliant deployments. The WebRTC and media server stack is configurable for encrypted media handling throughout the pipeline, and we work with healthcare clients specifically on the architecture decisions that determine compliance before deployment, not as a retrofit after go-live.

The Complete Infrastructure Stack for a Production Agentic Strategy

Here is what the infrastructure layer of a real agentic voice strategy looks like:

Infrastructure Layer

What It Does

What Breaks Without It

WebRTC media server (LiveKit managed)

Carries audio between users and AI agents; provides session management and AI SDK integration

High latency, session instability, no AI processing hooks

Enterprise SIP trunking

Connects PSTN phone numbers to the AI stack cleanly

Codec failures, added latency at bridge, audio artefacts

Co-located AI processing

AI inference runs close to the media server, not via remote API calls

Compounding latency from remote API round trips

Voice Activity Detection + streaming audio

Processes audio in real time rather than waiting for complete utterances

Slower turn detection, higher perceived latency

Session management and observability

Monitors session health, latency metrics, audio quality

No visibility into production failures until customers complain

Compliance architecture

Encrypted media handling, BAA-covered vendor chain, audit logs

HIPAA exposure, regulatory liability, unsellable to regulated industries

The average implementation cost of agentic AI runs to $890,000, producing 171% average ROI in organisations that have done the foundational work, and very different outcomes in organisations that have not. The infrastructure layer is the foundational work. The model layer gets almost all the attention. The communication stack is where production deployments succeed or fail.

RTC League is built around this full stack: managed WebRTC infrastructure, enterprise SIP trunking, AI voice through TelEcho, and the operational capability to run this in production at the reliability and compliance levels that customer-facing deployments require.

Jensen Huang asked a question at GTC 2026 that is not a new question. It is the same question that has decided every major technology transition of the last thirty years. The companies that win the agentic transition are not the ones with the best model. They are the ones who get the infrastructure right before they scale agents on top of it.