11 C
New York
Thursday, October 30, 2025

Buy now

The missing data link in enterprise AI: Why agents need streaming context, not just better prompts

Enterprise AI brokers immediately face a elementary timing drawback: They cannot simply act on essential enterprise occasions as a result of they don’t seem to be all the time conscious of them in real-time.

The problem is infrastructure. Most enterprise information lives in databases fed by extract-transform-load (ETL) jobs that run hourly or day by day — in the end too sluggish for brokers that should reply in actual time.

One potential method to sort out that problem is to have brokers immediately interface with streaming information techniques. Among the many main approaches in use immediately are the open supply Apache Kafka and Apache Flink applied sciences. There are a number of industrial implementations primarily based on these applied sciences, too, Confluent, which is led by the unique creators behind Kafka, being one in every of them.

As we speak, Confluent is introducing a real-time context engine designed to unravel this latency drawback. The expertise builds on Apache Kafka, the distributed occasion streaming platform that captures information as occasions happen, and open-source Apache Flink, the stream processing engine that transforms these occasions in actual time.

The corporate can also be releasing an open-source framework, Flink Brokers, developed in collaboration with Alibaba Cloud, LinkedIn and Ververica. The framework brings event-driven AI agent capabilities on to Apache Flink, permitting organizations to construct brokers that monitor information streams and set off mechanically primarily based on circumstances with out committing to Confluent’s managed platform.

“As we speak, most enterprise AI techniques cannot reply mechanically to essential occasions in a enterprise with out somebody prompting them first,” Sean Falconer, Confluent’s head of AI, advised VentureBeat. “This results in misplaced income, sad clients or added danger when a fee fails or a community malfunctions.”

The importance extends past Confluent’s particular merchandise. The trade is recognizing that AI brokers require totally different information infrastructure than conventional functions. Brokers do not simply retrieve data when requested. They should observe steady streams of enterprise occasions and act mechanically when circumstances warrant. This requires streaming structure, not batch pipelines.

Batch versus streaming: Why RAG alone is not sufficient

To grasp the issue, it is essential to differentiate between the totally different approaches to transferring information by enterprise techniques and the way they will connect with agentic AI.

See also  Cloudflare just changed the internet, and it's bad news for the AI giants

In batch processing, information accumulates in supply techniques till a scheduled job runs. That job extracts the information, transforms it and hundreds it right into a goal database or information warehouse. This may happen hourly, day by day and even weekly. The strategy works effectively for analytical workloads, but it surely creates latency between when one thing occurs within the enterprise and when techniques can act on it.

Knowledge streaming inverts this mannequin. As an alternative of ready for scheduled jobs, streaming platforms like Apache Kafka seize occasions as they happen. Every database replace, consumer motion, transaction or sensor studying turns into an occasion revealed to a stream. Apache Flink then processes these streams to affix, filter and mixture information in actual time. The result’s processed information that displays the present state of the enterprise, updating constantly as new occasions arrive.

This distinction turns into essential when you think about what sorts of context AI brokers really want. A lot of the present enterprise AI dialogue focuses on retrieval-augmented era (RAG), which handles semantic search over information bases to seek out related documentation, insurance policies or historic data. RAG works effectively for questions like “What’s our refund coverage?” the place the reply exists in static paperwork.

However many enterprise use instances require what Falconer calls “structural context” — exact, up-to-date data from a number of operational techniques stitched collectively in actual time. Contemplate a job advice agent that requires consumer profile information from the HR database, shopping conduct from the final hour, search queries from minutes in the past and present open positions throughout a number of techniques.

“The half that we’re unlocking for companies is the flexibility to basically serve that structural context wanted to ship the freshest model,” Falconer stated.

The MCP connection drawback: Stale information and fragmented context

The problem is not merely connecting AI to enterprise information. Mannequin Context Protocol (MCP), launched by Anthropic earlier this 12 months, already standardized how brokers entry information sources. The issue is what occurs after the connection is made.

In most enterprise architectures immediately, AI brokers join through MCP to information lakes or warehouses fed by batch ETL pipelines. This creates two essential failures: The information is stale, reflecting yesterday’s actuality somewhat than present occasions, and it is fragmented throughout a number of techniques, requiring vital preprocessing earlier than an agent can purpose about it successfully.

The choice — placing MCP servers immediately in entrance of operational databases and APIs — creates totally different issues. These endpoints weren’t designed for agent consumption, which might result in excessive token prices as brokers course of extreme uncooked information and a number of inference loops as they attempt to make sense of unstructured responses.

See also  Welcome to Chat Haus, the coworking space for AI chatbots

“Enterprises have the information, but it surely’s typically stale, fragmented or locked in codecs that AI cannot use successfully,” Falconer defined. “The actual-time context engine solves this by unifying information processing, reprocessing and serving, turning steady information streams into dwell context for smarter, quicker and extra dependable AI choices.”

The technical structure: Three layers for real-time agent context

Confluent’s platform encompasses three parts that work collectively or adopted individually.

The real-time context engine is the managed information infrastructure layer on Confluent Cloud. Connectors pull information into Kafka matters as occasions happen. Flink jobs course of these streams into “derived datasets” — materialized views becoming a member of historic and real-time alerts. For buyer assist, this may mix account historical past, present session conduct and stock standing into one unified context object. The Engine exposes this by a managed MCP server.

Streaming brokers is Confluent’s proprietary framework for constructing AI brokers that run natively on Flink. These brokers monitor information streams and set off mechanically primarily based on circumstances — they do not await prompts. The framework consists of simplified agent definitions, built-in observability and native Claude integration from Anthropic. It is accessible in open preview on Confluent’s platform.

Flink Brokers is the open-source framework developed with Alibaba Cloud, LinkedIn and Ververica. It brings event-driven agent capabilities on to Apache Flink, permitting organizations to construct streaming brokers with out committing to Confluent’s managed platform. They deal with operational complexity themselves however keep away from vendor lock-in.

Competitors heats up for agent-ready information infrastructure

Confluent is not alone in recognizing that AI brokers want totally different information infrastructure. 

The day earlier than Confluent’s announcement, rival Redpanda launched its personal Agentic Knowledge Aircraft — combining streaming, SQL and governance particularly for AI brokers. Redpanda acquired Oxla’s distributed SQL engine to present brokers normal SQL endpoints for querying information in movement or at relaxation. The platform emphasizes MCP-aware connectivity, full observability of agent interactions and what it calls “agentic entry management” with fine-grained, short-lived tokens.

The architectural approaches differ. Confluent emphasizes stream processing with Flink to create derived datasets optimized for brokers. Redpanda emphasizes federated SQL querying throughout disparate sources. Each acknowledge brokers want real-time context with governance and observability.

See also  This free Google AI course could transform how you research and write - but act fast

Past direct streaming opponents, Databricks and Snowflake are essentially analytical platforms including streaming capabilities. Their power is complicated queries over massive datasets, with streaming as an enhancement. Confluent and Redpanda invert this: Streaming is the inspiration, with analytical and AI workloads constructed on prime of knowledge in movement.

How streaming context works in observe

Among the many customers of Confluent’s system is transportation vendor Busie. The corporate is constructing a contemporary working system for constitution bus firms that helps them handle quotes, journeys, funds and drivers in actual time. 

“Knowledge streaming is what makes that potential,” Louis Bookoff, Busie co-founder and CEO advised VentureBeat. “Utilizing Confluent, we transfer information immediately between totally different components of our system as an alternative of ready for in a single day updates or batch stories. That retains the whole lot in sync and helps us ship new options quicker.

Bookoff famous that the identical basis is what’s going to make gen AI helpful for his clients.

“In our case, each motion like a quote despatched or a driver assigned turns into an occasion that streams by the system instantly,” Bookoff stated. “That dwell feed of knowledge is what’s going to let our AI instruments reply in actual time with low latency somewhat than simply summarize what already occurred.”

The problem, nevertheless, is how one can perceive context. When hundreds of dwell occasions stream by the system each minute, AI fashions want related, correct information with out getting overwhelmed.

 “If the information is not grounded in what is occurring in the true world, AI can simply make flawed assumptions and in flip take flawed actions,” Bookoff stated. “Stream processing solves that by constantly validating and reconciling dwell information in opposition to exercise in Busie.”

What this implies for enterprise AI technique

Streaming context structure alerts a elementary shift in how AI brokers devour enterprise information. 

AI brokers require steady context that blends historic understanding with real-time consciousness — they should know what occurred, what’s occurring and what may occur subsequent, suddenly.

For enterprises evaluating this strategy, begin by figuring out use instances the place information staleness breaks the agent. Fraud detection, anomaly investigation and real-time buyer intervention fail with batch pipelines that refresh hourly or day by day. In case your brokers have to act on occasions inside seconds or minutes of them occurring, streaming context turns into vital somewhat than non-compulsory.

“Whenever you’re constructing functions on prime of basis fashions, as a result of they’re inherently probabilistic, you employ information and context to steer the mannequin in a course the place you need to get some type of end result,” Falconer stated. “The higher you are able to do that, the extra dependable and higher the result.”

Supply hyperlink

Related Articles

Leave a Reply

Please enter your comment!
Please enter your name here

Latest Articles