Inside the Research Shaping Agentic Data Environments

Artificial intelligence (AI) agents are rapidly reshaping how people interact with data, software, and digital systems. As these technologies become more capable, researchers are exploring what it means to design environments where agents can effectively discover, access, manage, and act on data. Agentic Data Environments are data systems designed for AI agents rather than human users, aiming to make agents both more capable and more trustworthy. This post serves as a guide to a series on Agentic Data Environments, based on a position paper developed by the Data, Agents, and Processes Lab (DAPLab).

The series explores a research agenda for Agentic Data Environments, examining the systems, safeguards, and data management techniques needed to support increasingly capable AI agents. The posts will cover approaches to enabling safe exploration through branching and data-flow control, as well as methods to expand agent capabilities through improved information management, retrieval, and data elicitation. Together, these topics address a central challenge in agentic systems: how to provide agents with the information they need while ensuring their actions remain transparent, controlled, and reliable. The goal is not simply to build more powerful agents, but to create environments that allow them to operate effectively and safely at scale.

The Need for Agentic Data Environments
As AI agents become increasingly capable of taking actions on behalf of users, researchers are beginning to ask whether today’s data systems are equipped to support them. While many current AI applications operate in a read-only capacity, future agents will need to interact directly with shared systems, make changes, and carry out consequential tasks. This shift introduces new challenges around reliability, accountability, and risk. In The Need for Agentic Data Environments, researchers argue that the next frontier is not simply building smarter models but creating data environments specifically designed for agents. Environments that both expand what agents can accomplish and provide stronger guarantees around safety and control, and support trustworthy agentic systems at scale.

Branchable Databases Aren’t Ready for Agentic Workloads
Database branching is emerging as a critical capability for allowing them to safely explore potential actions before making changes to production systems. In this paper, the authors examine whether existing branchable databases are ready to support the scale of speculative execution that future agents will require. To evaluate current systems, they introduce BranchBench, a benchmark designed to test how databases handle large numbers of concurrent branches and complex exploration workloads. Their findings reveal significant trade-offs between branch management and query performance, with no existing system able to efficiently support both at the scale demanded by agentic applications. The paper highlights the need for new database architectures that can provide fast, low-cost branching while maintaining strong performance, laying the groundwork for the next generation of AI infrastructure.

StateFork: Give Agents a Rewind Button
As AI agents take on increasingly complex tasks, they need better ways to explore potential solutions without risking irreversible mistakes. Researchers introduce StateFork and Waypoint, two open-source tools that add branching and rollback capabilities to agent environments. Together, they allow agents to capture, restore, and explore different system states, enabling them to test alternative actions without repeatedly starting from scratch. The researchers argue that existing approaches, such as containers and virtual machines, are not well-suited to the rapid, iterative workflows required by autonomous agents. By providing lightweight, high-performance branching for entire computing environments, StateFork and Waypoint offer a foundation for safer, faster, and more effective agent exploration. The research presents the design of these systems, evaluates their performance, and discusses how branching may become a core capability of future agent infrastructure.

Inside the Research Shaping Agentic Data Environments

Computer Science at Columbia University

Upcoming Events

Microelectronics for Extreme Environment at Fermilab

In the News

Press Mentions