August 14, 2025
Securing the future of intelligent agents: Insights from the Agentic AI Summit
Reflections from the Agentic AI Summit 2025 on advancing safety, benchmarks, and resilience in the next generation of AI agents
This summer has been a busy season for the Cognizant AI Lab, filled with conversations that are shaping the future of artificial intelligence. From advancing responsible AI at The AI for Good Summit in Geneva, to exploring the boundaries of machine learning at ICML in Vienna, to scaling evolutionary systems at GECCO in Madrid, each event added a new dimension to the dialogue.
We closed out the season at the Agentic AI Summit at UC Berkeley, hosted by the Berkeley Center for Responsible Decentralized Intelligence. The summit brought together researchers, entrepreneurs, policymakers, and industry leaders to discuss the rapidly evolving world of AI agents and multi-agent systems. While the agenda spanned topics from orchestration frameworks to scientific discovery, one theme stood out across the main stage, breakout sessions, and lightning talks: agentic safety.
Agentic safety takes center stage
As autonomous agents take on more responsibility in making decisions, executing complex workflows, and collaborating with other agents, the stakes for securing them, aligning them with user goals, and ensuring they operate responsibly continue to rise. The discussions made clear that building capable agents is only half the challenge. The other half is ensuring those agents are safe, trustworthy, and resilient enough for real-world use.
One of the most practical frameworks came from Nicole Nichols of Palo Alto Networks, who outlined the pillars of agent security. These included securing agents from external compromise, safeguarding entrusted assets, defending against malicious agents, and ensuring actions always align with user intent. She highlighted principles like security-first design, safe interactions between agents and environments, and robust recovery protocols. Techniques such as activity tracing, dynamic authorization, and containment strategies gave the audience a clear playbook for trustworthy deployment.
These priorities were reflected in other sessions as well. Cisco’s Agency initiative for an “Internet of Agents” and Horizon3’s red-team security testing both underscored the importance of resilience and defense. In Horizon3’s case, their agents were able to compromise a system in under an hour, a striking reminder that security measures must evolve as quickly as agent capabilities.
Our Lab reinforced this point in our lightning talk on the Cognizant Neuro® AI Multi-Agent Accelerator. Built for scale and resilience, the open-sourced solution’s execution, isolated request handling, and ability to coordinate thousands of agents are paired with data security features like sly-data that keep sensitive information safe. By integrating safety into the architecture from the start, the Accelerator demonstrates how scalability and security can work hand in hand. To get started, explore on GitHub.
The message across these discussions was clear: expanding agent capabilities must be matched by equally rigorous safety engineering.
Measuring what matters: Agentic benchmarks
If safety defines how we should build agents, benchmarking helps define whether they’re actually effective, robust, and trustworthy. Several speakers addressed the need for standard benchmarks that reflect real-world conditions.
Nandi Subhrangshu of Amazon introduced SOP-Bench, a benchmark for evaluating agents that automate complex industrial standard operating procedures. Its design focuses on measuring accuracy, reliability, and adaptability — metrics that matter when agents move from lab settings into production environments.
Other sessions, such as Tapan Shah’s work on healthcare scenarios and stress testing, and ResearchTown’s simulation of a research community using LLM agents, showed how evaluation frameworks can reveal both the strengths and weaknesses of agent systems. Without shared and transparent metrics, it will be difficult to compare approaches, inspire trust, or accelerate adoption.
Looking ahead
The Agentic AI Summit was less about showcasing what is possible today and more about establishing the guardrails for what comes next. The path to widespread adoption will be shaped by two parallel priorities: advancing capabilities and defining standards for safety and evaluation.
For our Lab, these are not abstract ideas but active priorities. From architectures that protect sensitive data to scalable multi-agent systems and robust evaluation methods, we are committed to ensuring that the next generation of AI agents is as trustworthy as it is capable. The conversations in Berkeley were not the conclusion of a season of events, but a starting point for the work ahead.
Daniel Fink is an AI engineering expert with 15+ years in tech and 30+ years in software — spanning CGI, audio, consumer devices, and AI.
Risto Miikkulainen is VP of AI Research at Cognizant AI Lab and a Professor of Computer Science at the University of Texas at Austin.