Researchers are directing significant efforts toward creating powerful general-purpose agents by leveraging Foundation Models (FMs) like GPT and Claude[1]. Unlike monolithic models, these agents often require complex systems, integrating various components such as chain-of-thought planning, tool usage, and self-reflection[1]. Yet, these designs usually need extensive manual fine-tuning from researchers and engineers[1].
However, history shows that hand-crafted solutions in machine learning are eventually replaced by more efficient, learned solutions[1]. Building on this premise, this work introduces a new research area called Automated Design of Agentic Systems (ADAS). ADAS aims to automatically invent novel building blocks and optimize entire agentic system designs[1]. The ultimate goal is to create increasingly powerful agents that outperform state-of-the-art hand-designed solutions[1].
To revolutionize agentic system design, ADAS focuses on automating the creation of these systems by using meta agents. Meta agents are designed to program better agents iteratively in code, leveraging the Turing completeness of programming languages like Python[1]. This approach allows for the possible learning and discovery of any agentic system, including novel prompts, control flows, and tool use[1]. The Meta Agent Search algorithm demonstrates this concept effectively[1].
To operationalize ADAS, three components are essential:
Search Space: This defines which agentic systems can be represented. For example, some works optimize text prompts, while others explore graph structures or feed-forward networks[1].
Search Algorithm: It specifies how ADAS explores the search space. Effective algorithms balance rapid discovery of high-performance systems while avoiding local optima[1]. Variants include reinforcement learning or iterative generation by FMs.
Evaluation Function: Depending on ADAS's application, this function assesses candidate agents based on various criteria like performance, cost, and latency[1].
Meta Agent Search is one of the initial algorithms within ADAS that operates entirely in a code space. The meta agent iteratively creates new agents, evaluates their performance, adds them to an archive, and uses this archive for subsequent iterations[1]. By continuously incorporating feedback and refining its approach, the meta agent can build progressively more effective agents. Initial evaluation has shown Meta Agent Search's ability to greatly outperform hand-designed agents across multiple domains, including coding, science, and math[1].
One demonstration of Meta Agent Search's efficacy is the ARC (Abstraction and Reasoning Corpus) challenge. This task evaluates AI systems' general intelligence by requiring them to learn transformation rules from a few examples and apply them to new inputs[1].
To address ARC's challenges, the agent writes code for transformation rules instead of direct answers. The experiment involved comparing Meta Agent Search against five state-of-the-art hand-designed agents[1]:
Chain-of-Thought (COT)
Self-Consistency with Chain-of-Thought (COT-SC)
Self-Refine
LLM Debate
Quality-Diversity through Method Scaling
The best-discovered agent from these Meta Agent Search runs employed a sophisticated feedback mechanism, iterating through trials of multi-step reviews and refinements. This sophisticated process improved overall predictive accuracy significantly compared to baselines[1].
Meta Agent Search was also tested on four popular benchmarks: DROP for reading comprehension, MGSM for multilingual math, MMLU for multi-task problem-solving, and GPQA for advanced science questions[1]. The ADAS approach consistently discovered high-performing agents in all tested domains, improving preceding state-of-the-art solutions by substantial margins[1].
For example, in reading comprehension tasks, the algorithm improved F1 scores by 13.6/100 points, and in math tasks, accuracy rates increased by 14.4%[1]. The discovered agents also demonstrated significant robustness, maintaining superior performance when transferred across models and domains[1].
An important aspect of Meta Agent Search is the generalizability of the discovered agents. Experiments showed that agents optimized on one FM, like GPT-3.5, performed well when transferred to other models such as Claude-Sonnet and GPT-4[1]. This transferability illustrates these agents' robustness and their potential applicability to a wide array of tasks and environments.
Moreover, agents developed in specific domains, such as math, generalized well to non-math domains like reading comprehension and multi-task problem-solving. This ability to adapt and perform across varied areas underscores the broad utility and effective design of ADAS-generated agents[1].
While ADAS promises a fast track to developing advanced agentic systems, it also raises significant safety concerns. There is a pressing need to run untrusted code safely and ensure that the generated agents are honest, helpful, and harmless. Developing sandbox environments and incorporating principles from Constitutional AI might be crucial future steps[1].
Potential future research areas in ADAS include higher-order ADAS for self-improving meta agents, introducing more existing building blocks into the search space, integrating multi-objective optimization, and developing more sophisticated evaluation functions[1]. Extending this approach to more complex real-world applications and understanding the emergence of complexity in human organizations are additional avenues[1].
The Automated Design of Agentic Systems (ADAS) represents an exciting new frontier in AI research. By automating the creation of complex, powerful agents, ADAS offers a promising path toward increasingly efficient and effective AI systems that can significantly advance various real-world applications[1].
By demonstrating superior performance across diverse domains and proving robust transferability, ADAS underscores the potential to revolutionize AI development from the foundational level up[1].
Get more accurate answers with Super Search, upload files, personalized discovery feed, save searches and contribute to the PandiPedia.
Let's look at alternatives: