LangChain Research Reveals AI Agents Face Bottlenecks in Tool Usage

AIbase基地

Published inAI News · 4 min read · Feb 12, 2025

634

With the continuous advancement of artificial intelligence (AI) technology, businesses are beginning to explore whether they should rely on a single AI agent or build a multi-agent network that covers more functions. Recently, LangChain, a company specializing in orchestration frameworks, conducted related experiments aimed at investigating the performance limits of AI agents when faced with excessive instructions and tools.

In a blog post, LangChain detailed its experimental process, focusing on the core question: "Under what circumstances does the performance of a ReAct agent decline when asked to handle too many instructions and tools?" To answer this question, the research team chose the ReAct agent framework, as it is considered "one of the most fundamental agent architectures."

Robot Artificial Intelligence 2025

Image Source Note: Image generated by AI, image authorized by service provider Midjourney

In the experiment, LangChain aimed to evaluate the performance of an internal email assistant in two specific tasks: responding to customer inquiries and scheduling meetings. The researchers used a series of pre-built ReAct agents and tested them through the LangGraph platform. The language models involved included Anthropic's Claude 3.5 Sonnet, Meta's Llama-3.3-70B, and several versions from OpenAI, such as GPT-4o.

The first step of the experiment was to test the customer support capabilities of the email assistant, specifically how the agent accepts customer emails and provides replies. Additionally, LangChain paid particular attention to the agent's performance in calendar scheduling, ensuring it could accurately remember specific instructions.

The researchers set up a stress test with 30 tasks for each domain, dividing them into customer support and calendar scheduling. The results showed that when agents were given too many tasks, they often felt overwhelmed and even forgot to call necessary tools. For example, when handling tasks across up to seven domains, the performance of GPT-4o dropped to 2%. Meanwhile, Llama-3.3-70B made frequent mistakes in the task tests, failing to invoke the tool for sending emails.

LangChain discovered that as the amount of context provided increased, the agents' ability to execute instructions significantly declined. Although Claude-3.5-sonnet and several other models performed relatively well in multi-domain tasks, their performance gradually decreased as task complexity increased. The company stated that it will further explore how to evaluate multi-agent architectures in order to improve agent performance in the future.

MIIT: Over 400 National-Level Specialized and Innovative Small and Giant Enterprises Cultivated in the AI Field

Xie Shaofeng, the chief engineer of the Ministry of Industry and Information Technology (MIIT), stated that over 400 national-level specialized and innovative small and giant enterprises have been cultivated in the artificial intelligence field. The next step is to guide patient capital to increase support, accelerate the cultivation of a group of industry-leading enterprises and specialized and innovative SMEs. This includes building an open-source AI community, leveraging the role of the AI Standardization Technical Committee, and accelerating the development of key and urgently needed standards.

Infosys Develops Over 200 AI Agents, Reports 12% Drop in FY25 Net Profit

Infosys recently released its Q4 FY25 financial report, revealing a net profit of $814 million, a 11.7% decrease compared to $959 million in the same quarter last year. However, the company's revenue grew by 7.9% year-over-year, reaching $4.7 billion. For the full fiscal year, total revenue reached $19 billion, showing a modest 3.9% increase. In a press release, Infosys CEO Salil Parekh expressed optimism regarding generative AI...

International Arbitration Body Releases New AI Application Guidelines

The Chartered Institute of Arbitrators (CIArb), a leading international arbitration institution, recently released guidelines on the use of artificial intelligence (AI) in arbitration. This initiative aims to provide legal professionals and arbitrators with practical advice on the ethical use of this emerging technology in arbitration proceedings. With rapid technological advancements, AI is increasingly being integrated into various industries, including law and arbitration. AI can play a significant role in document review, evidence analysis, and decision-making support, but its application also raises a number of ethical considerations.

Jack Ma Reiterates Focus on AI; Alibaba's All-in AI Strategy Draws Attention; Employees Say Performance Not Yet Tied to AI

Alibaba founder Jack Ma recently addressed company employees, reaffirming the importance of artificial intelligence and stating that AI's future role is to liberate, not replace, humanity. Previous market rumors suggested that all Alibaba departments would have AI-driven growth as a core performance metric by 2025. However, an Alibaba employee told the media that performance evaluations are not currently directly linked to AI, which remains an auxiliary tool. In response to inquiries, Alibaba stated that this was not an official announcement.

SignalFire Raises Over $1 Billion to Focus on Applied AI Startups

Venture capital firm SignalFire announced it has secured over $1 billion in funding to support next-generation early-stage technology startups, particularly those innovating in applied artificial intelligence (AI). This capital will be allocated across multiple programs including SignalFire's Seed, Early, XIR (High-Retention), and Opportunity programs. The firm stated that this funding will be used to back founders pursuing disruptive innovations with the potential to 'reshape entire categories'. Image note: Image generated by AI, image licensing provided by Midjourney.

Meta Restarts AI Training Using Public Content from European Users

Meta recently announced it will resume training its AI models using publicly available content from European users. This decision follows a pause last year due to data privacy concerns. Meta stated that this AI training will primarily rely on publicly shared posts and comments from adult users across the 27 EU countries. Furthermore, interactions between users and Meta AI, such as questions and queries, will also be used to train and improve its AI models. Image attribution: Image generated by AI, image licensing provided by Midj