Researchers at the Massachusetts Institute of Technology’s Computer Science and Artificial Intelligence Laboratory (CSAIL), in collaboration with Asari AI, have developed a novel framework called EnCompass to improve how AI agents interact with large language models (LLMs). Many AI systems today use LLMs to perform complex tasks — from translating code to brainstorming ideas — but challenges arise when LLMs produce imperfect results. EnCompass helps these agents automatically backtrack and explore multiple paths in search of the most effective solutions, reducing the manual effort previously required to program such logic.
The core idea behind EnCompass is to separate an AI agent’s core workflow from its search strategy. Traditional agent programming often burdens developers with explicitly coding how an agent should revise plans when an LLM output isn’t optimal. EnCompass abstracts this search logic so programmers can simply mark “branchpoints” where decisions might vary, and the framework handles backtracking or parallel attempts to find better outcomes. This approach lets developers experiment with different search strategies, such as beam search or Monte Carlo methods, without deep modifications to their original code.
In practical tests, EnCompass showed dramatic improvements in coding efficiency. For example, when applied to a Python agent designed to translate an entire software repository from one language to another, it cut the extra code needed for search logic by about 82 % compared to manually written search routines. It also improved accuracy substantially when trying different search strategies, identifying more successful paths in tasks involving code translation and digital grid transformation.
Looking ahead, the MIT team plans to extend EnCompass to broader real-world applications, such as assisting in scientific experiment design, managing large code bases, and facilitating human-AI collaboration on complex problems. By modularizing the interplay between agents and LLM calls, EnCompass promises a more systematic and scalable way to harness AI agents in both research and industry settings.