MIT researchers have introduced an innovative framework that significantly improves the ability of large language models (LLMs) to tackle complex planning problems. This approach enables LLMs to function as “smart assistants,” effectively translating intricate planning challenges into optimization problems solvable by specialized software.
The researchers call their framework as LLM-Based Formalized Programming (LLMFP)
Imagine a coffee company trying to optimize its supply chain. The company sources beans from three suppliers, roasts them at two facilities into either dark or light coffee, and then ships the roasted coffee to three retail locations. The suppliers have different fixed capacity, and roasting costs and shipping costs vary from place to place.
The company seeks to minimize costs while meeting a 23 percent increase in demand.
Wouldn’t it be easier for the company to just ask ChatGPT to come up with an optimal plan? In fact, for all their incredible capabilities, large language models (LLMs) often perform poorly when tasked with directly solving such complicated planning problems on their own.
Rather than trying to change the model to make an LLM a better planner, MIT researchers took a different approach. They introduced a framework that guides an LLM to break down the problem like a human would, and then automatically solve it using a powerful software tool.

A user only needs to describe the problem in natural language — no task-specific examples are needed to train or prompt the LLM. The model encodes a user’s text prompt into a format that can be unraveled by an optimization solver designed to efficiently crack extremely tough planning challenges.
During the formulation process, the LLM checks its work at multiple intermediate steps to make sure the plan is described correctly to the solver. If it spots an error, rather than giving up, the LLM tries to fix the broken part of the formulation.
When the researchers tested their framework on nine complex challenges, such as minimizing the distance warehouse robots must travel to complete tasks, it achieved an 85 percent success rate, whereas the best baseline only achieved a 39 percent success rate.
The versatile framework could be applied to a range of multistep planning tasks, such as scheduling airline crews or managing machine time in a factory.
Source: MIT News