好色先生TV

好色先生TV

Optimal Termination in LLM-Based Decision Support

By Grace Liu

& Norman Gottron

The integration of Large Language Models (LLMs) into high-stakes decision-making architectures—such as disaster response simulations and maternal health chatbots—requires a sophisticated balance between exploration (information seeking) and exploitation (task execution). In real-world scenarios, information is often underspecified; an agent that terminates its inquiry too early risks making decisions based on incomplete data, while one that asks excessive questions compromises communicative and computational efficiency. Our foundational research at AI-SDM addresses this by treating the decision to "stop asking and start acting" as an optimal termination problem. We provide a principled framework for LLMs to quantify their own information sufficiency.

A plot showing termination behavior with and without the proposed approach
A schematic illustration of the termination behavior of models with and without our proposed approach. While LLMs typically fail to recognize the best points to stop thinking or questioning often either overshooting or undershooting the amount of information needed (a - top, b - left), our approach CaRT imbues them with the ability to correctly identify this point.

To overcome the limitations of off-the-shelf models—which frequently struggle to recognize when they lack the necessary context to solve a problem—we have developed developed CaRT (Categorical Rational Termination), a scalable, automated methodology for dataset augmentation. Rather than relying on expensive human annotations, our approach automatically labels existing reasoning datasets for information sufficiency. This allows us to train models using Supervised Fine-Tuning (SFT) to terminate strategically. This function enables the agent to evaluate the expected future utility of continuing a dialogue versus terminating to provide a final recommendation, effectively teaching the model to "know when it knows enough." We demonstrate the efficacy of this approach in simulated medical diagnosis conversations.

The impact of this work is particularly evident in our Decision Language Agents designed for public health resource allocation. For instance, in maternal health programs, agents must translate nuanced human language commands into specific reward objectives for AI algorithms. A model equipped with optimal termination logic can engage in informed dialogue with health officials to resolve ambiguities in policy priorities before committing to a resource distribution plan. This ensures that the resulting AI-guided policies are not only efficient but also precisely aligned with human intent and societal goals.

Moving forward, our efforts are focused on refining these termination behaviors within increasingly complex, multi-step information-seeking environments. By identifying the inherent trade-offs between downstream task performance and communication overhead, AI-SDM is establishing the theoretical and empirical groundwork necessary for deploying autonomous agents in dynamic, resource-constrained environments. These advancements ensure that as AI becomes more integrated into societal decision-making, it remains a reliable and efficient partner for human experts.