Enabling Agentic Workflows in LLMs: Tool Augmentation, Function Calling, and Automated Task Completion

To tackle complex, multi-step tasks like building a flight booking system, generating a presentation, or analyzing a personal GitHub repository, we can design multi-agent systems. These systems model a collaborative human environment where different specialists work together. For example, a task might be broken down and assigned to different agents, each with a specific role, such as: 1. Project Manager Agent: Responsible for planning the overall task, breaking it down into sub-tasks, and coordinating other agents. 2. Worker/Developer Agent: Executes specific sub-tasks, such as writing code, searching for information, or making calculations. 3. Tester/Verifier Agent: Reviews the output of other agents, checks for correctness, and provides feedback for revision.

The foundation for building these effective multi-agent systems is enabling Large Language Models (LLMs) to overcome their inherent limitations by skillfully using a variety of external tools.

Questions 1. How to teach llm, able to use the tools, to complete thier task given ? 2. How can we make tools usefull to llm models ? 3. How to integrate/expose api with to the llms for making api function call and getting response for completion of complex tasks ? 4. how LLMs can be taught to use any APIs, such as those for booking flights or transferring money? 5. how LLMs can solve complex, multi-tool tasks, like booking a flight or creating a presentation, by planning and executing a series of actions.

Objective - Incorporating tools like calculators or search engines during the fine-tuning process of LLMs, enabling the model to learn how to leverage specific tools. - Enabling LLMs to use any APIs, such as those for booking flights or transferring money. -

What is LLM ?

Transformer architecture trained on vast amounts of text to learn language patterns, meaning, and context, enabling them to perform nlp tasks like answering questions, summarizing information, translating languages, and generating new text.

Decoder-Only Transformer : GPT-Family models,

Encoder-Only Transformer : BERT

Encoder-Decoder Transformer : T5

What is a Tool ?

Any external module that a LLM can interact with to acquire new information for a given input or call. Thus, model interact with the tool by generating an api call and recieves the tool output which then incorporates into its subsequent text generation.

Examples : Calculator, Search Engine, Code Interpreter, Calender API, Database API, Knowledge Base.

Why should we needed to augment large language models with tools ?

Large Language Models (LLMs), despite their impressive capabilities, suffer from several limitations. 1. Inability to Access Real-Time Information : LMs are trained on a fixed dataset. Their knowledge is frozen at the time of training, making them unable to access real-time information or events that occurred after their training cut-off or even hallucinations. Solution by integrating a search engine, an LLM can query the internet in real-time to find current information, verify facts, and provide up-to-date, evidence-backed answers.

Lack of Precise Mathematical and Logical Reasoning : LMs often struggle with tasks that require precise symbolic or mathematical reasoning, such as arithmetic, logic puzzles, or multi-step calculations. Their "reasoning" is pattern matching, not a formal computational process. Solution by giving an LLM access to a calculator or, more powerfully, a Python interpreter, it can offload mathematical and logical computation to a tool designed for precision. The LLM's role becomes identifying the need for a calculation, formulating it correctly for the tool, and integrating the precise result back into its response.

How to tool alignment with llm can be done using finetuning ?

References and Articles⚓︎

This article is based on the my learning from the following lectures and references. 1. Tool Augmentation: Incorporating Tools during finetuning 2. Function Calling: Teaching LLMs to Use any APIs 3. Agentic WorkFlows : Autmating Complex, Multi-step Tasks