Module 1 (Day 2) — From LLMs to Agents | AnyCompany Leader Workshop

The Question Every Finance Leader Should Answer

Your team uses Claude or AnyCompany GPT every day. They draft variance commentary, summarise contracts, classify expense categories. That's a chatbot. When the work involves checking a system, deciding what to do, and taking action across multiple steps — chatbots fall over. That gap is what agents are built to close.

💬

What chatbots are great at

Reading text, drafting text, transforming text. Single-turn, single-pass, single-output tasks where the human reads the answer and decides what to do next.

🚧

Where chatbots run out of road

The moment the task needs fresh data ("what's the current FX rate?"), system access ("look up this PO"), or action ("flag this invoice for approval") — the chatbot can only describe what should happen, not make it happen.

🤖

What an agent adds

An agent is an LLM plus tools, plus a reasoning loop. It can check live systems, decide which step is next, take that step, observe the result, and decide what to do next — all without re-prompting the human at each step.

A Concrete Failure — "Where's My Refund?"

Imagine a finance ops analyst asks an LLM: "Vendor #4521 says we processed an invoice for SGD 23,800 last Tuesday but the refund hasn't landed. The vendor wants confirmation today. What should I tell them?"

A standalone LLM can write a polite reply — but it cannot check the payment ledger, query the bank rail, verify the chargeback queue, or trigger a status update. Watch the failure cascade:

// Vendor escalation received... check_payment_ledger("INV-4521") → ❌ NO API ACCESS query_bank_rail_status("SGD-RTGS") → ❌ NO REAL-TIME DATA post_status_update("vendor_4521") → ❌ NO ACTION CAPABILITY // Best the LLM can offer: "I'm sorry for the inconvenience. Please contact support..."

Why "Just Ask the LLM Harder" Doesn't Work

The temptation is to write a longer, more detailed prompt. That fixes nothing. The chatbot still has no eyes on your systems, no hands on your tools, and no working memory between calls. The wall is structural, not prompt-quality.

No real-time dataCan't see the current FX rate, today's chargeback queue, or last hour's transactions.

No actionsCan't post a journal entry, flag a transaction, or send an SLA-bound notification.

No memory between callsForgets what it just did. Re-asks for context every turn at scale.

Hallucinations under pressureWhen uncertain, invents plausible-but-wrong numbers, citations, or policy references.

One-shot reasoningCan't say "check this, then if X, do Y, otherwise do Z" reliably across many steps.

Cost at scaleLong prompts × many concurrent users = a budget incident waiting to happen.

📈 The leadership read: Chatbots are a productivity tool — they make humans faster at single tasks. Agents are an operational tool — they take work off the human's plate entirely. The shift isn't about "better AI". It's about which jobs you can hand to a system that runs without you.

The Four Ideas That Turn an LLM Into an Agent

Agents didn't appear from a single breakthrough — they emerged from four innovations that, combined, broke through the chatbot wall. Click any card to see why each one matters.

🔌

1. Tool Use (Function Calling)

The LLM can call your APIs and read the result.

🧠

2. Chain-of-Thought Reasoning

Break a goal into steps, then run them.

👁️

3. Multimodal Understanding

Read PDFs, photos, screenshots — not just text.

🏗️

4. Agent Frameworks

The plumbing that runs the loop for you.

🔌 Tool Use — The Core Unlock

By default an LLM only produces text. Tool use changes the contract: the model is told "these functions exist — get_invoice(id), check_po(id), post_journal(entry) — call any of them when you need to". The LLM responds with a structured tool call; the framework executes it and feeds the result back. The loop continues until the goal is met.

For finance: Your existing systems — ERP queries, bank rail status, the chargeback ledger, the spend cube — become things the agent can use. The LLM does the reasoning ("which invoices need follow-up?"), your code does the action (the actual database query, the actual post). You keep control of the action layer.

The Four Phases — From Answering to Operating

AI for the enterprise has moved through four distinct phases. Each one keeps everything from the previous, then adds a capability. Click any phase to see what changed and how it shows up in finance.

📝

LLMs

Text in → Text out

💬

GenAI Assistants

+ Memory & RAG

🤖

GenAI Agents

+ Tools & Reasoning

🌐

Agentic Systems

+ Multi-Agent

📝 Pure LLMs — Text Generation Only

The starting point. The model takes a prompt and returns a completion based on patterns it learned during training. Fast, cheap, and powerful for transformations like summarisation, classification, and drafting. But entirely isolated — no eyes on your systems, no hands.

AnyCompany finance reality: A pure LLM can draft an MD&A paragraph from numbers you paste, classify whether an expense is "T&E" or "Marketing", or rewrite a vendor letter for tone. Useful — but the numbers themselves still come from a human pulling reports.

A Single Direction — Increasing Autonomy

Each phase pushes more of the work onto the system and less onto the human. That's the only axis that matters: how much of the loop is the system running, and how much do you still have to do yourself?

Capability	📝 LLM	💬 Assistant	🤖 Agent	🌐 Agentic System
Reads your prompt	✅	✅	✅	✅
Remembers conversation	—	✅	✅	✅
Searches your documents	—	✅	✅	✅
Calls your systems	—	—	✅	✅
Plans multi-step work	—	—	✅	✅
Coordinates with other agents	—	—	—	✅

What This Means for AnyCompany Finance

Every limitation of the LLM-only world maps to something finance teams already do by hand. Agents turn each one into a candidate for automation — with humans staying in the approval loop.

Before vs After — Vendor Escalation

Scenario	LLM only	Agent NEW
"Where's our refund for INV-4521?"	Drafts a polite holding reply	Checks payment ledger + bank rail status, calculates revised settlement window, posts confirmation back to vendor portal
"Variance vs forecast — Q3 EBIT"	Summarises numbers you paste	Pulls actuals from the cube, joins to last forecast, writes the commentary, flags the three drivers worth a meeting
"Is this expense GST-claimable?"	Quotes general principles	Reads the receipt, cross-checks against your tax policy library, drafts the position memo with citations
"Approve invoice batch for AP run"	Tells you what to look for	Validates 200 invoices against POs, surfaces the 7 exceptions for human approval, queues the rest for payment

Eight Agentic Use Cases Across the AnyCompany Cohort

Mapped to the eight functions in the room. Don't worry about which to pick yet — Day 2 ends with you choosing one for your team. These are the candidates worth knowing about.

📋Procurement — Contract Clause Reviewer

Reads incoming MSAs, flags clauses that deviate from your standard template, drafts the redline summary with severity ratings.

Workflow agentQuick win

🔍Audit — Control-Test Memo Drafter

Given the control + the evidence sample, drafts the design-effectiveness and operating-effectiveness memo, grounded in your control library.

Workflow agentQuick win

📊FP&A — Variance Commentary Writer

Pulls actuals + forecast + last quarter's commentary, writes the new month's narrative in house style, flags the three drivers worth a deeper look.

Hybrid agentFlagship

📑Reporting — Disclosure Note First-Drafter

From this period's movement schedules, drafts IFRS / local-GAAP notes consistent with prior periods, ready for the human signer.

Workflow agentFlagship

🧾Tax — Internal Tax Helpdesk

Answers "is this GST-claimable?" / "withholding rate for SG → VN service payment?" against your tax policy library, with citations.

Workflow agentRAG-grounded

🏦Controllership — Period-Close Narrative

Auto-drafts the month-end commentary from the close pack. Checks intercompany balances, surfaces unreconciled items, suggests root cause.

Hybrid agentFlagship

💰Treasury — Weekly Cash Narrative

Pulls cash positions across entities, drafts the week's narrative (opening, in/out by category, exceptions), highlights covenant-relevant moves.

Hybrid agentQuick win

🗂️Data & Analytics — Ad-Hoc Query Translator

Finance user asks "top 10 vendors by Q2 spend" — agent converts to SQL, runs against the warehouse, narrates the result back. Reduces the data-request queue.

Workflow agentFlagship

🎯 Next: Module 2 zooms in on what kind of agent each of these is. Workflow vs autonomous vs hybrid vs multi-agent — different shapes, different trade-offs, different costs.

From Chatbots to Agents

The Question Every Finance Leader Should Answer

What chatbots are great at

Where chatbots run out of road

What an agent adds

A Concrete Failure — "Where's My Refund?"

Why "Just Ask the LLM Harder" Doesn't Work

The Four Ideas That Turn an LLM Into an Agent

1. Tool Use (Function Calling)

2. Chain-of-Thought Reasoning

3. Multimodal Understanding

4. Agent Frameworks

🔌 Tool Use — The Core Unlock

The Four Phases — From Answering to Operating

LLMs

GenAI Assistants

GenAI Agents

Agentic Systems

📝 Pure LLMs — Text Generation Only

A Single Direction — Increasing Autonomy

What This Means for AnyCompany Finance

Before vs After — Vendor Escalation

Eight Agentic Use Cases Across the AnyCompany Cohort