Module 2 · Day 1

📘 Where AI Wins (and Where It Doesn't) for Finance

Not every finance task needs Generative AI. Pick the wrong use case and you'll waste budget, erode trust, and slow your team. This module gives you a decision framework, finance-specific use cases by function, and a priority matrix you can take to your team this week.

Day 1: Foundation Decision Framework 32 Finance Use Cases By Function

The Question Every Finance Leader Should Answer First

Your team has 200 ideas for "where we could use AI." Most of them won't survive a 90-day pilot. The ones that succeed share three traits: they tackle unstructured work, they have a human in the loop for the final call, and they replace repeated cognitive effort — not one-off creative work.

⏱️

The cost of picking wrong

A pilot that runs for 6 months on the wrong use case doesn't just cost the licensing — it costs credibility. Once a finance team has seen "AI didn't work here," getting buy-in for the next attempt is twice as hard.

🎯

The shape of a winning use case

High volume, repeated cognitive work, judgement-required (not pure math), tolerates a human review step, has bounded scope. Variance commentary fits. Calculating tax liability does not.

🚦

Lead with use cases, not technology

"We need to use AI" is the wrong starting point. "We spend 6 hours a week writing month-end variance commentary" is the right one. The use case picks the technology, never the other way around.

📈 What good looks like for finance: ExxonMobil saved 30,000 hours on a major capital project through AI document review. McKinsey estimates supply-chain cost reduction potential at $290B–$550B across industries. AWS reports engineers spend ~60% of their time searching for data — most of that is recoverable. These aren't AI numbers; they're finding the right use case numbers.

The "Lead with Use Cases" Test

Before any new AI initiative, run it through these four questions. If you can't answer all four, the use case isn't ready.

#QuestionWhat a good answer looks like
1What is the specific business problem?"Our AP team takes 5–10 minutes per invoice on extraction. We process 8,000/month. We want to cut that by 60%."
2Who benefits, and how do we measure success?"AP analysts get hours back. Success = avg time per invoice ≤ 90 sec by month 3, with ≤ 2% error rate."
3Is the data available and reliable?"Yes — we have 12 months of historical invoices and PO data in our ERP. Quality is good."
4What's the ethical / safety / regulatory angle?"Approval threshold is SGD 50K — anything above is human-only. Audit trail required for SOX. No PII in prompts."
💡 Reality check for finance leaders: Question 4 is where most finance pilots stall in audit review. Answer it before building, not after. The Day 1 Governance & Trust module (Module 7) covers this in depth.

GenAI vs Traditional ML vs Rules vs Manual

The most expensive mistake in AI adoption is using the wrong technique. GenAI is not a hammer for every nail. Here's the decision framework finance teams actually need.

Criteria✨ Generative AI📊 Traditional ML📋 Rules / Scripts👤 Manual
Best for Narrative, summarization, classification with judgement, Q&A over documents, drafting Numeric prediction, anomaly detection, pattern recognition from structured data Deterministic calculations, lookups, threshold checks, standard procedures One-off creative judgement, novel decisions, anything irreversible
Data type Unstructured: text, PDFs, emails, contracts, transcripts Structured: tabular, time-series, labelled training data Structured: lookup tables, formulas, decision trees Whatever the human consumes
Output Generated text, summaries, classifications with reasoning, drafts Predictions, scores, classifications (no reasoning text) Exact deterministic results, pass/fail flags The judgement itself
Finance example "Draft variance commentary for the May P&L vs forecast" "Score this transaction 0–1 for fraud risk based on historical patterns" "If invoice > SGD 50K, route to Head of Finance" "Should we acquire this competitor?"
Accuracy posture Good draft, human reviews High precision (95%+ on the trained task) Exact & auditable The human is the audit trail
Explainability Can cite sources (RAG); reasoning is probabilistic Feature importance, partial Fully transparent — every step traceable Whatever the human writes down
Audit defensibility Medium — needs human sign-off + source attribution Medium — model card + drift monitoring required High — every rule is in code High — the human signs

Decision shortcuts

Use GenAI when…

  • Large volumes of unstructured documents to read or summarize
  • Repetitive narrative work (commentary, disclosures, briefs)
  • Classification that needs context (this is RED because…)
  • Q&A over a body of policies, contracts, or regulations
  • Multi-step tasks involving judgement under ambiguity
📊

Use Traditional ML when…

  • Predicting future events from labelled historical data
  • Detecting anomalies in time-series (unusual spend, fraud patterns)
  • Need high precision with measurable confidence scores
  • The output is a number, score, or class — not text
  • Have ≥10K labelled examples to train on
📋

Use Rules / Scripts when…

  • Same input always produces same output (currency conversion, GST)
  • Threshold-based routing or escalation
  • Regulatory compliance checks with binary pass/fail
  • Volume too low to justify model spend
  • Audit trail must be 100% deterministic
👤

Keep Manual when…

  • The decision is irreversible (M&A, write-offs, executive sign-off)
  • No precedent or training data exists
  • Stakes are too high for any failure mode
  • Personal accountability is the point (CFO certification)
🤝 The hybrid is usually the answer. The strongest finance solutions combine techniques. ML detects an anomalous transaction → GenAI explains it in plain English. Rules enforce the SGD 50K threshold → GenAI drafts the escalation memo. Think of GenAI as the communication layer on top of precise computational systems.

Use Cases by Function

Tailored to the cohort. Each function has the highest-impact GenAI use cases — drawn from real finance org transformations and matched to the daily reality of your team. Click your function to expand.

Persona: Cluster Procurement Heads, Regional Category Heads, Category Managers (SG/MY/PH). The largest cohort in this workshop — 7 of 28 participants. Daily reality: vendor onboarding, contract review, supplier scorecards, RFP evaluation, spend analysis.

📑Contract clause review
Read a 40-page MSA and flag deviations from your standard template — payment terms, liability caps, IP clauses, indemnities. Generates a redline summary with severity ratings. Human signs final.
GenAI + RAGQuick Win
🏷️Supplier risk scorecards
Synthesize public-source signals (news, sanctions lists, financial filings) + your internal data into a GREEN/AMBER/RED rating with reasoning. Quarterly refresh, exception alerts in between.
Hybrid (ML + GenAI)High Impact
📊RFP / RFQ response evaluation
Score 12 vendor proposals against your evaluation criteria. Extract pricing, SLAs, certifications. Highlight gaps and outliers. Reduces 2-day evaluation to 2 hours of human review.
GenAI + RAGQuick Win
💬Negotiation prep briefs
"Brief me for tomorrow's call with Vendor X" — combines historical spend, current contract terms, recent service issues, market benchmarks. Replaces 3 hours of pre-meeting research.
GenAIDaily Use

Persona: Heads / Managers, Finance Controllership across SG/TH/VN + Regional Controllership (Deliveries & Mobility). 4 participants. Daily reality: period-close, intercompany matching, journal review, reconciliation, statutory accounts.

🔄Intercompany reconciliation explainer
When intercompany balances don't tie, GenAI reads both sides' GL postings + descriptions and proposes the most likely reason — timing diff, FX, missed accrual, classification mismatch. Speeds up dispute resolution.
Hybrid (Rules + GenAI)Quick Win
📝Period-close narrative
Auto-draft the month-end commentary: "Revenue +4.2% MoM driven by SG market expansion; opex flat; one-off SGD 800K legal accrual reversed in May." Pulls from your close pack data; human reviews and signs.
GenAI + Data PluginHigh Impact
🧾Journal entry review assistant
Check journal descriptions for completeness, flag entries with vague memos ("adjustment per K"), suggest the audit-trail-grade rewrite. SOX-friendly. Catches what a tired reviewer misses at hour 9 of close.
GenAIQuick Win
📚Statutory accounts disclosure draft
Draft notes-to-accounts disclosures from underlying movement schedules. Knows the IFRS / local-GAAP language. Speeds up the most tedious part of statutory filing season.
GenAI + RAGHigh Impact

Persona: Senior Manager + Managers, Financial Planning & Analysis (Singapore + Regional). 4 participants. Daily reality: variance analysis, forecast updates, board pack prep, scenario modelling, business-partner conversations.

💬Variance commentary
Auto-draft the "why did revenue miss by 6%" commentary. Combines actuals vs forecast deltas + driver-tree decomposition + qualitative context from Slack/email threads. The flagship FP&A use case.
GenAI + Data PluginFlagship
🎲Scenario narrative drafting
"Given +10% wage inflation in PH, model the impact and draft the board summary." Runs the model (Excel/Python) → GenAI writes the executive narrative. Hybrid use case showing the deterministic + generative pattern.
Hybrid (Script + GenAI)High Impact
📈Board pack first draft
Convert your monthly KPI dashboard into a 3-page CFO deck — exec summary, key drivers, risks & opportunities, ask of the board. Saves 4–6 hours of layout + first-pass writing per cycle.
GenAIQuick Win
🤝Business-partner Q&A prep
Before your business review with the BU lead, ask: "What 5 questions will they push back on?" GenAI mines historical reviews + current variance and gives you the likely challenges with backing data.
GenAIDaily Use

Persona: Heads + Senior Managers, Audit Innovation & Analytics, Technology Audit, Insure Audit, AnyCompany Integrity Unit. 5 participants. Will pressure-test hallucination examples — care most about defensibility, controls, and audit trail.

🔍Control testing memo drafting
Given a control description + sample evidence, draft the test memo: design effectiveness, operating effectiveness, exceptions, conclusion. Human signs. Speeds up routine SOX cycle work.
GenAI + RAGQuick Win
⚠️Anomaly explanation
ML model flags an unusual transaction → GenAI explains in plain English why it's unusual (vs the population), what controls should have caught it, and what evidence to request. The "translator" pattern.
Hybrid (ML + GenAI)High Impact
📋Sample selection narrative
Document the rationale for your sample selection (risk-based, attribute-based, monetary-unit). Explains the methodology to reviewers. Audit-defensible by design.
GenAIQuick Win
🚨Whistleblower / integrity case triage
First-pass triage of integrity reports: classify by category, summarize the allegation, suggest evidence to gather, route to the right investigator. Human always reviews — high-stakes domain.
GenAISensitive — Human Always Reviews

Persona: Head of Tax, Senior Tax Manager, Assistant Manager Tax Management. 3 participants. The single most "RAG-shaped" persona — daily work is reading regulations and writing positions. Tax bench is small but every member benefits.

📜Regulation tracking & impact summary
"Singapore IRAS just released a new GST circular — what changes for our merchant settlement flow?" GenAI reads the circular against your current tax positions and drafts the impact memo. Cross-jurisdictional version applies SEA-wide.
GenAI + RAGFlagship
🌏Treaty & transfer pricing research
"What's our withholding rate for a SG → VN service payment under the DTA?" Q&A grounded in your library of treaties, OECD guidelines, and historical positions. Cites the source clause every time.
GenAI + RAGFlagship
📝Tax position memo drafting
Convert your conclusion + supporting analysis into a structured tax memo (background, position, alternatives considered, citation, conclusion). The tax-firm format leaders recognise. Defensible & review-ready.
GenAI + RAGQuick Win
Internal tax helpdesk
Internal teams ask "is this expense GST-claimable?" or "do I need a withholding cert?" — GenAI answers from your tax policy library + recent rulings, with citations. Triages routine queries; escalates novel ones.
GenAI + RAGDaily Use

Persona: Head of Corporate Reporting, Group Reporting Managers (incl. Regional). 3 participants. Daily reality: disclosure drafting, MD&A, analyst-facing materials, reporting cycles for board / auditors / regulators.

📰MD&A first draft
From your KPI book + forecast variance, draft the "Management Discussion & Analysis" section in your house style. Addresses the questions analysts ask. Human revises & signs.
GenAI + RAGFlagship
🗂️Disclosure note drafting
Draft IFRS / local-GAAP notes from underlying movement schedules. Maintains consistency across periods. Speeds up the most format-heavy work in any reporting cycle.
GenAI + RAGQuick Win
🎤Analyst Q&A prep
"What will analysts ask on Tuesday's call?" Trained on past transcripts + this quarter's actuals + competitor performance. Generates likely questions and your strongest factual answer for each.
GenAIDaily Use
📑Cross-period consistency check
Compare this quarter's draft against the last 4 quarters' filed versions — flag tone shifts, numerical inconsistencies, vanished disclosures. The "auditor's first question" check.
GenAI + RAGQuick Win

Persona: Senior Manager, Finance & Treasury. 1 participant. Smaller bench but high-leverage role — cash forecasting, FX exposure, banking relationships, debt servicing.

💵Cash forecast commentary
Draft the weekly cash narrative — opening position, inflows/outflows by category, forecast vs actual variance, exception items requiring attention. Pairs with your existing cash model.
Hybrid (Script + GenAI)Quick Win
💱FX exposure briefings
"What's our SGD/IDR exposure this month, and what are the hedge implications if IDR moves ±5%?" Combines exposure data + market commentary into an executive brief.
Hybrid (ML + GenAI)High Impact
🏦Bank covenant compliance check
Read the loan agreement → check current ratios & metrics against covenants → flag headroom and breach risk → draft the compliance certificate language for the CFO to sign.
GenAI + RAGQuick Win
💼Bank pitch deck prep
Pulling together a refinancing or new-facility pitch — current capital structure, financial highlights, ratings rationale. GenAI structures the first draft from your standard data sources.
GenAIProject

Persona: Assistant Manager, Finance Data Solutions. 1 participant. The bridge persona between finance & IT — translates business questions into data work and back.

🗂️Ad-hoc query translator
Finance user asks "show me top 10 vendors by spend in Q2" — GenAI converts to SQL against your data warehouse, runs it, narrates the result. Reduces the queue of low-priority data requests.
GenAI + Data PluginFlagship
📊Dashboard narrative generation
Auto-generate the "what does this dashboard say" commentary that executives actually read. Updates as numbers refresh. Replaces the email with the screenshot.
Hybrid (Script + GenAI)Quick Win
🧰Data quality issue triage
When a metric jumps, GenAI cross-checks ETL logs, source-system changes, and recent dimension updates to suggest the most likely root cause. Speeds up the "why is this number wrong?" investigation.
Hybrid (Logs + GenAI)Quick Win
📚Data dictionary & lineage Q&A
"Where does 'net revenue' come from in our P&L?" Q&A over your data dictionary, lineage docs, and ELT definitions. Onboards new analysts in days, not weeks.
GenAI + RAGOnboarding
📌 Pattern across all 8 functions: Almost every flagship use case involves RAG (grounding to your documents) or hybrid (deterministic + GenAI). Pure GenAI alone is rare in finance — your data and your rules are too valuable to leave on the table. Module 9 (RAG) and the Day 2 agent build cover both patterns.

What GenAI Doesn't Do Well — Yet — for Finance

Knowing the failure modes is more useful than knowing the wins. Here's where finance leaders should not reach for GenAI, and what to do instead.

Don't use GenAI for…Why it failsWhat to use instead
Calculating tax liability Requires exact arithmetic + interpretation of statute. GenAI may hallucinate a section number or miscompute. Tax authorities won't accept "the model said so." Tax engine (rules) for the calculation; GenAI for the explanatory memo.
Reconciling balances to the cent Floating-point summation, currency conversion, rounding rules. Exactness is the point. GenAI's "almost right" is wrong. Recon engine (rules); GenAI to explain the resulting break.
Approval decisions above threshold Material exposure means accountability stays with a named human. Audit and regulators expect it. Hybrid: AI drafts the recommendation + reasoning; human signs.
Predicting fraud probability scores Better solved by a trained classifier on labelled fraud history. GenAI doesn't know your fraud signature. Traditional ML for the score; GenAI to explain why the score is high.
Anything novel without precedent GenAI works from patterns it has seen. New regulation? New product? New jurisdiction? It will improvise — and improvisation in finance is risk. Human first, AI second. Once you have 10–20 examples, revisit.
Real-time financial data feeds The model's training data is static; it doesn't know today's rate, today's balance, today's posting. Without grounding, it makes up plausible numbers. Connect to live data via plugins/MCP. Always cite the data timestamp.
Pure summarization of numbers "Summarize this P&L" — a chart does this better. GenAI text adds nothing if the numbers themselves are the message. Visualization. Use GenAI for the narrative around the numbers.

The five common failure modes — name them in your team

🎭

1. Confident hallucination

The model invents a regulation, a section number, a historical figure. Sounds right. Catch with: source attribution, RAG, cross-check.

📅

2. Stale knowledge

Model's training cutoff was months ago. Doesn't know about the new IRAS circular issued last week. Fix with: RAG over your current document library.

🔢

3. Arithmetic drift

Long multiplications, percentages, currency conversions — the model gets close but not exact. Always verify numbers; route them through a script.

💼

4. Tone & style drift

Without explicit style guidance, output sounds generic. Audit committee disclosures don't read like a marketing email. Fix: persona prompts + style examples.

🪞

5. Sycophancy

Models often agree with the framing of the question. "Is this control adequate?" gets a more positive answer than "Audit this control for adequacy and weakness." Frame for challenge, not agreement.

🔄

6. Inconsistency between runs

Same prompt, different answers. Acceptable for drafting; problematic for regulated outputs. Fix: low temperature, RAG grounding, structured-output schemas.

🛡️ The verification posture for finance. Every GenAI output that reaches a human or a system needs three things: (1) a citation to the source data, (2) a confidence statement when uncertainty is material, (3) a path back to the input the model received. If you can't supply those three things, the output isn't ready for finance use. Module 7 (Governance & Trust) builds this in detail.

Priority Matrix — Where to Start

You've got 28 use cases (across 8 functions) on the previous tab. You can't pursue all of them at once. The Value × Effort matrix below is the standard way leaders pick the first three.

⏳ EVALUATE High value · High effort ⭐ DO FIRST High value · Low effort ⏸ AVOID Low value · High effort 📚 LEARNING Low value · Low effort (upskill) Effort → lower <----> higher ↑ Value Variance commentary (FP&A) Tax regulation tracking Period-close narrative Contract clause review Supplier risk scorecards MD&A first draft Anomaly explanation Negotiation prep briefs Internal tax helpdesk Q&A prep for analyst calls Calc tax liability via GenAI Pure number summarization
Do first — flagship wins
Evaluate — high value, needs investment
Learning — try them to build team confidence
Avoid — wrong tool for the job

How to score your own use case

For each candidate, score Value (1–5) and Effort (1–5). Map onto the quadrants. Pick 1–3 from Do First for your first 90 days; 1 from Evaluate as a structured pilot; ignore Avoid.

Score this1 (low)5 (high)
Value — hours saved/week< 2 hours> 20 hours
Value — error reductionMarginalMaterial risk reduction
Value — strategic fitTangentialCritical to org priority
Effort — data readinessClean & availableNeeds extraction & cleansing
Effort — process changeDrop-in toolNew SOP + training + change mgmt
Effort — governanceLow risk, no PIISensitive data, audit-grade trail required
🎯 The 90-day rule for finance: Pick one "Do First" use case. Run it for 90 days with a small team. Measure hard outcomes (time, error rate, satisfaction). Only after the first one ships do you start the second. Parallel pilots without a win first is the fastest way to lose credibility.

Implementation Phases — Realistic Timeline

Adoption isn't a switch. It's a maturity curve. Here's what good progress looks like for a finance team going from zero to running multiple agents — phased to manage risk and build confidence.

Phase 1

🌱 Quick Wins

Months 1–3

Replace repetitive narrative work with prompt templates. No agents, no automation, no IT involvement. Each individual saves 2–4 hours/week.

  • Variance commentary template (FP&A)
  • Period-close narrative template (Controllership)
  • Tax memo first-draft template (Tax)
  • Contract clause review template (Procurement)
  • Outcome: one prompt template per person, used daily
Phase 2

🔁 Operational Use

Months 3–6

Convert templates to Skills in Claude Cowork. Add Project Instructions for governance. Connect to one source of truth (e.g., your data warehouse via Plugins). Team-wide adoption.

  • Saved Skills replace ad-hoc prompting
  • Project Instructions enforce house rules (SGD default, no PII, escalation thresholds)
  • One agent runs daily on a single use case (e.g., invoice processing)
  • Outcome: 1 saved Skill per use case, 1 agent in production at L1–L2
Phase 3

🚀 Transformation

Months 6–12

Multiple agents in parallel. Scheduled Tasks running overnight. Human review by exception only. ML + GenAI hybrids on flagship workflows. Governance baked in from day one.

  • 3–5 agents in production, each owning a workflow
  • Scheduled overnight runs reduce daily backlog
  • ML + GenAI hybrid on the highest-volume workflow (e.g., audit anomaly triage)
  • Outcome: team scope expanded; AI handles routine work; humans focus on judgement

What you should be doing differently in 12 months

Today (baseline)12 months from now
Each analyst writes their own variance commentary from scratchFP&A team uses a shared Skill; output is consistent and reviewed by exception
Tax answers come from individual research + Word memosTax helpdesk Skill answers 60% of routine queries with citations; tax bench focuses on novel positions
Procurement reviews contracts manually one at a timeContract review Skill flags deviations in seconds; analysts work the exceptions
Audit's anomaly investigation starts at "look at the data"ML flags it; GenAI explains it; auditor starts at "is the explanation reasonable?"
Period-close narrative drafted manually each cycleAuto-drafted from close pack; reviewer adjusts and signs

What good adoption looks like — leading indicators

📊

Usage signals

People log into Cowork at least 3 days/week. Skills are activated daily. Project Instructions are updated when policy changes. Activity is the leading indicator of value.

⏱️

Outcome signals

Time-to-draft on flagship outputs (variance commentary, MD&A, tax memos) drops 40%+ within 6 months. Error rate stays flat or improves. Reviewer effort shifts from drafting to challenging.

🛡️

Governance signals

Audit trail captures inputs + outputs + reviewer. Policy violations are flagged not silently fixed. Quarterly review of active Skills catches stale ones. This is the difference between AI adoption and AI risk.

🎓 Where the rest of Day 1 + Day 2 fit. The remaining Day 1 modules give you the tools (M3–M6: how LLMs work + costs, M8–M9: prompt engineering + RAG). Module 7 (Governance) gives you the safety. Day 2 gives you the execution (build your first agent in Cowork). By the end of Day 2 you'll have a Phase 1 quick win running and a Phase 2 plan on paper.