Multimodal Refund Audit System

Multimodal Refund
Audit System

A Generative AI proof-of-concept that automated image-based refund claim auditing at HungerStation — validated internally, then adopted as a global standard by the parent company, Delivery Hero.

Internal project — architecture and outcomes shared at a high level. Model details, datasets, and operational thresholds are confidential to Delivery Hero.

The problem

Food delivery platforms process thousands of refund claims daily. A significant portion of those claims involve image evidence — photos of incorrect, missing, or damaged items submitted by customers. Manually reviewing each image to determine claim validity is slow, inconsistent, and doesn't scale.

The challenge: build a system that could look at an image, understand its content in context, and make a reliable audit decision — matching or exceeding human reviewer accuracy, at scale.

How it works

Refund claim audit pipeline

Claim submitted

Customer image + metadata

→

Multimodal LLM

Analyses image + claim context

→

Evidence scoring

Validity, consistency, confidence

Valid — approve

Refund processed automatically

←

Decision routing

Above threshold → auto-resolve

→

Flagged — review

Low confidence → human queue

The model evaluates the image against the claimed issue, checking for visual consistency between the order contents and the reported problem. Claims above a confidence threshold are resolved automatically; ambiguous cases are routed to human reviewers with model reasoning attached — significantly reducing review time even in the manual queue.

Global adoption

Starting as a local PoC, this system's validation at HungerStation gave Delivery Hero enough confidence to invest in a production-grade version. The core design decisions — multimodal evidence analysis, confidence-based routing, and human-in-the-loop fallback — were preserved in the global implementation.

Design decisions

◈

Multimodal reasoning

The model jointly processes the image and textual claim context — not just the image in isolation. Reduces false positives from ambiguous photos.

⬡

Confidence-based routing

High-confidence decisions are resolved automatically. Low-confidence cases go to human reviewers with model reasoning — accelerating manual review rather than replacing it.

◎

Curated training data

Built and maintained a dataset of labelled claim images to improve model reliability — shared upstream with Delivery Hero for their global rollout.

▦

Audit trail

Every model decision is logged with its reasoning — maintaining accountability and enabling continuous model improvement over time.

Multimodal RefundAudit System

The problem

How it works

Global adoption

Design decisions

Built with

Multimodal Refund
Audit System