HungerStation · Delivery Hero · 2024–25 Multimodal LLM Adopted Globally

Multimodal Refund
Audit System

A Generative AI proof-of-concept that automated image-based refund claim auditing at HungerStation — validated internally, then adopted as a global standard by the parent company, Delivery Hero.

Internal project — architecture and outcomes shared at a high level. Model details, datasets, and operational thresholds are confidential to Delivery Hero.
Year
2024 – 2025
Organisation
HungerStation
Scope
PoC → Global standard
Parent company
Delivery Hero

The problem

Food delivery platforms process thousands of refund claims daily. A significant portion of those claims involve image evidence — photos of incorrect, missing, or damaged items submitted by customers. Manually reviewing each image to determine claim validity is slow, inconsistent, and doesn't scale.

The challenge: build a system that could look at an image, understand its content in context, and make a reliable audit decision — matching or exceeding human reviewer accuracy, at scale.

How it works

Refund claim audit pipeline
Claim submitted
Customer image + metadata
Multimodal LLM
Analyses image + claim context
Evidence scoring
Validity, consistency, confidence
Valid — approve
Refund processed automatically
Decision routing
Above threshold → auto-resolve
Flagged — review
Low confidence → human queue

The model evaluates the image against the claimed issue, checking for visual consistency between the order contents and the reported problem. Claims above a confidence threshold are resolved automatically; ambiguous cases are routed to human reviewers with model reasoning attached — significantly reducing review time even in the manual queue.

Global adoption

🌐
Adopted as a Delivery Hero global standard
The architectural blueprint and curated datasets developed at HungerStation were shared with Delivery Hero. They served as the foundation and inspiration for the company's final global solution — scaling the approach across the international organisation.

Starting as a local PoC, this system's validation at HungerStation gave Delivery Hero enough confidence to invest in a production-grade version. The core design decisions — multimodal evidence analysis, confidence-based routing, and human-in-the-loop fallback — were preserved in the global implementation.

Design decisions

Multimodal reasoning
The model jointly processes the image and textual claim context — not just the image in isolation. Reduces false positives from ambiguous photos.
Confidence-based routing
High-confidence decisions are resolved automatically. Low-confidence cases go to human reviewers with model reasoning — accelerating manual review rather than replacing it.
Curated training data
Built and maintained a dataset of labelled claim images to improve model reliability — shared upstream with Delivery Hero for their global rollout.
Audit trail
Every model decision is logged with its reasoning — maintaining accountability and enabling continuous model improvement over time.

Built with

Multimodal LLM Computer Vision Python GCP BigQuery REST API Custom Dataset