The problem
Food delivery platforms process thousands of refund claims daily. A significant portion of those claims involve image evidence — photos of incorrect, missing, or damaged items submitted by customers. Manually reviewing each image to determine claim validity is slow, inconsistent, and doesn't scale.
The challenge: build a system that could look at an image, understand its content in context, and make a reliable audit decision — matching or exceeding human reviewer accuracy, at scale.
How it works
The model evaluates the image against the claimed issue, checking for visual consistency between the order contents and the reported problem. Claims above a confidence threshold are resolved automatically; ambiguous cases are routed to human reviewers with model reasoning attached — significantly reducing review time even in the manual queue.
Global adoption
Starting as a local PoC, this system's validation at HungerStation gave Delivery Hero enough confidence to invest in a production-grade version. The core design decisions — multimodal evidence analysis, confidence-based routing, and human-in-the-loop fallback — were preserved in the global implementation.