---
title: "AI Week: Your AI Models Aren't Learning From Production Data"
description: "Unlike traditional ML, GenAI apps don't learn from production data. See how Opik and Zuplo AI Gateway enable continuous optimization for LLM applications."
canonicalUrl: "https://zuplo.com/blog/2025/10/02/comet-ml-opik"
pageType: "blog"
date: "2025-10-02"
authors: "josh"
tags: "AI"
image: "https://zuplo.com/og?text=Your%20AI%20Models%20Aren%27t%20Learning%20From%20Production%20Data"
---
There's a fundamental difference between traditional ML and GenAI development:
your AI applications don't get better over time. No matter if you have one user
or a billion, your LLM stays static, your prompts stay static, and your system
learns nothing from production data—unless you manually intervene.

This realization drove Gideon Mendels, CEO of [Comet ML](https://comet.com), to
build solutions for this problem. "Every ML team I ever spoke to retrains their
models with new production data," he explains in our latest conversation for AI
Week.

"But with these GenAI systems, there's no mechanism to learn from additional
data."

<YouTubeVideo videoId="M_rWY2qMEyk" />

## From Spreadsheets to Automated Optimization

Gideon's path to solving this problem started 12 years ago when he moved from
software engineering to ML at Google. Working on hate speech detection, he found
ML teams managing everything with spreadsheets and emails—a contrast to the
tooling developers had access to.

That observation led to Comet in 2018, and more recently, to
[Opik](https://www.comet.com/site/products/opik/), their open-source platform
for building production-ready LLM applications. In nine months, it's grown to
nearly [15,000 GitHub stars](https://github.com/comet-ml/opik).

## The Three-Step Path to Production AI

Gideon's suggested framework for successful GenAI applications has three
components:

**1. Instrument Everything**  
Wrap your OpenAI client or add a callback to whatever AI/LLM framework you're
using. In 30 seconds, you get complete observability—every input, output, tool
call, and execution trace. This gives you valuable insight data to start working
with.

**2. Build Evaluation Datasets**  
Create a test suite of sample questions and correct answers. This isn't
traditional unit testing (semantic meaning matters, not exact strings), but it
serves the same purpose: confidence that changes _improve_ rather than break
your application.

**3. Automate Optimization**  
Opik's Agent Optimizer uses reinforcement learning to automatically generate and
test prompt candidates, turning manual prompt engineering into an automated
optimization loop.

## The Continuous Improvement Loop

When you connect these pieces, production failures get added to your evaluation
dataset. Automated optimization runs generate new prompt candidates. A/B tests
validate improvements against live traffic. The result is something that
resembles traditional ML retraining, but for the LLM era.

You can see this in action in the video above, where we integrate Opik with
Zuplo's AI Gateway in under a minute. The combination provides centralized
governance, cost controls, and comprehensive observability, all of which can be
used completely for free.

## Why This Matters

The teams moving successfully from POC to production follow methodologies like
this. They don't just build; they measure, test, and continuously improve. They
treat AI development like software engineering, with evaluation suites,
regression testing, and automated optimization.

The alternative is static prompts, manual tuning, and applications that never
improve despite having thousands of users generating valuable feedback data.

## Try It Yourself

Opik is fully open source, you can find everything you need to get started on
[GitHub](https://github.com/comet-ml/opik), or at
[comet.com](https://www.comet.com/site/products/opik/).

Zuplo's AI Gateway is also
[available for free](https://portal.zuplo.com/signup?utm_source=comet-blog&utm_campaign=ai-week&utm_source=web&ref=aiweek-comet-blog),
and features a
[specific policy for working with Opik](https://zuplo.com/docs/ai-gateway/policies/comet-opik-tracing).

## More from AI Week

This article is part of Zuplo's AI Week. A week dedicated to AI, LLMs and, of
course, APIs centered around the release of our
[AI Gateway](https://zuplo.com/ai-gateway).

You can find the other articles and videos from this week below:

- Day 1: [AI Gateway Overview](/blog/zuplo-ai-gateway) with Zuplo CEO, Josh
  Twist
- Day 2:
  [Is Spec-Driven AI Development the Future?](/blog/spec-driven-ai-development)
  with Guy Podjarny, CEO & Founder of Tessl
- Day 2:
  [Using AI Gateway with LangChain & OpenAI](/blog/ai-gateway-with-langchain)
  with John McBride, Staff Software Engineer at Zuplo
- Day 3:
  [Your AI Models Aren't Learning From Production Data](/blog/comet-ml-opik)
  with Gideon Mendels, CEO & Co-Founder of Comet ML
- Day 3:
  [Using Claude Code with Zuplo's AI Gateway](/blog/ai-gateway-with-claude-code)
  with Martyn Davies, Developer Advocate at Zuplo
- Day 4:
  [What Autonomous Agents Actually Need from Your APIs](/blog/what-autonomous-agents-actually-need-from-your-apis)
  with Emmanuel Paraskakis, CEO of Level250
- Day 4: [Using AI Gateway with goose AI agent](/blog/ai-gateway-with-goose)
  with Martyn Davies, Developer Advocate at Zuplo