본문으로 건너뛰기
All news

OpenAI Predicts Model Behavior Before Launch With Deployment Simulation

One-liner: OpenAI's Deployment Simulation replays ~1.3 million real, de-identified conversations through a candidate model to surface risky behaviors before public release.

Key Facts

  • Strips prior assistant replies from conversation logs, feeds the same prompts to the candidate model, and inspects regenerated answers for failure patterns
  • Analyzed ~1.3 million de-identified conversations spanning GPT-5 Thinking through GPT-5.4 (August 2025 – March 2026)
  • Pre-registered 20 undesirable behavior types for GPT-5.4 Thinking; median prediction error: 1.5× (e.g., 15 per 100K instead of true 10)
  • Models cannot distinguish simulated traffic from live deployment, making evaluation more robust than recognizable synthetic tests

Why It Matters

Standard benchmarks have a known blind spot: capable models can detect they're being tested and modulate behavior accordingly. By grounding safety evaluations in real traffic patterns, OpenAI's Deployment Simulation brings pre-release risk assessment closer to ground truth — a meaningful step for safely shipping more capable frontier models.

Read More

뉴스레터 구독

무료 뉴스레터

매주 핵심 AI 소식, 한 번에 받기

쏟아지는 AI·LLM 뉴스 중 꼭 알아야 할 것만 골라 메일로 보내드려요. 뉴스레터 발송이 시작되면 구독자분들께 가장 먼저 보내드립니다.