Skip to main content
CloudJune 18, 20266 min read

Azure Returns 410 Gone for GPT-4o on October 1 and Auto-Upgrade Skips the Deployments That Matter

On October 1, 2026, Azure OpenAI in Microsoft Foundry retires the GA gpt-4o (2024-11-20) and gpt-4o-mini (2024-07-18) versions, after which calls to a retired deployment return HTTP 410 Gone. Standard-family deployments are auto-upgraded region by region, but Provisioned (PTU) deployments and anything set to NoAutoUpgrade are not, and Microsoft says the date is not extendable.

The change and the date

On October 1, 2026, Azure OpenAI in Microsoft Foundry retires two general-availability model versions that a large share of production workloads still run on: gpt-4o version 2024-11-20 and gpt-4o-mini version 2024-07-18. After that date, any inference call to a deployment still pointing at one of those retired versions returns HTTP 410 Gone, and Microsoft states plainly that these retirement dates are not extendable. There is no exception process to buy more time.

What makes this retirement different from the usual deprecation churn is the split in how deployments are handled. Standard-family deployments get an automatic upgrade. A meaningful set of deployments do not, and those are exactly the ones run by teams that thought ahead about version pinning and reserved capacity. The protection they built becomes the thing that bites them on October 1.

Who is hit hardest

Microsoft auto-upgrades Standard, Data Zone Standard, and Global Standard deployments on a rolling, region-by-region schedule. If you run one of those Standard-family SKUs, you will very likely be carried onto a current model without lifting a finger, though not necessarily on a timeline you chose.

Two groups are not carried, and must migrate by hand:

  • Provisioned (PTU) deployments. Per Microsoft's lifecycle policy, Provisioned deployments are not auto-upgraded and customers must manually migrate to the replacement. PTU is where the high-throughput, latency-sensitive, contractually important workloads tend to live, so the deployments most likely to be in this bucket are also the ones a 410 hurts most.
  • Any deployment, on any SKU, configured with versionUpgradeOption set to NoAutoUpgrade. Microsoft's own documentation describes that value as: never auto-upgrade, and the deployment stops working at retirement. Teams that pinned a version deliberately, to keep evaluation results stable or to satisfy a change-control process, opted out of the safety net on purpose and may have forgotten they did.

There is recent precedent worth internalizing. Earlier gpt-4o versions were already auto-upgraded on the Standard SKU when versions 2024-05-13 and 2024-08-06 retired on March 31, 2026 (Microsoft's own worked example describes those two versions being moved to gpt-5.1 on Standard). So if you assumed the gpt-4o family was fully handled back then, note that the still-live GA version 2024-11-20, plus gpt-4o-mini 2024-07-18, are the ones now reaching the line, and the PTU and NoAutoUpgrade audiences were never covered by an auto-upgrade in the first place.

The failure mode in plain terms

A retired model does not degrade gracefully. It returns a hard 410 Gone, the HTTP status for a resource that is permanently gone. Your client library will surface that as an error, not as a fallback to another model. Anything synchronous (a chat endpoint, a RAG answer path, an agent step) starts failing for users the moment the region you sit in crosses the retirement boundary. Anything asynchronous (a batch job, a nightly enrichment pipeline, a queue consumer) starts throwing and, depending on retry logic, either dead-letters or silently stalls.

The quiet danger is the deployment you forgot you owned: an internal tool, a proof of concept that became load-bearing, a scheduled function nobody has touched in a year. Auto-upgrade will not save it if it is Provisioned or pinned to NoAutoUpgrade, and the first signal you get is a 410 in production.

What to do before October 1

Treat this as an inventory and migration exercise, not a code rewrite. The model contract for the GPT-4o family to its successors is broadly compatible, but you must confirm that against your own prompts and evaluations rather than assume it.

  • Inventory every Azure OpenAI deployment across all subscriptions and regions. Use the Models API to read lifecycleStatus and the per-SKU deprecation date for each, rather than eyeballing the portal one resource at a time.
  • Flag every deployment that is Provisioned, and every deployment (any SKU) with versionUpgradeOption set to NoAutoUpgrade. Those are your manual-migration list.
  • For each flagged deployment, pick the target version, validate it against your evaluation set and latency budget, and confirm the replacement is available in your region before you cut over. Microsoft lists a suggested replacement per version (gpt-5.1 for gpt-4o 2024-11-20, gpt-4.1-mini for gpt-4o-mini 2024-07-18); treat these as the current suggestion and re-verify, since model availability and recommendations shift.
  • For Provisioned specifically, confirm you have quota for the target model before migrating, and choose between in-place migration and a side-by-side deployment so you can test and roll back.
  • Wire up an alert. Subscribe the relevant subscriptions to Azure Service Health advisories filtered to Azure OpenAI Service so a human, not an end user, finds out first.

An honest caveat

Do not panic-migrate the deployments that are genuinely on a Standard-family SKU. If a workload runs Standard, Data Zone Standard, or Global Standard (and is not pinned to NoAutoUpgrade), Microsoft will auto-upgrade it region by region, and forcing a manual change adds risk rather than removing it. The real action item is narrow: find the Provisioned and the NoAutoUpgrade deployments, because those are the only ones that go dark on their own.

Two things to keep verifying rather than trusting from memory. First, the suggested replacement model is exactly that, a suggestion, and Microsoft is explicit that it declares replacements relatively late (roughly 90 to 120 days before retirement) and that recommendations change as better models ship; confirm the target the week you migrate, not from a months-old note. Second, dates and version strings on a fast-moving model schedule do move, so re-check the primary schedule before you act. For the broader pattern of dated snapshots disappearing on a fixed clock, see our note on OpenAI retiring older GPT-5 and o3 snapshots.

How a senior team de-risks this

The teams that sail through October 1 are not the ones with the newest model; they are the ones with an accurate inventory and an owner for every deployment. The work is unglamorous: enumerate deployments programmatically across every subscription, classify each by SKU and upgrade option, and treat the Provisioned and NoAutoUpgrade set as a tracked migration backlog with a hard deadline rather than a someday task. A version pin or a PTU reservation is a deliberate decision, and every deliberate decision needs an owner who is watching the retirement clock for it.

That ownership and inventory discipline is the core of how we run ongoing cloud management for clients: catch the fixed-date retirements before they become a 410 in production, validate the replacement against real workloads, and migrate on a planned change window instead of an incident bridge.

Sources

Talk to the engineer who will own your stack.

No account managers, no offshore handoff. Senior DevOps, direct. Tell us what you are dealing with and you get a straight answer.