{"id":60093,"date":"2026-03-29T09:46:07","date_gmt":"2026-03-29T04:16:07","guid":{"rendered":"https:\/\/officechai.com\/?p=60093"},"modified":"2026-03-29T09:46:11","modified_gmt":"2026-03-29T04:16:11","slug":"gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74","status":"publish","type":"post","link":"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/","title":{"rendered":"GPT 5.4 (xhigh) Scores 95% On 2026 US Math Olympiad, Gemini 3.1 Pro Second With 74%"},"content":{"rendered":"\n<p>AI models are continuing to make rapid strides in math and science.<\/p>\n\n\n\n<p><a href=\"https:\/\/officechai.com\/ai\/gpt-5-4-benchmarks\/\">GPT-5.4<\/a>, OpenAI&#8217;s current flagship, has scored 95.24% on the 2026 USA Math Olympiad (USAMO), according to a new evaluation by MathArena. <a href=\"https:\/\/officechai.com\/ai\/gemini-3-1-pro-benchmarks\/\">Gemini 3.1 Pro<\/a> finished second at 74.4%, followed by <a href=\"https:\/\/officechai.com\/ai\/claude-opus-4-6-benchmarks-released\/\">Claude Opus 4.6<\/a> at 47%, and open-source model Step-3.5-Flash at 44.6%.<\/p>\n\n\n\n<p>The jump is striking in context. A year ago, on USAMO 2025, the same class of models produced solutions riddled with circular arguments, unsupported guesses, and incoherent structure. In 2026, those failure modes are largely gone.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" fetchpriority=\"high\" decoding=\"async\" width=\"640\" height=\"271\" src=\"https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42-1024x434.png?resize=640%2C271&#038;ssl=1\" alt=\"\" class=\"wp-image-60095\" srcset=\"https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?resize=1024%2C434&amp;ssl=1 1024w, https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?resize=300%2C127&amp;ssl=1 300w, https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?resize=768%2C325&amp;ssl=1 768w, https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?resize=1536%2C651&amp;ssl=1 1536w, https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?resize=2048%2C868&amp;ssl=1 2048w, https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?w=1280 1280w, https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?w=1920 1920w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">What Changed<\/h2>\n\n\n\n<p>The improvement isn&#8217;t just in scores \u2014 it&#8217;s in the nature of the errors. In 2025, models frequently guessed rather than proved. In 2026, the remaining mistakes are subtler: open models occasionally slip back into chain-of-thought reasoning mid-proof without completing the argument, and Opus 4.6 ran out of its 128,000-token budget on 4 of 24 attempts, three of them on a single problem (Problem 2).<\/p>\n\n\n\n<p>GPT-5.4&#8217;s only notable error was on Problem 5, where one of its runs incorrectly argued the statement was false and produced an invalid counterexample \u2014 a surprising stumble for an otherwise dominant performance.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The Cost Gap<\/h2>\n\n\n\n<p>The benchmark also highlights a significant cost disparity. GPT-5.4 (xhigh) cost $5.15 per run. Gemini 3.1 Pro, which <a href=\"https:\/\/officechai.com\/ai\/google-gemini-3-1-pro-takes-top-spot-in-artificial-analysis-intelligence-index-at-price-half-that-of-opus-4-6-gpt-5-2\/\">has established itself as cost-efficient at the frontier<\/a>, cost just $2.20. Claude Opus 4.6 was the most expensive at $13.23 \u2014 nearly 2.6x the cost of GPT-5.4 \u2014 for a score less than half as high. Step-3.5-Flash, the strongest open model, ran at just $0.22.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Grading at Scale<\/h2>\n\n\n\n<p>MathArena built a semi-automated grading pipeline for the evaluation, using a jury of three models \u2014 GPT-5.4, Gemini 3.1 Pro, and Opus 4.6 \u2014 rather than a single judge. The jury approach was designed to counter two documented problems with LLM-based grading: self-bias (models scoring their own outputs more generously) and formatting bias (rewarding verbose or polished-looking solutions).<\/p>\n\n\n\n<p>The pipeline&#8217;s accuracy held up well under human review: final scores shifted by at most two points, and only for three solutions. Notably, GPT-5.4 was the most reliable judge, while Gemini 3.1 Pro and Opus 4.6 both significantly inflated scores for their own outputs.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What It Means<\/h2>\n\n\n\n<p>The USAMO result adds to a growing body of evidence that frontier AI is closing in on expert-level mathematical reasoning. GPT-5.4&#8217;s <a href=\"https:\/\/officechai.com\/ai\/gpt-5-4-tied-with-gemini-3-1-pro-on-artificial-analysis-intelligence-index-first-time-a-new-openai-model-hasnt-topped-index-outright\/\">benchmark dominance has been consistent across categories<\/a> since its release, and the USAMO score \u2014 near-saturation on one of the most rigorous high school math competitions in the world \u2014 underscores how rapidly the ceiling has moved. And if this scorching pace of development continues, the <a href=\"https:\/\/officechai.com\/ai\/we-predict-far-more-dramatic-ai-progress-in-next-two-years-anthropic\/\">big breakthroughs<\/a> in science that top voices in AI have been <a href=\"https:\/\/officechai.com\/ai\/ai-could-lead-to-50-years-of-scientific-progress-in-7-years-larry-summers\/\">promising <\/a>might soon come to fruition.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>AI models are continuing to make rapid strides in math and science. GPT-5.4, OpenAI&#8217;s current flagship, has scored 95.24% on the 2026 USA&#8230;<\/p>\n","protected":false},"author":1,"featured_media":60095,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1029],"tags":[],"class_list":["post-60093","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>GPT 5.4 (xhigh) Scores 95% On 2026 US Math Olympiad, Gemini 3.1 Pro Second With 74%<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"GPT 5.4 (xhigh) Scores 95% On 2026 US Math Olympiad, Gemini 3.1 Pro Second With 74%\" \/>\n<meta property=\"og:description\" content=\"AI models are continuing to make rapid strides in math and science. GPT-5.4, OpenAI&#8217;s current flagship, has scored 95.24% on the 2026 USA...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/\" \/>\n<meta property=\"og:site_name\" content=\"OfficeChai\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/OfficeChai\/\" \/>\n<meta property=\"article:published_time\" content=\"2026-03-29T04:16:07+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-03-29T04:16:11+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png\" \/>\n\t<meta property=\"og:image:width\" content=\"2532\" \/>\n\t<meta property=\"og:image:height\" content=\"1073\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"OfficeChai Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@OfficeChai\" \/>\n<meta name=\"twitter:site\" content=\"@OfficeChai\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"OfficeChai Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/\",\"url\":\"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/\",\"name\":\"GPT 5.4 (xhigh) Scores 95% On 2026 US Math Olympiad, Gemini 3.1 Pro Second With 74%\",\"isPartOf\":{\"@id\":\"https:\/\/officechai.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?fit=2532%2C1073&ssl=1\",\"datePublished\":\"2026-03-29T04:16:07+00:00\",\"dateModified\":\"2026-03-29T04:16:11+00:00\",\"author\":{\"@id\":\"https:\/\/officechai.com\/#\/schema\/person\/5861f1134993293cc28905de7624d6b2\"},\"breadcrumb\":{\"@id\":\"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/#primaryimage\",\"url\":\"https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?fit=2532%2C1073&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?fit=2532%2C1073&ssl=1\",\"width\":2532,\"height\":1073,\"caption\":\"us math olympiad 2026\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/officechai.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"GPT 5.4 (xhigh) Scores 95% On 2026 US Math Olympiad, Gemini 3.1 Pro Second With 74%\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/officechai.com\/#website\",\"url\":\"https:\/\/officechai.com\/\",\"name\":\"OfficeChai\",\"description\":\"Startups, Businesses And Careers\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/officechai.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/officechai.com\/#\/schema\/person\/5861f1134993293cc28905de7624d6b2\",\"name\":\"OfficeChai Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/officechai.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/61d744733248dc647d505d0676bb425323413132ee5447e86aa8eecbbb7b27d5?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/61d744733248dc647d505d0676bb425323413132ee5447e86aa8eecbbb7b27d5?s=96&d=mm&r=g\",\"caption\":\"OfficeChai Team\"},\"description\":\"Dotting the i's, crossing the t's.\",\"url\":\"https:\/\/officechai.com\/author\/admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"GPT 5.4 (xhigh) Scores 95% On 2026 US Math Olympiad, Gemini 3.1 Pro Second With 74%","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/","og_locale":"en_US","og_type":"article","og_title":"GPT 5.4 (xhigh) Scores 95% On 2026 US Math Olympiad, Gemini 3.1 Pro Second With 74%","og_description":"AI models are continuing to make rapid strides in math and science. GPT-5.4, OpenAI&#8217;s current flagship, has scored 95.24% on the 2026 USA...","og_url":"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/","og_site_name":"OfficeChai","article_publisher":"https:\/\/www.facebook.com\/OfficeChai\/","article_published_time":"2026-03-29T04:16:07+00:00","article_modified_time":"2026-03-29T04:16:11+00:00","og_image":[{"width":2532,"height":1073,"url":"http:\/\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png","type":"image\/png"}],"author":"OfficeChai Team","twitter_card":"summary_large_image","twitter_creator":"@OfficeChai","twitter_site":"@OfficeChai","twitter_misc":{"Written by":"OfficeChai Team","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/","url":"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/","name":"GPT 5.4 (xhigh) Scores 95% On 2026 US Math Olympiad, Gemini 3.1 Pro Second With 74%","isPartOf":{"@id":"https:\/\/officechai.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/#primaryimage"},"image":{"@id":"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?fit=2532%2C1073&ssl=1","datePublished":"2026-03-29T04:16:07+00:00","dateModified":"2026-03-29T04:16:11+00:00","author":{"@id":"https:\/\/officechai.com\/#\/schema\/person\/5861f1134993293cc28905de7624d6b2"},"breadcrumb":{"@id":"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/#primaryimage","url":"https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?fit=2532%2C1073&ssl=1","contentUrl":"https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?fit=2532%2C1073&ssl=1","width":2532,"height":1073,"caption":"us math olympiad 2026"},{"@type":"BreadcrumbList","@id":"https:\/\/officechai.com\/ai\/gpt-5-4-xhigh-scores-95-on-2026-us-math-olympiad-gemini-3-1-pro-second-with-74\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/officechai.com\/"},{"@type":"ListItem","position":2,"name":"GPT 5.4 (xhigh) Scores 95% On 2026 US Math Olympiad, Gemini 3.1 Pro Second With 74%"}]},{"@type":"WebSite","@id":"https:\/\/officechai.com\/#website","url":"https:\/\/officechai.com\/","name":"OfficeChai","description":"Startups, Businesses And Careers","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/officechai.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/officechai.com\/#\/schema\/person\/5861f1134993293cc28905de7624d6b2","name":"OfficeChai Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/officechai.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/61d744733248dc647d505d0676bb425323413132ee5447e86aa8eecbbb7b27d5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/61d744733248dc647d505d0676bb425323413132ee5447e86aa8eecbbb7b27d5?s=96&d=mm&r=g","caption":"OfficeChai Team"},"description":"Dotting the i's, crossing the t's.","url":"https:\/\/officechai.com\/author\/admin\/"}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/officechai.com\/wp-content\/uploads\/2026\/03\/image-42.png?fit=2532%2C1073&ssl=1","jetpack_shortlink":"https:\/\/wp.me\/p685C6-fDf","jetpack_likes_enabled":true,"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/posts\/60093","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/comments?post=60093"}],"version-history":[{"count":1,"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/posts\/60093\/revisions"}],"predecessor-version":[{"id":60096,"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/posts\/60093\/revisions\/60096"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/media\/60095"}],"wp:attachment":[{"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/media?parent=60093"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/categories?post=60093"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/officechai.com\/wp-json\/wp\/v2\/tags?post=60093"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}