can you find performance benchamrkts for models especially openai 5.4?
AI Models & Capabilities
OpenAI's GPT-5.4 demonstrates strong performance on several benchmarks, particularly in professional and knowledge-work tasks. On the GDPval benchmark, which evaluates AI across 44 professions like accounting, sales, and engineering, GPT-5.4 achieved an 83% score, marking a 20% improvement over GPT-5.2 at similar costs [1][2][5]. It also set records on OSWorld-Verified (75.0%, surpassing human performance of 72.4% for desktop navigation via screenshots and actions) and WebArena Verified, showcasing gains in reasoning and computer use [2][7]. Additionally, GPT-5.4 generates output at 77.6 tokens per second via OpenAI's API, exceeding the median of 57.6 t/s for similar reasoning models [10].
For other models, China's GLM-5 (a 744-754 billion-parameter open-source model) reaches 80-90% of frontier-model performance and outperforms competitors like Gemini 3 Pro and Grok 4 on unspecified benchmarks, while offering lower costs at $1 per million input tokens [6][12]. Sources lack detailed cross-model comparisons beyond these highlights.
Sources
- OpenAI Reclaims Benchmark Lead with GPT-5.4 Release — techstrong.ai
- OpenAI Launches GPT-5.4 With Expanded Context Window, Improved Reasoning, and Higher Benchmark Performance — AI Insider
- GPT-5.4 is here — and OpenAI just made every other AI model look slow — Tom's Guide
- OpenAI’s GPT-5.4 sets new records on professional benchmarks — The Next Web
- New OpenAI GPT-5.4 AI Model : Everything You Need to Know — Geeky Gadgets
- Chinese AI Model Challenges US Margins — GAI Insights Newsletter
- GPT-5.4 Review: OpenAI's Best Model Yet (Full Breakdown) — The Neuron
- GPT-5.4: What I Verified With Code | by Lakshmi narayana .U — medium.com
- OpenAI dishes out its first model on a plate of Cerebras silicon — theregister
- GPT-5.4 (xhigh) - Intelligence, Performance & Price Analysis — Artificial Analysis
- Introducing GPT-5.4 — OpenAI
- Z.ai's GLM-5 Challenges Western AI with Competitive Pricing — The Rundown AI
- New OpenAI GPT-5.4 AI Model : Everything You Need to Know — Geeky Gadgets
- OpenAI: GPT-5.4 Review — Pricing, Benchmarks & Capabilities (2026) — Design for Online — Design for Online
- GPT-5.4 Review: OpenAI's Best Model Yet (Full Breakdown) — The Neuron
- GPT-5.4 (xhigh) - Intelligence, Performance & Price Analysis — Artificial Analysis
- OpenAI Reclaims Benchmark Lead with GPT-5.4 Release — techstrong.ai
- GPT-5.4 - Pricing, Context Window Size, and Benchmark Data — automatio.ai
- GPT-5.4: What I Verified With Code | by Lakshmi narayana .U — medium.com
- Introducing GPT-5.4 — OpenAI
- OpenAI launches GPT-5.4: reasoning, coding, and computer use in one — Techzine Global
- OpenAI launches GPT-5.4 Thinking and Pro combining coding, reasoning, and computer use in one model — The Decoder
- OpenAI’s GPT-5.4 sets new records on professional benchmarks — The Next Web
- OpenAI launches GPT-5.4 with Pro and Thinking versions — TechCrunch
- OpenAI's new GPT-5.4 clobbers humans on pro-level work in tests - by 83% — ZDNET
- OpenAI launches GPT-5.4 with native computer use mode, financial plugins for Microsoft Excel, Google Sheets — VentureBeat
- OpenAI, in Desperate Need of a Win, Launches GPT-5.4 — Gizmodo
- GPT-5.4 is here — and OpenAI just made every other AI model look slow — Tom's Guide
- GPT-5.4 Targets Anthropic’s Claude With Premium Pricing and Coding Muscle — www.trendingtopics.eu
Related questions
- →What is retrieval-augmented generation (RAG), and why is it important for enterprise AI deployment?
- →How should non-technical executives evaluate and compare AI model performance benchmarks?
- →What is multimodal AI, and why does it matter for practical business applications?
- →How quickly are AI capabilities improving, and is there credible evidence that the pace of progress is slowing?