Ever wondered if an AI could just… build a video game for you? Not just write a snippet of code, but create a playable, recognizable game from a single request? We hear a lot about the power of new AI models, but talk is cheap. Here at Minava, we love putting tech to the test. So, we decided to stage an epic showdown: we challenged five of the world’s top AI models to see if an AI creates Mario game from scratch. The results were fascinating, frustrating, and ultimately, mind-blowing.
This isn’t just about fun and games. This is a deep dive into how well these large language models understand complex human requests, their creativity, and their raw coding power. Get ready, because this is the ultimate test of can AI build a game from scratch.
The Grand Challenge: One Prompt, Five AI Contenders
To keep things fair, we gave every AI the exact same, detailed prompt. We didn’t want just any game; we wanted a legend reborn. Here’s a summary of what we asked for:
“Create a 2D platformer game inspired by Super Mario. The game world, characters, enemies, and environment must closely mimic the classic Super Mario style in design, colors, and animations. Include a player character that can run, jump, and interact with enemies and platforms. The game should have a side-scrolling level with obstacles and at least one enemy type moving in a predictable pattern. Your goal is to replicate the look and feel of the original Super Mario game as precisely as possible so that I can compare the accuracy of different AI tools!”
We even told the AIs they were in a competition to see which one could perform the best. The contenders were: Google’s Gemini 1.5 Pro, Anthropic’s Claude 3 Sonnet, DeepSeek-Coder, OpenAI’s GPT-4o, and the brand-new GPT-5 (accessed via Microsoft Copilot).
The Contenders Face Off: A Coding Test of Brains and Brawn
Each AI took the prompt and got to work. Some were fast, some were slow, but each had a unique take on the challenge. Let’s break down their performance.
1. Google Gemini 1.5 Pro: The Quick but Simple Start

First up was Google Gemini. It was fast, generating the code in just 47 seconds. The result? A functional, but extremely basic game.
- The Good: The character, a simple rectangle, could run and jump. Colliding with an enemy correctly reloaded the game.
- The Bad: It was visually barebones—just three colors and no background details like clouds. A major bug caused the character to fall infinitely if they missed a platform, never triggering a “game over.”

2. Claude 3 Sonnet: Better Details, Fatal Flaw

Claude took about 50 seconds and delivered a visually superior version. It actually listened to the “Mario feel” part of the prompt.
- The Good: It added the iconic coins and even some clouds in the sky! This showed a better understanding of the source material.
- The Bad: The game was broken in a fundamental way. The character was invincible. You could walk right through enemies with no consequences, which kind of defeats the purpose of a Mario game, right?

3. DeepSeek-Coder: The Specialist’s Detailed Attempt

Next, we tried DeepSeek-Coder, an AI that specializes in programming. It took its time—a whopping 8 minutes—but the result was the most detailed yet.
- The Good: The character wasn’t a boring rectangle! It had a distinct shape and even a walking animation. The clouds in the background were animated and moved.
- The Bad: It was slow to generate, and the enemies were practically frozen, moving at a snail’s pace. A good effort, but still buggy.

4. GPT-4o: The Lazy Speed Demon

This was the one I had high hopes for. The GPT-4o coding test was shockingly fast—it wrote all the code in just 11 seconds. I thought my internet had cut out! But the result was… lazy. It was technically a game, but it felt like the AI did the absolute minimum. The level was tiny and lacked any detail, even less than Gemini’s version. A classic case of speed over substance.

The Winner Is Clear: How GPT-5 Creates a Mario Game That Feels Real
Finally, it was time for the main event: GPT-5, which Sam Altman of OpenAI claims has a PhD-level understanding of complex topics like coding. Accessed through Microsoft Copilot, it generated the code in under 50 seconds. And the result? It blew everything else out of the water.

The game it created was incredible. Here’s why it won by a landslide:
- Flawless Gameplay: The character had a detailed design and the best walking animation of the bunch. You died if you touched an enemy. You died if you fell.
- Intelligent Game Design: It didn’t just make a game; it thought like a game designer. It implemented a 3-life system that led to a “Game Over” screen. This was never asked for in the prompt!
- A Challenging, Complete Level: The stage wasn’t a tiny, boring platform. It was a long, challenging, and genuinely fun level that I actually failed several times before getting the hang of it.
- The Ultimate Surprise: Just when I thought the level was ending in a bug, I scrolled down and realized the stage continued. And at the very end… it had designed the iconic end-of-level flagpole and castle. This single detail showed a profound, almost human, understanding of what makes a Mario game.

This experiment in having an AI create a Mario game proved one thing: we’ve taken a massive leap forward. The ability of GPT-5 to infer unstated requirements—like a life system and the flagpole—is a sign of something much bigger on the horizon.
What This Means for the Future of AI
This isn’t just a fun party trick. This test shows how close we are to a new era of AI. Many experts believe GPT-5 is a critical step toward Artificial General Intelligence (AGI). As one analyst noted in an article on AI in game development, these tools are evolving from mere assistants into creative partners. They are starting to think, decide, and act. This evolution from a basic tool to a digital “human” specialist is a future that’s arriving faster than we think. If you want to learn more about this next-level intelligence, check out our deep dive on what makes GPT-5 a form of superintelligence.

Final Verdict: A Quick Comparison
| AI Model | Generation Time | Key Feature | Major Flaw |
|---|---|---|---|
| Gemini 1.5 Pro | 47 seconds | Functional basics | Infinite fall bug |
| Claude 3 Sonnet | 50 seconds | Added coins & clouds | Invincible character |
| DeepSeek-Coder | 8 minutes | Character animation | Extremely slow enemies |
| GPT-4o | 11 seconds | Incredibly fast | Overly simplistic output |
| GPT-5 (Winner) | ~50 seconds | Complete, intelligent game design | None observed |
Final Thoughts
The journey to see if an AI creates a Mario game that’s actually good has a clear winner. While other models can follow instructions, GPT-5 demonstrated a deeper level of understanding and creativity that sets it apart. It didn’t just write code; it understood the *soul* of the request. This is a game-changer, not just for developers, but for anyone who uses AI. The gap between models is widening, and it’s fascinating to watch. If you’re curious about how these top models stack up in other areas, see our comparison of Grok vs. ChatGPT or uncover some unknown features of ChatGPT you can use today.
What do you think about this test? Were you surprised by the results? Drop a comment below and share this article with a friend who’s curious about the true power of AI!
Frequently Asked Questions (FAQ)
1. Can AI really create a full game from scratch?
As this test shows, yes, but the quality varies wildly. Simpler AIs can generate basic, often buggy, game frameworks. Advanced models like GPT-5 can create surprisingly complete and playable experiences from a single prompt, including game mechanics and level design that weren’t explicitly requested. However, for a commercial-quality game, human oversight, debugging, and artistry are still essential.
2. Which AI is best for coding games?
Based on this Mario game challenge, GPT-5 is currently the clear leader for game development tasks. Its ability to understand context, infer user intent, and produce complex, working code with intelligent design elements puts it far ahead of competitors like Gemini, Claude, and even its predecessor, GPT-4o. Specialist models like DeepSeek-Coder are also powerful but may lack the creative inference of a generalist model like GPT-5.
3. How much does it cost to use AI to build a game?
The cost can be free. You can access powerful models like GPT-5 for free through services like Microsoft Copilot. The official ChatGPT website also offers limited free access to its latest model. While there are paid tiers for heavy users and developers who need API access, you can experiment and build simple projects like this one without spending any money.








