Can LLMs Recreate Van Gogh's Starry Night Using Only SVG?

Here's an interesting experiment I recently did: Can AI models like ChatGPT recreate Van Gogh's Starry Night using only code?

I created a website to test this across multiple versions of large language models, from 2022 to 2025. The results show how much models have improved in just a few years.

The Original

First, let's look at the masterpiece we're trying to recreate:

GPT-3.5 (2022)

GPT-3.5's attempt at recreating Starry Night

The earliest model produces what looks like a surprised emoji with some yellow dots. It knows "night sky" and "yellow things" but has no concept of composition or structure.

GPT-4 (2023)

GPT-4's attempt at recreating Starry Night

GPT-4 gets some basic elements right: the hills and the stars. Simple but recognizable.

GPT-4o (2024)

GPT-4o's attempt at recreating Starry Night

The moon comes out, we see a black streak that is the cypress and there are two white swirls in the middle. Not pretty, but it contains the basics.

GPT-4.1 (2025)

GPT-4.1's attempt at recreating Starry Night

The town appears, the cypress is recognizable as a tree and we see nice swirly streaks.

GPT-5 (2025)

GPT-5's attempt at recreating Starry Night

All elements are there, and it's trying to capture the artistic style.

What This Reveals

What fascinates me isn't just the improvement. It's how models come up with interpretations that have a kind of beauty. It's almost like... art?

I tested this across multiple paintings:

You can compare every model side-by-side in the gallery.

Is this Creativity?

These models weren't trained to draw. They learned from text and code. Their ability to reconstruct spatial and compositional knowledge shows how they combine and synthesize what they know. When art meets artificial intelligence, there's always a debate: Do you think these interpretations show that LLMs have creativity and "understand" something visually? Or is it all just next token prediction?