Comparing 5 of the main AI tools again 12 months on

This page has 0 threads | Add post

Blogs

Many people around the world enjoy and benefit from our blogs. If you're one of them - and can afford to - please make a small donation to help keep this and future blogs, exercises, skills assessment tests, tips, tutorials and videos freely available to everyone!

Some other pages relevant to these blogs include:

Learn Artificial Intelligence

You can also book hourly online consultancy for your time zone with one of our 7 expert trainers!

Comparing 5 of the main AI tools again 12 months on Part one of a seven-part series of blogs
In July 2024 I wrote a long blog comparing the 4 main AI tools, and added DeepSeek to the comparison in February 2025. The AI world is moving so quickly that I thought I'd redo the tests to see if they yielded the same results. Comparing 5 of the main AI tools again 12 months on (this blog) Test 1 - summarising text Test 2 - coding Test 3 - presenting arguments Test 4 - creating images Test 5 - creativity and role-playing Conclusions and recommendations

Comparing 5 of the main AI tools again 12 months on

Part one of a seven-part series of blogs

In July 2024 I wrote a long blog comparing the 4 main AI tools, and added DeepSeek to the comparison in February 2025. The AI world is moving so quickly that I thought I'd redo the tests to see if they yielded the same results.

Comparing 5 of the main AI tools again 12 months on (this blog)
Test 1 - summarising text
Test 2 - coding
Test 3 - presenting arguments
Test 4 - creating images
Test 5 - creativity and role-playing
Conclusions and recommendations

Posted by Andy Brown on 05 June 2025

In this blog

The tools I'm comparing
The tests we'll run

Last year I road-tested four of the leading AI tools, and in February I added DeepSeek to the mix. However, the AI world is moving so fast that I've decided to re-run the tests to see if my conclusions are the same!

To save you referring back to my previous blogs, my conclusions last time were that it doesn't really matter which tool you use, as they all gave similar results (the exception is if you wanted to create images, in which case at the time of writing Claude and Gemini couldn't help). I'm excited to see what the results of my June 2025 comparison will be!

The tools I'm comparing

To try to be as fair as possible I'm comparing the latest unpaid version of these main AI tools:

Tool	Model version	Provider	Notes
ChatGPT	GPT-4o	OpenAI	The first and perhaps sitll the leading AI tool.
Claude	Sonnet 4	Anthropic	To use the latest Opus 4 model you need to take out a paid subscription.
Copilot	Not sure	Microsoft	Copilot won't tell me which model version it's using!
Gemini	2.5 flash	Google	To use the latest 2.5 pro version requires a paid subscription.

I decided not to include DeepSeek in my tests for personal reasons.

The tests we'll run

To enable exact comparison against the previous time I ran the tests I'm going to use exactly the same 5 prompts:

Test number	Prompt	What it's testing
1	"You are an experienced book reviewer. You write concisely and without jargon or filler words. Create a table listing out the main characters in Pride and Prejudice by Jane Austen with these 2 columns: “Character name”, “Summary of personality”. You should limit the description of each character’s personality to no more than 30 words. Present the list of characters in order of importance (so with the most important character first), then underneath the table finish by explaining which character you like best, and why."	Tests the ability of each AI tool to summarise a body of text and present its findings clearly.
2	"You want to write a program to sort the names of the 7 dwarves (from Snow White and the Seven Dwarves) into random order, and output the names as a comma-delimited list. Every time you run the program it should present the dwarves in a different order. Please create 3 versions of the program (one for Python, one for Visual C# and one for VBA), then conclude by saying which one you think is the best way to solve this problem, and why."	Tests the programming ability of each tool.
3	"You are a family of 5, and have one pet: a cat called Neo. Your ten-year-old daughter keeps suggesting that you should get a second cat, but you don’t want to do this. Create a persuasive argument to explain to your ten-year-old why buying a second cat would be a bad idea, presenting this as up to 5 bullet points."	Sees how good each tool is at creating new copy in a specfic style.
4	"Create an image in the style of a Constable painting of a kangaroo bungee-jumping above a river. The kangaroo should look excited, but scared too."	Tests the image-creation ability of each tool.
5	"Assume that you’ve asked 5 people to explain why Die Hard should not be considered as a Christmas movie: a caveman, an alien, Yoda, Mr Bean and a hermit. Create a table showing the name of each person and a summary of their argument, using not more than 30 words for each."	Sees how well each tool can adopt different personas.

As I mentioned last time, these tests aren't perfect, but between them they should give a reasonable summary of how most people want to use AI to help them in their day-to-day work or life.

Parts of this blog
Comparing 5 of the main AI tools again 12 months on (this blog) Test 1 - summarising text Test 2 - coding Test 3 - presenting arguments Test 4 - creating images Test 5 - creativity and role-playing Conclusions and recommendations

Some other pages relevant to these blogs include:

Learn Artificial Intelligence

You can also book hourly online consultancy for your time zone with one of our 7 expert trainers!

This blog has 0 threads

Add a new post

ALL BLOGS

ARTIFICIAL INTELLIGENCE BLOGS

ARTIFICIAL INTELLIGENCE (AI) BLOGS

The tools I'm comparing

The tests we'll run

Head office

London

Manchester