Read our blogs, tips and tutorials
Try our exercises or test your skills
Watch our tutorial videos or shorts
Take a self-paced course
Read our recent newsletters
License our courseware
Book expert consultancy
Buy our publications
Get help in using our site
468 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owl trainers only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
Search our website
We also send out useful tips in a monthly email newsletter ...
Comparing 5 of the main AI tools again 12 months on Part one of a seven-part series of blogs |
---|
In July 2024 I wrote a long blog comparing the 4 main AI tools, and added DeepSeek to the comparison in February 2025. The AI world is moving so quickly that I thought I'd redo the tests to see if they yielded the same results.
|
In this blog
Last year I road-tested four of the leading AI tools, and in February I added DeepSeek to the mix. However, the AI world is moving so fast that I've decided to re-run the tests to see if my conclusions are the same!
To save you referring back to my previous blogs, my conclusions last time were that it doesn't really matter which tool you use, as they all gave similar results (the exception is if you wanted to create images, in which case at the time of writing Claude and Gemini couldn't help). I'm excited to see what the results of my June 2025 comparison will be!
To try to be as fair as possible I'm comparing the latest unpaid version of these main AI tools:
Tool | Model version | Provider | Notes |
---|---|---|---|
ChatGPT | GPT-4o | OpenAI | The first and perhaps sitll the leading AI tool. |
Claude | Sonnet 4 | Anthropic | To use the latest Opus 4 model you need to take out a paid subscription. |
Copilot | Not sure | Microsoft | Copilot won't tell me which model version it's using! |
Gemini | 2.5 flash | To use the latest 2.5 pro version requires a paid subscription. |
I decided not to include DeepSeek in my tests for personal reasons.
To enable exact comparison against the previous time I ran the tests I'm going to use exactly the same 5 prompts:
Test number | Prompt | What it's testing |
---|---|---|
1 | "You are an experienced book reviewer. You write concisely and without jargon or filler words. Create a table listing out the main characters in Pride and Prejudice by Jane Austen with these 2 columns: “Character name”, “Summary of personality”. You should limit the description of each character’s personality to no more than 30 words. Present the list of characters in order of importance (so with the most important character first), then underneath the table finish by explaining which character you like best, and why." | Tests the ability of each AI tool to summarise a body of text and present its findings clearly. |
2 | "You want to write a program to sort the names of the 7 dwarves (from Snow White and the Seven Dwarves) into random order, and output the names as a comma-delimited list. Every time you run the program it should present the dwarves in a different order. Please create 3 versions of the program (one for Python, one for Visual C# and one for VBA), then conclude by saying which one you think is the best way to solve this problem, and why." | Tests the programming ability of each tool. |
3 | "You are a family of 5, and have one pet: a cat called Neo. Your ten-year-old daughter keeps suggesting that you should get a second cat, but you don’t want to do this. Create a persuasive argument to explain to your ten-year-old why buying a second cat would be a bad idea, presenting this as up to 5 bullet points." | Sees how good each tool is at creating new copy in a specfic style. |
4 | "Create an image in the style of a Constable painting of a kangaroo bungee-jumping above a river. The kangaroo should look excited, but scared too." | Tests the image-creation ability of each tool. |
5 | "Assume that you’ve asked 5 people to explain why Die Hard should not be considered as a Christmas movie: a caveman, an alien, Yoda, Mr Bean and a hermit. Create a table showing the name of each person and a summary of their argument, using not more than 30 words for each." | Sees how well each tool can adopt different personas. |
As I mentioned last time, these tests aren't perfect, but between them they should give a reasonable summary of how most people want to use AI to help them in their day-to-day work or life.
Parts of this blog |
---|
|
Some other pages relevant to the above blogs include:
Kingsmoor House
Railway Street
GLOSSOP
SK13 2AA
Landmark Offices
99 Bishopsgate
LONDON
EC2M 3XD
Holiday Inn
25 Aytoun Street
MANCHESTER
M1 3AE
© Wise Owl Business Solutions Ltd 2025. All Rights Reserved.