Read our blogs, tips and tutorials
Try our exercises or test your skills
Watch our tutorial videos or shorts
Take a self-paced course
Read our recent newsletters
License our courseware
Book expert consultancy
Buy our publications
Get help in using our site
548 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owl trainers only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
Search our website
We also send out useful tips in a monthly email newsletter ...
Comparing the 5 leading AI tools for image generaton from text Part four of an eight-part series of blogs |
---|
Having recently compared the 4 main AI tools for text prompts, I thought I'd do the same for image generation tools. In this blog series we compare Dall-E (via ChatGPT and separately via Copilot), Firefly, Midjourney and Stable Diffusion for 3 pre-defined tests to see which ones score most highly for cost, ease-of-use, speed, editing ability and above all for quality of image.
|
Here's a reminder of the first picture I want to create:
Create a photo-realistic image using 4k detail. This should be a side view of the following scene. On the left of the image should be a judge sitting in their chair in a UK courtroom. The judge should be wearing a legal wig and have a severe expression. Facing the judge in the dock on the right of the picture should be two mice with guilty-looking expressions.
Let's see how the different tools compare, in alphabetical order:
Tool | Picture |
---|---|
ChatGPT | |
Copilot | |
Firefly | |
Midjourney | |
Stable Diffusion |
I thought I'd give each tool one chance to redeem itself, so issued these prompts:
Tool | Thoughts | Editing prompt |
---|---|---|
ChatGPT | I was pretty happy with this - if only the judge was facing the mice! | "Please keep everything the same, but turn the judge round so that he is facing the mice, not facing away from them." |
Copilot | This was already near-perfect, but maybe the expressions could do with tweaking. | "For the top right image, make the judge look sterner and the mouse look more disreputable and shifty-looking." |
Firefly | Firefly can't count! There is only one mouse in each photo. | "Take the first image. There should be two mice facing the judge and they should look guilty." |
Midjourney | None of these was really what I was after, but the fourth image was closest, so I want to take this and make the judge human. | "Create a photo-realistic image using 4k detail. This should be a side view of the following scene. On the left of the image should be a human judge sitting in their chair in a UK courtroom. The judge should be wearing a legal wig and have a severe expression. Facing the judge in the dock on the right of the picture should be two mice with guilty-looking expressions." |
Stable Diffusion | Where to start? This is the only image which bears no relation to what I asked for. | "Please use a UK style court. There should be a human judge facing two real mice." |
Here's what the results were after these single changes:
Tool | Results |
---|---|
ChatGPT | |
Copilot | |
Firefly | |
Midjourney | |
Stable Diffusion |
I'm not sure there were any improvements here!
For this test there were two clear winners (and one clear loser).
Position | AI tool | Notes |
---|---|---|
Winner | ChatGPT / Copilot | Copilot generated a near-perfect rendition of what I requested, but ChatGPT wasn't far behind. This means that the true winner was Dall-E, the underlying image generation tool. |
Loser | Stable Diffusion | This was the only tool which gave me no confidence that it could ever come up with the image I was looking for. Its original picture was nothing like the one I'd requested, and the edited version also ignored my instructions. |
Middle places | Firefly, Midjourney | Firefly gets bonus points for tweaking its algorithm to generate female judges too, but despite my original prompt it was very US style and I felt the edits moved me further away from what I was trying to achieve. Midjourney was reasonably close to what I was trying to achieve, but didn't improve on editing. |
In the interests of balance I think I should make one last point: I'm not convinced I've mastered editing images in Midjourney (although to be fair, part of the point of this test is that the tools should be reasonably easy to use).
Parts of this blog |
---|
|
Some other pages relevant to the above blogs include:
Kingsmoor House
Railway Street
GLOSSOP
SK13 2AA
Landmark Offices
99 Bishopsgate
LONDON
EC2M 3XD
Holiday Inn
25 Aytoun Street
MANCHESTER
M1 3AE
© Wise Owl Business Solutions Ltd 2024. All Rights Reserved.