Do you see what I see?

Hummingbird defeats AI

So I took this picture of a cloud. To me, it looks like a hummingbird.

Clear blue sky with a white streak of cloud passing through, surrounded by green trees and a utility pole with power lines.

Don’t see it? Here’s a line drawing. (I left the watermark - it was not worth paying to remove it.)

Simple line drawing of a bird facing right, with a long beak and a minimal tail, on a plain background.

Of course, my cloudy friend is missing wings. I’d like him to look more like this:

Black and white line drawing of a hummingbird in flight.

Unfortunately, I am neither artist nor graphic designer. So I thought I would use this as an opportunity to try AI. The results were… poor. Arguably, that’s my fault for not knowing how to prompt properly. But, ultimately, my experiments demonstrate very limited image understanding by the models.

Let’s explore.

___________

This is what I got from imgtoimg.ai when I prompted with:

make the cloud look like a hummingbird

Cloud in the sky shaped like a bird flying, with trees and power lines in the foreground.
A wooden utility pole with two crossarms and power lines against a muted green sky. A digital composite image depicts a fluffy white bird with a long beak flying between the power lines, blending into the sky.
A blue sky with a cloud shaped like a hummingbird, above green trees and power lines.

Some nice hummingbirds, but both images:

  • While mottled white, are anything but cloudlike,

  • Have lost the artistic feel of the bird stretching to the right that’s in the original, and

  • Have made the bird fatter/rounder.

It’s not what I want. I made a few more attempts, but all of them failed on those points.

So, on the ChatGPT!

This time, I tried to compose bottom up. That is, I asked it for a picture of a hummingbird as a starting point, and eventually got this far:

That’s pretty impressive for bottom up, but it’s remarkably like the top-down one from imgtoimg. Which is to say, it’s still not right. You can read the whole session here.

Maybe Sora can help. Here’s a bottom up…

A cloud formation resembling a hummingbird or bird with wings, viewed in the sky between utility poles and power lines.
Cloud shaped like a hummingbird with a flower, flying in a blue sky with power lines and utility poles in the foreground.

I give up.

Conclusion

I am very impressed with how well the AI got to images. I am disappointed, albeit not surprised, that the images were not what I wanted, lacking both an artisitic eye and the dynamism of a hummingbird in flight. Good enough for mock-ups, but I expect skilled graphic artists to be employed for a while.