Using AI images as reference material

Making AI the start, not the end

Oct 17, 2022

Last week I released a comic strip introduction to my novel Tales from the Triverse. Today I’m going to talk about how I put it together, and my efforts to make it more human.

Here’s the comic in case you missed it:

Write More with Simon K Jones

A visual intro to Tales from the Triverse

3 years ago · 11 likes · 26 comments · Simon K Jones

First off, I should say that I’m absolutely not an artist. I’m a writer who enjoys illustration but is very much a beginner. Creating the Triverse intro comic was an experiment to see if AI generated images, courtesy of MidJourney, could result in something interesting.

Oh, one quick thing: there’s a lot of images in this email, so it’s almost certainly going to be cut off by most email clients. It’s probably best to click here to open it up on the web:
Read in your browser

My original plan was to compile MidJourney-generated images into a comic flow and otherwise leave them as-is. My artistic input would be in the storytelling and the layout, but not in the actual images. A discussion a few weeks ago about AI images then prompted me to rethink this and I determined to put in some more effort. My hope was to find a middle ground, with me adding human intent to the images and MidJourney providing the detail that I’m not yet capable of creating on my own.

So let’s break it down. I’m going to focus on a couple of specific panels, starting with the futuristic cityscape. It looks like this in the finished comic:

Let’s now rewind right back to the initial generation within MidJourney. The prompt I used was this:

a futuristic scifi version of London

Pretty simple, and without much in the way of specifics or stylistic direction. Here’s what I got:

All of those are interesting in their own way. Most of them were a bit too fantastical, and I wanted a realistic-feeling vision of the future. The first thumbnail, top-left, had promise, so I iterated on that one, resulting in this:

Much better! The composition is satisfying, with the pier/bridge projecting from bottom-right of frame to the centre, pointing at the skyscrapers in the background. The top-left version I liked, so I created an upscaled version:

Note the extreme gribblies1 all over this., especially visible in the sky. Yuck. Very AI.

There’s an option in MidJourney to do a light upscale, which aims to upscale without introducing additional, unwanted detail. Here’s the result of that:

Still some gribblies but not as bad, and the image has a more painterly feel.

Here’s the image comped into the comic flow:

The framing works nicely. I’m using Clip Studio Paint here, incidentally. The first thing was to create some actual line art, using the AI image as a reference source. Using the ink pen I ended up with this:

The blue-tint is a handy feature in Clip Studio Paint which you can toggle on and off per-layer, and it’s very useful for making a reference layer fainter while still being visible. Adding the line work immediately gives the image a more human-made feel, I think, and helps to pick out and specify specific details and focal points in the otherwise slightly vague AI version.

Next up I used an india ink brush to create light and dark values across the image, taking the opportunity to really deepen the shadows and lean into a noirish vibe. I was quite tempted to leave the comic entirely black and white at this point:

While I really like the B&W version, I knew that the comic would have more impact in colour. I’m pretty terrible at colouring my own work, though, so I turned back to the AI source. The colours in the MidJourney image worked well enough - AI is especially good at lighting, it seems - but I didn’t want to use it as-is, so I applied various blur and blend tools to create a softer version that had a bit more of an oil painting look:

It’s subtle, but there’s a deliberateness to it which I think works. Compare the bright blue reflection in the water, for example.

Here it is with the shadows and lines back on:

It’s quite a shift from the original, but it was missing something. It took me longer than it probably should to figure it out. Lights! These are skyscrapers at night, and they’d be covered in pinprick lights. Here we go:

It’s absolutely scrappier than the AI original. It’s possibly not as good, technically. But the key thing, I think, is that it’s now much more mine. There’s deliberate intent in the finished article. The AI image is a reference, a starting point, just as you might use stock imagery, your own photography or other references.

Here’s another example, with the next frame in the comic:

I love the moody silhouetted lighting in this. Here’s the original thumbnails:

The prompt was:

18th century London, landscape, river thames, houses of parliament, circular black portal on the river bank, concept art, winter lighting

Lots more information in that prompt, you’ll notice. Here’s the upscaled version:

I generated this image while I was making the comic, so in the last couple of weeks. The futuristic city image I talked about earlier was generated several months back, and I think you can really see the progress MidJourney has made. Note the almost complete lack of greeblies.

It’s a pretty gorgeous image, if you ask me, full of atmosphere.

Here’s the inked and shaded version, without the AI colouring:

Like I say, I was very tempted to go with B&W.

Leave a comment

The actual composition of the comic, with the panels and text, was done in Clip Studio Paint. All of the line art and shading was also done in CSP. It’s a lovely program for comics artists (or artists generally), and is unusually cheap as well for such a powerful tool. Highly recommended.

In terms of hardware, I did most of the art on my Lenovo laptop. It’s a super cheapo (for Windows) machine that has a flexible hinge and touchscreen, enabling it to go into a canvas mode and work with a stylus. The layout was done on my main PC, where I have two screens. I realised that I can actually turn my monitors 90 degrees to become horizontal, something I’d never had need to do before. It was fantastically useful for this specific project, though:

Handy!

This is a small, tentative prod at using AI-generated images in a more artistic and original manner. There’s lots more I could be doing: I’m especially curious about going down a sort of photobash route, using multiple AI images and mashing them together to create something new.

Let me know if you’ve been experimenting with AI images in your own work!

I have no idea if this is the correct term for the squiggly-wigglies that AI generation tends to produce (although less so with each passing month).

Winston Malone

Oct 18, 2022

Thanks for the break down. Super helpful. I did something like this on the second part of my Doge story where I imported a midjourney image into Procreate and added lines and shading to really make it more cartoon looking. I think using AI art as a starting point is a great use case for a lot of artistic reasons.

Expand full comment

Mike Miller

Oct 22, 2022

Ok. Have you considered Stable Diffusion instead of Midjourney?

Stable Diffusion is free. Eventually Midjourney charges you.

Stable Diffusion can be installed and run locally - now, you do need a beefy machine for this. Like an Nvidia GPU with 6GB or more VRAM, but my current laptop has a 12GB Nvidia 3080ti (mobile), so I should be OK (as I type I'm waiting for my machine to download the libraries. You also need about 100GB of storage space... Well, my laptop as 3TB of SSD on two drives. I'm also installing the GUI version to try... I might end up using the CLI version.

Now, even if you're staying with Midjourney, here's the real point of this comment. Image to Image. Instead of a text prompt, you start with Simon-scribbles and let Stable Diffusion glam it out for you. I think this will give you more consistent results if you start doing character art. And, as shown in that Corridor video I sent you, if you have an insanely good GPU (you don't... What, was it 64 GB of VRAM to train?), you can train Stable Diffusion. Maybe you create Clarke in Makehuman, take him into Blender, animate a turnaround with some moving lights and render a PNG sequence to train? Boom, Stable Diffusion knows Clarke. With Blender and Mixamo and Hitfilm you can import a rigged and animated Clarke and export poses to feed into your starting image. Use GIMP to quickly comp stock images from Pexels or other sites.

Just a thought.

Using AI images as reference material

Making AI the start, not the end

Discussion about this post