A Cover Story pt 2

The Man Behind the Curtain

You may have noticed that The Control Problem uses an image generated by an AI tool. As I explained in my last post, I paid the original cover artist the full cost of a cover, which could not be used. Like many self-published authors, I either had to wing it on my own or rely on family. Luckily, I have a multi-talented husband. He studied film in college, worked as a camera assistant on things you have heard of, took photos you have seen, and has been working in tech for over 16 years. Here's my interview with Jacob Woodsey, the man who makes my tea every morning, about the hows and whys of the cover development process.

Who are you, exactly?

Your husband. And VP of UX Design at Twitch.

Would anyone who knows you be surprised that you designed a book cover?

Not really! I’ve always used design to bring attention to whatever I’m working on. I started designing websites for client in high school and I’ve printed books, posters and other collateral for various film and digital projects I’ve worked on. For a long time I managed the (small) creative team at Justin.tv and later Twitch. One of the first projects we tackled as a group after we hired our first designers (David McLeod, Kat Nieh, and Thomas Reed-Muñoz) at Twitch was a complete overhaul of our branding which covered everything from new employee onboarding supplies, swag, our web and mobile applications, all the way to a booth at E3 in 2011.

What made you want to do The Control Problem's cover?

I’ve been on sabbatical for 6 months and during that time I read the first version of the The Control Problem and gave feedback on some of your ideas. I’ve had the chance to think about my interpretation of the story and what the critical elements are. When your previous arrangement fell through it seemed like a good idea to see if I could try to create a cover that would both express some of the ideas I thought you were going for and not leave you cover-less!

Why did you choose to use Dall-e?

I am not really an illustrator. I’m a UI designer and I draw doodles and other silly pictures. The Control Problem needs something complex and dynamic. I wanted to see if any of the AI tools could create something unique that showed off what the book was about while creating some mystery and intrigue.

What was your process? What prompts did you use and why?

We discussed 3-4 ideas: a manipulation of a graphic describing an AI achieving “singularity”, a visual of a woman representing Vera, while masking some of the elements in the story, and typographic concept.

I made physical sketches of all three (editor's note: I recycled these before taking photos. oops) and decided to try searching for input or concept images on some stock sites. I found very little that worked - many of the images were too specific and complicated - so I moved to Stable Diffusion and Dall-E.

I began with trying to create elements of an image that I could compose into something else: an abandoned library, arms grasping each other, sinuous threads, etc. As I explored both tools I started to understand which prompts might offer usable images; specificity is hard, diffusion models are really bad at fingers (which apparently everyone knows), and faces can get creepy fast with eyes missing, strange symmetry and hair that doesn’t seem to grow from the right places.

Eventually I realized that I could combine the key ideas I wanted into a single image - a mysterious figure (Vera doesn’t even know who she is), bright colors (she has strong vivid hopes for her own future), and intertwining threads (a gesture to AI singularity). I wanted Vera to be obviously feminine but not sexual - she wants to be a parent, not a lover. I wanted the threads to be translucent or stretchy so that they would feel more organic or like they could integrate with the human body. And finally I wanted there to be many colors, showing the many possible futures Vera might have or bring about as she learns who she is and what she can do.

How horrifying were the rejected images?

These models are pretty bad at hands. If you don’t get lucky you get appendages or body parts that appear to be human but are so obviously wrong that they’re unnerving. This can also happen with eyes and mouths when you generate full body figures in different contexts.

How many images did you generate until you found something you wanted to use?

470 images exploring concepts before we selected a base image, and then another 300 new images and variants after we selected a base image.

An abandoned option based on Vera's city

The image that kicked off exploring threads

Thread options

Threads with color variations

Humans with threads

the first version of the concept that we liked - the prompt is “ultra sharp photograph of generational evolution, gelatinous translucent rainbow-colored strands, hybrid network topology, entering a female human head, concept art”
The four variants I generated that became the direction for the final image

Once you had something that would work, did it give you a perfectly ready-to-go cover?

The original image we focused on was just the face. It was a hint at the idea and it wasn’t even clear if it would work.

Dall-e has its own image editor that lets you generate patchwork images adding onto your existing images (from Dall-e or uploaded from elsewhere). Using that small piece I slowly composed a larger graphic to show the rest of the body and more of the threads so that I’d have enough artwork to design several different covers (different paperbacks, audiobook, digital book, and a dustjacket). This requires a lot of downloading pieces and manipulating them in photoshop and then adding them to Dall-e or generating another image and compositing it in photoshop. Even with a tool as powerful as Dall-e, there’s still a lot of creative paths you can take with your approach.

My favorite part was, once I’d completed the main illustration, building a set of masks to thread Norah’s name into the image. This effect may be subtle but it changes the emphasis between the author name and the book title so they can be read independently.

How long did it take to reach final results?

I think it took about 8-10 hours to get from the concepts to an image that we thought would work for our purposes.

This included a lot of just trying to learn how to even get basic images out of stable diffusion. I spent another block of time trying to understand how to safely increase the resolution of the image so it could be used in printing. The most difficult time was spent trying to get designs for each of the 3 different printers. Calculating spine widths and ensuring that the indesign files were going to be compatible with their templates turned out to be a really difficult problem (for me).

So all in all probably about 40 hours mostly due to my own lack of experience with some of these tools and processes.

Would you do this again, knowing how some people view AI generated images?

Yes. This technology is a real disruption to existing ways of creating art and as we develop other tools in other venues it will raise questions for the people who have mastered those skills. But these are tools - and humans have a long history of making improvements to the way we work and create and those have enabled more creators and more ideas to be expressed which in turn have allowed others to build new tools and started the process all over again. Creating anything uses the inputs of your experiences, your skills, and your influences (among others). AI generated images can give a lot more people skills that some only dream of.

The danger is that they also bring influences that are hidden and the originators remain uncredited and unrecognized for their contributions. If we can figure out a way to solve that problem (artist opt-out or some form of attribution perhaps), it will certainly improve the result.

In the end, it’s really quite appropriate for a set of tools with (so far) unrealized impact on the world to be the primary method of creating the cover for a book about the same.

Is AI image generation going to destroy art?

I’m not sure I’m qualified to answer this question.

Generally new technology leads to new creativity - when film started to give way to digital photography, we didn’t lose photographs or movies but they changed significantly - both in who could create them and what the average result would be. Some might argue that it’s been awful, but in my experience the people who care about the process of creation - making deliberate decisions with a purpose - and pour their hearts into the results will continue making wonderful things for us to enjoy.