Since the recent advent of AI image generation, we at Allied Global Marketing have looked for ways to incorporate this groundbreaking technology into our workflow. And while AI text generators have certainly made their mark on so many facets of our business, image generation has revealed new creative possibilities as well.
Benefits of AI image generation
Adobe has built AI into Photoshop, public platforms like Midjourney have arisen, and we have begun hosting our own sandboxed versions of Dall-E and Stable Diffusion. What has come from these advances is unprecedented. Painstaking retouching tasks that would have taken hours can now be done in seconds, and original images that would have taken stock photo sourcing, silhouetting and image manipulation (with limited success), can now be done in a fraction of the time, with often spectacular outcomes. And with a bit of trial and error, prompts can be finessed for countless options and versions, until the desired images are produced.
During the brainstorming stage of creative assignments, we are able to use AI to help "sketch" and "storyboard" ideas and get a real sense of how they may appear, without extensive research and comping time.
As the software behind image-generating AI continues to evolve and improve, the quality and accuracy of the results get better as well. And, like its text-only counterparts, it allows for extreme detail in the prompts it will accept. Not only can it understand description of the image itself, including point of view, illustrative style, lighting conditions and mood, but camera lens, focal length and aperture, depth of field, aspect ratios and more.
Recently we were given a key art assignment for a stage production of a holiday-themed acrobatics show. Since we were working concurrently with the production development, photographic assets were limited; yet we had descriptions of the plot, key characters, the setting, etc. So with a few keystrokes, we were able to help the producers (and ourselves) to imagine costume options, poses and other character attributes. Versions of the images not only made it into the final key art, but also informed the actual production and costume design.
In another instance, we were pitching a new client with a spec presentation where we were given no assets whatsoever. Historically in these instances, we would present mood boards and reference to the client, along with a verbal description of our ideas. Or put together mock concepts featuring images cobbled together with stock, or pencil sketches or marker comps. These options often fell short of describing our vision. This time, we were able to build full-out finished looking comps, complete with (seemingly) photographic assets, customized to the client's aesthetic and tone.
An imperfect world
However, with the positive often comes the negative. Despite the impressiveness of the images we were able to conjure, the AI's misunderstanding, as well as visual aberrations and outright errors abounded.
In both instances described above, bodies often appeared with three legs, or nine or eleven fingers. In one instance, an acrobat's entire torso was backwards. On occasion, the AI would misrepresent or "misunderstand" the prompts it was given. In one instance, multiple attempts were made to produce an image of a seemingly simple background character for a piece of key art with poor to unusable results. The prompts used to describe the character's build and pose were just simply ignored.
As AI imagery is largely an amalgamation of an admittedly massive library of reference that is assembled to satisfy a prompt, sometimes it just doesn't have the intuition to understand nuance the way a human artist would.
Text Support
Quite recently, Google released its AI powerhouse, Gemini. Capable of many tasks previously unachievable, it has added to the artist's toolbox in a significant way. Namely, typography. Usable, legible, customizable typography. Now, images can be generated with type incorporated seamlessly into a scene. And with a bit of trial and error with prompts, the results can be staggering.
Proprietary Security
As with their text-based counterparts, online AI image generators are learning models. In other words, each time a user enters a prompt, and generates an image, that gets added to the software's "intelligence," and becomes part of the repertoire from which it can generate new content. Therefore, if we at Allied were to enter any client's privileged information, we'd effectively be releasing it to the public, albeit contained behind the AI. We cannot allow this. Therefore, Allied's Data & Technology team have devised proprietary versions of these programs built into our AI framework, Allied GPT. Encrypted and available only to Allied employees with valid credentials, these versions do not share inputs with the world. They can access outside information, while protecting any content we input into them. This protects our clients' privacy and information and us to harness the power of these groundbreaking tools, and securly use them for our clients.
The value of AI
Of course, image generation is merely a tool, one in an ever-growing toolbox we as creative professionals use to solve creative challenges for our clients. Like the advent of digital photo manipulation and page layout software some 30 years ago, it enhances our capabilities, but does not replace our individual creativity or thought processes. It is for this reason that we can embrace these new tools, remaining cautiously optimistic that our human creativity will always be needed to drive the process, and continue to be in high demand.
Taken at face value, AI imagery has provided a quantum leap for creative advertising. What it has enabled us as artists to do has had a lasting effect on our work process, and on the creativity we can offer our clients. But it is not a solution to every creative challenge, and cannot solve every visual problem.
Yet.