When Creating Means Describing
“A crowded city street at dusk, soft rain, cinematic lighting.” And within seconds, the model will deliver unique images that match your input.
Generative artificial intelligence is no longer confined to language. In recent years, models capable of producing realistic images and videos from text descriptions have rapidly entered professional and everyday workflows. What once required cameras, studios or advanced design skills can now be achieved in seconds through a short written textual prompt. A short description, sometimes a single sentence, is enough to produce a detailed visual result. Between intention and result, there is now a pause — brief, often unnoticed — in which the creator just… waits.
The Importance of Description
This shift is not only technological; it is changing how we think about images, creativity and visual communication. To generate an image, one must first decide how to describe it. Choices about style, mood, context and perspective are made in language before anything is seen, shaping the range of possible outcomes in advance. This reorders the creative process. Instead of looking at an image and adjusting it step by step, creators write a description, generate several images and then choose the one that works best. Visual decisions that once emerged during production are now embedded in description. The moment of waiting becomes integral to the process: a short interval in which authorship is suspended and outcome is uncertain. Creativity unfolds not through continuous action, but through discrete acts of instruction through description followed by a small delay.
This has clear practical advantages. Language is accessible. Many people who cannot draw, take a high-quality picture or design can still describe what they want to see. Ideas that might have remained abstract can quickly be turned into images that support communication, learning or planning. In this sense, AI image generation lowers barriers and expands participation. The benefits of this transformation are already visible. Designers use generative images to explore ideas more quickly. Educators visualize abstract concepts for students. Marketing teams test visual directions before committing resources.
But Complexity Is Hard to Describe
At the same time, language brings its own limits. Not everything is easy to describe. Some ideas are vague, emotional or embodied. Some visual experiences resist clear naming. When images are generated through text, what can be easily described is more likely to appear, while what lacks words may remain unseen. Prompts often rely on shared references and familiar categories. As a result, generated images tend to follow recognizable patterns, and diversity depends not only on the model, but on the language used to guide it.
Not everything is easy to describe. Some ideas are vague, emotional or embodied
This shift helps explain the contrasting reactions to AI-generated images. For some, generative AI is a natural continuation of past technologies that expanded creative possibilities, increased productivity and lowered barriers to visual expression. From this perspective, it offers more people the ability to work with images, experiment with ideas and communicate visually without specialized skills.
Art Becomes the Word
For others, the same developments raise serious concerns. When images are generated quickly from language, creative labor can appear interchangeable, styles can be reproduced without clear consent and visual production risks becoming more industrial. What is framed as efficiency and democratization can also feel like a loss of authorship, value and recognition for those whose work has shaped the visual languages these systems now reproduce.
Today, the word is not only the beginning of creation; it is almost the entirety of the human contribution. After it is written, the rest unfolds elsewhere
What is certain is that Ananda K. Coomaraswamy’s claim — “It is by a word conceived in intellect that the artist, whether human or divine, works” — has acquired a new, technical resonance. Today, the word is not only the beginning of creation; it is almost the entirety of the human contribution. After it is written, the rest unfolds elsewhere. The artist types, submits and waits… and the image arrives.