Digital Aura of Generative AI in the Built Environment: A Brief History from Generative Adversarial
Newtorks to Generative Pre-trained Transformers
Haoqing Xu
In the mid-2010s, the machine-learning revolution entered the field of visual arts. Later artists discovered the generative adversarial networks (GANs) was particularly suitable for image manipulation. GAN is a new mode of generative AI which demonstrates the capability of manipulating images based on pixels. The emergence of deep learning based generative artificial intelligence (AI) began in the summer of 2022, with the groundbreaking Generative Pre-trained transformers GPTs) such as Dall-E that create pictures from language prompts.
In the architecture industry, generation of digital AI images has become an indispensable phenomenon, driven by the development of generative AI models. By illustrating the evolutionary painting tools and the modes of production in the historic perspective, the thesis challenges traditional methods of creating art and calls for a new vision and understanding of generative AI as a creative tool in the fields of design and art.
The thesis will express the changing styles and methods of digital generative AI that have shaped architecture over the past decades. There are two ways to analyse generative AI in architectural content, from the aesthetics and productive modes in the order of historic timeline.
The history of painting tools evaluated from traditional painting equipment to digital design applications, which roots in the changing productive relationships. Karl Marx and Friedrich Engels discussed the term ‘reproduction’,[1] whereby he believed the time and labour of humans can be reproduced in the form of capital. Walter Benjamin’s concept of ‘the work of art designed for reproducibility’[2] reminds the future mode of production in terms of art where ‘mechanical reproduction of art changes the reaction of the masses towards art.’[3] In the mechanisation mode of production, Sigfried Giedion innovatively argues that the mechanisation contributes to the ‘anonymous history,’ where on the broader, often anonymous forces that shaped technological and social change. Technology, as a central discourse, form historic memories, produce innovative perspectives.
In light of the contemporary situations of evolving forms of digital art and evolutionary digital technologies, particularly AI and machine learning, an evaluation of the form of Benjamin’s ‘aura’[4] in the machine-learning production of art is necessary. However, generative AI imitates artwork from the originals, derivates artistic styles from the dataset, and creative processed content via transformers. In contrast to the era of industrialisation where unidentifiable copies are mass-produced, AI algorithms can produce art in seconds. In Carpo’s essay Imitation Game, a critical awareness of invention from AI is predicated on some form of assimilation or imitation.[5] Drawing on Benjamin’s ‘aura’[6] is a means to understand the phenomenon of mass reproduction in the era of mass customisation where each reproducible product can be identifiable and unique.
The motivation to explore generative models as a design technique for architecture could be the desire to find an alternative design methodology. Before the emergence of diffusion models, AttnGAN has shown an attempt to use text to image neural networks aiding in design process in architecture.[7] This approach to architecture design relies not on images but on languages as a starting point, as stated by Matias del Campo in Ontology of Diffusion Models published in 2024.[8] Moreover, ControlNet was invented in 2023, which revealed the veil of the potential of diffusion models.[9] ControlNet can effectively control stable diffusion with single or multiple conditions, with or without prompts, therefore, diffusion model is a progressive model being able to evolve itself.
Challenging traditional dualism, tools are the only cultural products designed to produce something else. The adaptation of emerging techniques and technologies in the progress of civilisation has been crucial, as gleaned from the historic discourse on technologies. However, in the end, it is the human mind that decodes the various meanings, and cultural products present in the imagery produced by diffusion models because machines do not understand the underlying meanings in images.
To parallelise cultural reproduction and mechanical reproduction, Pierre Bourdieu introduced the concept of culture reproduction. Art has become a cultural capital in Bourdieu’s discourse. He carries on Marx’s historical materialist viewpoints from the Frankfurt School, where artworks belong to the superstructure of a culture. Similarly, Benjamin examined the capacity of mass production, especially in the era of photography and film, to contribute to aesthetics and politicising art. Benjamin’s The Work of Art in the Age of Mechanical Reproduction, which can be enlarged and therefore extends the ideas of the Frankfurt School, fills the gap to remind us that there is still optimism among the masses and thus positivism in mass culture.
The conclusion of this image testament via ControlNet acts as an experiment and to test the hypothesis that there is the continuity from GAN to GPT, and from 2D to 3D, with a users’ end starting point to reveal the veil of the algorithms. The team lead by Ulyanov et al. separated the digital ‘style’ and ‘content’ of one image, they found an even better solution that successfully transferred styles.[10] Here the notion of ‘style’ recalls the reference to the styles of architecture created no matter by different architects or times. As the deeply analysed logics and scientific reasons behind an image, the generated images have a solid foundation to be believed they are reliable and meaningful.
AI can handle the rhythm and poetic sections of video production. Sora uses a diffusion model to create realistic and imaginative scenes from text instructions, demonstrating processes at a higher level with consistent characters and realistic elements. The character in Air Head features a man with a balloon for a face who is walking in a cactus store, which is created by imitation and modification of the original balloons and suits. The transformation and the mix of the originals are like surrealist painting skills that magnificent contrasts among the elements. The essential elements and their textures bring up resonance in an imagined scene. Referring to the gestures of modernism, Zumthor mentioned, the feeling of the door handle in his home is the most memorable memory. ‘What the use of a particular material could mean in a specific architectural context’ asked by him,‘[...] throw new light onto both the way in which the material is generally used and its own inherent sensuous qualities.’[11] The composition of textures and materials brings up the unique existence of the space and time, which is the missing aura in mechanical reproduction.
The aesthetic evaluation of AI-generated artefacts in Tim Fu’s Studio influences the viewers’ perception. This may lead to decreased credits for the human artist and the increased credits for the technological creators. Future works are needed in ways to quantify and diversify the outputs, in terms of understanding generative AI’s influence on aesthetics. The value of generative AI lies not only in its ability to create artistic works but also in its potential to democratise video creation, making it accessible and inclusive for everyone.
The present research tested the hypothesis that digital style and content have been imitated by generative AI. Traditional artworks have an aura that has been lost in the era of mechanical reproduction, in photos and films, but the aura is back in generative AI images. The limitation of the present research is that there is already an aura in many architects’ drawings and creations, and the research lacks a comparison between the aura created by humans and that created by generative AI. However, these two kinds of auras have minimal differences because they are like two artists using different painting tools.
Regarding the relationship between human labour and machines, there is a question of whether humans should adapt tools or whether tools should be invented to meet humans’ needs. As the answer is neither of these, painting tools—as tools of cultural production, a top superstructure above the modes of production, a significant opposite of the mechanical reproduction—will be the tendency of Generative AI models such as GANs and GPTs to boost creativity and possibility in artworks, which helps generate identities and self-recognition and reconnect the lost humanity in nature in the previous era of digital manual drawings to the light of AI and deep learning in almost every industry.
Endnotes
[1] Karl Marx, Capital Volume III: The Process of Capitalist Production as a Whole, ed. Friedrich Engels (New York: International Publishers, 1999), chap. 15.
[2] Walter Benjamin, ‘The Work of Art in the Age of Mechanical Reproduction,’ in Illuminations, ed. Hannah Arendt, trans. Harry Zorn (London: Pimlico, 1999), 218.
[3] Ibid., 227.
[4] Ibid., 214.
[5] Mario Carpo, ‘Imitation Games,’ Artforum 61, 10 (Summer 2023): 185.
[6] Benjamin, ‘The Work’, 218.
[7] See Matias Del Campo, ‘Ontology of Diffusion Model’, in Diffusions in Architecture: AI and Image Generators (London: Wiley, 2024), 44-54.
[8] Matias Del Campo and Sandra Manninger, ‘Strange, But Familiar Enough: The Design Ecology of Neural Architecture’, Architectural Design 92, no. 3 (May 2022): 38–45, https://doi.org/10.1002/ad.2811.
[9] Lvmin Zhang, Anyi Rao and Maneesh Agrawala, ‘Adding Conditional Control to Text-to-Image Diffusion Models’ (Preprint, submitted on 26 November 2023), and Dmitry Ulyanov et al., ‘Texture Networks: Feed-forward Synthesis of Textures and Stylized Images’ (Preprint, submitted on 10 Mar 2016).
[10] Peter Zumthor, Thninking Architecture (Baden: Lars Muller, 1998), 11.
Figures
Figure 1. Created by Haoqing Xu. The Timeline of AI Development. Sep 6, 2024. Figure 2. OpenAI. Air Head • Made by Shy Kids with Sora. 5 Apr 2024. Youtube Video. https://www.youtube.com/watch?v=9oryIMNVtto.
Figure 3. OpenAI. Tim Fu • Sora Showcase. 18 Jul 2024. Youtube Video. https://www.youtube.com/watch?v=y_4Kv_Xy7vs.