Multimodal Architecture: Applications of Language in a Machine Learning Aided Design Process

Conversations

Multimodal Architecture: Applications of Language in a Machine Learning Aided Design Process

A conversation with George Guida on the relationship between language and architecture and his research developed at Harvard GSD and winner of the Harvard Digital Design Prize.

“Multimodal Architecture: Applications of Language in a Machine Learning Aided Design Process” explores how artificial intelligence can be used as a creative medium to reframe the practice of design itself. Developed within the context of Harvard GSD and winner of the Harvard Digital Design Prize, the project probes the imbricated ways in which machines read and transform words and images and how human designers might leverage such techniques to multiply imaginative possibilities. In this conversation with George Guida we discuss the relationship between language and architecture and how this often-secondary medium of design will play an integral role within design practices in the coming years through the use of machine-learning algorithms.

Image generation. ©George Guida

KOOZ What prompted your interest in AI?

GG My interest in AI began while I was working as an architect for Foster + Partners in London in 2017. During this time I came to appreciate the values which my colleagues with computational skills would bring to each project, where one person with these skills could produce the same or better outcome than an entire team without these skills, all in the same amount of time.

Designing large projects in an architectural practice requires a constant assessment of performance criteria (environmental, structural, floor plan efficiencies and so on) and it is also paired with a rapid optioneering design phase. While computational design offers these possibilities, I came to appreciate how machine learning and statistical neural networks take this to a new level.

My interest in AI, therefore, began from a more practical perspective and evolved more recently at the Harvard Graduate School of Design into an interest in changing design practices and creative applications of machine learning in architecture. With this, I became fascinated by expanding the field of collaborative human-machine design practices.

Architecture is ultimately never just a building; it is also the discourse and process that defines it. Within our current multimodal design processes there has been an inherent denial that language has had any significance.

KOOZ By building from language and text. How does the project reframe the relationship between architecture and script?

GG Architecture is ultimately never just a building; it is also the discourse and process that defines it. It is interesting to see how within our current multimodal design processes - through the combined use of text, images, 3D models, videos, etc. - there has been an inherent denial that language has had any significance. Adrian Forty states how throughout modernism there was a belief that building and drawing were the only mediums of the architect, with architects such as Mies famously saying “Build, don’t talk”.

This project uses the recent convergence of text and image processing into multimodal machine learning models to challenge this issue. Many will have seen the recent explosion of text-generated images with OpenAI’s DALL-E 2, Midjourney or StableDiffusion. What these invoke is repositioning the role of language within the creative process. Through the mediatization of form text and verbal speech find new relevance as a key medium of design, directly implying materiality and form. Ultimately the project brings forward machine-learning aided design methodologies, where architects maintain agency within this collaborative process, ultimately applied to the design case study for the recent MAXXI extension competition in Rome, Italy.

Writing MAXXI Museum Extension. ©George Guida

KOOZ How and to what extent might this affect the way we as architects approach and implement our vocabulary?

GG AI models today are imperfect. They learn and adapt in response to datasets we feed them, which in turn reflect on social and cultural norms. When you think of these recent models, they are all trained on millions of image and text prompts “scraped” from many internet sources. This means that as we use them we must go through a reverse process of learning how to speak in a language the model has learned to interpret images or recognizes features.

We, therefore, find ourselves writing fragmented text prompts such as “A modern research building by Carlo Scarpa and Zaha Hadid, front elevation view, hyper-realistic, trending on artstation”. What becomes interesting is the interplay between semantic specificity of text where one has control over content, materiality, style, and moods to say the least, and an intentional ambiguity in language. What this means is a play between our agency as architects and that of the machine.

AI models today are imperfect. They learn and adapt in response to datasets we feed them, which in turn reflect on social and cultural norms.

KOOZ By relying on a series of references, whether built or conceptual, how does the project challenge the power of making architecture through language?

GG This can most noticeably be seen in art, for example through the works of Sol LeWitt, where the role of the creative author is hands-off, yet art itself can be considered both the prompt and the artistic output. The idea of differed authorship, as Kyle Steinfield describes, does not make the author secondary but a principal driver and collaborator within the process. In architecture you can see this in the experimental CDLT House by Morphosis and Michele Rotondi, where only daily verbal instructions were provided daily to the contractor, resulting in an architecture of interpretation.

This, therefore, will gradually redefine the meaning of the architectural brief. As the first description of space, this text’s qualitative ambiguity enables an open design interpretation. I could see a future where a project's typical construction drawing becomes a set of written or verbal instructions where architects redesign descriptive briefs into architecture. Another interesting example is OMA’s BibliothèqueNational (1989) competition submission, where their interpretation of the brief’s description “to imagine a building where the most important parts would be absences of building” brought to a full conceptual basis of their proposal composed of voids within voids. Rem Koolhaas is unique in describing how at the beginning of each of his projects, “there is not writing but a definition in words – a text – a concept, ambition, or theme that is put into words, and only at the moment that it is put into words can we begin to proceed to think about architecture; the words unleash the design” (Koolhaas, 1993).

1/3

I could see a future where a project's typical construction drawing becomes a set of written or verbal instructions where architects redesign descriptive briefs into architecture.

KOOZ The project explores how “designers might leverage AI to multiply imaginative possibilities.” What are for you the opportunities offered by such a multiplication?

GG We are only at the beginning of this new design paradigm. Just to think of the quality of the results I was generating more than a year ago with AttnGAN or CLIP to what it is now is a reflection of Moore’s Law (which is now slowing down!). Machine learning and in particular generative adversarial networks (GANs) are beneficial in giving architects new tools to tackle creative problems. Beyond the benefits of optimization, these permit us to classify, generate and navigate across large amounts of data. Being able to explore multidimensional “latent spaces” permits us, for example, to visually navigate between Gothic and Post-modern architectural imagery and see hybrid forms or styles. We are gradually unfolding Negroponte’s “The Architectural Machine” (1970) where machines are used to enhance the creative process of architectural production. With models such as NVIDIA’s Instant Nerf, for example, we are entering into a phase of image-3D scene creation, where we will soon be able to navigate non-linearly from text or simply thought-to image to 3D designs.

1/3

The democratization of many of these AI tools, away from computer science or python-based software, is unleashing new exciting forms of participatory design.

KOOZ What is for you the power of the architectural imaginary?

GG The democratization of many of these AI tools, away from computer science or python-based software, is unleashing new exciting forms of participatory design. The speed at which we are increasingly able to navigate from a concept idea to a detailed design is unprecedented. Some fear it as the death of the illustrator, graphic designer, or soon the architect, but what these tools can also do is to provide agency in creating new imaginary visions, narratives, and discourse. The many unbuilt images or forms give us a critical view of our biases and cultural norms engrained in each trained dataset. The architectural imaginary (AI) is the starting point, and as Alan Turing wrote: “This is only a foretaste of what is to come, and only a shadow of what is going to be” (1949).

Bio

George Guida is a research associate at the Harvard Laboratory for Design Technologies and co-founder of the design practice ArchiTAG. His research synthesizes design and technology into collaborative and participatory solutions through machine learning, human-computer interfaces, and immersive mixed realities. He completed his studies at the Architectural Association and Harvard Graduate School of Design and has worked for several years as an ARB RIBA architect at Foster + Partners in London and most recently at the MIT Media Lab.

Francesca Romana Forlini is an architect, Ph.D, editor, writer and educator whose research is located at the intersection of feminism, cultural sociology and architectural history and theory. She is an Adjunct Associate Professor at the New York Institute of Technology and Parsons The New School in New York. She worked as chief editor at KoozArch, where she is currently a contributor. She is a Fulbrighter ed alumna of Harvard Graduate School of Design (GSD) and the RCA.

Interviewee(s)

George Guida

Interviewer(s)

Francesca Romana Forlini x KoozArch

Published

22 Dec 2022

Reading time

10 minutes

Topics

Digital Language