Using AI on a live project - York Road
Artificial Intelligence (AI) software has been introduced to the masses and is being used for the generation of both text and images.
ChatGPT is based on a language model and supposedly supports users to create written work as consequence of a text prompt by the user. Whilst this article doesn’t focus on the creation of text, using ChatGPT has been described as “AI collecting relevant information from the internet, placing it in a mixing bowl and spitting out a result.” We would interpret this as AI having the ability to understand the original text prompt by the user, being able to source relevant information from the internet, being able to compose text output that may appear to be convincing on the surface but critically, not being able to contextualise the text or present the text with the correct narrative. It may lack accurate sequencing, weaving of storytelling or the emotional expression that a human would have been able to introduce.
Unity Architects decided to use two AI programmes for image generation, to understand how they may be used as design tools and whether they present the same “mixing bowl” output as ChatGPT. One is called Midjourney, which is a standalone AI programme aimed at the generation of imagery. The other is called Diffusion which is an extension to SketchUp and is accessed directly within SketchUp.
There is no better way of testing a new design tool than to use it on a live project. Our York Road scheme in Leamington Spa is a private residential development and after navigating RIBA Stage 1 with the Client, it was time to explore what the scheme could be during RIBA Stage 2 – Concept Design.
Each architectural practice is different, but we like to understand the constraints and opportunities in plan initially, to enable us to have a general project direction, prior to the creation of 3D modelling and the creation of imagery. To this end, the below concept plan roughly captured the underlaying architectural strategy that was to be developed further:
After agreeing the general principles in plan with the Client, we needed to explore the massing, volume and form of the proposal. This began by extruding the sketch plan to aid the Client’s understanding of the spaces, coupled with the embryonic creation of internal and external eye-level images to express some of the main architectural moves, such as a key lines of sight:
Once the spatial relationships had been agreed, it was time to understand what the built form needed to be and how the architecture could support the way the Client wanted to use the spaces. This was the moment we introduced generative AI programmes to the workflow.
As mentioned earlier, Midjourney is a standalone AI programme, whereas Diffusion is directly embedded within SketchUp. This resulted in two approaches to how the programmes could be used and naturally this presented different outcomes.
DIFFUSION
We will look at Sketchup Diffusion initially, as the visual variations are easier to understand. Below you can see a sequence of images.
In the top-left corner, you can see the basic massing image we started with. The following sequence of images were created using Diffusion (SketchUp’s generative AI programme). The images used Diffusion’s pre-set templates and therefore represent a ‘click of a button’ output. Starting with a basic massing model, you are able to rapidly (within seconds) create the other images.
A benefit of using Diffusion is speed. It is incredibly simple to use and presents the designer and client with a different way of looking at the massing image almost instantaneously. This includes the potential play of light and material application.
Another advantage of using Diffusion is the ability to respect the original geometry. This aspect will become clearer when discussion Midjourney. One of Diffusion’s strengths is that you can still visually appreciate the intended architecture after AI has done its work. If you look back at the images created with Diffusion, they all look alike and can all be visually traced back to the original massing image.
Were any of the Diffusion AI generated images shown to the client? In this instance, no. It was purely an investigative process to understand how AI may contribute to the architectural workflow.
Did any of the images created by Diffusion help to develop the design during RIBA Stage 2? In this instance, no. However, some of the Diffusion images such as the ‘Physical Model’ and ‘Photorealistic’ images did add more life to the massing model and therefore helped validate the potential quality of such scheme.
MIDJOURNEY
In the top-left corner, you can see the massing image that was uploaded to Midjourney and this was the basis for the AI programme to work upon. The original image was combined with text prompts, instructing Midjourney with how we wanted it to develop the image.
You will note, Midjourney is not as successful at respecting the original geometry and it is difficult to trace the images back to the original architectural scheme.
A pro and a con of using Midjourney is that it does feel feral. It will bounce back images that are sometimes similar and sometimes wildly different to the original input. In a peculiar way, this allows the architect to visually appraise variations of a theme. If you are the creator of the architectural scheme and the direct user of Midjourney, inputting your purposely created text prompts, then you are able to extract certain qualities and characteristics from the Midjourney output.
Were any of the Midjourney AI generated images shown to the client? In this instance, no.
Did any of the images created by Midjourney help to develop the design during RIBA Stage 2? In this instance, yes. Whilst Midjourney churned out images that were abstract variations of the original architectural massing image, due to the sophistication of the Midjourney images, including materials, daylight and interesting design variations of the original theme, certain haphazard outputs were able to be opportunistically identified and reintroduced into the underlaying architectural vision.
CONCLUSIONS
This is representative of Unity Architects’ first foray into generative AI images and therefore far from an expert’s review. However, the introduction of Diffusion and Midjourney have positively contributed to this project. The speed and playful nature of the processes can support creative thinking and the identification of opportunistic principles.
Lessons learned from this process suggest the introduction of generative AI during the early stages of RIBA Stage 2 – Concept Design, can be beneficial to the architect’s workflow. Due to the lack of control over the output, especially when using Midjourney, it would be challenging to contemplate a use beyond RIBA Stage 2 at this moment in time.
Neither of the AI programmes altered the fundamental architectural principles of this project. It is safe to conclude that an architect is still necessary and generative AI images are in addition to the typical design processes and methodologies an architectural practice would perform.
With reference to the ChatGPT anti-contextual “mixing-bowl” output, the AI generative images are somewhat similar, having a use but lacking context and control. As is similar with ChatGPT output whereby the user needs to review the output, extract useful portions of text and manually restructure, this attention is also required when managing AI image creation, to identify, adapt and reintroduce AI design principles to the original architectural scheme. Evidently, an iterative human & AI process is able to support the targeted and purposeful creation of architecture.
We are looking forward to the rapid evolution of this technology and the integration with future projects.