Journal of Computing and Information Science in Engineering
Do Large Language Models Produce Diverse Design Concepts?
A Comparative Study with Human-Crowdsourced Solutions
Our overall objective is to better understand LLMs’ ability to generate diverse design solutions – tested across a range of design problems and LLM input parameters. For each design topic, we generated 800 total design solutions using GPT-4 across different generative parameters and different prompt engineering techniques. For each design topic, we retrieved 100 design solutions via crowdsourcing. All the solutions were then converted into vector embeddings, which were used to measure diversity for quantitative comparisons. This was conducted 5 times across 5 different design problems, leading to a total of 4,000 design solutions generated by an LLM and 500 design solutions retrieved via crowdsourcing.
Access to large amounts of diverse design solutions can support designers during the early stage of the design process. In this article, we explored the efficacy of large language models (LLMs) in producing diverse design solutions, investigating the level of impact that parameter tuning and various prompt engineering techniques can have on the diversity of LLM-generated design solutions. Specifically, we used an LLM (GPT-4) to generate a total of 4000 design solutions across five distinct design topics, eight combinations of parameters, and eight different types of prompt engineering techniques, leading to 50 LLM-generated solutions for each combination of method and design topic. Those LLM-generated design solutions were compared against 100 human-crowdsourced solutions in each design topic using the same set of diversity metrics. Results indicated that, across the five design topics tested, human-generated solutions consistently have greater diversity scores. Using a post hoc logistic regression analysis, we also found that there is a meaningful semantic divide between humans and LLM-generated solutions in some design topics, but not in others. Taken together, these results contribute to the understanding of LLMs’ capabilities and limitations in generating a large volume of diverse design solutions and offer insights for future research that leverages LLMs to generate diverse design solutions for a broad range of design tasks (e.g., inspirational stimuli).
Download publicationAssociated Researchers
Christopher McComb
Carnegie Mellon University
Kosa Goucher-Lambert
University of California, Berkeley
Kevin Ma
University of California, Berkeley
Related Publications
2025
RECALL-MM: A Multimodal Dataset of Consumer Product Recalls for Risk Analysis using Computational Methods and Large Language ModelsNew multi-modal design dataset contains historical information about…
2025
Fundamental Investigation of the Interface Formation of Multi-material Additive Manufactured 316L-CuSn10 StructuresThis study investigates the interfacial properties and bonding…
2024
Evaluating Large Language Models for Material SelectionThis work evaluates the use of LLMs for predicting materials of…
2024
Elicitron: An LLM Agent-Based Simulation Framework for Design Requirements ElicitationA novel framework that leverages Large Language Models (LLMs) to…
2024
DesignQA: A Multimodal Benchmark for Evaluating Large Language Models’ Understanding of Engineering DocumentationNovel benchmark aimed at evaluating the proficiency of multimodal…
Get in touch
Something pique your interest? Get in touch if you’d like to learn more about Autodesk Research, our projects, people, and potential collaboration opportunities.
Contact us