Journal of Computing and Information Science in Engineering

Do Large Language Models Produce Diverse Design Concepts?

A Comparative Study with Human-Crowdsourced Solutions

Our overall objective is to better understand LLMs’ ability to generate diverse design solutions – tested across a range of design problems and LLM input parameters. For each design topic, we generated 800 total design solutions using GPT-4 across different generative parameters and different prompt engineering techniques. For each design topic, we retrieved 100 design solutions via crowdsourcing. All the solutions were then converted into vector embeddings, which were used to measure diversity for quantitative comparisons. This was conducted 5 times across 5 different design problems, leading to a total of 4,000 design solutions generated by an LLM and 500 design solutions retrieved via crowdsourcing.

Access to large amounts of diverse design solutions can support designers during the early stage of the design process. In this article, we explored the efficacy of large language models (LLMs) in producing diverse design solutions, investigating the level of impact that parameter tuning and various prompt engineering techniques can have on the diversity of LLM-generated design solutions. Specifically, we used an LLM (GPT-4) to generate a total of 4000 design solutions across five distinct design topics, eight combinations of parameters, and eight different types of prompt engineering techniques, leading to 50 LLM-generated solutions for each combination of method and design topic. Those LLM-generated design solutions were compared against 100 human-crowdsourced solutions in each design topic using the same set of diversity metrics. Results indicated that, across the five design topics tested, human-generated solutions consistently have greater diversity scores. Using a post hoc logistic regression analysis, we also found that there is a meaningful semantic divide between humans and LLM-generated solutions in some design topics, but not in others. Taken together, these results contribute to the understanding of LLMs’ capabilities and limitations in generating a large volume of diverse design solutions and offer insights for future research that leverages LLMs to generate diverse design solutions for a broad range of design tasks (e.g., inspirational stimuli).

Download publication

Associated Researchers

Daniele Grandi

Principal Research Scientist

Christopher McComb

Carnegie Mellon University

Kosa Goucher-Lambert

University of California, Berkeley

Kevin Ma

University of California, Berkeley

Related Publications

Publication

2025

RECALL-MM: A Multimodal Dataset of Consumer Product Recalls for Risk Analysis using Computational Methods and Large Language Models

New multi-modal design dataset contains historical information about…

Publication

2025

Fundamental Investigation of the Interface Formation of Multi-material Additive Manufactured 316L-CuSn10 Structures

This study investigates the interfacial properties and bonding…

Publication

2024

Evaluating Large Language Models for Material Selection

This work evaluates the use of LLMs for predicting materials of…

Publication

2024

Elicitron: An LLM Agent-Based Simulation Framework for Design Requirements Elicitation

A novel framework that leverages Large Language Models (LLMs) to…

Publication

2024

DesignQA: A Multimodal Benchmark for Evaluating Large Language Models’ Understanding of Engineering Documentation

Novel benchmark aimed at evaluating the proficiency of multimodal…

Get in touch

Something pique your interest? Get in touch if you’d like to learn more about Autodesk Research, our projects, people, and potential collaboration opportunities.

Contact us