Abstract

A picture is worth a thousand words, and in design metric estimation, a word may be worth a thousand features. Pictures are awarded this worth because they can encode a plethora of information. When evaluating designs, we aim to capture a range of information as well, including usefulness, uniqueness, and novelty of a design. The subjective nature of these concepts makes their evaluation difficult. Still, many attempts have been made and metrics developed to do so, because design evaluation is integral to the creation of novel solutions. The most common metrics used are the consensual assessment technique (CAT) and the Shah, Vargas-Hernandez, and Smith (SVS) method. While CAT is accurate and often regarded as the “gold standard,” it relies on using expert ratings, making CAT expensive and time-consuming. Comparatively, SVS is less resource-demanding, but often criticized as lacking sensitivity and accuracy. We utilize the complementary strengths of both methods through machine learning. This study investigates the possibility of using machine learning to predict expert creativity assessments from more accessible nonexpert survey results. The SVS method results in a text-rich dataset about a design. We utilize these textual design representations and the deep semantic relationships that words and sentences encode to predict more desirable design metrics, including CAT metrics. We demonstrate the ability of machine learning models to predict design metrics from the design itself and SVS survey information. We show that incorporating natural language processing (NLP) improves prediction results across design metrics, and that clear distinctions in the predictability of certain metrics exist. Our code and additional information about our work are available on the MIT DeCoDE Lab website.1

References

1.
Snyder
,
J.
,
2014
, “
Visual Representation of Information as Communicative Practice
,”
J. Assoc. Inf. Sci. Technol.
,
65
(
11
), pp.
2233
2247
.
2.
Amabile
,
T. M.
,
1988
, “
A Model of Creativity and Innovation in Organizations
,”
Res. Organ. Behav.
,
10
(
1
), pp.
123
167
.
3.
Maritz
,
A.
, and
Donovan
,
J.
,
2015
, “
Entrepreneurship and Innovation: Setting an Agenda for Greater Discipline Contextualisation
,”
Education + Training
,
57
(
1
), pp.
74
87
.
4.
Ahmed
,
F.
,
Ramachandran
,
S. K.
,
Fuge
,
M.
,
Hunter
,
S.
, and
Miller
,
S.
,
2019
, “
Interpreting Idea Maps: Pairwise Comparisons Reveal What Makes Ideas Novel
,”
ASME J. Mech. Des.
,
141
(
2
), p.
021102
.
5.
Batey
,
M.
, and
Furnham
,
A.
,
2006
, “
Creativity, Intelligence, and Personality: A Critical Review of the Scattered Literature
,”
Genet. Soc. Gen. Psychol. Monogr.
,
132
(
4
), pp.
355
429
.
6.
Ambile
,
T. M.
,
1996
,
Creativity in Context
,
Westview Press
,
Boulder, CO
.
7.
Sternberg
,
R. J.
,
1999
,
Handbook of Creativity
,
Cambridge University Press
,
New York
.
8.
Mumford
,
M. D.
, and
Gustafson
,
S. B.
,
1988
, “
Creativity Syndrome: Integration, Application, and Innovation
,”
Psychol. Bull.
,
103
(
1
), pp.
27
43
.
9.
Liikkanen
,
L. A.
,
Hämäläinen
,
M. M.
,
Häggman
,
A.
,
Björklund
,
T.
, and
Koskinen
,
M. P.
,
2011
, “
Quantitative Evaluation of the Effectiveness of Idea Generation in the Wild
,”
International Conference on Human Centered Design
,
Berlin/Heidelberg
.
10.
Louridas
,
P.
,
1999
, “
Design as Bricolage: Anthropology Meets Design Thinking
,”
Des. Stud.
,
20
(
6
), pp.
517
535
.
11.
Toh
,
C. A.
, and
Miller
,
S. R.
,
2016
, “
Creativity in Design Teams: The Influence of Personality Traits and Risk Attitudes on Creative Concept Selection
,”
Res. Eng. Des.
,
27
(
1
), pp.
73
89
.
12.
Sarkar
,
P.
, and
Chakrabarti
,
A.
,
2014
, “
Ideas Generated in Conceptual Design and Their Effects on Creativity
,”
Res. Eng. Des.
,
25
(
3
), pp.
185
201
.
13.
Sarkar
,
P.
, and
Chakrabarti
,
A.
,
2011
, “
Assessing Design Creativity
,”
Des. Stud.
,
32
(
4
), pp.
348
383
.
14.
Sundström
,
P.
, and
Zika-Viktorsson
,
A.
,
2003
, “
Innovation Through Explorative Thinking in Product Development Projects
,”
DS 31: Proceedings of ICED 03, the 14th International Conference on Engineering Design
,
Stockholm, Sweden
.
15.
Christensen
,
B. T.
, and
Ball
,
L. J.
,
2016
, “
Dimensions of Creative Evaluation: Distinct Design and Reasoning Strategies for Aesthetic, Functional and Originality Judgments
,”
Des. Stud.
,
45
(
Part A
), pp.
116
136
.
16.
Eshun
,
E. F.
, and
de Graft-Johnson
,
K.
,
2012
, “
Learner Perceptions of Assessment of Creative Products in Communication Design
,”
Art Des. Commun. High. Educ.
,
10
(
1
), pp.
89
102
.
17.
Shah
,
J. J.
,
Smith
,
S. M.
, and
Vargas-Hernandez
,
N.
,
2003
, “
Metrics for Measuring Ideation Effectiveness
,”
Des. Stud.
,
24
(
2
), pp.
111
134
.
18.
Cseh
,
G. M.
, and
Jeffries
,
K. K.
,
2019
, “
A Scattered Cat: A Critical Evaluation of the Consensual Assessment Technique for Creativity Research
,”
Psychol. Aesthet. Creat. Arts
,
13
(
2
), pp.
159
166
.
19.
Alipour
,
L.
,
Faizi
,
M.
,
Moradi
,
A. M.
, and
Akrami
,
G.
,
2017
, “
The Impact of Designers’ Goals on Design-by-Analogy
,”
Des. Stud.
,
51
, pp.
1
24
.
20.
Cheng
,
P.
,
Mugge
,
R.
, and
Schoormans
,
J. P.
,
2014
, “
A New Strategy to Reduce Design Fixation: Presenting Partial Photographs to Designers
,”
Des. Stud.
,
35
(
4
), pp.
374
391
.
21.
Chan
,
J.
,
Dow
,
S. P.
, and
Schunn
,
C. D.
,
2018
,
Engineering a Better Future
,
Springer
,
Cham
, pp.
111
139
.
22.
Baer
,
J.
,
2015
, “
The Importance of Domain-Specific Expertise in Creativity
,”
Roeper Rev.
,
37
(
3
), pp.
165
178
.
23.
Galati
,
F.
,
2015
, “
Complexity of Judgment: What Makes Possible the Convergence of Expert and Nonexpert Ratings in Assessing Creativity
,”
Creat. Res. J.
,
27
(
1
), pp.
24
30
.
24.
Amabile
,
T. M.
,
1982
, “
Social Psychology of Creativity: A Consensual Assessment Technique
,”
J. Pers. Soc. Psychol.
,
43
(
5
), pp.
997
1013
.
25.
Chan
,
J.
,
Dow
,
S. P.
, and
Schunn
,
C. D.
,
2018
,
Engineering a Better Future
,
Springer
,
Cham
, pp.
111
139
.
26.
Linsey
,
J. S.
,
Green
,
M. G.
,
Murphy
,
J. T.
,
Wood
,
K. L.
, and
Markman
,
A. B.
,
2005
, “
“Collaborating to Success”: An Experimental Study of Group Idea Generation Techniques
,”
International Design Engineering Technical Conferences and Computers and Information in Engineering Conference
,
Long Beach, CA
,
Sept. 24–28
, Vol. 4742, pp.
277
290
.
27.
Ramachandran
,
S. K.
,
2019
, “
Investigating the Accuracy of Creativity Metrics Used in Engineering Design
,” Ph.D. thesis,
Pennsylvania State University
,
Philadelphia, PA
.
28.
Miller
,
S. R.
,
Hunter
,
S. T.
,
Starkey
,
E.
,
Ramachandran
,
S.
,
Ahmed
,
F.
, and
Fuge
,
M.
,
2021
, “
How Should We Measure Creativity in Engineering Design? A Comparison Between Social Science and Engineering Approaches
,”
ASME J. Mech. Des.
,
143
(
3
), p.
031404
.
29.
Atilola
,
O.
,
Tomko
,
M.
, and
Linsey
,
J. S.
,
2016
, “
The Effects of Representation on Idea Generation and Design Fixation: A Study Comparing Sketches and Function Trees
,”
Des. Stud.
,
42
, pp.
110
136
.
30.
Linsey
,
J. S.
,
2007
, “
Design-by-Analogy and Representation in Innovative Engineering Concept Generation
,” Ph.D. thesis,
University of Texas at Austin
,
Austin, TX
.
31.
Redmond
,
M. R.
,
Mumford
,
M. D.
, and
Teach
,
R.
,
1993
, “
Putting Creativity to Work: Effects of Leader Behavior on Subordinate Creativity
,”
Organ. Behav. Hum. Decis. Process.
,
55
(
1
), pp.
120
151
.
32.
Gosnell
,
C. A.
, and
Miller
,
S. R.
,
2016
, “
But is it Creative? Delineating the Impact of Expertise and Concept Ratings on Creative Concept Selection
,”
ASME J. Mech. Des.
,
138
(
2
), p.
021101
.
33.
Besemer
,
S. P.
,
1998
, “
Creative Product Analysis Matrix: Testing the Model Structure and a Comparison among Products–Three Novel Chairs
,”
Creat. Res. J.
,
11
(
4
), pp.
333
346
.
34.
Yang
,
M. C.
,
2009
, “
Observations on Concept Generation and Sketching in Engineering Design
,”
Res. Eng. Des.
,
20
(
1
), pp.
1
11
.
35.
Beysolow
II
T.
,
2018
,
What is Natural Language Processing?
,
Apress
,
Berkeley, CA
, pp.
1
12
.
36.
John
,
B.
, and
Sharon
,
S. M.
,
2009
,
Assessing Creativity Using the Consensual Assessment Technique
,
IGI Global
,
Hershey, PA
, pp.
65
77
.
37.
Kaufman
,
J. C.
,
Baer
,
J.
, and
Cole
,
J. C.
,
2011
, “
Expertise, Domains, and the Consensual Assessment Technique
,”
J. Creat. Behav.
,
43
(
4
), pp.
223
233
.
38.
Johnson
,
T. A.
,
Cheeley
,
A.
,
Caldwell
,
B. W.
, and
Green
,
M. G.
,
2016
, “
Comparison and Extension of Novelty Metrics for Problem-Solving Tasks
,”
International Design Engineering Technical Conferences and Computers and Information in Engineering Conference
,
Charlotte, NC
,
Aug. 21–24
, American Society of Mechanical Engineers, Vol. 50190, p. V007T06A012.
39.
Nelson
,
B. A.
,
Wilson
,
J. O.
,
Rosen
,
D.
, and
Yen
,
J.
,
2009
, “
Refined Metrics for Measuring Ideation Effectiveness
,”
Des. Stud.
,
30
(
6
), pp.
737
743
.
40.
Barth
,
P.
, and
Stadtmann
,
G.
,
2020
, “
Creativity Assessment Over Time: Examining the Reliability of Cat Ratings
,”
J. Creat. Behav.
,
55
(
2
), pp.
396
409
.
41.
Amabile
,
T. M.
,
2018
,
Creativity in Context: Update to the Social Psychology of Creativity
,
Routledge
,
New York
.
42.
Kaufman
,
J. C.
,
Plucker
,
J. A.
, and
Baer
,
J.
,
2008
,
Essentials of Creativity Assessment
, Vol. 53,
John Wiley & Sons
,
New York
.
43.
Baer
,
J.
,
Kaufman
,
J. C.
, and
Gentile
,
C. A.
,
2004
, “
Extension of the Consensual Assessment Technique to Non-Parallel Creative Products
,”
Creat. Res. J.
,
16
(
1
), pp.
113
117
.
44.
Amabile
,
T. M.
,
1983
, “
Brilliant but Cruel: Perceptions of Negative Evaluators
,”
J. Exp. Soc. Psychol.
,
19
(
2
), pp.
146
156
.
45.
Kaufman
,
J. C.
,
Baer
,
J.
,
Cole
,
J. C.
, and
Sexton
,
J. D.
,
2008
, “
A Comparison of Expert and Nonexpert Raters Using the Consensual Assessment Technique
,”
Creat. Res. J.
,
20
(
2
), pp.
171
178
.
46.
Long
,
H.
, and
Pang
,
W.
,
2015
, “
Rater Effects in Creativity Assessment: A Mixed Methods Investigation
,”
Think. Ski. Creat.
,
15
, pp.
13
25
.
47.
Kaufman
,
J. C.
,
Baer
,
J.
,
Cropley
,
D. H.
,
Reiter-Palmon
,
R.
, and
Sinnett
,
S.
,
2013
, “
“Furious Activity vs. Understanding: How Much Expertise is Needed to Evaluate Creative Work?
,”
Psychol. Aesthet. Creat. Arts
,
7
(
4
), pp.
332
340
.
48.
Kaufman
,
J. C.
,
Gentile
,
C. A.
, and
Baer
,
J.
,
2005
, “
Do Gifted Student Writers and Creative Writing Experts Rate Creativity the Same Way?
,”
Gifted Child Q.
,
49
(
3
), pp.
260
265
.
49.
Oman
,
S. K.
,
Tumer
,
I. Y.
,
Wood
,
K.
, and
Seepersad
,
C.
,
2013
, “
A Comparison of Creativity and Innovation Metrics and Sample Validation Through In-Class Design Projects
,”
Res. Eng. Des.
,
24
(
1
), pp.
65
92
.
50.
Ahmed
,
F.
,
Ramachandran
,
S. K.
,
Fuge
,
M.
,
Hunter
,
S.
, and
Miller
,
S.
,
2019
, “
Measuring and Optimizing Design Variety Using Herfindahl Index
,”
ASME 2019 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference
,
Anaheim, CA
,
Aug. 18–21
.
51.
Shah
,
J. J.
,
Kulkarni
,
S. V.
, and
Vargas-Hernandez
,
N.
,
2000
, “
Evaluation of Idea Generation Methods for Conceptual Design: Effectiveness Metrics and Design of Experiments
,”
ASME J. Mech. Des.
,
122
(
4
), pp.
377
384
.
52.
Sluis-Thiescheffer
,
W.
,
Bekker
,
T.
,
Eggen
,
B.
,
Vermeeren
,
A.
, and
De Ridder
,
H.
,
2016
, “
Measuring and Comparing Novelty for Design Solutions Generated by Young Children Through Different Design Methods
,”
Des. Stud.
,
43
, pp.
48
73
.
53.
Peeters
,
J.
,
Verhaegen
,
P.-A.
,
Vandevenne
,
D.
, and
Duflou
,
J.
,
2010
, “
Refined Metrics for Measuring Novelty in Ideation
,”
IDMME Virtual Concept Research in Interaction Design
,
Virtual
,
October
, pp.
20
22
.
54.
Moustafa
,
K.
,
Luz
,
S.
, and
Longo
,
L.
,
2017
, “
Assessment of Mental Workload: A Comparison of Machine Learning Methods and Subjective Assessment Techniques
,”
International Symposium on Human Mental Workload: Models and Applications
,
June 4
, Springer, Cham, pp.
30
50
.
55.
Aldahdooh
,
A.
,
Masala
,
E.
,
Van Wallendael
,
G.
,
Lambert
,
P.
, and
Barkowsky
,
M.
,
2019
, “
Improving Relevant Subjective Testing for Validation: Comparing Machine Learning Algorithms for Finding Similarities in VQA Datasets Using Objective Measures
,”
Signal Process. Image Commun.
,
74
, pp.
32
41
.
56.
Sun
,
S.
,
Luo
,
C.
, and
Chen
,
J.
,
2017
, “
A Review of Natural Language Processing Techniques for Opinion Mining Systems
,”
Inf. Fusion
,
36
, pp.
10
25
.
57.
Jindal
,
N.
, and
Liu
,
B.
,
2008
, “
Opinion Spam and Analysis
,”
Proceedings of the 2008 International Conference on Web Search and Data Mining
,
Palo Alto, CA,
Feb. 11–12, pp. 219–230
.
58.
Ghose
,
A.
, and
Ipeirotis
,
P. G.
,
2007
, “
Designing Novel Review Ranking Systems: Predicting the Usefulness and Impact of Reviews
,”
Proceedings of the Ninth International Conference on Electronic Commerce
,
Minneapolis, MN
,
Aug. 19–22
, pp.
303
310
.
59.
Liu
,
Y.
,
Huang
,
X.
,
An
,
A.
, and
Yu
,
X.
,
2008
, “
Modeling and Predicting the Helpfulness of Online Reviews
,”
2008 Eighth IEEE International Conference on Data Mining
,
IEEE
, pp.
443
452
.
60.
Lu
,
Y.
,
Tsaparas
,
P.
,
Ntoulas
,
A.
, and
Polanyi
,
L.
,
2010
, “
Exploiting Social Context for Review Quality Prediction
,”
Proceedings of the 19th International Conference on World Wide Web
,
Raleigh, NC
,
Apr. 26–30
, pp.
691
700
.
61.
Ghose
,
A.
, and
Ipeirotis
,
P. G.
,
2010
, “
Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics
,”
IEEE Trans. Knowl. Data Eng.
,
23
(
10
), pp.
1498
1512
.
62.
Chowdhury
,
G. G.
,
2003
, “
Natural Language Processing
,”
Annu. Rev. Inf. Sci. Technol.
,
37
(
1
), pp.
51
89
.
63.
Liddy
,
E. D.
,
2001
,
Natural Language Processing
, Encyclopedia of Library and Information Science.
64.
Meystre
,
S.
, and
Haug
,
P. J.
,
2006
, “
Natural Language Processing to Extract Medical Problems From Electronic Clinical Documents: Performance Evaluation
,”
J. Biomed. Inform.
,
39
(
6
), pp.
589
599
.
65.
Li
,
R.
,
Zhu
,
Y.
, and
Wu
,
Z.
,
2013
, “
A New Algorithm to the Automated Assessment of the Chinese Subjective Answer
,”
2013 International Conference on Information Technology and Applications
,
Chengdu, China
,
Nov. 16–17
, IEEE, pp.
228
231
.
66.
Gyory
,
J. T.
,
Kotovsky
,
K.
, and
Cagan
,
J.
,
2020
, “
A Topic Modeling Approach to Study the Impact of Manager Interventions on Design Team Cognition
,”
ASME 2020 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference.
,
Virtual, Online
,
Aug. 17–19
.
67.
Ahmed
,
F.
, and
Fuge
,
M.
,
2017
, “
Capturing Winning Ideas in Online Design Communities
,”
Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing
,
Portland, OR
,
Feb. 25
, pp.
1675
1687
.
68.
Dehbozorgi
,
N.
,
Maher
,
M. L.
, and
Dorodchi
,
M.
,
2020
, “
Sentiment Analysis on Conversations in Collaborative Active Learning as an Early Predictor of Performance
,”
2020 IEEE Frontiers in Education Conference (FIE)
,
Uppsala, Sweden
,
Oct. 21–24
, IEEE, pp.
1
9
.
69.
Joung
,
J.
, and
Kim
,
H. M.
,
2020
, “
Importance-performance Analysis of Product Attributes Using Explainable Deep Neural Network From Online Reviews
,”
ASME 2020 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference
,
Virtual, Online
,
Aug. 17–19
.
70.
Besemer
,
S. P.
, and
O’Quin
,
K.
,
1999
, “
Confirming the Three-Factor Creative Product Analysis Matrix Model in an American Sample
,”
Creat. Res. J.
,
12
(
4
), pp.
287
296
.
71.
Starkey
,
E. M.
,
Hunter
,
S. T.
, and
Miller
,
S. R.
,
2019
, “
Are Creativity and Self-Efficacy at Odds? An Exploration in Variations of Product Dissection in Engineering Education
,”
ASME J. Mech. Des.
,
141
(
1
), p.
021001
.
72.
Potdar
,
K.
,
Pardawala
,
T.
, and
Pai
,
C.
,
2017
, “
A Comparative Study of Categorical Variable Encoding Techniques for Neural Network Classifiers
,”
Int. J. Comput. Appl.
,
175
(
10
), pp.
7
9
.
73.
Hancick
,
J. T.
, and
Khoshgoftaar
,
T. M.
,
2020
, “
Survey on Categorical Data for Neural Networks
,”
J. Big Data
,
7
(
28
).
74.
Cer
,
D.
,
Yang
,
Y.
,
yi Kong
,
S.
,
Hua
,
N.
,
Limtiaco
,
N.
,
John
,
R. S.
,
Constant
,
N.
, et al
,
2018
,
Universal Sentence Encoder
.
75.
Sorzano
,
C. O. S.
,
Vargas
,
J.
, and
Montano
,
A. P.
,
2014
, “
A Survey of Dimensionality Reduction Techniques
.”
76.
77.
Xu
,
Y.
, and
Goodacre
,
R.
,
2018
, “
On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning
,”
J. Anal. Test.
,
2
(
10
), pp.
249
262
.
78.
Sammut
,
C.
, and
Webb
,
G. I.
,
2011
,
Encyclopedia of Machine Learning
,
Springer Science & Business Media
,
New York
.
79.
Deng
,
X.
,
Liu
,
Q.
,
Deng
,
Y.
, and
Mahadevan
,
S.
,
2016
, “
An Improved Method to Construct Basic Probability Assignment Based on the Confusion Matrix for Classification Problem
,”
Inf. Sci.
,
340
, pp.
250
261
.
You do not currently have access to this content.