International Conference on Learning Representations (ICLR)

Learned Visual Features to Textual Explanations

TExplain projects learned visual representations of a frozen image classifier onto a space that an independently trained language model can interpret. Using a large number of generated sentence samples along with the visual representation, TExplain produces a word cloud for each visual representation. Blue and green refers to frozen and trainable parameters, respectively. The category of the feature representation is highlighted in red, while other captured features are shown in gray. The font size of each word indicates the strength of its corresponding feature.

Abstract

Saeid Asgari Taghanaki, Aliasghar Khani, Amir Khasahmadi, Aditya Sanghi, Karl D.D. Willis, Ali Mahdavi-Amiri

Interpreting the learned features of vision models has posed a longstanding challenge in the field of machine learning. To address this issue, we propose a novel method that leverages the capabilities of large language models (LLMs) to interpret the learned features of pre-trained image classifiers. Our method, called TExplain, tackles this task by training a neural network to establish a connection between the feature space of image classifiers and LLMs. Then, during inference, our approach generates a vast number of sentences to explain the features learned by the classifier for a given image. These sentences are then used to extract the most frequent words, providing a comprehensive understanding of the learned features and patterns within the classifier. Our method, for the first time, utilizes these frequent words corresponding to a visual representation to provide insights into the decision-making process of the independently trained classifier, enabling the detection of spurious correlations, biases, and a deeper comprehension of its behavior. To validate the effectiveness of our approach, we conduct experiments on diverse datasets, including ImageNet-9L and Waterbirds. The results demonstrate the potential of our method to enhance the interpretability and robustness of image classifiers.

Download publication

Associated Researchers

Saeid Asgari

Former Autodesk

Aliasghar Khani

Machine Learning Research Scientist

Amir Khasahmadi

Senior AI Research Scientist

Aditya Sanghi

Principal AI Research Scientist

Karl Willis

Senior Manager, Research Science

Ali Mahdavi-Amiri

School of Computing Science, Simon Fraser University

View all researchers

Related Resources

Publication

2025

Evaluating the Role of Model Size in Agentic AI for Expert-Like Material Selection

This research evaluates the role of model size in agentic AI tools for…

Article

2025

From Gestures to Greatness: Autodesk Research Transforms Filmmaking at MIT AI Hackathon

Learn how Autodesk Research and Wonder Dynamics empowered 650…

Publication

2023

A Hybrid Intelligence Approach to Training Generative Design Assistants: Partnership Between Human Experts and AI Enhanced Co-Creative Tools

The research presents a framework for designing and evaluating…

Publication

2023

Neural Shape Diameter Function for Efficient Mesh Segmentation

Introducing a neural approximation of the Shape Diameter Function,…

Get in touch

Something pique your interest? Get in touch if you’d like to learn more about Autodesk Research, our projects, people, and potential collaboration opportunities.

Contact us