International Conference on Learning Representations (ICLR) 2024
SLiMe
Segment Like Me
Abstract
SLiMe: Segment Like Me
Aliasghar Khani, Saeid Asgari , Aditya Sanghi, Ali Mahdavi-Amiri, Ghassan Hamarneh
Significant strides have been made using large vision-language models, like Stable Diffusion (SD), for a variety of downstream tasks, including image editing, image correspondence, and 3D shape generation. Inspired by these advancements, we explore leveraging these extensive vision-language models for segmenting images at any desired granularity using as few as one annotated sample by proposing SLiMe. SLiMe frames this problem as an optimization task. Specifically, given a single training image and its segmentation mask, we first extract attention maps, including our novel “weighted accumulated self-attention map” from the SD prior. Then, using the extracted attention maps, the text embeddings of Stable Diffusion are optimized such that, each of them, learn about a single segmented region from the training image. These learned embeddings then highlight the segmented region in the attention maps, which in turn can then be used to derive the segmentation map. This enables SLiMe to segment any real-world image during inference with the granularity of the segmented region in the training image, using just one example. Moreover, leveraging additional training data when available, i.e. few-shot, improves the performance of SLiMe. We carried out a knowledge-rich set of experiments examining various design factors and showed that SLiMe outperforms other existing one-shot and few-shot segmentation methods.
Download publicationAssociated Researchers
Aliasghar Khani
School of Computing Science, Simon Fraser University
Ali Mahdavi-Amiri
School of Computing Science, Simon Fraser University
Ghassan Hamarneh
School of Computing Science, Simon Fraser University
Related Resources
2024
TimeTunnel: Integrating Spatial and Temporal Motion Editing for Character Animation in Virtual Reality
This research provides an approachable editing experience by…
2023
Leveraging Graph Neural Networks for Graph Regression and Effective Enumeration Reduction
Graph-based framework represents aspects of optimal thermal management…
2023
Peek-At-You: An Awareness, Navigation, and View Sharing System for Remote Collaborative Content Creation
Remote work improved by collaborative features such as conversational…
2019
Relational Graph Representation Learning for Open-Domain Question Answering
We introduce a relational graph neural network with bi-directional…
Get in touch
Something pique your interest? Get in touch if you’d like to learn more about Autodesk Research, our projects, people, and potential collaboration opportunities.
Contact us