Simple Contrastive User Representation Learning from Command Sequences

SimCURL learns user representations from a large corpus of unlabeled command sequences. These learned representations are then transferred to multiple downstream tasks that have only limited labels available.


User modeling is crucial to understanding userbehavior and essential for improving user experience and personalized recommendations. When users interact with software, vastamounts of command sequences are generated through loggingand analytics systems. These command sequences contain cluesto the users’ goals and intents. However, these data modalitiesare highly unstructured and unlabeled, making it difficult forstandard predictive systems to learn from. We propose SimCURL,a simple yet effective contrastive self-supervised deep learningframework that learns user representation from unlabeled command sequences. Our method introduces a user-session networkarchitecture, as well as session dropout as a novel way of dataaugmentation. We train and evaluate our method on a real-worldcommand sequence dataset of more than half a billion commands.Our method shows significant improvement over existing methodswhen the learned representation is transferred to downstreamtasks such as experience and expertise classification.

Download publication

Related Resources



Peek-At-You: An Awareness, Navigation, and View Sharing System for Remote Collaborative Content Creation

Remote work improved by collaborative features such as conversational…



Workflow Graphs: A Computational Model of Collective Task Strategies for 3D Design Software

This paper introduces Workflow graphs, or W-graphs, which encode how…




Just as Amazon and Netflix recommend products and movies to their…



Software Learning

This learning project investigates advanced techniques for assisting…

Get in touch

Something pique your interest? Get in touch if you’d like to learn more about Autodesk Research, our projects, people, and potential collaboration opportunities.

Contact us