Yiming Xu

PhD Fellow working on Multimodal Document Understanding at the Computer Vision Center, Universitat Autònoma de Barcelona.

I work in the Vision & Language Group at CVC, UAB, advised by Dimosthenis Karatzas and Ernest Valveny — building agentic document-understanding systems and foundation models for document understanding.

Before Barcelona I earned an MSc at the University of Amsterdam (thesis on Multimodal RAG) and spent two years as principal developer building e-commerce AI agents.

Yiming Xu

Selected project

All projects →

Recent writing

All writing →

Recent publications

All publications →

Awards