Yiming Xu
I am currently a PhD Fellow at the Computer Vision Center (CVC), Universitat Autònoma de Barcelona (UAB). I am part of the Vision & Language Group, supervised by Dimosthenis Karatzas and Ernest Valveny, my research focuse on Multimodal Document Understanding.
Before starting my PhD, I received my master degree from the University of Amsterdam, where I completed my thesis on Multimodal RAG under the supervision of Benno Kruit and Jan-Christoph Kalo.
After graduating from my MSc program, I spent two years in industry building AI agents for e-commerce scenarios. I served as the principal developer on our flagship agent project, which went on to raise close to USD 7 million in funding.
NEWS
May 2025: I started my PhD at the Computer Vision Center (CVC), Universitat Autònoma de Barcelona (UAB).
Jan 2025: I received the FPI Fellowship to support my PhD research at CVC, UAB.
Feb 2024: My master’s thesis work was accepted to COLING 2024.
PUBLICATIONS

Retrieval-based Question Answering with Passage Expansion Using a Knowledge Graph
A retrieval QA method that expands passages with KG context to improve recall and answer quality.

Fine-grained label learning via siamese network for cross-modal information retrieval
Mining hard negatives to improve cross-modal information retrieval.
AWARDS
FPI (Formación de Personal Investigador) Fellowship
Chinese Academy of Sciences Science and Innovation Program Scholarship