Yiming Xu
I am currently a PhD Fellow at the Computer Vision Center (CVC), Universitat Autònoma de Barcelona (UAB). I am part of the Vision & Language Group, supervised by Dimosthenis Karatzas and Ernest Valveny, my research focuse on Multimodal Document Understanding.
Before starting my PhD, I received my master degree from the University of Amsterdam, where I completed my thesis on Multimodal RAG under the supervision of Benno Kruit and Jan-Christoph Kalo. After graduation, I worked in industry for two years, focusing on AI Agents development in e-commerce scenarios.

NEWS
May 2025: I started my PhD at the Computer Vision Center (CVC), Universitat Autònoma de Barcelona (UAB).
Jan 2025: I received the FPI Fellowship to support my PhD research at CVC, UAB.
Feb 2024: My master’s thesis was accepted to COLING 2024.
PUBLICATIONS

Retrieval-based Question Answering with Passage Expansion Using a Knowledge Graph
A retrieval QA method that expands passages with KG context to improve recall and answer quality.

Fine-grained label learning via siamese network for cross-modal information retrieval
Mining hard negatives to improve cross-modal information retrieval.
AWARDS
FPI (Formación de Personal Investigador) Fellowship
Chinese Academy of Sciences Science and Innovation Program Scholarship