Yiming Xu

I am currently a PhD Fellow at the Computer Vision Center (CVC), Universitat Autònoma de Barcelona (UAB). I am part of the Vision & Language Group, supervised by Dimosthenis Karatzas and Ernest Valveny, my research focuse on Multimodal Document Understanding.

Before starting my PhD, I received my master degree from the University of Amsterdam, where I completed my thesis on Multimodal RAG under the supervision of Benno Kruit and Jan-Christoph Kalo.

After graduating from my MSc program, I spent two years in industry building AI agents for e-commerce scenarios. I served as the principal developer on our flagship agent project, which went on to raise close to USD 7 million in funding.


NEWS

  • May 2025: I started my PhD at the Computer Vision Center (CVC), Universitat Autònoma de Barcelona (UAB).

  • Jan 2025: I received the FPI Fellowship to support my PhD research at CVC, UAB.

  • Feb 2024: My master’s thesis work was accepted to COLING 2024.


PUBLICATIONS


AWARDS

  • FPI (Formación de Personal Investigador) Fellowship

  • Chinese Academy of Sciences Science and Innovation Program Scholarship