Yiming Xu

I am currently a PhD Fellow at the Computer Vision Center (CVC), Universitat Autònoma de Barcelona (UAB). I am part of the Vision & Language Group, supervised by Dimosthenis Karatzas and Ernest Valveny, my research focuse on Multimodal Document Understanding.

Before starting my PhD, I received my master degree from the University of Amsterdam, where I completed my thesis on Multimodal RAG under the supervision of Benno Kruit and Jan-Christoph Kalo. After graduation, I worked in industry for two years, focusing on AI Agents development in e-commerce scenarios.


NEWS

  • May 2025: I started my PhD at the Computer Vision Center (CVC), Universitat Autònoma de Barcelona (UAB).

  • Jan 2025: I received the FPI Fellowship to support my PhD research at CVC, UAB.

  • Feb 2024: My master’s thesis was accepted to COLING 2024.


PUBLICATIONS


AWARDS

  • FPI (Formación de Personal Investigador) Fellowship

  • Chinese Academy of Sciences Science and Innovation Program Scholarship