cv
Basics
Name | Shangyu Xing |
Affiliation | School of Artificial Intelligence, Nanjing University |
xsy@smail.nju.edu.cn | |
Github | https://github.com/starreeze |
Google scholar | https://scholar.google.com/citations?user=u5pqxu0AAAAJ |
Education
Publications
-
2025 RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios
Submitted to EMNLP 2025
Construct a new multi-image benchmark for multimodal LLMs from real user inputs on social platform.
-
2025 Maximizing the Effectiveness of Larger BERT Models for Compression
ACL 2025
Propose a new KD method for BERT model by maximizing linear difference between selected layers, enabling better performance with larger teachers.
-
2025 GePBench: Evaluating Fundamental Geometric Perception for Multimodal Large Language Models
Submitted to NeurIPS 2025
Propose a novel large-scale multimodal benchmark on geometric perception, identifying the limitations of current models and exploring the impact of geometric perception on high level tasks.
-
2025 AnyPrefer: An Automatic Framework for Preference Data Synthesis
ICLR 2025
Design a framework that can automatically generate preference data for multiple scenarios without requiring manual annotation.
-
2024 AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability
Submitted to EMNLP 2025
Create distinct alignment vectors for differently aligned text-image pairs during pretraining, and allocate them to various subtasks in finetuning and inference.
-
2024 EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models
EMNLP’ 2024
Leverage external expert knowledge to reinforce the alignment between language and vision, thereby reducing multimodal hallucinations with no manually annotated data and minial computational resources.
-
2023 DRIN: Dynamic Relation Interactive Network for Multimodal Entity Linking
ACMMM’ 2023
Explicitly model four types of alignment between multimodal mentions and entities and uses a dynamic Graph Convolutional Network to automatically select appropriate alignment relations for different input samples.
-
2022 Learning from Different text-image Pairs: A Relation-enhanced Graph Convolutional Network for Multimodal NER
ACMMM’ 2022
Utilize Graph Neural Networks to capture external matching relationships across different text-image pairs.
Awards
- 2020
National Scholarship
Ministry of Education of the People's Republic of China
- 2021
Tencent Scholarship
Tencent
- 2022
Outstanding Student Model of Nanjing University
Nanjing University
- 2023
Outstanding Graduate of Nanjing University
Nanjing University
- 2024
Tencent Scholarship
Tencent
Work
-
2024.09 - 2025.02 Teaching Assistant: Compilers - Principles, Techniques, and Tools
School of Artificial Intelligence, Nanjing University
Assisted in lectures, graded assignments and final exams.
-
2024.06 - 2025.03 Research Intern led by Huaxiu Yao (Online)
University of North Carolina, Chapel Hill
Implemented data synthesis algorithm with feedback on target vision-language model (AnyPrefer).
-
2023.07 - 2023.09 Internship: Software R&D Engineer
INFLY Tech (Shanghai) Co., Ltd.
Implementing preference alignment algorithms RLHF/PPO and its variations DPO, RRHF; Training a BLOOM model with billion-level parameters using the Deepspeed and Megatron-LM frameworks, experimenting different algorithms.
-
2022.07 - 2022.09 Internship: Software R&D Engineer
Huawei Technologies Co., Ltd.
Integrating the open-source SOTA model Tacotron2 with a proprietary model optimized for handling Chinese spoken language pitch and rhythm.
Languages
Chinese | |
Native speaker |
English | |
Fluent (TOEFL 105) |