山东大学教师主页蒋超亚首页中文主页

蒋超亚

研究员

所属院部：控制科学与工程学院

访问次数：次

dX8 YvE JSx Bi4 6Fm 66S 5Kd q13 MEl BtL U8g 6il QXJ 150 h3A Obw 0pw un1 nyy Fe5

基本信息

教师拼音名称：

jiangchaoya
入职时间：

2025-07-07
所在单位：

控制科学与工程学院
学历：

研究生(博士)毕业
办公地点：

山东大学千佛山校区8号楼
性别：

男
学位：

工学博士学位
在职信息：

在职
毕业院校：

北京大学

学科：

控制科学与工程

曾获荣誉：

2024 中国电子学会-腾讯博士生科研激励计划（混元大模型专项）；

2024 北京大学校长奖；

2023 北京市科技进步一等奖（大规模异质知识计算关键技术及应用）；

教师简介

蒋超亚，山东大学控制科学工程学院研究员，山东省泰山学者青年专家。本科毕业于南京大学计算机科学与技术系，硕士毕业于北京大学王选计算机研究所，博士毕业于北京大学软件工程国家工程研究中心知识计算实验室（师从张世琨教授）。主持2023年国家自然科学基金青年学生基础研究项目（博士研究生），2024年中国电子学会-腾讯博士科研激励计划项目及2025年山东省重点研发计划课题等多个项科研项目。获得2023年北京市科技进步一等奖，北京大学校长奖等多项奖励。与北京大学，阿里巴巴通义实验室，腾讯公司等多家高水平研究机构合作紧密。主要从事多模态大语言模型及其在新能源领域的应用，具体方向包括：1.通用多模态大模型推理能力增强；2.基于多模态大模型的智能体（Agent）；3. 新能源垂域多模态大模型开发及应用。以一作/共一身份目前在人工智能领域CCF-A类顶级会议和期刊（NuerlPS，CVPR，ICCV，ACL，AAAI等）发表论文数十篇，研究成果得到了同行和业界的高度认可。

招收实习生及科研助理：

欢迎人工智能、大模型以及AI+新能源方向感兴趣且有想法的同学报名。会手把手带你入门，提供细致的科研指导和充足的算力支持，一起发顶会顶刊。提倡平等交流，尊重每个人的想法。对于优秀的学生，我会尽力推荐去北大等顶尖高校深造，期待与你一起探索科研！

教育经历

2014-09 — 2018-07

南京大学

计算机科学与技术

理学学士学位
2018-09 — 2021-07

北京大学

计算机应用技术

理学硕士学位
2021-09 — 2025-07

北京大学

软件工程

工学博士学位

研究领域

屏幕截图 2025-07-09 183212.png

通用多模态大模型：
1. 研究类R-1 强推理多模态大模型，包括Thinking with Image 推理范式，强化学习对齐算法，特定场景下的推理能力评估等。

2. 研究复杂多模态Agent智能体，包括面向GUI的多模态Agent 构建，面向风光储氢新能源控制的智能体设计优化等。
新能源多模态大模型：

1. 面向复杂设备运维的智慧运维大模型
2. 综合能源智慧管控大模型

News：

(2025.09) One paper is accepted by NeurlPS 2025 !

已发表一作/通讯论文：

[1] Jiang, C.*, Heng, Y*(Equal Contribution), Ye, W., Xu, H., Yan, M., et al. (2025). VLM-R3 : Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought. *Proceedings of the Conference on Neural Information Processing Systems (NeurlPS 2025)*. (CCF-A)

[2] Jiang, C.*, Hongruijia*(Equal Contribution), Ye, W., Xu, H., Yan, M., et al. (2024). MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model. *Proceedings of the Conference on Neural Information Processing Systems (NeurlPS 2024)*. (CCF-A)

[3] Hongruijia*, Jiang, C.*(Equal Contribution), Ye, W., et al. (2025). SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization. *Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2025)*. (CCF-A)

[4] Jiang, C., Ye, W., Dong, M., Jia, H., et al. (2024). Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models. *Proceedings of the ACM International Conference on Multimedia (MM 2024)*. (CCF-A)

[5] Jiang, C., Xu, H., Dong, M., Chen, J., Ye, W., et al. (2024). Hallucination Augmented Contrastive Learning for Multimodal Large Language Model. *Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)*. (CCF-A)

[6] Jiang, C., Ye, W., Xu, H., Ye, Q., Yan, M., Zhang, J., & Zhang, S. (2024). TiMix: Text-aware Image Mixing for Effective Vision-Language Pretraining. *Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2024)*. (CCF-A)

[7] Jiang, C., Xu, H., Ye, W., Ye, Q., et al. (2023). BUS: Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization. *Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2023)*. (CCF-A)

[8] Jiang, C., Xu, H., Ye, W., Ye, Q., et al. (2023). COPA: Efficient Vision-Language Pre-training through Collaborative Object- and Patch-Text Alignment. *Proceedings of the ACM International Conference on Multimedia (MM 2023)*. (CCF-A)

[9] Jiang, C., Ye, W., Xu, H., Yan, M., et al. (2023). Vision Language Pre-training by Contrastive Learning with Cross-Modal Similarity Regulation. *Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)*. (CCF-A)

[10] Jiang, C., Xie, R., Ye, W., Sun, J., & Zhang, S. (2023). Exploiting Pseudo Image Captions for Multimodal Summarization. *Finds of the Association for Computational Linguistics: ACL 2023*.

[11] Jiang, C., Xu, H., Li, C., Yan, M., et al. (2022). TRIPS: Efficient Vision-and-Language Pre-training with Text-relevant Patch Selection. *Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)*. (CCF-B)

[12] Jiang, C., Yang, D., & Chen, X. (2020). Similarity Learning For Cover Song Identification Using Cross-Similarity Matrices of Multi-Level Deep Sequences. *Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020)*. (CCF-B)

[13] Jiang, C., Yang, D., & Chen, X. (2020). Learn A Robust Representation For Cover Song Identification Via Aggregating Local And Global Music Temporal Context. *Proceedings of the IEEE International Conference on Multimedia and Expo (ICME 2020)*. (CCF-B)

科研成果

研究方向

1. 新能源多模态大模型

2. 通用多模态大模型

科研项目

1. 光-储-醇多能互补智慧供能系统关键技术研发与示范应用, 2025年山东省重点研发计划（课题负责人）

2. 面向复杂领域知识应用的多模态大模型检索增强生成关键技术研究, 中国电子学会博士科研激励计划（主持）

3. 多模态大模型高效鲁棒训练与推理技术技术研究, 国家自然科学基金, 2025-07-09, 2023年国家自然科学基金青年学生基础研究项目（博士研究生）（主持）