Jingjing Wang
王晶晶
Associate Professor, Natural Language Processing Lab
About
I am an Associate Professor at School of Computer Science and Technology, Soochow University. I am also a Senior Technical Consultant (Part-time) at Microsoft (Asia), China.
My research interests focus on Multimodal Computing (especially for Multimodal Information Extraction, Visual-Language Understanding and Generation, Embodied Intelligence), Affective Computing, Large Language Models and AI for Medical Diagnosis.
I received my Ph.D. degree from Soochow University in 2019, advised by Prof. Guodong Zhou and Prof. Shoushan Li. I also collaborate with Prof. Min Zhang for advancing NLP and AI technology.
王晶晶,苏州大学副教授,微软访问学者,苏州大学自然语言处理(NLP)实验室博士,兼任微软(亚洲)工程院高级技术顾问, 主要致力于人工智能(AI)领域中Multimodal Computing (especially for Multimodal Information Extraction, Visual-Language Understanding and Generation, Embodied Intelligence), Affective Computing, Large Language Models and AI for Medical Diagnosis等方向的研究。 截止目前,已在CCF-A类顶会和顶刊,例如ACL、AAAI、WWW、MM、IJCAI、SCIS等发表AI与NLP相关论文数十篇,并主持与参与国家项目多项,拥有多项授权专利。 此外担任AI、NLP领域国际顶级会议ACL、AAAI、WWW、MM、IJCAI等的Area Chair、PC,CCCF期刊编委以及TASLP、TAFFC、SCIS、中国科学、软件学报等国内外重要学术期刊审稿人。 目前合作的企业包括:Microsoft、阿里达摩院、蚂蚁金服等,也乐于推荐本组的学生到上述企业交流、实习与工作。人生寄语:“知者行之始,行者知之成”。
Welcome highly-motivated students to join my team. Perspective candidates are welcome to email me with your CV or research interests for detailed consultation. Regarding recommendation letters, please be advised that I would like to provide substantive evaluations for candidates with whom I have already had a meaningful collaboration. This would allow me to objectively assess your research competencies, scholarly contributions, and professional development through sustained engagement.
Updates & News
- Jan. 14, 2026 🔥Two papers by Yujie and Jiamin in my team are accepted by WWW 2026 CCF A
- Nov. 08, 2025 🔥One paper (oral) by Jiawen in my team is accepted by AAAI 2026 CCF A
- Jul. 05, 2025 🚀Two papers by Junxiao and Jiamin in my team are accepted by ACM MM 2025 CCF A
- Jun. 21, 2025 🐂My four students (Yu Tan, Jianing, Yiding & Jipeng) achieve the Outstanding Graduating Postgraduate
- Jun. 01, 2025 🔥Our paper "Boosting LLM's.." is accepted by TASLP 2025 JCR Q1
- May. 15, 2025 🔥"Table-Critic: A Multi-Agent Framework.." is accepted by ACL 2025 main conference CCF A
- Jan. 21, 2025 🚀Two papers ("Sherlock" & "Omni-SILA") are accepted by WWW 2025 CCF A
- Dec. 28, 2024 🐂Two students (Yu Tan & Jianing Zhao) in my team achieve the National Scholarship
- Dec. 10, 2024 🔥"SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing" is accepted by AAAI 2025 CCF A
- Dec. 01, 2024 🔥"TACL: A Trusted Action-enhanced Curriculum Learning Approach" is accepted by Neurocomputing JCR Q1
- Jul. 16, 2024 🚀Two papers ("Emotion-enriched Text-to-Motion Generation" & "Hawkeye") are accepted by ACM MM 2024 CCF A
- Feb. 20, 2024 🚀Three papers ("ChatASU" etc.) are accepted by COLING 2024 CCF B
Selected Research
Multi-Agent System for Structured Information Extraction and Generation
This research focuses on advancing Text-to-SQL parsing by addressing the critical challenge of semantic accuracy in LLM-generated SQL queries. While fine-tuned large language models excel at producing syntactically valid SQL, they often struggle with semantic consistency, leading to unreliable results in real-world applications.
To tackle this issue, we introduce SQLFixAgent (Cen et al., AAAI'2025), a novel multi-agent collaborative framework designed to detect and repair erroneous SQL queries. SQLFixAgent integrates three specialized agents:
1) SQLReviewer: Identifies semantic mismatches using rubber duck debugging principles.
2) QueryCrafter: Generates diverse candidate SQL repairs by perturbing user queries.
3) SQLRefiner: Selects the optimal repair through retrieval-augmented reflection and failure memory.
This framework achieves a 3% improvement in execution accuracy on the challenging BIRD benchmark while maintaining high token efficiency, making it practical for deployment. Beyond this, we investigate robust Text-to-SQL parsing across diverse scenarios, including domain knowledge integration (Spider-DK) and synonym robustness (Spider-Syn). Our work also explores the synergy between fine-tuned and foundation LLMs, demonstrating how agent collaboration can compensate for individual model limitations.
In addition, we propose Table-Critic (Yu et al., ACL'2025), a novel multi-agent framework designed to enhance structured table reasoning through collaborative error detection and refinement. While large language models (LLMs) excel in many reasoning tasks, they often struggle with maintaining consistency in multi-step table-based reasoning, leading to cascading errors. Table-Critic demonstrates how structured collaboration among LLM-based agents can overcome inherent limitations in complex reasoning tasks, offering a scalable and interpretable approach for real-world applications in data analysis and decision support.
Multimodal Foundation Model for Video Understanding and Grounding
The goal is to establish a unified framework for video anomaly detection, advancing precision in identifying and localizing abnormal events across dynamic scenes while enabling interpretable analysis of complex visual patterns.
Starting from real-world applications in surveillance and social media analysis, we introduce Hawkeye (Zhao et al., ACM MM'24), the first scene-enhanced video-language model designed for anomaly detection. Hawkeye integrates multimodal context (visual-textual-temporal cues) to recognize subtle anomalies and pinpoint their temporal boundaries in untrimmed videos. This work lays a critical foundation for event typing and spatiotemporal localization in short video understanding.
Building on this, we investigate low-resource scenarios where annotated anomaly data is scarce. Our Continuous Attention Modeling method (Zhang et al., JOS'23) enhances adaptability by capturing long-range dependencies in sparse anomaly signals. Further, we extend Hawkeye with self-supervised learning to uncover latent patterns across unlabeled videos, improving generalization to unseen anomaly types. To scale solutions, we construct a benchmark suite combining large-scale anomaly annotations and instruction-tuned datasets. This addresses the challenge of diverse event types (e.g., accidents, unusual behaviors) and supports downstream tasks like explainable reasoning.
Publications
-
CCF A
Table-Critic: A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning.
ACL 2025, Main Conference (CCF A, Corresponding Author)
-
CCF A
SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration.
AAAI 2025 (CCF A, Corresponding Author)
-
CCF A
Skynet-V1: Towards Early Warning of Video Abnormal Events via A Spatial-temporal Causal-enhanced MoE Framework.
ACM MM 2025 (CCF A, Corresponding Author)
-
CCF A
SOmniDoctor: Towards LLM-centric Lifelong Learning for New Emerging Medical VQA Tasks.
ACM MM 2025 (CCF A, Corresponding Author)
-
CCF A
Sherlock: Towards Multi-scene Video Abnormal Event Extraction and Localization via a Global-local Spatial-sensitive LLM.
WWW 2025 (CCF A, Corresponding Author)
-
CCF C
CPO-SQL: Boosting Small LLMs for Text-to-SQL via Efficient In-Context Learning and Preference Optimization.
NLPCC 2025 (CCF C, Corresponding Author)
-
CCF A
Omni-SILA: Towards Omni-scene Driven Visual Sentiment Identifying, Locating and Attributing in Videos.
WWW 2025 (CCF A, Corresponding Author)
-
CCF A
Multi-modal Reliability-aware Affective Computing. Ruan Jian Xue Bao/Journal of Software (JOS), 2025, 36(2):537-553.
Journal of Software (JOS) (CCF A in Chinese Journal, EI, Corresponding Author)
-
JCR Q1
Boosting LLM's Continual Sentiment Understanding for Low-resource Languages.
IEEE/ACM TASLP (SCI, JCR Q1, Corresponding Author)
-
JCR Q1
TACL: A Trusted Action-enhanced Curriculum Learning Approach to Multimodal Affective Computing. Neurocomputing, 2025, 620:129195.
Neurocomputing (SCI, JCR Q1, Corresponding Author)
-
CCF A
Hawkeye: Discovering and Grounding Implicit Anomalous Sentiment in Recon-videos via Scene-enhanced Video Large Language Model.
ACM MM 2024, 592-601 (CCF A, Corresponding Author)
-
CCF A
Towards Emotion-enriched Text-to-Motion Generation via LLM-guided Limb-level Emotion Manipulating.
ACM MM 2024, 612-621 (CCF A, Corresponding Author)
-
CCF A
Continual Attention Modeling for Successive Sentiment Analysis in Low-resource Scenarios. Journal of Software (JOS), 2024, 35(12):5470-5486.
Journal of Software (JOS) (CCF A in Chinese Journal, EI, Corresponding Author)
-
CCF B
ChatASU: Evoking LLM's Reflection to Truly Understand Aspect Sentiment in Dialogues.
COLING 2024, 3075-3085 (CCF B, Corresponding Author)
-
CCF B
How to Understand 'Support'? An Implicit-enhanced Causal Inference Approach for Weakly-supervised Phrase Grounding.
COLING 2024 (CCF B, Corresponding Author)
-
CCF B
TopicDiff: A Topic-enriched Diffusion Approach for Multimodal Conversational Emotion Detection.
COLING 2024 (CCF B, Corresponding Author)
-
CCF C
Topic-Enriched Variational Transformer for Conversational Emotion Detection.
NLPCC 2024, 3-15 (CCF C, Corresponding Author)
-
CCF A
LLM-Grounded Conversation Aspect Sentiment Understanding via Multi-Agent Consistency Reflection.
Journal of Software (JOS) (CCF A in Chinese Journal, EI, Corresponding Author)
-
CCF A
Implicit-enhanced Causal Modeling for Phrasal Visual Grounding.
Journal of Software (JOS) (CCF A in Chinese Journal, Corresponding Author)
-
CCF C
Fine-Grained Question-Answer Matching via Sentence-Aware Contrastive Self-supervised Transfer.
NLPCC 2023, 616-628 (CCF C)
-
CCF B
Cross-modal Speech Sentiment Classification Based on Knowledge Distillation.
Journal of Chinese Information Processing (CCF B in Chinese Journal, Corresponding Author)
-
CCF A
Cognition-driven multimodal personality classification. Science China Information Sciences (Sci China Inf Sci), 2022, 65(10).
Science China Information Sciences (SCIS) (CCF A, JCR Q1, Corresponding Author)
-
CCF B
Patent Matching with Multi-View Attentive Network.
Journal of Chinese Information Processing, 2022, 36(7) (CCF B in Chinese Journal, Corresponding Author)
-
CCF B
Multi-modal Emotion Recognition Based on Multi-LSTMs Fusion.
Journal of Chinese Information Processing, 2022, 36(5) (CCF B in Chinese Journal, Corresponding Author)
-
CCF B
Cognition-Driven Real-Time Personality Detection via Language-Guided Contrastive Visual Attention.
ICME 2021, 1-6 (CCF B, Corresponding Author)
-
CCF B
Multimodal Hierarchical Dynamic Routing for Depression Detection.
Journal of Chinese Information Processing (CCF B in Chinese Journal, Corresponding Author)
-
CCF B
Cross-lingual Aspect Sentiment Classification Based on Multi-channel BERT.
Journal of Chinese Information Processing (CCF B in Chinese Journal, Corresponding Author)
-
CCF A
Aspect Sentiment Classification with Document-level Sentiment Preference Modeling.
ACL 2020, 3667-3677 (CCF A, Corresponding Author)
-
CCF A
Sentiment classification in customer service dialogue with topic-aware multi-task learning.
AAAI 2020, 9177-9184 (CCF A, Corresponding Author)
-
CCF B
Multimodal topic-enriched auxiliary learning for depression detection.
COLING 2020, 1078-1089 (CCF B, Corresponding Author)
-
CCF B
Human-Like Decision Making: Document-level Aspect Sentiment Classification via Hierarchical Reinforcement Learning.
EMNLP 2019, 5585-5594 (CCF B)
-
CCF A
Aspect Sentiment Classification Towards Question-Answering with Reinforced Bidirectional Attention Network.
ACL 2019 (CCF A)
-
Multi-label Aspect Classification on Question-Answering Text with Contextualized Attention-Based Neural Network.
China Conference on Chinese Language Processing 2019, 479-491 (EI, Corresponding Author)
-
CCF A
Aspect Sentiment Classification with both Word-level and Clause-level Attention Networks.
IJCAI 2018, 4439-4445 (CCF A)
-
CCF B
Cross-media User Profiling with Joint Textual and Social User Embedding.
COLING 2018, 246-251 (CCF B)
-
CCF B
Sentiment Classification towards Question-Answering with Hierarchical Matching Network.
EMNLP 2018, 3654-3663 (CCF B)
-
CCF C
Semi-supervised Sentiment Classification Based on Auxiliary Task Learning.
NLPCC 2018 (CCF C)
-
Question-Answering Aspect Classification with Hierarchical Attention Network.
China Conference on Chinese Language Processing 2018, 225-237 (EI, Corresponding Author)
-
Question-Answering Aspect Classification with Multi-attention Representation.
China Conference on Information Retrieval 2018, 78-89 (EI)
-
CCF A
Joint Learning on Relevant User Attributes in Micro-blog.
IJCAI 2017, 4130-4136 (CCF A)
-
CCF B
Semi-supervised Question Classification with Jointly Learning Question and Answer Representations.
Journal of Chinese Information Processing, vol. 31(1), 2017 (CCF B in Chinese Journal)
-
CCF A
User age prediction by combining classification and regression.
Journal of Sci Sin Inform, 2017, 47: 1095-1108 (CCF A in Chinese Journal)
-
CCF B
Leveraging Interactive Knowledge and Unlabeled Data in Gender Classification with Co-training.
DASFAA 2015, 246-251 (CCF B)
-
CCF A
Interactive Gender Inference with Integer Linear Programming.
IJCAI 2015, 2341-2347 (CCF A)
-
CCF A
Semi-Stacking for Semi-supervised Sentiment Classification.
ACL 2015, 27-31 (CCF A)
-
CCF B
User-Type Classification in Micro-Blog Based on Information of Authenticated User.
Journal of Frontiers of Computer Science and Technology, vol. 9(6), 2015 (CCF B in Chinese Journal)
-
CCF B
User Gender Classification in Chinese Microblog.
Journal of Chinese Information Processing, vol. 28(6), 2014 (CCF B in Chinese Journal)
-
CCF B
Interactive Gender Inference in Social Media.
DASFAA 2015, 252-258 (CCF B)
Awards & Honors
Academic Services
Technical Program Committee
- ACL: Annual Meeting of the Association for Computational Linguistics, Area Chair
- EMNLP: Conference on Empirical Methods in Natural Language Processing, Area Chair
- AAAI: Association for the Advancement of Artificial Intelligence, PC
- IJCAI: International Joint Conference on Artificial Intelligence, PC
- WWW, ACM MM: Reviewer
Journal Reviewer
- IEEE/ACM TASLP
- ACM TALLIP
- Science China Information Sciences
- Science China
- Acta Automatica Sinica
- Journal of Chinese Information Processing
Academic Presentations and Exchanges
- 2016-2021: Academic reports and exchanges at top conferences including ACL, AAAI, IJCAI
- 2019: Academic report and exchange at Zhejiang Tailong Commercial Bank, Suzhou Industrial Park Headquarters
- 2019: Invited talk at Ecovacs, Suzhou
- 2022: Academic report and exchange at Alibaba Ant Financial
- 2023: Academic report and exchange at the establishment of NLPAI-SCHOOL, Microsoft Asia Engineering Institute, Suzhou
Research Grants
As Principle Investigator
-
Key Technologies for Multimodal Implicit Sentiment Understanding and Editing Generation in Harmful Short-Video Content Governance
No. 62576234 · 500K RMB · 2026.01–2029.12
NSFC General Program -
Key Technology Research on Attribute-level Sentiment Analysis for Conversational Texts
No. 62006166 · 240K RMB · 2021.01–2023.12
NSFC Young Scientist Fund Project -
Research on Chinese Single-document Automatic Summarization Based on Discourse Structure Analysis
No. 61976146 · 560K RMB · 2020.01–2023.12
NSFC General Program -
Resource Construction and Key Technology Research on Sentiment Information Extraction from Question-answer Texts
No. 2019M661930 · 80K RMB · 2020.01–2022.12
China Postdoctoral Science Foundation (CPSF)
As Co-investigator
-
Scene-based Knowledge Graph for Language Understanding and Generation
Sub-project No. 2020AAA0108604 · 6,650K RMB · 2020.11–2023.10
National Key Research and Development Program