主要研究方向:数据挖掘、自然语言处理、社会计算;近期重点关注面向“一带一路”的语料库建设及应用。主持国家自然科学基金项目3项,省部级项目7项、省教育厅重点项目2项。在ACM Transactions on Asian and Low-Resource Language Information Processing 、Expert Systems with Applications、Pattern Recognition Letters、Information Processing Letters、Journal of Intelligent & Fuzzy Systems等学术期刊上发表论文90余篇;出版专著1部,教材2部。
(1) Multi-domain Sentiment Classification on Self-constructed Indonesian Dataset. NLPCC2020.
(2) An Indonesian sentiment classification based on multi-task learning. FSKD2020.
(3) IndoAbbr: A New Benchmark Dataset for Indonesian Abbreviation Identification. IALP2020.
(4) Fake News Detection in the Urdu Language using CharCNN-RoBERTa. FIRE2020.
(5) A Framework for Indonesian Grammar Error Correction. TALLIP. 2021.
(6) A Simple but Effective Method for Indonesian Automatic Text Summarization. Connection Science. 2021.
(7) Towards Corpus and Model: Hierarchical Structure-Attention-based Features for Indonesian Named Entity Recognition. Intelligent Data Analysis. 2021.
(8) Irony Detection in the Portuguese Language using BERT. CEUR. 2021.
(9) Pre-trained Language models for Tagalog with Multi-source data. NLPCC2021.
(10) Pre-trained Models and Evaluation Data for the Khmer Language. Tsinghua Science and Technology.2021