Journal of Liaoning Petrochemical University ›› 2026, Vol. 46 ›› Issue (2): 78-87.DOI: 10.12422/j.issn.1672-6952.2026.02.009

• Information and Control Engineering • Previous Articles     Next Articles

Performance Prediction of Catalysts for CO2 Hydrogenation to Methanol Based on Large Language Model and Deep Learning

Qinghui LIU1(), Ziyi LI2, Hao YU2, Siyu YANG2()   

  1. 1.Jieyang Qianzhan Wind Power Co. ,Ltd. ,Jieyang Guangdong 522000,China
    2.School of Chemistry and Chemical Engineering,South China University of Technology,Guangzhou Guangdong 510640,China
  • Received:2025-09-26 Revised:2025-10-23 Published:2026-04-25 Online:2026-04-21
  • Contact: Siyu YANG

基于大语言模型与深度学习的CO2加氢制甲醇催化剂性能筛选与预测

刘庆辉1(), 李子怡2, 余皓2, 杨思宇2()   

  1. 1.揭阳前詹风电有限公司,广东 揭阳 522000
    2.华南理工大学 化学与化工学院,广东 广州 510640
  • 通讯作者: 杨思宇
  • 作者简介:刘庆辉(1981-),男,硕士,从事工程项目管理方面的研究;E⁃mail:liuqinghui@spic.com.cn
  • 基金资助:
    国家自然科学基金重点项目(U22A20415);国家自然科学基金重点项目(22278151);广东省基础与应用基础研究基金项目(2023A1515012071)

Abstract:

To address the low efficiency in developing catalysts for CO2 hydrogenation to methanol, this study constructs and validates an intelligent performance prediction model based on large language model (LLM) and deep learning. First, a Large Language Model (LLM) to design structured prompts, achieving semi⁃automated and high⁃efficiency extraction of multi⁃dimensional catalyst data from literature. Subsequently, a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN⁃GP) is employed to augment the sparse original dataset, effectively overcoming the bottleneck of data scarcity. Following data cleaning, feature engineering, and dimensionality reduction, a hyperparameter⁃optimized Multi⁃Layer Perceptron (MLP) is constructed as the prediction model. The results show that the optimized MLP model achieves high prediction accuracy on an independent test set, with R² values for CO2 conversion and methanol selectivity reaching as high as 0.972 3 and 0.969 3, respectively. SHAP⁃based feature analysis reveals that BET surface area and Cu⁃based catalysts are the dominant factors affecting catalytic performance, and also uncovered the unique dependency of In⁃based catalysts on metal content. This data⁃driven model, integrating LLM and WGAN⁃GP, provides a powerful tool for the rapid screening and rational design of novel catalysts, demonstrating the great potential of AI in catalysis research.

Key words: CO2 hydrogenation, Methanol synthesis, Catalyst performance prediction, Large language model, Machine learning

摘要:

为解决CO2加氢制甲醇催化剂开发效率低的问题,构建并验证一种基于大语言模型 (Large Language Model,LLM)与深度学习的性能智能预测模型。首先,利用LLM设计结构化指令,实现了从大量文献中半自动化、高效率地提取多维催化剂数据;采用带梯度惩罚的Wasserstein生成对抗网络(WGAN⁃GP)对稀疏的原始数据集进行高质量增强,有效克服了数据量不足的瓶颈;在此基础上,经数据清洗、特征工程与降维处理后,采用超参数优化的多层感知机(Multi⁃layer Perceptron,MLP)构建了预测模型。结果表明,优化后的MLP模型在独立测试集上对CO2转化率与甲醇选择性的预测决定系数(R2)分别高达0.972 3和0.969 3。基于SHAP(SHapley Additive exPlanations)的特征分析结果表明,BET(Brunauer⁃Emmett⁃Teller)比表面积和铜基催化剂是影响催化性能的主导因素,且铟(In)基催化剂对金属质量分数具有独特依赖性。整合LLM与WGAN⁃GP的数据驱动模型可为新型催化剂的快速筛选与理性设计提供有力工具,展现了人工智能(Artificial Intelligence,AI)在催化研究中的巨大应用潜力。

关键词: CO2加氢, 甲醇合成, 催化剂性能预测, 大语言模型, 机器学习

CLC Number: 

Cite this article

Qinghui LIU, Ziyi LI, Hao YU, Siyu YANG. Performance Prediction of Catalysts for CO2 Hydrogenation to Methanol Based on Large Language Model and Deep Learning[J]. Journal of Liaoning Petrochemical University, 2026, 46(2): 78-87.

刘庆辉, 李子怡, 余皓, 杨思宇. 基于大语言模型与深度学习的CO2加氢制甲醇催化剂性能筛选与预测[J]. 辽宁石油化工大学学报, 2026, 46(2): 78-87.