• ISSN 0258-2724
  • CN 51-1277/U
  • EI Compendex
  • Scopus 收录
  • 全国中文核心期刊
  • 中国科技论文统计源期刊
  • 中国科学引文数据库来源期刊

基于大模型构建图网络的事件因果关系识别

潘磊 袁鸿霄 钟准 廖泓舟 杨蕊嘉

潘磊, 袁鸿霄, 钟准, 廖泓舟, 杨蕊嘉. 基于大模型构建图网络的事件因果关系识别[J]. 西南交通大学学报. doi: 10.3969/j.issn.0258-2724.20240484
引用本文: 潘磊, 袁鸿霄, 钟准, 廖泓舟, 杨蕊嘉. 基于大模型构建图网络的事件因果关系识别[J]. 西南交通大学学报. doi: 10.3969/j.issn.0258-2724.20240484
PAN Lei, YUAN Hongxiao, ZHONG Zhun, LIAO Hongzhou, YANG Ruijia. Event Causality Identification Based on Large Language Model-Constructed Graph Networks[J]. Journal of Southwest Jiaotong University. doi: 10.3969/j.issn.0258-2724.20240484
Citation: PAN Lei, YUAN Hongxiao, ZHONG Zhun, LIAO Hongzhou, YANG Ruijia. Event Causality Identification Based on Large Language Model-Constructed Graph Networks[J]. Journal of Southwest Jiaotong University. doi: 10.3969/j.issn.0258-2724.20240484

基于大模型构建图网络的事件因果关系识别

doi: 10.3969/j.issn.0258-2724.20240484
基金项目: 四川省自然科学基金项目(2024NSFSC0508)
详细信息
    作者简介:

    潘磊(1986—),男,高级工程师,博士,研究方向为多模态信息智能处理,E-mail:mapan.lei@163.com

  • 中图分类号: TP391.1;TP18

Event Causality Identification Based on Large Language Model-Constructed Graph Networks

  • 摘要:

    为提高文档级事件因果关系识别的准确率,首先利用大语言模型所包含的隐性知识,从文档中筛选出与目标事件相关的其他事件,构建事件候选集;其次,将事件候选集组合成事件关系全连接图,并对其进行条件约束,通过条件约束将事件关系全连接图简化为事件关系约束图,减少图中无关事件的噪声传递;最后,以自注意力机制计算图网络中任一节点对其他节点的影响,并利用带有焦点损失的二分类损失函数训练模型,进一步缓解了因果关系识别中的假阳性问题. 研究结果表明:模型在事件因果关系识别任务中,Causal-TimeBank数据集句内事件对因果识别的精确率达到77.3%;EventStoryLine数据集句内事件对因果识别的精确率达到75.2%,句间事件对识别的F1-score达到60.6%,文档级事件对识别的F1-score达到59.6%.

     

  • 图 1  数据集节选文档

    Figure 1.  Excerpt from dataset document

    图 2  LLM4GL模型结构

    Figure 2.  Structure of LLM4GL model

    图 3  数据集节选文档

    Figure 3.  Excerpt from dataset document

    表  1  大语言模型选择事件图模块流程

    Table  1.   Process of selecting event graph module in large language models

    模块:大语言模型选择事件图模块
    Input:文档$ D $,提示Prompt,Llama
    Output:筛选事件集$ {E}_{1} $
    1 通过文档$ D $获得初始句子集合$ {S}_{0} $;
    2 删除$ {S}_{0} $没有事件的句子得到新句子集合$ S $;
    3 通过$ S $获得文档内全部事件$ {E}_{0} $;
    4 For $ {e}_{t} $,$ {e}_{n} $ in $ {E}_{0} $:
    5 通过$ {e}_{t} $获得$ {s}_{t} $,通过$ {e}_{n} $获得$ {s}_{n} $;
    6 ($ {s}_{t} $ + $ {s}_{n} $ + $ S $ + Prompt)送入Llama模型;
    7 Llama模型生成事件相关句;
    8 通过事件相关句获得相关事件;
    9 ($ {e}_{t} $ + $ {e}_{n} $ + 相关事件)组合成为$ {E}_{1} $;
    10 End;
    下载: 导出CSV

    表  2  预测-真值混淆矩阵

    Table  2.   Predicted and true value confusion matrix

    预测 正样本 负样本
    预测为真 正阳(NTP 假阳(NTN
    预测为假 假阴(NFP 正阴(NFN
    下载: 导出CSV

    表  3  模型超参数设置

    Table  3.   Hyperparameter settings of model

    超参数 数值
    学习率 2×10−5
    批量大小/批 1
    训练轮数/轮 20
    丢弃率 0.1
    权重衰减 0.01
    下载: 导出CSV

    表  4  句子级事件因果实验结果比较

    Table  4.   Comparison of sentence-level event causality experiment results %

    模型 EventStoryLine Causal-TimeBank
    P/% R/% F1/% P/% R/% F1/%
    GPT-3.5-turbo[16] 27.6 80.2 41.0 7.0 82.6 12.8
    GPT-4.0[16] 27.2 94.7 42.2 6.1 97.4 11.5
    CHEER[8] 56.9 69.6 62.6 56.4 69.5 62.3
    ERGO 57.5 72.0 63.9 62.1 61.3 61.7
    KADE[10] 62.1 68.8 65.3 67.9 64.6 66.2
    HOTECI[15] 66.1 72.3 69.1 71.1 65.9 68.4
    DFP[22] 55.9 69.8 62.1 63.7 64.2 58.5
    GenSORL[23] 65.6 63.3 64.6 60.1 53.3 56.3
    C3NET[24] 60.5 73.6 66.4 60.2 72.8 65.9
    LLM4GL 75.2 46.3 54.7 77.3 59.2 67.1
    下载: 导出CSV

    表  5  在EventStoryLine数据集上的句间事件因果实验结果比较

    Table  5.   Comparison of intra-sentence event causality experiment results on EventStoryLine dataset

    模型句间文档级
    P/%R/%F1/%P/%R/%F1/%
    BERT[25]30.639.134.337.241.239.1
    ERGO[7]51.643.347.148.653.451.4
    GPT-3.5-turbo[16]15.261.824.420.767.531.7
    GPT-4.0[16]16.964.726.820.973.432.5
    CHEER[8]45.252.148.449.753.351.4
    KADE[10]39.245.742.242.651.346.6
    HOTECI[15]81.440.655.163.151.256.5
    LLM4GL53.470.960.659.160.759.6
    下载: 导出CSV

    表  6  在EventStoryLine数据集上的消融实验结果比较

    Table  6.   Comparison of ablation experiment results on EventStoryLine dataset

    模型 Long-
    Former编码器
    LLM
    Option
    RECG P/% R/% F1/%
    LLM4GL1 × × 33.3 63.0 39.4
    LLM4GL2 × 40.9 45.7 44.1
    LLM4GL3 × 40.1 45.8 43.2
    LLM4GL4 × 47.8 57.2 52.1
    LLM4GL 59.1 60.7 59.6
    下载: 导出CSV

    表  7  在EventStoryLine数据集上的定性实验结果比较

    Table  7.   Comparison of qualitative experiment results on EventStoryLine dataset

    事件对 GT BERT LLM4GL
    (shot,shooting)
    (shielding,confessed)
    (shot,shielding)
    (attack,reports)
    (shooting,attack)
    (shielding,attack)
    (shooting,confessed)
    $\vdots $ $\vdots $ $\vdots $ $\vdots $
    下载: 导出CSV

    表  8  在EventStoryLine数据集上Top k 实验结果

    Table  8.   Top k experiment results on EventStoryLine dataset

    kP/%R/%F1/%
    043.455.248.6
    256.258.656.1
    359.160.759.6
    457.857.357.4
    下载: 导出CSV
  • [1] 杨竣辉, 刘宗田, 刘炜, 等. 基于语义事件因果关系识别[J]. 小型微型计算机系统, 2016, 37(3): 433-437.

    Yang Junhui, Liu Zongtian, Liu Wei, et al. Identify causality relationships based on semantic event[J]. Journal of Chinese Computer Systems, 2016, 37(3): 433-437.
    [2] 朱庆, 李茂粟, 丁雨淋, 等. 滑坡灾情数据多层级语义检索方法[J]. 西南交通大学学报, 2020, 55(3): 467-475. doi: 10.3969/j.issn.0258-2724.20180695

    Zhu Qing, Li Maosu, Ding Yulin, et al. Multi-level semantic retrieval method for landslide disaster data[J]. Journal of Southwest Jiaotong University, 2020, 55(3): 467-475. doi: 10.3969/j.issn.0258-2724.20180695
    [3] 王淑营, 李雪, 黎荣, 等. 基于知识图谱的高速列车知识融合方法[J]. 西南交通大学学报, 2024, 59(5): 1194-1203.

    Wang Shuying, Li Xue, Li Rong, et al. Knowledge fusion method of high-speed train based on knowledge graph[J]. Journal of Southwest Jiaotong University, 2024, 59(5): 1194-1203.
    [4] 周旭, 温韬, 龙志强. 基于漏检率的磁浮列车悬浮系统异常检测[J]. 西南交通大学学报, 2023, 58(4): 903-912.

    Zhou Xu, Wen Tao, Long Zhiqiang. Anomaly detection of suspension system in maglev train based on missed detection rate[J]. Journal of Southwest Jiaotong University, 2023, 58(4): 903-912.
    [5] 郑巧夺, 吴贞东, 邹俊颖. 基于双层CNN-BiGRU-CRF的事件因果关系抽取[J]. 计算机工程, 2021, 47(5): 58-64, 72.

    Zheng Qiaoduo, Wu Zhendong, Zou Junying. Event causality extraction based on two-layer CNN-BiGRU-CRF[J]. Computer Engineering, 2021, 47(5): 58-64,72.
    [6] Tran Phu M, Nguyen T H. Graph convolutional networks for event causality identification with rich document-level structures[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2021: 3480-3490.
    [7] Chen M Q, Cao Y X, Deng K Q, et al. ERGO: event relational graph transformer for document-level event causality identification[C]//Proceedings of the 29th International Conference on Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2022: 2118-2128.
    [8] Chen M Q, Cao Y X, Zhang Y, et al. CHEER: centrality-aware high-order event reasoning network for document-level event causality identification[C]// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: Association for Computational Linguistics, 2023: 10804-10816.
    [9] Liu Y, Jiang X X, Zhao W Z, et al. Dual graph convolutional networks for Document-level event causality identification[C]//Web and Big Data. Cham: Springer, 2023: 114-128.
    [10] Wu S F, Zhao R H, Zheng Y F, et al. Identify event causality with knowledge and analogy[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(11): 13745-13753.
    [11] Gao L, Choubey P K, Huang R H. Modeling document-level causal structures for event causal relation identification[C]//Proceedings of the 2019 Conference of the North. Stroudsburg: Association for Computational Linguistics, 2019: 1808-1817.
    [12] Liu J, Chen Y B, Zhao J. Knowledge enhanced event causality identification with mention masking generalizations[C]//Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. Yokohama: International Joint Conferences on Artificial Intelligence Organization, 2020: 3608-3614.
    [13] Zuo X Y, Chen Y B, Liu K, et al. KnowDis: knowledge enhanced data augmentation for event causality detection via distant supervision[C]//Proceedings of the 28th International Conference on Computational Linguistics. Barcelona: International Committee on Computational Linguistics, 2020: 1544-1550.
    [14] Zuo X Y, Cao P F, Chen Y B, et al. Improving event causality identification via self-supervised representation learning on external causal statement[C]//Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Stroudsburg: Association for Computational Linguistics, 2021: 2162-2172.
    [15] Man H, Van Nguyen C, Ngo N T, et al. Hierarchical selection of important context for generative event causality identification with optimal transports[C]//Proceedings of the Language Resources and Evaluation Conference, 2024: 8122-8132.
    [16] Gao J L, Ding X, Qin B, et al. Is ChatGPT a good causal reasoner? a comprehensive evaluation[C]//Findings of the Association for Computational Linguistics: EMNLP 2023. Stroudsburg: Association for Computational Linguistics, 2023: 11111-11126.
    [17] Touvron H, Lavril T, Izacard G, et al. LLaMA: open and efficient foundation language models[PP/OL]. V1. arXiv (2023-02-27)[2024-06-10]. https://doi.org/10.48550/arXiv.2302.13971.
    [18] Beltagy I, Peters M E, Cohan A. Longformer: the long-document transformer[PP/OL]. V2. arXiv (2020-12-02)[2024-06-10]. https://doi.org/10.48550/arXiv.2004.05150.
    [19] Vaswani A, Vaswani A, Shazeer N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010.
    [20] Caselli T, Vossen P. The event StoryLine corpus: a new benchmark for causal and TemporalRelation extraction[C]//Proceedings of the Events and Stories in the News Workshop. Stroudsburg: Association for Computational Linguistics, 2017: 77-86.
    [21] Mirza P, Sprugnoli R, Tonelli S, et al. Annotating causality in the TempEval-3 corpus[C]//Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL). Gothenburg, Stroudsburg: Association for Computational Linguistics, 2014: 10-19.
    [22] Huang P X, Zhao X, Hu M H, et al. Distill, fuse, pre-train: towards effective event causality identification with commonsense-aware pre-trained model[C]//Proceedings of the Language Resources and Evaluation Conference, 2024: 5029-5040.
    [23] Chen M L, Yang W Z, Wei F Y, et al. Event causality identification via structure optimization and reinforcement learning[J]. Knowledge-Based Systems, 2024, 284: 111256.
    [24] Gao J L, Ding X, Li Z Y, et al. Event causality identification via competitive-cooperative cognition networks[J]. Knowledge-Based Systems, 2024, 300: 112139.
    [25] Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019: 4171-4186.
  • 加载中
图(3) / 表(8)
计量
  • 文章访问数:  39
  • HTML全文浏览量:  27
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-09-25
  • 修回日期:  2025-03-28
  • 网络出版日期:  2026-06-01

目录

    /

    返回文章
    返回