半监督卷积神经网络的词义消歧

张春祥; 唐利波; 高雪瑶

doi:10.3969/j.issn.0258-2724.20200105

半监督卷积神经网络的词义消歧

doi: 10.3969/j.issn.0258-2724.20200105

哈尔滨理工大学计算机科学与技术学院, 黑龙江哈尔滨 150080

基金项目: 国家自然科学基金(61502124, 60903082)；中国博士后科学基金(2014M560249)；黑龙江省自然科学基金(F2015041, F201420)；黑龙江省普通高校基本科研业务费专项资金资助(LGYC2018JC014)

详细信息

作者简介:
张春祥（1974—），男，教授，博士，研究方向为自然语言处理与计算机图形学，E-mail：z6c6x666@163.com

通讯作者:
高雪瑶（1979—），女，教授，博士，研究方向为计算机图形学与自然语言处理，E-mail：xueyao_gao@163.com

中图分类号: TP391.2
计量
- 文章访问数: 449
- HTML全文浏览量: 254
- PDF下载量: 26
- 被引次数: 0
出版历程
- 收稿日期: 2020-03-05
- 修回日期: 2020-08-09
- 网络出版日期: 2021-11-13
- 刊出日期: 2020-09-15

Word Sense Disambiguation Based on Semi-Supervised Convolutional Neural Networks

School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China

摘要

摘要:
为了解决有标签语料获取困难的问题，提出了一种半监督学习的卷积神经网络(convolutional neural networks, CNN)汉语词义消歧方法. 首先，提取歧义词左右各2个词汇单元的词形、词性和语义类作为消歧特征，利用词向量工具将消歧特征向量化；然后，对有标签语料进行预处理，获取初始化聚类中心和阈值，同时，使用有标签语料对卷积神经网络消歧模型进行训练，利用优化后的卷积神经网络对无标签语料进行语义分类，选取满足阈值条件的高置信度语料添加到训练语料之中，不断重复上述过程，直到训练语料不再扩大为止；最后，使用SemEval-2007：Task#5作为有标签语料，使用哈尔滨工业大学无标注语料作为无标签语料进行实验. 实验结果表明：所提出方法使CNN的消歧准确率提高了3.1%.
- 半监督学习 /
- 卷积神经网络 /
- 词义消歧 /
- 消歧特征 /
- 词向量工具
Abstract:
In order to solve the difficulty of acquiring tagged corpus, a Chinese word sense disambiguation method is proposed on the basis of semi-supervised learning convolutional neural networks (CNN). Firstly, the word, part of speech and semantic category are extracted as discriminative features, which are acquired from 2 word units on the both left and right adjacent to ambiguous word. Word vector tool is used to denote discriminative features as vector. Secondly, tagged corpus is preprocessed to obtain initialized clustering centers and thresholds. At the same time, it is used to train convolutional neural networks. The optimized CNN is applied for determining the semantic categories of ambiguous words in the untagged corpus. Corpus with high confidence that meets threshold conditions is selected into the training corpus. The above process is repeated until the training corpus is no longer expanded. In the last, SemEval-2007: Task#5 is used as the tagged corpus, and the unannotated corpus from Harbin Institute of Technology is used as the untagged corpus. Experimental results show that the proposed method improve disambiguation accuracy of CNN by 3.1%.
- semi-supervised learning /
- convolutional neural networks /
- word sense disambiguation /
- discriminative features /
- word vector tool

HTML全文

图 1 特征提取

Figure 1. Feature extraction

下载: 全尺寸图片幻灯片

图 2 特征矩阵构建过程

Figure 2. Construction process of feature matrix

下载: 全尺寸图片幻灯片

图 3 softmax层

Figure 3. softmax layer

下载: 全尺寸图片幻灯片

图 4 不同阈值和类别数下的平均消歧准确率

Figure 4. Average disambiguation accuracy at different thresholds and category numbers

下载: 全尺寸图片幻灯片

图 5 不同比例和类别数下的平均消歧准确率

Figure 5. Average disambiguation accuracy at different ratios and category numbers

下载: 全尺寸图片幻灯片

表 1 不同阈值的平均消歧准确率

Table 1. Average disambiguation accuracy of different thresholds %

类别数/类	歧义词汇	T = T_max	T = T_min	T = T_med	T = T_avg
2	表面菜单位动摇儿女镜头开通气息气象使眼光	88.3 88.9 77.8 86.7 90.2 50.0 60.5 64.2 93.8 70.2 76.9	82.4 72.2 94.4 80.0 86.3 57.1 58.8 66.5 87.5 72.5 84.6	76.4 88.9 94.4 93.3 92.4 50.0 58.8 68.6 87.5 76.3 76.9	82.4 88.9 94.4 73.3 96.3 50.7 58.8 67.3 93.8 79.6 84.6
3	补成立赶旗帜日子长城	62.0 84.6 66.7 50.0 51.6 68.0	61.9 88.2 66.7 62.5 48.4 68.0	52.4 80.8 55.6 50.0 48.4 68.0	71.4 76.9 77.8 68.8 48.4 80.0
4	吃动叫	56.0 66.7 61.5	56.0 61.1 64.1	56.0 61.1 64.1	56.0 61.1 53.8
平均准确率		70.7	71.0	70.0	73.2

下载: 导出CSV

表 2 不同比率下的平均消歧准确率

Table 2. Average disambiguation accuracy of different rates %

类别数/类	歧义词汇	r = 1	r = 5	r = 10	r = 50	r = 100
2	表面菜单位动摇儿女镜头开通气息气象使眼光	82.4 88.9 94.5 86.7 82.2 64.3 58.8 64.2 93.8 72.5 84.6	82.4 77.8 94.5 80.0 88.9 50.0 58.8 68.6 87.5 72.5 76.9	76.0 83.3 88.9 80.0 84.6 57.1 58.8 66.5 87.5 70.2 76.9	82.4 88.9 83.3 80.0 90.7 50.0 58.8 66.5 87.5 74.9 84.6	88.3 88.9 94.5 86.7 92.4 57.1 58.8 68.0 93.8 75.4 88.0
3	补成立赶旗帜日子长城	71.4 76.9 72.2 60.5 48.4 60.0	61.9 84.6 61.1 56.3 48.4 68.0	52.4 76.9 72.2 56.3 51.6 68.0	61.9 76.9 61.1 62.5 48.4 68.0	71.4 80.8 66.7 62.5 51.6 64.0
4	吃动叫	60.0 55.6 66.7	64.0 72.2 64.1	60.0 67.1 61.5	60.0 50.0 71.8	64.0 66.7 64.0
平均准确率		72.2	70.9	69.8	70.4	74.2

下载: 导出CSV

表 3 3 组实验的平均消歧准确率

Table 3. Average disambiguation accuracy of three groups of experiments %

类别数/类	歧义词汇	DBN	CNN	本文方法
2	表面菜单位动摇儿女镜头开通气息气象使眼光	61.1 55.6 58.8 62.5 70.0 53.3 70.0 71.4 62.5 62.5 71.4	82.3 72.2 82.3 93.7 94.9 53.3 85.0 64.2 87.5 81.2 71.4	82.4 88.9 94.4 73.3 96.3 50.7 58.8 67.3 93.8 79.6 84.6
3	补成立赶旗帜日子长城	50.0 63.3 50.0 55.6 46.9 38.1	64.9 66.6 55.5 72.2 50.0 71.4	71.4 76.9 77.8 68.8 48.4 80.0
4	吃动叫	43.5 50.0 30.0	52.1 50.0 70.0	56.0 61.1 53.8
平均准确率		56.3	71.0	73.2

下载: 导出CSV

参考文献(20)

[1]	LESK M. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine code from an ice cream[C]//The Figth Annual International Conference on Systems Documentation. Toronto: ACM Press, 1986: 24-26
[2]	杨安,李素建,李芸. 基于领域知识和词向量的词义消歧方法[J]. 北京大学学报(自然科学版),2017,53(2): 204-210. YANG An, LI Sujian, LI Yun. Word sense disambiguation based on domain knowledge and word vector model[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2017, 53(2): 204-210.
[3]	FRANCO R L, IVAN L A , PINTO D, et al. Context expansion for domain-specific word sense disambiguation[J]. IEEE Latin America Transactions, 2015, 13(3): 784-789. doi: 10.1109/TLA.2015.7069105
[4]	唐共波,于东,荀恩东. 基于知网义原词向量表示的无监督词义消歧方法[J]. 中文信息学报,2015,29(6): 23-29. doi: 10.3969/j.issn.1003-0077.2015.06.004 TANG Gongbo, YU Dong, XUN Endong. An unsupervised word sense disambiguation method based on sememe vector in HowNet[J]. Journal of Chinese Information Processing, 2015, 29(6): 23-29. doi: 10.3969/j.issn.1003-0077.2015.06.004
[5]	ARAB M, JAHROMI M Z, FAKHRAHMAD S M. A graph-based approach to word sense disambiguation. An unsupervised method based on semantic relatedness[C]//2016 24th Iranian Conference on Electrical Engineering (CEE). Shiraz: IEEE, 2016: 250-255.
[6]	孟禹光,周俏丽,张桂平,等. 引入词性标记的基于语境相似度的词义消歧[J]. 中文信息学报,2018,32(8): 9-18. doi: 10.3969/j.issn.1003-0077.2018.08.003 MENG Yuguang, ZHOU Qiaoli, ZHANG Guiping, et al. Word sense disambiguation based on context simila- rity with POS tagging[J]. Journal of Chinese Information Processing, 2018, 32(8): 9-18. doi: 10.3969/j.issn.1003-0077.2018.08.003
[7]	鹿文鹏,黄河燕,吴昊. 基于领域知识的图模型词义消歧方法[J]. 自动化学报,2014,40(12): 2836-2850. LU Wenpeng, HUANG Heyan, WU Hao. Word sense disambiguation with graph model based on domain knowledge[J]. Acta Automatica Sinica, 2014, 40(12): 2836-2850.
[8]	DUQUE A, STEVENSON M, MARTINEZ-ROMO J, et al. Co-occurrence graphs for word sense disambiguation in the biomedical domain[J]. Artificial Intelligence in Medicine, 2018, 87: 9-19. doi: 10.1016/j.artmed.2018.03.002
[9]	TRIPODI R, PELILLO M. A game-theoretic approach to word sense disambiguation[J]. Computational Linguistics, 2017, 43(1): 31-70. doi: 10.1162/COLI_a_00274
[10]	XU Xueping, YU Jianping, PIAO Xiaoyu. Contribution of governors to word sense disambiguation of English preposition[J]. ICIC Express Letters, 2015, 6(3): 723-730.
[11]	杨陟卓. 基于上下文翻译的有监督词义消歧研究[J]. 计算机科学,2017,44(4): 252-255, 280. doi: 10.11896/j.issn.1002-137X.2017.04.053 YANG Zhizhuo. Supervised WSD method based on context translation[J]. Computer Science, 2017, 44(4): 252-255, 280. doi: 10.11896/j.issn.1002-137X.2017.04.053
[12]	CARDELLINO C, ALONSO ALEMANY L. Exploring the impact of word embeddings for disjoint semisupervised Spanish verb sense disambiguation[J]. Inteligencia Artificial, 2018, 21(61): 67-81. doi: 10.4114/intartif.vol21iss61pp67-81
[13]	HUANG Z H, CHEN Y D, SHI X D. A novel word sense disambiguation algorithm based on semi-supervised statistical learning[J]. International Journal of Applied Mathematics and Statistics, 2013, 43(13): 452-458.
[14]	MAHMOODVAND M, HOURALI M. Semi-supervised approach for Persian word sense disambiguation[C]// 2017 7th International Conference on Computer and Knowledge Engineering (ICCKE). Mashhad: IEEE, 2017: 104-110.
[15]	刘子图,全紫薇,毛如柏,等. NT-EP:一种无拓扑结构的社交消息传播范围预测方法[J]. 计算机研究与发展,2020,57(6): 1312-1322. doi: 10.7544/issn1000-1239.2020.20190584 LIU Zitu, QUAN Ziwei, MAO Rubai, et al. NT-EP:a non-topology method for predicting the scope of social message propogation[J]. Journal of Computer Research and Development, 2020, 57(6): 1312-1322. doi: 10.7544/issn1000-1239.2020.20190584
[16]	刘勇,谢胜男,仲志伟,等. 社会网中基于主题兴趣的影响最大化算法[J]. 计算机研究与发展,2018,55(11): 2406-2418. doi: 10.7544/issn1000-1239.2018.20170672 LIU Yong, XIE Shengnan, ZHONG Zhiwei, et al. Topic-interest based influence maximization algorithm in social networks[J]. Journal of Computer Research and Development, 2018, 55(11): 2406-2418. doi: 10.7544/issn1000-1239.2018.20170672
[17]	薛涛,王雅玲,穆楠. 基于词义消歧的卷积神经网络文本分类模型[J]. 计算机应用研究,2018,35(10): 2898-2903. doi: 10.3969/j.issn.1001-3695.2018.10.004 XUE Tao, WANG Yaling, MU Nan. Convolutional neural network based on word sense disambiguation for text classification[J]. Application Research of Computers, 2018, 35(10): 2898-2903. doi: 10.3969/j.issn.1001-3695.2018.10.004
[18]	PESARANGHADER A, MATWIN S, SOKOLOVA M, et al. DeepBioWSD:effective deep neural word sense disambiguation of biomedical text data[J]. Journal of the American Medical Informatics Association, 2019, 26(5): 438-446. doi: 10.1093/jamia/ocy189
[19]	BORDES A, GLOROT X, WESTON J, et al. A semantic matching energy function for learning with multi-relational data[J]. Machine Learning, 2014, 94(2): 233-259. doi: 10.1007/s10994-013-5363-6
[20]	CHEN S J, HUNG C. Word sense disambiguation based sentiment lexicons for sentiment classification[J]. Knowledge-Based Systems, 2016, 110: 224-232. doi: 10.1016/j.knosys.2016.07.030