• ISSN 0258-2724
  • CN 51-1277/U
  • EI Compendex
  • Scopus
  • Indexed by Core Journals of China, Chinese S&T Journal Citation Reports
  • Chinese S&T Journal Citation Reports
  • Chinese Science Citation Database
Volume 28 Issue 3
Jun.  2015
Turn off MathJax
Article Contents
ZHAI Donghai, CUI Jingjing, NIE Hongyu, DU Jia. Topic Link Detection Method Based on Semantic Similarity[J]. Journal of Southwest Jiaotong University, 2015, 28(3): 517-522. doi: 10.3969/j.issn.0258-2724.2015.03.021
Citation: ZHAI Donghai, CUI Jingjing, NIE Hongyu, DU Jia. Topic Link Detection Method Based on Semantic Similarity[J]. Journal of Southwest Jiaotong University, 2015, 28(3): 517-522. doi: 10.3969/j.issn.0258-2724.2015.03.021

Topic Link Detection Method Based on Semantic Similarity

doi: 10.3969/j.issn.0258-2724.2015.03.021
  • Received Date: 30 Jun 2014
  • Publish Date: 25 Jun 2015
  • To effectively judge the similarity between the topics of any two of stories, a topic link detection method was proposed on the basis of semantic similarity. First, the relative entropy between the feature words in two stories was calculated to work as the semantic similarity. Furthermore, the relevance between the feature words and the other story was obtained by calculating the average semantic similarity. At last, the relevance degree between two stories was calculated by considering TF-IF(term frequency-nverse document frequency)weights of the feature words in the corpus and the semantic similarity simultaneously, completing the link detection of the story pairs. The proposed algorithm was compared with the VSM (vector space model) method and average point-wise mutual information. The experimental results for Chinese Corpus of TDT4 show that minimum DET(detection error tradeoff)cost of the proposed algorithm is reduced by about 3%, which demonstrates that the proposed algorithm can impose the context information effectively and improve the performance of the topic link detection system simultaneously.

     

  • loading
  • 洪宇,张宇,刘挺,等. 话题检测与跟踪的评测及研究综述
    [J]. 中文信息学报,2007,21(6): 71-87. HONG Yu, ZHANG Yu, LIU Ting, et al. Topic detection and tracking review
    ALLAN J, LAVRENKO V, MALIN D, et al. Detections, bounds and timelines: UMASS and TDT-3
    KUMARAN G, ALLAN J. Text classification and named entities for new event detection
    [J]. Journal of Chinese Information Processing, 2007, 21(6): 71-87.
    贾真,何大可,尹红风,等. 基于无监督学习的部分-整体关系获取
    [C]//Proceedings of Topic Detection and Tracking(TDT-3). Vienna:, 2000: 167-174.
    庞海杰. 基于动态共现的中文话题关联检测
    杨玉珍,刘培玉,费绍栋,等. 融合扩展信息瓶颈理论的话题关联检测方法研究
    [C]//Proc. of the SIGIR 2004. New York: Association for Computing Machinery Press, 2004: 297-304.
    CHEN Y J, CHEN H H, NLP I R. Approaches to monolingual and multilingual link detection
    SHAH C, EGUCHI K. Improving document representation for story link detection by modeling term topicality
    [J]. 西南交通大学学报,2014,49(4): 590-596. JIA Zhen, HE Dake, YIN Hongfeng, et al. Acquisition of part-whole relations based on unsupervised learning
    DAGAN I, MARCUS S, MARKOVITCH S. Contextual word similarity and estimation from sparse data
    袁里驰. 一种基于互信息的词聚类算法
    [J]. Journal of Southwest Jiaotong University, 2014, 49(4): 590-596.
    龙志祎,程葳. 基于词聚类的热点话题检测算法
    [J]. 计算机应用与软件,2012,29(3): 115-117. PANG Haijie. Chinese story link detection based on dynamic co-occurrance
    CHEN P I, LIN S J. Word Ad-Hoc network: using Google core distance to extract the most relevant information
    PAN Y, LUO H X, TANG Y, et al. Learning to rank with document ranks and scores
    [J]. Computer Applications and Software, 2012, 29(3): 115-117.
    BURGESS C, LIVESAY K, LUND K. Explorations in context space: words, sentences, discourse
    SONG D, BRUZA P D. Towards context sensitive information inference
    [J]. 自动化学报,2014,40(3): 471-479. YANG Yuzhen, LIU Peiyu, FEI Shaodong, et al. A topic link detection method based on improved information bottleneck theory
    BAI J, SONG D, BRUZA P, et al. Query expansionusing term relationships in language models for information retrieval
    [J]. Acta Automatica Sinica, 2014, 40(3): 471-479.
    YU L C, WU C H, YEH J F, et al. HAL-based evolutionary inference for pattern induction from psychiatry Web resources
    [C]//Proceedings of the 19th International Conference on Computational Linguistics-Volume 1. Taipei: Association for Computational Linguistics, 2002: 1-7.
    BUDANITSKY A, HIRST G. Evaluating word net-based measures of lexical semantic relatedness
    KULLBACK S. Information theory and statistics
    [J]. Information and Media Technologies, 2009, 4(2): 433-441.
    HIJAZI M H A, COENEN F, ZHENG Y. Data mining techniques for the screening of age-related macular degeneration
    [C]//Proceedings of the 31st Annual Meeting on Association for Computational Linguistics. Morristown: Association for Computational Linguistics, 1993: 164-171.
    [J]. 系统工程,2008,26(5): 120-122. YUAN Lichi. A word clustering method based onmutual information
    [J]. Systems Engineering, 2008, 26(5): 120-122.
    [J]. 计算机工程与设计,2011(6): 60-84. LONG Zhiyi, CHENG Wei. Kind of hot topic detection algorithm based on clustering keywords
    [J]. Computer Engineering and Design, 2011(6): 60-84.
    [J]. Knowledge-Based Systems, 2011, 24: 393-405.
    [J]. Knowledge-Based Systems, 2011, 24: 478-483.
    [J]. Discourse Processes, 1998, 25(2/3): 211-257.
    [J]. Journal of the American Society for Information Science and Technology, 2003, 54(4): 321-334.
    [C]//Proc. 14th ACM Int. Conf. Inf. Knowl. Manage. (CIKM'05). Ann Arbor:, 2005: 688-695.
    [J]. IEEE Transactions on Evolutionary Computation, 2008, 12(2): 160-170.
    [J]. Computational Linguistics, 2006, 32(1): 13-47.
    [M]. New York: John-Wiley Sons, 1959: 30-50.
    [J]. Knowledge-Based Systems, 2012, 29: 83-92.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索
    Article views(859) PDF downloads(632) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return