The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Digital Archiving Policy
  • Promote your Publication
  • Metadata Harvesting (OAI2)

IJACSA

  • About the Journal
  • Call for Papers
  • Editorial Board
  • Author Guidelines
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Fees/ APC
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • Guest Editors
  • SUSAI-EE 2025
  • ICONS-BA 2025
  • IoT-BLOCK 2025

Future of Information and Communication Conference (FICC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Archives
  • Indexing

DOI: 10.14569/IJARAI.2016.050105
PDF

Bidirectional Extraction of Phrases for Expanding Queries in Academic Paper Retrieval

Author 1: Yuzana Win
Author 2: Tomonari Masada

International Journal of Advanced Research in Artificial Intelligence(ijarai), Volume 5 Issue 1, 2016.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: This paper proposes a new method for query expansion based on bidirectional extraction of phrases as word n-grams from research paper titles. The proposed method aims to extract information relevant to users’ needs and interests and thus to provide a useful system for technical paper retrieval. The outcome of proposed method are the trigrams as phrases that can be used for query expansion. First, word trigrams are extracted from research paper titles. Second, a co-occurrence graph of the extracted trigrams is constructed. To construct the co-occurrence graph, the direction of edges is considered in two ways: forward and reverse. In the forward and reverse co-occurrence graphs, the trigrams point to other trigrams appearing after and before them in a paper title, respectively. Third, Jaccard similarity is computed between trigrams as the weight of the graph edge. Fourth, the weighted version of PageRank is applied. Consequently, the following two types of phrases can be obtained as the trigrams associated with the higher PageRank scores. The trigrams of the one type, which are obtained from the forward co-occurrence graph, can form a more specific query when users add a technical word or words before them. Those of the other type, obtained from the reverse co-occurrence graph, can form a more specific query when users add a technical word or words after them. The extraction of phrases is evaluated as additional features in the paper title classification task using SVM. The experimental results show that the classification accuracy is improved than the accuracy achieved when the standard TF-IDF text features are only used. Moreover, the trigrams extracted by the proposed method can be utilized to expand query words in research paper retrieval.

Keywords: word n-grams; Jaccard similarity; PageRank; TF-IDF; query expansion; information retrieval; feature extraction

Yuzana Win and Tomonari Masada, “Bidirectional Extraction of Phrases for Expanding Queries in Academic Paper Retrieval” International Journal of Advanced Research in Artificial Intelligence(ijarai), 5(1), 2016. http://dx.doi.org/10.14569/IJARAI.2016.050105

@article{Win2016,
title = {Bidirectional Extraction of Phrases for Expanding Queries in Academic Paper Retrieval},
journal = {International Journal of Advanced Research in Artificial Intelligence},
doi = {10.14569/IJARAI.2016.050105},
url = {http://dx.doi.org/10.14569/IJARAI.2016.050105},
year = {2016},
publisher = {The Science and Information Organization},
volume = {5},
number = {1},
author = {Yuzana Win and Tomonari Masada}
}



Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

IJACSA

Upcoming Conferences

IntelliSys 2025

28-29 August 2025

  • Amsterdam, The Netherlands

Future Technologies Conference 2025

6-7 November 2025

  • Munich, Germany

Healthcare Conference 2026

21-22 May 2026

  • Amsterdam, The Netherlands

Computing Conference 2026

9-10 July 2026

  • London, United Kingdom

IntelliSys 2026

3-4 September 2026

  • Amsterdam, The Netherlands

Computer Vision Conference 2026

15-16 October 2026

  • Berlin, Germany
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference
  • Communication Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

© The Science and Information (SAI) Organization Limited. All rights reserved. Registered in England and Wales. Company Number 8933205. thesai.org