A Malay Hadith translated document retrieval using parallel Latent Semantic Indexing (LSI)
Latent Semantic Indexing (LSI) is one of the well-known searching techniques where documents are retrieved based on the content similarity or meaning of the documents. LSI is an effective method to improve the retrieval performance, however, as the size of documents gets larger; a better technique i...
Published in: | 2016 3rd International Conference on Information Retrieval and Knowledge Management, CAMP 2016 - Conference Proceedings |
---|---|
Main Author: | |
Format: | Conference paper |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers Inc.
2017
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85015812907&doi=10.1109%2fINFRKM.2016.7806346&partnerID=40&md5=0a60b7dc337b0e95e27c9f012db9b38d |
id |
2-s2.0-85015812907 |
---|---|
spelling |
2-s2.0-85015812907 Amirah N.N.; Rahim T.M.; Mabni Z.; Hanum H.M.; Rahman N.A. A Malay Hadith translated document retrieval using parallel Latent Semantic Indexing (LSI) 2017 2016 3rd International Conference on Information Retrieval and Knowledge Management, CAMP 2016 - Conference Proceedings 10.1109/INFRKM.2016.7806346 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85015812907&doi=10.1109%2fINFRKM.2016.7806346&partnerID=40&md5=0a60b7dc337b0e95e27c9f012db9b38d Latent Semantic Indexing (LSI) is one of the well-known searching techniques where documents are retrieved based on the content similarity or meaning of the documents. LSI is an effective method to improve the retrieval performance, however, as the size of documents gets larger; a better technique is needed to process the documents faster. In this paper, a new parallel LSI algorithm which runs on standard multi-core personal computer (PC) is proposed to improve the performance of retrieving relevant documents. The parallel LSI algorithm uses parallel threads to automatically perform the matrix computations using the Fork-Join approach. 2028 text documents extracted from four volumes of the Malay-translated book of Hadith known as Shahih Bukhari were used as the test collections. We compare the time to process LSI space between both sequential and parallel systems. The percentage of recall, precision and effectiveness for retrieving relevant document are also measured for both systems using the Information Retrieval (IR) metrics which are recall, precision, and effectiveness. The results show that the time taken to create LSI space for parallel system is faster than sequential system. Based on recall, precision and effectiveness measures, our proposed parallel LSI system is comparable to sequential LSI system. © 2016 IEEE. Institute of Electrical and Electronics Engineers Inc. English Conference paper |
author |
Amirah N.N.; Rahim T.M.; Mabni Z.; Hanum H.M.; Rahman N.A. |
spellingShingle |
Amirah N.N.; Rahim T.M.; Mabni Z.; Hanum H.M.; Rahman N.A. A Malay Hadith translated document retrieval using parallel Latent Semantic Indexing (LSI) |
author_facet |
Amirah N.N.; Rahim T.M.; Mabni Z.; Hanum H.M.; Rahman N.A. |
author_sort |
Amirah N.N.; Rahim T.M.; Mabni Z.; Hanum H.M.; Rahman N.A. |
title |
A Malay Hadith translated document retrieval using parallel Latent Semantic Indexing (LSI) |
title_short |
A Malay Hadith translated document retrieval using parallel Latent Semantic Indexing (LSI) |
title_full |
A Malay Hadith translated document retrieval using parallel Latent Semantic Indexing (LSI) |
title_fullStr |
A Malay Hadith translated document retrieval using parallel Latent Semantic Indexing (LSI) |
title_full_unstemmed |
A Malay Hadith translated document retrieval using parallel Latent Semantic Indexing (LSI) |
title_sort |
A Malay Hadith translated document retrieval using parallel Latent Semantic Indexing (LSI) |
publishDate |
2017 |
container_title |
2016 3rd International Conference on Information Retrieval and Knowledge Management, CAMP 2016 - Conference Proceedings |
container_volume |
|
container_issue |
|
doi_str_mv |
10.1109/INFRKM.2016.7806346 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85015812907&doi=10.1109%2fINFRKM.2016.7806346&partnerID=40&md5=0a60b7dc337b0e95e27c9f012db9b38d |
description |
Latent Semantic Indexing (LSI) is one of the well-known searching techniques where documents are retrieved based on the content similarity or meaning of the documents. LSI is an effective method to improve the retrieval performance, however, as the size of documents gets larger; a better technique is needed to process the documents faster. In this paper, a new parallel LSI algorithm which runs on standard multi-core personal computer (PC) is proposed to improve the performance of retrieving relevant documents. The parallel LSI algorithm uses parallel threads to automatically perform the matrix computations using the Fork-Join approach. 2028 text documents extracted from four volumes of the Malay-translated book of Hadith known as Shahih Bukhari were used as the test collections. We compare the time to process LSI space between both sequential and parallel systems. The percentage of recall, precision and effectiveness for retrieving relevant document are also measured for both systems using the Information Retrieval (IR) metrics which are recall, precision, and effectiveness. The results show that the time taken to create LSI space for parallel system is faster than sequential system. Based on recall, precision and effectiveness measures, our proposed parallel LSI system is comparable to sequential LSI system. © 2016 IEEE. |
publisher |
Institute of Electrical and Electronics Engineers Inc. |
issn |
|
language |
English |
format |
Conference paper |
accesstype |
|
record_format |
scopus |
collection |
Scopus |
_version_ |
1809678160643489792 |