Summary: | Latent Semantic Indexing (LSI) is one of the well-known searching techniques where documents are retrieved based on the content similarity or meaning of the documents. LSI is an effective method to improve the retrieval performance, however, as the size of documents gets larger; a better technique is needed to process the documents faster. In this paper, a new parallel LSI algorithm which runs on standard multi-core personal computer (PC) is proposed to improve the performance of retrieving relevant documents. The parallel LSI algorithm uses parallel threads to automatically perform the matrix computations using the Fork-Join approach. 2028 text documents extracted from four volumes of the Malay-translated book of Hadith known as Shahih Bukhari were used as the test collections. We compare the time to process LSI space between both sequential and parallel systems. The percentage of recall, precision and effectiveness for retrieving relevant document are also measured for both systems using the Information Retrieval (IR) metrics which are recall, precision, and effectiveness. The results show that the time taken to create LSI space for parallel system is faster than sequential system. Based on recall, precision and effectiveness measures, our proposed parallel LSI system is comparable to sequential LSI system. © 2016 IEEE.
|