Visualizing stemming techniques on online news articles text analytics

Stemming is the process to convert words into their root words by the stemming algorithm. It is one of the main processes in text analytics where the text data needs to go through stemming process before proceeding to further analysis. Text analytics is a very common practice nowadays that is practi...

Full description

Bibliographic Details
Published in:Bulletin of Electrical Engineering and Informatics
Main Author: Razmi N.A.; Zamri M.Z.; Ghazalli S.S.S.; Seman N.
Format: Conference paper
Language:English
Published: Institute of Advanced Engineering and Science 2021
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85092360495&doi=10.11591%2feei.v10i1.2504&partnerID=40&md5=d9661186e450e40c7d53fb1fb3ed5b9f
id 2-s2.0-85092360495
spelling 2-s2.0-85092360495
Razmi N.A.; Zamri M.Z.; Ghazalli S.S.S.; Seman N.
Visualizing stemming techniques on online news articles text analytics
2021
Bulletin of Electrical Engineering and Informatics
10
1
10.11591/eei.v10i1.2504
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85092360495&doi=10.11591%2feei.v10i1.2504&partnerID=40&md5=d9661186e450e40c7d53fb1fb3ed5b9f
Stemming is the process to convert words into their root words by the stemming algorithm. It is one of the main processes in text analytics where the text data needs to go through stemming process before proceeding to further analysis. Text analytics is a very common practice nowadays that is practiced toanalyze contents of text data from various sources such as the mass media and media social. In this study, two different stemming techniques; Porter and Lancaster are evaluated. The differences in the outputs that are resulted from the different stemming techniques are discussed based on the stemming error and the resulted visualization. The finding from this study shows that Porter stemming performs better than Lancaster stemming, by 43%, based on the stemming error produced. Visualization can still be accommodated by the stemmed text data but some understanding of the background on the text data is needed by the tool users to ensure that correct interpretation can be made on the visualization outputs. © 2020, Institute of Advanced Engineering and Science. All rights reserved.
Institute of Advanced Engineering and Science
20893191
English
Conference paper
All Open Access; Gold Open Access
author Razmi N.A.; Zamri M.Z.; Ghazalli S.S.S.; Seman N.
spellingShingle Razmi N.A.; Zamri M.Z.; Ghazalli S.S.S.; Seman N.
Visualizing stemming techniques on online news articles text analytics
author_facet Razmi N.A.; Zamri M.Z.; Ghazalli S.S.S.; Seman N.
author_sort Razmi N.A.; Zamri M.Z.; Ghazalli S.S.S.; Seman N.
title Visualizing stemming techniques on online news articles text analytics
title_short Visualizing stemming techniques on online news articles text analytics
title_full Visualizing stemming techniques on online news articles text analytics
title_fullStr Visualizing stemming techniques on online news articles text analytics
title_full_unstemmed Visualizing stemming techniques on online news articles text analytics
title_sort Visualizing stemming techniques on online news articles text analytics
publishDate 2021
container_title Bulletin of Electrical Engineering and Informatics
container_volume 10
container_issue 1
doi_str_mv 10.11591/eei.v10i1.2504
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85092360495&doi=10.11591%2feei.v10i1.2504&partnerID=40&md5=d9661186e450e40c7d53fb1fb3ed5b9f
description Stemming is the process to convert words into their root words by the stemming algorithm. It is one of the main processes in text analytics where the text data needs to go through stemming process before proceeding to further analysis. Text analytics is a very common practice nowadays that is practiced toanalyze contents of text data from various sources such as the mass media and media social. In this study, two different stemming techniques; Porter and Lancaster are evaluated. The differences in the outputs that are resulted from the different stemming techniques are discussed based on the stemming error and the resulted visualization. The finding from this study shows that Porter stemming performs better than Lancaster stemming, by 43%, based on the stemming error produced. Visualization can still be accommodated by the stemmed text data but some understanding of the background on the text data is needed by the tool users to ensure that correct interpretation can be made on the visualization outputs. © 2020, Institute of Advanced Engineering and Science. All rights reserved.
publisher Institute of Advanced Engineering and Science
issn 20893191
language English
format Conference paper
accesstype All Open Access; Gold Open Access
record_format scopus
collection Scopus
_version_ 1809677894744539136