Volume 14 Number 10 (Oct. 2019)
Home > Archive > 2019 > Volume 14 Number 10 (Oct. 2019) >
JCP 2019 Vol.14(10): 596-614 ISSN: 1796-203X
doi: 10.17706/jcp.14.10.596-614

A Near Real-Time Approach for Sentiment Analysis Approach Using Arabic Tweets

Anis Zarrad1, Izzat Alsmadi2, Abdulaziz Aljaloud3
1School of Computer Science, University of Birmingham, Dubai, UAE.
2Department of Computing and Cyber Security, University of Texas A&M, San Antonio, TX 77005 USA.
3Prince Sultan University, Riyadh, 11586 Saudi Arabia.
….

Abstract—Big data storage and real time data analysis are major challenges for IT researchers. The recent massive increase in data has not been accompanied by adequate storage technology and data processing algorithms. Understanding what people think about an idea, a product, a service or a policy is important for individuals, companies and governments. Sentiment analysis process can be used to identify opinions expressed in text on certain subjects. The result accuracy has a direct effect on decision making in both business and government. Our focus in this paper is first to identify the critical issues associated with real-time big data analysis and then to develop a new paradigm on Hadoop Ecosystem with real-time stream data processing to analyze Arabic tweet sentiment on Twitter. To perform real-time analytics, data collection should be performed using Apache Flume in order to move and aggregate all tweets received online (near real-time) to pre-defined locations through a channel called Sinks to the Hadoop distributed file system (HDFS). In addition, due to the serious challenges in Arabic text and speech and the high speed with which tweets arrive, we designed a complex sentiment analysis (SA) module to process each incoming tweet in such a way that no tweets are lost without being analyzed. Also, a sentiment analysis approach to Arabic text was developed using multiple Hive User Defended Functions (UDF). Finally, to guarantee a varied data collection, we proposed a Java MapReduce program for lexicon-based Arabic sentiment analysis, which supports n-gram search in the lexicon. Our approach was applied to determining opinions about MERS virus in the Kingdom of Saudi Arabia on Twitter Public Stream API and the results are discussed.

Index Terms—Big data, Hadoop, opinion mining, sentiment data analysis, MERS-CoV infection virus, social networks analysis.

[PDF]

Cite: Anis Zarrad, Izzat Alsmadi, Abdulaziz Aljaloud, "A Near Real-Time Approach for Sentiment Analysis Approach Using Arabic Tweets," Journal of Computers vol. 14, no. 10, pp. 596-614, 2019.

General Information

ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO,  ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
E-mail: jcp@iap.org
  • Nov 14, 2019 News!

    Vol 14, No 11 has been published with online version   [Click]

  • Mar 20, 2020 News!

    Vol 15, No 2 has been published with online version   [Click]

  • Dec 16, 2019 News!

    Vol 14, No 12 has been published with online version   [Click]

  • Sep 16, 2019 News!

    Vol 14, No 9 has been published with online version   [Click]

  • Aug 16, 2019 News!

    Vol 14, No 8 has been published with online version   [Click]

  • Read more>>