TY - GEN
T1 - Feature selection methods in persian sentiment analysis
AU - Saraee, Mohamad
AU - Bagheri, Ayoub
PY - 2013
Y1 - 2013
N2 - With the enormous growth of digital content in internet, various types of online reviews such as product and movie reviews present a wealth of subjective information that can be very helpful for potential users. Sentiment analysis aims to use automated tools to detect subjective information from reviews. Up to now as there are few researches conducted on feature selection in sentiment analysis, there are very rare works for Persian sentiment analysis. This paper considers the problem of sentiment classification using different feature selection methods for online customer reviews in Persian language. Three of the challenges of Persian text are using of a wide variety of declensional suffixes, different word spacing and many informal or colloquial words. In this paper we study these challenges by proposing a model for sentiment classification of Persian review documents. The proposed model is based on stemming and feature selection and is employed Naive Bayes algorithm for classification. We evaluate the performance of the model on a collection of cellphone reviews, where the results show the effectiveness of the proposed approaches.
AB - With the enormous growth of digital content in internet, various types of online reviews such as product and movie reviews present a wealth of subjective information that can be very helpful for potential users. Sentiment analysis aims to use automated tools to detect subjective information from reviews. Up to now as there are few researches conducted on feature selection in sentiment analysis, there are very rare works for Persian sentiment analysis. This paper considers the problem of sentiment classification using different feature selection methods for online customer reviews in Persian language. Three of the challenges of Persian text are using of a wide variety of declensional suffixes, different word spacing and many informal or colloquial words. In this paper we study these challenges by proposing a model for sentiment classification of Persian review documents. The proposed model is based on stemming and feature selection and is employed Naive Bayes algorithm for classification. We evaluate the performance of the model on a collection of cellphone reviews, where the results show the effectiveness of the proposed approaches.
KW - feature selection
KW - mutual information
KW - Naive Bayes algorithm
KW - Persian language
KW - sentiment analysis
KW - sentiment classification
UR - http://www.scopus.com/inward/record.url?scp=84884966737&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-38824-8_29
DO - 10.1007/978-3-642-38824-8_29
M3 - Conference contribution
AN - SCOPUS:84884966737
SN - 9783642388231
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 303
EP - 308
BT - Natural Language Processing and Information Systems - 18th International Conference on Applications of Natural Language to Information Systems, NLDB 2013, Proceedings
T2 - 18th International Conference on Application of Natural Language to Information Systems, NLDB 2013
Y2 - 19 June 2013 through 21 June 2013
ER -