Computer Application in Arts and Humanities Web-Based Services Document Analysis
Issue Date:
2019
Publisher:
Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Citation:
Serdica Journal of Computing, Vol. 13, No 1-2, (2019), 081p-096p
Abstract:
In this paper we investigate the influence of emoticons, informal speech,
lexical and other linguistic features on the sentiment contained in SMS messages.
Using the dataset of ∼ 6,000 samples, we trained a linear SVM classifier able to
determine positive, negative and neutral sentiments. The dataset mostly contains
messages in Serbian, but also in English and German. The classifier had an average
accuracy score of 92.3% in a 5-fold Cross Validation setting, and F1-score of
92.1%, 74.0% and 93.3% in favor of the positive, negative and neutral class, respectively.