Computer Application in Arts and Humanities Web-Based Services Document Analysis
Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Serdica Journal of Computing, Vol. 13, No 1-2, (2019), 081p-096p
In this paper we investigate the inﬂuence of emoticons, informal speech,
lexical and other linguistic features on the sentiment contained in SMS messages.
Using the dataset of ∼ 6,000 samples, we trained a linear SVM classiﬁer able to
determine positive, negative and neutral sentiments. The dataset mostly contains
messages in Serbian, but also in English and German. The classiﬁer had an average
accuracy score of 92.3% in a 5-fold Cross Validation setting, and F1-score of
92.1%, 74.0% and 93.3% in favor of the positive, negative and neutral class, respectively.