IMI-BAS BAS
 

BulDML at Institute of Mathematics and Informatics >
IMI >
IMI Periodicals >
Pliska Studia Mathematica Bulgarica >
2013 Volume 22 >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10525/2508

Title: Classification of Texts' Authorship Using a Regression Model on Compressed Data
Authors: Dackova, Diana
Mateev, Plamen
Keywords: Text authorship identification
Classification
Compression
Linear Regression
Issue Date: 2013
Publisher: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Citation: Pliska Studia Mathematica Bulgarica, Vol. 22, No 1, (2013), 25p-32p
Abstract: An algorithm for text authorship identification is proposed. The procedure is based on the Kolmogorov complexity and uses regression models on the length of the compressed texts. The classification employs the regression parameters estimates. Different combinations of compressor parameters and the preliminary processing on the data are examined using prose texts of a few English classics.
Description: 2010 Mathematics Subject Classification: 68T50,62H30,62J05.
URI: http://hdl.handle.net/10525/2508
ISSN: 0204-9805
Appears in Collections:2013 Volume 22

Files in This Item:

File Description SizeFormat
Pliska-22-2013-025-032.pdf433.7 kBAdobe PDFView/Open

 



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0!   Creative Commons License