BulDML at Institute of Mathematics and Informatics >
IMI Periodicals >
Serdica Journal of Computing >
2007 >
Volume 1 Number 1 >

Please use this identifier to cite or link to this item:

Title: Applying A Normalized Compression Metric To The Measurement Of Dialect Distance
Authors: Simov, Kiril
Osenova, Petya
Keywords: Kolmogorov Complexity
Compression Metric
Dialect Distance
Language Contacts
Issue Date: 2007
Publisher: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Citation: Serdica Journal of Computing, Vol. 1, No 1, (2007), 73p-86p
Abstract: The paper discusses the application of a similarity metric based on compression to the measurement of the distance among Bulgarian dia- lects. The similarity metric is de ned on the basis of the notion of Kolmo- gorov complexity of a le (or binary string). The application of Kolmogorov complexity in practice is not possible because its calculation over a le is an undecidable problem. Thus, the actual similarity metric is based on a real life compressor which only approximates the Kolmogorov complexity. To use the metric for distance measurement of Bulgarian dialects we rst represent the dialectological data in such a way that the metric is applicable. We propose two such representations which are compared to a baseline distance between dialects. Then we conclude the paper with an outline of our future work.
ISSN: 1312-6555
Appears in Collections:Volume 1 Number 1

Files in This Item:

File Description SizeFormat
sjc031-vol1-num1-2007.pdf159.72 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0!   Creative Commons License