BulDML at Institute of Mathematics and Informatics: Automatic Identification of False Friends in Parallel Corpora: Statistical and Semantic Approach

	Home

Browse
	Communities & Collections
	Issue Date
	Author
	Title
	Subject

Sign on to:
	Receive email updates
	My DSpace authorized users
	Edit Profile


	About DSpace

BulDML at Institute of Mathematics and Informatics >
IMI >
IMI Periodicals >
Serdica Journal of Computing >
2009 >
Volume 3 Number 2 >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10525/366

Title:	Automatic Identification of False Friends in Parallel Corpora: Statistical and Semantic Approach
Authors:	Nakov, Svetlin
Keywords:	Cognates False Friends Identification of False Friends Parallel Corpus Cross-Lingual Semantic Similarity Web as a Corpus
Issue Date:	2009
Publisher:	Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Citation:	Serdica Journal of Computing, Vol. 3, No 2, (2009), 133p-158p
Abstract:	False friends are pairs of words in two languages that are perceived as similar but have different meanings. We present an improved algorithm for acquiring false friends from sentence-level aligned parallel corpus based on statistical observations of words occurrences and co-occurrences in the parallel sentences. The results are compared with an entirely semantic measure for cross-lingual similarity between words based on using the Web as a corpus through analyzing the words’ local contexts extracted from the text snippets returned by searching in Google. The statistical and semantic measures are further combined into an improved algorithm for identification of false friends that achieves almost twice better results than previously known algorithms. The evaluation is performed for identifying cognates between Bulgarian and Russian but the proposed methods could be adopted for other language pairs for which parallel corpora and bilingual glossaries are available.
URI:	http://hdl.handle.net/10525/366
ISSN:	1312-6555
Appears in Collections:	Volume 3 Number 2

Files in This Item:

File	Description	Size	Format
sjc083-vol3-num2-2009.pdf		246.97 kB	Adobe PDF	View/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons License