Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Citation:
Serdica Journal of Computing, Vol. 5, No 3, (2011), 207p-236p
Abstract:
In research, grid computing is an established way of providing
computer resources for information retrieval. However, e-science grids also
contain, process and produce documents - thereby acting as digital libraries
and requiring means for information discovery. In this paper, we discuss
how distributed information retrieval can be integrated into the Open Grid
Service Architecture (OGSA) to efficiently provide image retrieval for e-science grids. We identify two fundamental ways of performing information
retrieval on the grid - as a batch job or as a distributed activity - and argue
the case for the latter for reasons of efficiency. We give an analysis of the
theoretic communication and computation complexity and demonstrate that
bandwidth limitations provide a decisive argument to support our case. We
describe further design decisions for our system architecture and give a brief
comparison with other designs reported in literature. Lastly, we describe
how the statelessness and isolation of web services impede data-intensive,
distributed, cross-site activities in OGSA grids, and how to escape them.
Description:
This is an extended version of an article presented at the Second International Conference
on Software, Services and Semantic Technologies, Sofia, Bulgaria, 11–12 September 2010.