Data Placement in Widely Distributed Environments
Miron Livny
High capacity networks present distributed resource management systems with a fundamental dilemma – should the data be moved to the application or should the application be moved to the data. It is common knowledge that how this tradeoff is evaluated and resolved plays a key role in the over all performance of a distributed system. It is therefore critical to treat the placement of data as a managed activity. Like processing activities, data placement jobs need to be scheduled and monitored and the resources needed to facilitate such a placement – e.g storage space and I/O bandwidth – must be controlled by an allocation policy. The talk will introduce an approach to treat data placement jobs as first class citizens and present a framework for managing the resources consumed by such jobs.