Future Generation Computer Systems 18 (2002) 1061–1074
A computational economy for grid computing and its
implementation in the Nimrod-G resource broker
David Abramson a,∗, Rajkumar Buyya b, Jonathan Giddy c
a
School of Computer Science and Software Engineering, Monash University, Caulfield Campus, Melbourne, Australiab
Grid Computing and Distributed Systems (GRIDS) Laboratory, University of Melbourne, Australiac
Department of Computer Science, Welsh e-Science Centre, Cardiff University, Cardiff, UKAbstractComputational grids that couple geographically distributed resources such as PCs, workstations, clusters, and scientific
instruments, have emerged as a next generation computing platform for solving large-scale problems in science, engineering,
and commerce. However, application development, resource management, and scheduling in these environments continue to
be a complex undertaking. In this article, we discuss our efforts in developing a resource management system for scheduling
computations on resources distributed across the world with varying quality of service (QoS). Our service-oriented grid
computing system called Nimrod-G manages all operations associated with remote execution including resource discovery,
trading, scheduling based on economic principles and a user-defined QoS requirement. The Nimrod-G resource broker is
implemented by leveraging existing technologies such as Globus, and provides new services that are essential for constructing
industrial-strength grids. We present the results of experiments using the Nimrod-G resource broker for scheduling parametric
computations on the World Wide Grid (WWG) resources that span five continents.
© 2002 Elsevier Science B.V. All rights reserved.
Keywords: Grid computing; Computational economy; Nimrod-G broker; Grid scheduling; Resource management
1. Introductionundertaking. This is due to the geographic distribu-
tion of resources that are often owned by different
The accelerated development of grid computing
organizations having different usage policies and cost
systems has positioned them as promising next gener-
models, and varying loads and availability patterns.
ation computing platforms. They enable the sharing,
To address these resource management challenges,
selection, and aggregation of geographically dis-
we have proposed and developed a computational
tributed resources for solving large-scale problems in
economy framework for resource allocation and reg-
science, engineering, and commerce [1,7,30]. How-
ulation of supply and demand for resources. The new
ever, application composition, resource management
framework offers an incentive to resource owners for
and scheduling in these environments is a complex
being part of the grid and motivates resource users to
strike a balance between time for results delivery and
heir economic cost, i.e., deadline and budget [5].
∗ Corresponding author. Tel.: +61-3-9905-1183;
We are exploring the use of an economic paradigm
fax: +61-3-9905-5146.
for grid computing. We have developed an economy-
E-mail addresses: davida@csse.monash.edu.au (D. Abramson),
rajkumar@buyya.com (R. Buyya), j.p.giddy@cs.cf.ac.uk
driven grid resource broker within the Nimrod-G
(J. Giddy).
system that supports soft-deadline and budget-based
0167-739X/02/$ – see front matter © 2002 Elsevier Science B.V. All rights reserved.
PII: S 0 1 6 7 - 7 3 9 X ( 0 2 ) 0 0 0 8 5 - 7
1062
D. Abramson et al. / Future Generation Computer Systems 18 (2002) 1061–1074Table 1
Economics models and examples of distributed computing scheduling systems that adopted them
Economic model
Adopted by
Commodity market
Mungi [15], MOSIX [16], and Nimrod-G [1,4]
Posted price
Nimrod-G
Bargaining
Mariposa [12] and Nimrod-G
Tendering or contract-net model
Mariposa [12]
Auction model
Spawn [17] and Popcorn [18]
Bid-based proportional resource sharing
Rexec and Anemone [19]
Community, coalition, and bartering
Condor, SETI@Home [20], and MojoNation [21]
Monopoly and oligopoly
Nimrod-G broker can be used to choose between resources offered at
different quality and prices
scheduling of applications on the computational grid
spanning across five continents. Whilst it is not the
[7]. Depending on users’ quality of service (QoS)
goal of the system to earn revenue for the resource
requirements, our resource broker dynamically leases
providers, this approach does provide an economic in-
grid services at runtime depending on their cost,
centive for resource owners to share their facilities on
quality, and availability. The scheduler allows mini-
the grid. Further, it encourages the emergence of a
mization of time or cost within specific deadline and
new service-oriented computing industry. Importantly,
budget constraints.
it provides mechanisms to trade-off QoS parameters
Resource management systems need to provide
such as deadline and computational cost, and offers
mechanisms and tools that realize the goals of
incentive for users to relax their requirements. For ex-
both service providers and consumers. The resource
ample, a user may be prepared to accept a later dead-
consumers need a
utility model, representing their re-
line if the computation can be achieved more cheaply.
source demand and preferences, and
brokers that au-
Current grid computing toolkits and applications do
tomatically generate strategies for choosing providers
not provide this functionality.
based on this model. Further, the brokers need to
manage all issues associated with the execution of the
underlying application. The service providers need
2. Nimrod-G: economics-driven grid computingprice generation schemes that increase system uti-
environmentlization, as well as economic
protocols that help them
to offer competitive services. For the market to be
2.1. Objectives and goalscompetitive and efficient, coordination mechanisms
are required that help the market reach an equilibrium
Nimrod-G is a tool for automated modeling and
price, that is, the market price at which the supply
execution of parameter sweep applications (param-
of a service equals the quantity demanded [13]. Nu-
eter studies) over global computational grids [1–3].
merous economic theories have been proposed in the
It provides a simple declarative parametric modeling
literature and many commonly used economic models
language for expressing parametric experiments. A
for selling goods and services can be employed as ne-
domain expert can easily create a plan for a paramet-
gotiation protocols in grid computing. Some of these
ric experiment and use the Nimrod system to submit
market- or social-driven economic models are shown
jobs for execution. It uses novel resource manage-
in Table 1 along with the identity of the distributed
ment and scheduling algorithms based on economic
system that adopted the approach [8].
principles. Specifically, it supports user-defined dead-
These economic models regulate the supply and de-
line and budget constraints for schedule optimizations
mand for resources in grid-based virtual enterprises.
and manages the supply and demand of resources
We demonstrate the power of these models in schedul-
in the grid using a set of resource trading services
ing computations using the Nimrod-G resource broker
called Grid Architecture for Computational Economy
on a grid testbed, called the World Wide Grid (WWG)
(GRACE) [5–7].
D. Abramson et al. / Future Generation Computer Systems 18 (2002) 1061–10741063
Nimrod-G provides a persistent and programmable
parameterization of application input data files, and a
task-farming engine (TFE) that enables “plugging”
ofgrid resource broker with programmable entities for
user-defined schedulers and customized applications
processing jobs on grid resources. The resource broker
or problem solving environments (e.g., ActiveSheets
is made of a number of components, namely a persis-
[23]) in place of default components. The TFE is a
tent and programmable TFE, a schedule advisor, and
coordination point for processes performing resource
a dispatcher, whose functionalities are discussed later.
trading, scheduling, data and executable staging, re-
It also provides job management services that can
mote execution, and result collation. In the past, the
be used for creating user-defined schedulers, steering
major focus of our project was on creating
tools that
and monitoring tools, and customized applications.
help domain experts to compose their legacy serial ap-
Therefore, the end users that benefit from Nimrod-G
plications for parameter studies and run them on com-
tools, protocols, and services are:
putational clusters and manually managed grids [2,3].
•
Domain experts: This group includes scientific, en-
Our current focus is on the use of economic principles
gineering, and commercial users with large-scale
in resource management and scheduling on the grid in
data-set processing requirements. Parameter appli-
order to provide some measurable QoS to the end user.
cations can use Nimrod-G tools to compose them as
Real-world economic methods provide incentives for
coarse-grained data parallel, parameter sweep ap-
owners to contribute their resources to markets, and
plications for executing on distributed resources.
it also provides consumers with a basis for trading
They can also take advantage of the Nimrod-G bro-
the QoS they receive against cost. That is, our focus
ker features to trade-off between a deadline and the
revolves within an intersection area of grid architec-
cost of computation while scheduling application
tures, economic principles, and scheduling optimiza-
execution on the grid. This QoS aspect is important
tions (see Fig. 1), which is essential for pushing the
to end users, because the results are only useful if
grid into the mainstream computing.
they are returned in a timely manner. Previous grid
work has largely ignored this aspect of running real
2.2. Services and end usersapplications.
•
Problem solving environments developers: Appli-
Nimrod-G provides a suite of tools and services for
cation developers can grid enable their applications
creating parameter sweep applications, performing
with their own mechanisms to submit jobs to the
resource management, and scheduling applications.
Nimrod-G resource broker at runtime depending on
They are, a simple declarative programming language
user requirements for processing on the grid. This
and associated GUI tools for creating scripts and
gives them the ability to create applications capable
Fig. 1. QoS-based resource management—intersection of economic, scheduling, and grid worlds.
1064
D. Abramson et al. / Future Generation Computer Systems 18 (2002) 1061–1074of directly using Nimrod-G tools and job manage-
◦ customized end user applications (e.g., Ac-
ment services, which in turn enables their applica-
tiveSheets [23], Nimrod-O [28]).
tions for grid execution.
• The Nimrod-G resource broker, that consists of
•
Task farming or master–worker programming en-◦ a TFE;
vironments designers: These users can focus on
◦ a scheduler that performs resource discovery,
designing and developing easy use and powerful
trading, and scheduling;
application creation primitives for task farming and
◦ a dispatcher and actuator;
master–worker style programming model; develop-
◦ agents for managing the execution of jobs on re-
ing translators and application execution environ-
sources.
ments by taking advantage of Nimrod-G runtime
The Nimrod-G broker architecture leverages ser-
machinery for executing jobs on distributed grid
vices provided by lower-level different grid middle-
resources.
ware solutions to perform resource discovery, trading,
•
Scheduling researchers: The scheduling policy de-
and deployment of jobs on grid resources.
velopers generally use simulation techniques and
tools such as GridSim [26] and Simgrid [27] for
2.3.1. Nimrod-G clientsevaluating performance of their algorithms. In sim-
ulation, it is very difficult to capture the complete
2.3.1.1. Tools for creating parameter sweep applica-property and behavior of a real-world system and
tions.Nimrod supports GUI tools and declarative
hence, evaluation results may be inaccurate. Ac-
programming language that assist in creation of pa-
cordingly, to prove the usefulness of scheduling
rameter sweep applications [2]. They allow user to (a)
algorithms on actual systems, researchers need to
parameterize input files, (b) prepare a plan file contain-
develop runtime machinery, which is a resource in-
ing the commands that define parameters and their val-
tensive and time-consuming task. This can be over-
ues, (c) generate a run file, which converts the generic
come by using Nimrod-G broker programmable
plan file to a detailed list of jobs and (d) control and
capability. Researchers can use Nimrod-G job
monitor execution of the jobs. The application execu-
management protocols and services to develop
tion environment handles online creation of input files
their own scheduler and associated scheduling al-
and command line arguments through parameter sub-
gorithms. The new scheduler can be used to run
stitution.
actual applications on distributed resources and
then evaluate ability of scheduling algorithms in
2.3.1.2. Steering and control monitors.These com-
optimally mapping jobs to resources.
ponents act as a user-interface for controlling and
monitoring a Nimrod-G experiment. The user can
2.3. Architecturevary constraints related to time and cost that influence
the direction the scheduler takes while selecting re-
The Nimrod-G toolkit and resource broker is devel-
sources. It serves as a monitoring console and lists the
oped by leveraging services provided by grid middle-
status of all jobs, which a user can view and control.
ware systems such as Globus, Legion, Condor/G, and
A Nimrod-G monitoring and steering client snapshot
the GRACE trading mechanisms. These middleware
is shown in Fig. 3. Another feature of the Nimrod-G
systems provide a set of low-level protocols for se-
client is that it is possible to run multiple instances
cure and uniform access to remote resources; and ser-
of the same client at different locations. That means
vices for accessing resources information and storage
the experiment can be started on one machine, mon-
management,. The modular and layered architecture
itored on another machine by the same or different
of Nimrod-G is shown in Fig. 2. The key components
user, and the experiment can be controlled from yet
of Nimrod-G resource broker consist of the following:
another location. We have used this feature to monitor
and control an experiment from Monash University
• Nimrod-G Clients, which can be
and Pittsburgh Supercomputing Centre at Carnegie
◦ tools for creating parameter sweep applications;
Melon University simultaneously during HPDC-2000
◦ steering and control monitors;
research demonstrations.
D. Abramson et al. / Future Generation Computer Systems 18 (2002) 1061–10741065
Fig. 2. Layered with an hourglass shape architecture of Nimrod-G system.
2.3.1.3. Customized end user applications.Special-
2.3.2. The Nimrod-G grid resource brokerized applications can be developed to create jobs at
The Nimrod-G resource broker is responsible for
runtime and add jobs to the Nimrod-G engine for pro-
determining the specific requirements that an exper-
cessing on the grid. These applications can use the
iment places on the grid and performing resource
Nimrod-G management APIs for adding and manag-
discovery, scheduling, dispatching jobs to remote grid
ing jobs. One such application is ActiveSheets [23],
nodes, starting and managing job execution, and gath-
an extended Microsoft Excel spreadsheet that submits
ering results back to the home node. The sub-modules
cell functions for parallel execution on computational
of our resource broker are, the TFE; the scheduler
grids using Nimrod-G services. Another example is
that consists of a grid explorer for resource discovery,
the Nimrod-O system, a tool that uses non-linear op-
a schedule advisor backed with scheduling algo-
timization algorithms to facilitate automatic optimal
rithms, and a resource trading manager; a dispatcher
design [28]. This tool has been used on a variety of
and actuators for deploying agents on grid resources;
case studies, including antenna design, smog model-
and agents for managing execution of Nimrod-G
ing, durability optimization, aerofoil design and com-
jobs on grid resources. The interaction between
putational fluid dynamics [29].
components of the Nimrod-G runtime machinery
1066
D. Abramson et al. / Future Generation Computer Systems 18 (2002) 1061–1074Fig. 3. A snapshot of Nimrod-G execution monitoring and steering client.
and grid services during runtime is shown in
The TFE maintains the state of an entire experiment
Fig. 4.
and ensures that it is recorded in persistent storage.
This allows the experiment to be restarted if root node
2.3.2.1. The TFE.The Nimrod-G TFE is a persis-
fails. The TFE exposes interfaces for job, resource, and
tent and programmable job control agent that manages
task management along with the job-to-resource map-
and controls an experiment. It consists of a database to
ping APIs. Accordingly, scheduling policy developers
provide persistence which is accessed through a thin
can use these interfaces to implement other schedulers
management interface. The farming engine is respon-
without concern for the complexity of low-level re-
sible for the parameterization of the experiment, the
mote execution mechanisms.
actual creation of jobs, the maintenance of job status,
and provides the means for interaction between the
2.3.2.2. The scheduler.The scheduler is responsible
clients, the schedule advisor, and the dispatcher. The
for resource discovery, resource trading, resource se-
TFE interacts with the scheduler and dispatcher in or-
lection, and job assignment. The resource discovery
der to process jobs. It manages the experiment under
algorithm interacts with an information service (the
the direction of schedule advisors, and then instructs
MDS in Globus), identifies the list of authorized and
the dispatcher to allocate an application task to the
available machines, trades for resource access cost,
selected resource.
and keeps track of resource status information. The
D. Abramson et al. / Future Generation Computer Systems 18 (2002) 1061–10741067
Fig. 4. The flow actions in the Nimrod-G runtime environment.
resource selection algorithm is responsible for select-
results back to the TFE. Since the agent operates on
ing those resources that meet the deadline and bud-
the “far side” of the middleware resource management
get constraints along with optimization requirements.
components, it provides error detection for the user’s
Nimrod-G provides three different scheduling algo-
task, sending the information back to the TFE.
rithms [6].
The Nimrod-G agent also records the amount of
resource consumed during job execution, such as the
2.3.2.3. The dispatcher and actuators.The dis-
CPU time and wall clock time. The online measure-
patcher triggers appropriate actuators to deploy agents
ment of the amount of resource consumed by the job
on grid resources and assign one of the resource-
during its execution helps the scheduler evaluate re-
mapped jobs for execution. Even though the schedule
source performance and change the schedule accord-
advisor creates a schedule for the entire duration based
ingly. Typically, there is only one type of agent for all
on user requirements, the dispatcher deploys jobs on
mechanisms, irrespective of whether they are fork or
resources periodically depending on load and number
queue nodes. However, different agents are required
of CPU s that are available. We have implemented
for different middleware systems.
different dispatchers and actuators for each different
middleware service. For example, a Globus-specific
2.4. Implementation and technologiesdispatcher is required for Globus resources, and a
Legion-specific component for Legion resources.
The Nimrod-G resource broker follows a mod-
ular, extensible, and layered architecture with an
2.3.2.4. Agents.Nimrod-G agents are deployed on
“hourglass” principle as applied in the Internet Pro-
grid resources dynamically at runtime depending on
tocol suite [11]. This architecture enables separation
the scheduler’s instructions. The agent is responsible
of different grid middleware systems
mechanismsfor setting up the execution environment on a given
for accessing remote resources from the end user
resource for a job. It is responsible for transporting
applications. The broker provides uniform access to
the code and data to the machine; starting the execu-
diverse implementations of low-level grid services.
tion of the task on the assigned resource and sending
The key components of Nimrod-G, the TFE, the
1068
D. Abramson et al. / Future Generation Computer Systems 18 (2002) 1061–1074Table 2
The resource broker functionality and grid services role
Nimrod-G functionality
Grid services used or performed by
Application model
Coarse-grained task farming, master–worker, and data
parallelism
Application composition
We support mechanism for application parameterization through
parameterization of input files and command-line inputs for coarse-grained data
parallelism. It basically supports coarse-grain, data parallel, task-farming
application model, which can be expressed using our declarative programming
language or GUI tools
Application interface
The Nimrod-G broker supports protocols and interfaces that help in job
management. Nimrod-G clients or problem solving environments can add,
remove, and inquire about job status. They can set user requirements such as
deadline and budget; start and stop application execution both at job level and
the entire application level
Scheduling interface
The Nimrod-G broker supports protocols and interfaces that help in developing
schedulers. The schedulers can interact with the TFE, inquire jobs, user
constraints, and develop a schedule that maps jobs to resources
Security
Secure access to resources and computations (identification, authentication,
computational delegation) is provided by low level middleware systems like
Globus
Resource discovery
Resource discovery involves discovering appropriate resources and their
properties that match with users requirements. We maintain resource listings for
Globus, Legion, and Condor and their static and dynamic properties are
discovered using grid information services. For example, in case of Globus
resources, we query Globus LDAP-based GRIS server for resource information
Resource trading and market models
The market-driven resource trading is performed using GRACE trading
services. The Nimrod-G broker architecture is generic enough to support
various economic models for price negotiation and using the same in
developing application schedules
Performance prediction
The Nimrod-G scheduler performs the user-level resource capability
measurement and load profiling by measuring and establishing the job
consumption rate
Scheduling algorithms
Deadline and budget-based constrained (DBC) scheduling performed by
Nimrod-G schedule advisor. Along with DBC scheduling, we support further
optimization of time, cost, or surplus-driven divide and conquer in scheduling
Remote job submission
The Nimrod-G dispatcher performs deployment of Nimrod-G agents using
Globus GRAM, Legion, or Condor commands. The agents are responsible for
managing all aspects of job execution
Staging programs and data on remote resources
In case of Legion and Condor it is handled by their I/O management systems.
On Globus resources, we use http protocols for fetching required files
Accounting (broker level)
Nimrod-G agents perform accounting tasks such as measuring resource
consumptions and the scheduler performs the entire application level accounting
Monitoring and steering tools
Nimrod-G monitoring and steering client
Problem solving environments
ActiveSheets and Nimrod-O are enabled to use Nimrod-G services
Execution testbed
The WWG having resources distributed across five continents
scheduler, and the dispatcher are loosely coupled.
implementation of Nimrod-G support for upcoming
The interaction among them takes place through
peer-to-peer computing infrastructures such as Jxta
network protocols. Apart from the dispatchers and
[24] and Web services [25]. To achieve this, it is only
the Grid Explorer, the Nimrod-G components are
necessary to implement two new components, a dis-
mechanism-independent. The modular and exten-
patcher and an enhanced Grid Explorer. The current
sible architecture of Nimrod-G facilitates a rapid
implementation of Nimrod-G broker uses low-level
D. Abramson et al. / Future Generation Computer Systems 18 (2002) 1061–10741069
grid services provided by Globus [9] and Legion [10]
3. Scheduling experiments on the WWG testbedsystems. We also support Nimrod-G dispatcher im-
plementation for Condor [22] resource management
We have performed deadline and budget con-
system. The role of various grid and commodity tech-
strained scheduling experiments at two different
nologies in implementing Nimrod-G functionality
times (Australian peak and off-peak hours) on re-
and components is presented in Table 2.
sources distributed in two major time zones [7] using
While submitting applications to the broker, user
a “cost-optimization scheduling algorithm” [6] on
requirements such as deadline and budget constraints
the WWG [14] testbed shown in Fig. 5. Currently,
need to be set and start application execution. These
the testbed has heterogeneous computational re-
constraints can be changed at any time during exe-
sources owned by different organizations distributed
cution. The complete details on application parame-
across five continents: Asia, Australia, Europe, North
terization and jobs management information, starting
America, and South America. This testbed contains
from submission to completion, is maintained in the
heterogeneous resources such as PCs, workstations,
database. In the past the database was implemented
SMPs, clusters, and vector supercomputers running
as a file-based hierarchical database. In the latest ver-
operating systems such as Linux, Sun Solaris, IBM
sion of Nimrod-G, the TFE database is implemented
AIX, SGI IRIX and True64. Further, the systems use
using a standard “relational” database management
a variety of job management systems such as Condor,
system.
RMS, PBS, and LSF. All these resources are grid
The commodity technologies and software tools
enabled using Globus services.
used in the Nimrod-G implementation include: the C
We have performed an experiment of 165 CPU-
and Python programming languages, the Perl script-
intensive jobs, each running approximately 5 min
ing language, SQL and Embedded C for database
duration. For a given deadline of 2 h (120 min)
management. The PostgreSQL database system is
and budget of 396,000 (G$ or tokens), we con-
used for the management of the TFE database and its
ducted experiments for two different optimization
interaction with other components.
strategies:
Fig. 5. The WWG testbed.
1070
D. Abramson et al. / Future Generation Computer Systems 18 (2002) 1061–1074Table 3
The WWG testbed resources used in scheduling experiments, job execution and costing
Resource type and size
Organization and location
Grid services and
Price (G$ per
Jobs executed on resources
(no. of nodes)
fabric
CPU s)
Time Opt
Cost Opt
Linux cluster (60 nodes)
Monash University, Australia
Globus, GTS, Condor
2
64
153
Solaris (Ultra-2)
Tokyo Institute of Technology,
Globus, GTS, Fork
3
9
1
Japan
Linux PC (Prosecco)
CNUCE, Pisa, Italy
Globus, GTS, Fork
3
7
1
Linux PC (Barbera)
CNUCE, Pisa, Italy
Globus, GTS, Fork
4
6
1
Sun (8 nodes)
ANL, Chicago, USA
Globus, GTS, Fork
7
42
4
SGI (10 nodes)
ISI, Los Angeles, USA
Globus, GTS, Fork
8
37
5
Total experiment cost (G$)
237000
115200
Time to complete
70
119
experiment (min)
1.
Optimize for time: This strategy produces results as
In these scheduling experiments, the Nimrod-G re-
early as possible, but before a deadline and within
source broker employed the
Commodity Market model
a budget limit.
for establishing a service access price [8]. It used grid
2.
Optimize for cost: This strategy produces results by
resource trading services for establishing connection
deadline, but reduces cost within a budget limit.
with the Grid Trader running on resource providers’
Fig. 6. Resource selection in deadline and budget-based scheduling for time optimization strategy.
Document Outline
- A computational economy for grid computing and its implementation in the Nimrod-G resource broker
- Introduction
- Nimrod-G: economics-driven grid computing environment
- Objectives and goals
- Services and end users
- Architecture
- Nimrod-G clients
- Tools for creating parameter sweep applications
- Steering and control monitors
- Customized end user applications
- The Nimrod-G grid resource broker
- The TFE
- The scheduler
- The dispatcher and actuators
- Agents
- Implementation and technologies
- Scheduling experiments on the WWG testbed
- Conclusions and future work
- References
Add New Comment