This is not the document you are looking for? Use the search form below to find more!

Report home > Computer / Internet

A Simulation Environment for Load Balancing on a Cloud

0.00 (0 votes)
Document Description
Distributing jobs of unknown size efficiently across a large number of machines aimed at minimizing cost and achieving optimal response time.
File Details
Submitter
Embed Code:

Add New Comment




Related Documents

Load Balancing Appliance-Benefits of a Load Balancing Appliance

by: array networks, 2 pages

A Load balancing appliance is an essential part of a network. So that you are able to maintain network performance, an active load balancing solution is essential.

Link Load Balancing for Optimizing Internet Bandwidth

by: array networks, 2 pages

Internet connectivity today, is the fundamental part of every business operation, and a critical component that needs to be highly reliable and always available. Link load balancing helps maintain ...

Benefits of Link load Balancing Made Simple

by: array networks, 2 pages

Link Load Balancing looks at balancing out the critical resources on data networks with unpredictable requests issued to a server. It is also concerned with communications channels themselves to ...

Features & Benefits of Load Balancing Appliances

by: array networks, 2 pages

Load balancing appliances are essential parts of a network. If you intend to maintain a stable network performance, you need to have a dynamic load balancing solution at place. It is also crucial for ...

Understanding the Benefits of Load Balancing

by: array networks, 2 pages

Load balancing has specific significance in busy networks, especially when it is difficult to foresee the number of requests that will be issued to a server. It not only increases the performance to ...

Load Balancing Appliance for Dedicated Severs

by: array networks, 2 pages

Mulling over computer terminologies and applications, though you might have encountered the word “dedicated servers”, but still might be unaware of its functions and features. It is best ...

Understanding the Importance of Server Load Balancing

by: array networks, 2 pages

It is a known fact that the IT infrastructure plays a vital part in the success of a business. With network servers hosting a multitude of applications, the availability of these applications can be ...

Create A Page For Your Business On Facebook

by: gusta, 12 pages

Create A Page For Your Business On Facebook

Hp Load Runner Tutorial 2 How To Design A Scenario For Load Testing Using Hp Load Runner

by: benito, 8 pages

HP - LoadRunner Tutorial - 2: How to design a Scenario for Load testing using HP - LoadRunner? Brief Introduction of Scenario: A scenario is a file containing all the…

Donor approaches to improving the business environment for small enterprises

by: monkey, 90 pages

The Working Group on Enabling Environment, established by the Committee of Donor Agencies for Small Enterprise Development in 1992, has taken a closer look at the work ...

Content Preview
Project Report
Cloud Computing Report1
December 22, 2010
Marcus Ljungblad
Navaneeth Rameshan
Wasif Malik
This report is prepared by
Marcus Ljungblad
Navaneeth Rameshan
Wasif Malik
1This report is a part of the cloud computing project.

Contents
1
Introduction
1
2
ProposedMethod
2
2.1
Attempted approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2.1.1
Web Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2.1.2
Single Task Mode . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.2
Proposed method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.2.1
Web Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.2.2
Single Task Mode . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
3
Implementation
5
4
Results
7
4.1
Single-Task mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
4.1.1
Simulation summary . . . . . . . . . . . . . . . . . . . . . . . . .
9
4.2
Web-Task Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
4.2.1
Simulation Summary . . . . . . . . . . . . . . . . . . . . . . . . .
11
4.3
Round-Robin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
4.3.1
Simulation Summary . . . . . . . . . . . . . . . . . . . . . . . . .
12
5
Conclusion
13
5.1
Scope for Improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
5.1.1
Shifting jobs between workers . . . . . . . . . . . . . . . . . . . .
13
5.1.2
Load parameters . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
5.1.3
Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
5.2
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
i

List of Figures
2.1
High level flow of Scheduling algorithm . . . . . . . . . . . . . . . . . . .
3
3.1
UML diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
4.1
Response and Queued jobs over time in single-task mode. . . . . . . . . .
8
4.2
Number of active, idle, and computing workers over time for single-task
mode.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
4.3
Response time and queued jobs over time in web-task mode
. . . . . . .
10
4.4
Active, idle and computing workers over time in web-task mode. . . . . .
11
4.5
Response time and queued jobs using round-robin scheduling . . . . . . .
12
ii

Chapter 1
Introduction
Distributing jobs of unknown size efficiently across a large number of machines is one
of the greatest challenges today in cloud computing. The goals ranges from minimizing
cost to minimizing time to complete a set of jobs; two often contradicting requirements.
While the best way to complete all jobs as fast as possible may be to schedule one job per
machine, it is far from the most cost-efficient. In effect, one must always make a trade-off
between the two.
In this report we present three algorithms: one focuses on minimizing response time for
a job, one on minimizing cost, and one reference algorithm implemented using round-
robin. To test the algorithms a cloud simulator was implemented in C++. Results from
simulations with varying inputs are compared against the round-robin algorithm. Finally,
a set of improvements to the evaluated algorithms are proposed.
1

Chapter 2
ProposedMethod
2.1
Attempted approaches
The following subsections describe the initial attempted approaches for scheduling.
2.1.1
Web Mode
For the web mode, we intended to do an efficient distribution of jobs to worker nodes, to
minimize swapping costs and fit them in memory in the best possible way so as to ensure
load distribution. However, since the scheduler has no information of the job memory, the
only possible way to efficiently distribute jobs is to schedule jobs in round-robin or random
distribution initially and get information about the job's memory from the worker. Then
the cost of transferring jobs from one worker node to another is calculated to see if it is
feasible to do a transfer for efficient load distribution. It may not be feasible to transfer a
job if the job has already computed most of its instructions. Although the worker nodes
don't provide a feedback on the number of instructions, the worst case time taken for the
jobs completed is used to estimate the time remaining. We discarded this method, as the
estimate for cost was deviating significantly from the actual cost. Time taken for a job
to complete depends largely on the number of jobs that are already present in the worker
node, the swapping costs and also the number of instructions for the job. We believe that
these factors made it difficult to predict the completion time.
2

3
2.1.2
Single Task Mode
For the single task mode, we intended to submit jobs in a round robin manner and also
compute the job away time for all the jobs that have been submitted . The job away
time is computed every scheduler cycle after at least one job from a task completes. The
average time is used in this case to estimate the completion time for jobs that have been
submitted. If the job away time is lesser than the average time of completion for the jobs
in the task, then the estimated time of completion is the average time itself but if the job
away time is greater than the average time, the average completion time is recomputed
as a weighted average. However, here again the estimation of time to complete deviated
significantly from the actual values and as a result either more workers were started than
what was necessary or lesser in some cases.
2.2
Proposed method
Figure 2.1 shows the high level flow of the scheduling algorithm.
Figure 2.1: High level flow of Scheduling algorithm
2.2.1
Web Mode
In the web mode, the goal of the scheduling algorithm is to minimize the average response
time. Ideally, a practical solution would be to distribute load evenly and to start enough

4
number of worker nodes to minimize response time. The jobs are submitted in round-
robin manner until at least one job completes. This is a modified version of the normal
round robin algorithm; it only sends one job per active worker in each scheduling cycle.
This results in the scheduler holding back jobs and increases the chances of jobs finishing
quickly due to reduced swapping time. As jobs complete, we keep track of worst case
completion time. Based on the current time, we also keep a track of time until next
charging tick. As soon as we have at least one completed job, the future jobs are scheduled
to worker nodes based on their current load. In the implementation, load is the number of
jobs a worker node is currently working on. Lesser the load, more jobs can be sent to the
worker and vice versa. For each worker node, the worst case execution times for the jobs
to be sent are estimated based on the worst case completion time seen so far. If the worst
case execution time of the jobs at hand exceeds the time until the next charging tick,
then only that many jobs are sent that are estimated to complete within the charging
tick. These jobs are chosen randomly from the queue for a good distribution. Jobs that
arent sent are considered spilled and spilled jobs for each worker is accumulated in the
same cycle. Depending on the estimation of how long the spilled jobs take, new worker
nodes are started. The scheduler cannot sent the spilled jobs to the newly started workers
immediately because the workers take some time to boot up and cant accept jobs in that
time. So, the spilled jobs are saved to a hash map and the scheduler tries to send them
to the specified workers at the start of each scheduling cycle. As soon as the new worker
node(s) boot up, the jobs will be submitted to them.
2.2.2
Single Task Mode
In single task mode, a similar scheduling algorithm like above is used but with one
major difference; the decision to start new nodes is dependent on the percentage of waste
value. The more money the user is willing to waste, the more nodes will be started
by the scheduler. This would result in a very quick response time but the cost will be
significantly more. Similarly, the lower the percentage of waste value, the more strict is
the scheduler in starting new nodes, the lesser the cost, but increased average response
time. For estimating the time it would take to complete all the jobs in hand, the scheduler
uses the same approach used in web task scheduler i.e. calculates the completion time
by multiplying the number of jobs in queue with the worst time to complete one job.
Theoretically, It would have been better to consider the average completion time for each
task and then decide if new nodes should be started or not; but due to time constraints
and complexity, this approach was not implemented.

Chapter 3
Implementation
The implementation of all modules was done in C++. The UML diagram in figure 3.1
shows the relationship between modules and their key attributes.
Simulator
-currentTime: long
1
1
1
0..n
TaskGen
Scheduler
Worker
-tasks: list<Task>
-queuedJobs: list<Job>
-workerState: enum
-jobs: list<Job>
-runningJobs: list<Job>
+execute()
scheduler: Scheduler
-completedJobs: list<Job>
+startWorker()
submit jobs/tasks
-workers: list<Worker>
-sendTask()
+stopWorker()
1
1
-workerStats: list<WorkerStats>
-createTask()
+submitJobs()
+runScheduler()
has
+getState()
+submitJobs(list<Job>)
+getAvailableMemory()
1
0..n
+notifyJobCompletion()
+isAcceptingJobs()
-getSlowestJobTime()
+getTotalMemory()
1
1
-fetchJobsFromQueueRandomly()
+getCostPerHour()
-startWorkerNode()
+getInstructionsPerTime()
has
has
-runRoundRobinScheduler()
+getTotalExecutionTime()
-runWebScheduler()
+getTotalCPUTime()
-runSingleTaskScheduler()
+getAverageResponseTime()
+getTotalCost()

0..n
0..n
+getQueuedJobs()
+getJobsCompleted()

Task
Job
-taskid: long
-jobid: long
-jobrate
-taskid: long
has
-num_of_jobs
num_instructions: long
1
1..n
-jobs
mem_size: long
+getJobID()
+getTaskID()

Figure 3.1: UML diagram
The clock functionality was implemented in the Simulator class by having a while loop
calling the task generator, scheduler, and worker objects in every iteration. One iteration
5

6
is considered to be one millisecond by each module. But since the scheduler can only
work after a configurable interval, the scheduler ignores iterations and only does work
after the specified scheduling interval. The jobs and tasks are fed to the scheduler by the
task generator at a rate specified in the input file (input.conf). The number of workers
to start automatically at the start of the simulation can be configured in workers.conf
but they will still take time to start-up when the simulation starts. Till that time, the
jobs will be queued at the schedulers end. The initial workers objects are created by
simulator and passed on to the scheduler at the start of the simulation. A limitation in
this implementation is that the scheduler has to wait for workers to start, even though
an initial numbers of started workers is specified.
The worker node is implemented as a state machine with the following states: Initialising,
Idle, Computing, Swapping, or Offline. It can accept jobs in all states except Initialising
and Offline. It maintains two queues: jobs in memory, and jobs in hard drive. When
swapping occurs at a node, it is conducted in a round-robin fashion. A job is started only
if it fits in the memory, if not, an existing job is swapped out (i.e moved to the jobs in
hard drive queue) and the next job retried. Moreover, the worker maintains a public API
statistical use towards the scheduler.

Chapter 4
Results
In this section, three test runs are presented and evaluated. The following configuration
was used for all the test runs.
Table 4.1: Simulator Configuration
Scheduling interval
0.1 s
Worker node speed
300 instr/s
Worker node memory
8 Gb
Worker swapping cost
5 instr/gb
Worker quantum
0.1 s
Worker node start-up time
120 s
Worker node notification time
2 instr
Worker node cost
1 Euro/hour
Allowed waste
30%
Workers started
2
4.1
Single-Task mode
Table 4.2 shows the input configuration for the jobs.
It can be seen from figure4.1 that in the first 120 seconds no jobs are scheduled as the
workers are being started during this time. Then jobs are scheduled in a round-robin
fashion until a sufficiently accurate estimation of when the jobs will complete is attained.
At this point one more worker is started and the spilled jobs are sent to the designated
worker. However, as well see in the following graph 4.2, this new worker is not well used
7

Document Outline

  • ﾿
      • ﾿
      • ﾿
    • ﾿
      • ﾿
  • ﾿
      • ﾿

Download
A Simulation Environment for Load Balancing on a Cloud

 

 

Your download will begin in a moment.
If it doesn't, click here to try again.

Share A Simulation Environment for Load Balancing on a Cloud to:

Insert your wordpress URL:

example:

http://myblog.wordpress.com/
or
http://myblog.com/

Share A Simulation Environment for Load Balancing on a Cloud as:

From:

To:

Share A Simulation Environment for Load Balancing on a Cloud.

Enter two words as shown below. If you cannot read the words, click the refresh icon.

loading

Share A Simulation Environment for Load Balancing on a Cloud as:

Copy html code above and paste to your web page.

loading