Cloud Computing – Task Scheduling Based on Genetic Algorithms
Cloud Computing –
Task Scheduling based on Genetic Algorithms
Eleonora Maria Mocanu, Mihai Florea, Mugurel Ionut Andreica, Nicolae Tapus
Abstract— Cloud Computing is a cutting edge technology for managing and delivering services over the Internet. Map-Reduce is the programming model used in cloud computing for processing large data sets in parallel over huge clusters. In order to increase efficiency, a good task scheduling is needed. Genetic algorithms are very useful and accurate in finding solutions to large scale optimization problems, such as task scheduling. They have gained immense popularity over last few years as a robust and easily adaptable search technique. Hadoop, the open source implementation of Map-Reduce, has several task schedulers available (FIFO, Fair, Capacity Schedulers), but neither one of them is focused on minimizing the global execution time. The goal of this project is to improve Hadoop’s functionality by implementing a scheduler based on a genetic algorithm, solving the stated problem.
Keywords – Cloud Computing; Genetic Alghorithms; MapReduce; Hadoop
Cloud Computing is the next natural step in the evolution of on-demand information technology services. It offers a way to increase storage capacity, computation power and add capabilities on the fly without investing in new infrastructure, licensing new software or training new personnel. In order to do all that, it needs a good taskresource planning.
Task scheduling in distributed systems usually has the goal of distributing the load on processors, maximizing their utilization, while minimizing the global task execution time. Task scheduling can be defined into two kinds: static scheduling and dynamic scheduling.
Static scheduling uses so-called prescheduling technology to schedule known tasks in foregone environment, while dynamic scheduling depends on not only the foregone tasks, but also on the current system state to make scheduling plans. Dynamic scheduling is used in real life because a cloud system is dynamically changing over time. Task flow is uncertain and during the possible long time of execution, available resources with their quantity and form are changing all the way; resources capability, current load, interests and tasks’ requests are dynamic, too.
978-1-4673-0750-5/12/$31.00 ©2012 IEEE
Dynamic scheduling problem is an NP-complete problem, whose cost of precise computation will increase exponentially with the increase of the problem scale. In fact, we can just find out an approximate solution to NPcomplete problem . At present, three general algorithms become popular to deal with such problems: Simulated Annealing, Tabu Search and Genetic Algorithm. Among them, GA gains wide attention because of its simplicity, robustness and accuracy in finding the global optimum.