Scholarly article on topic 'A Model Driven Approach for Supporting the Cloud Target Selection Process'

A Model Driven Approach for Supporting the Cloud Target Selection Process Academic research paper on "Computer and information sciences"

CC BY-NC-ND
0
0
Share paper
Academic journal
Procedia Computer Science
OECD Field of science
Keywords
{"cloud computing" / "cloud modeling language" / "multi-criteria decision making" / "provider selection ;"}

Abstract of research paper on Computer and information sciences, author of scientific article — Aliki Kopaneli, George Kousiouris, Gorka Echevarria Velez, Athanasia Evangelinou, Theodora Varvarigou

Abstract The decision making process for the selection of one cloud target over another plays a major role during the migration to the Cloud, affecting not only the operational costs, functional characteristics and QoS, but also the development, monitoring and maintaining experience of the IT professionals. As the Cloud gains ground, a progressively growing number of cloud providers, services and technologies are exposed in the market rendering the research and selection upon them complex and time consuming. Proposed efforts for automatic support, fail to follow the quick paste of evolution, demanding, thus, even more effort for maintaining the supporting systems. In this paper the Cloud Target Selection (CTS) tool methodology and prototype implementation are presented introducing a novel approach: The CloudML@artist modeling language is exploited as a representation of real-world cloud environments becoming a source of information for an extensible decision making mechanism. The proposed work contributes in the direction towards the construction of an adaptive solution, which will follow the technological advances requiring the minimum of human intervention

Academic research paper on topic "A Model Driven Approach for Supporting the Cloud Target Selection Process"

CrossMark

Available online at www.sciencedirect.com

ScienceDirect

Procedía Computer Science 68 (2015) 89 - 102

HOLACONF - Cloud Forward: From Distributed to Complete Computing

A model driven approach for supporting the Cloud target selection

process

Aliki Kopanelia*, George Kousiourisa, Gorka Echevarría Velezb, Athanasia Evangelinoua,

Theodora Varvarigoua

aNational Technical University of Athens, Heroon Polytechniou 9,15780 Zografou,Greece bTECNALIA, Parque Tecnológico de Bizkaia, C/ Geldo, Edificio 700, E-48160 Derio (Spain)

Abstract

The decision making process for the selection of one cloud target over another plays a major role during the migration to the Cloud, affecting not only the operational costs, functional characteristics and QoS, but also the development, monitoring and maintaining experience of the IT professionals. As the Cloud gains ground, a progressively growing number of cloud providers, services and technologies are exposed in the market rendering the research and selection upon them complex and time consuming. Proposed efforts for automatic support, fail to follow the quick paste of evolution, demanding, thus, even more effort for maintaining the supporting systems. In this paper the Cloud Target Selection (CTS) tool methodology and prototype implementation are presented introducing a novel approach: The CloudML@artist modeling language is exploited as a representation of real-world cloud environments becoming a source of information for an extensible decision making mechanism. The proposed work contributes in the direction towards the construction of an adaptive solution, which will follow the technological advances requiring the minimum of human intervention

© 2015The Authors.Publishedby ElsevierB.V.This is an open access article under the CC BY-NC-ND license (http://creativecommons.Org/licenses/by-nc-nd/4.0/).

Peer-review under responsibility of Institute of Communication and Computer Systems.

Keywords: cloud computing; cloud modeling language; multi-criteria decision making; provider selection;

* Corresponding author. Tel.: +30-210-772-2546; E-mail address: alikikop@mail.ntua.gr

1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.Org/licenses/by-nc-nd/4.0/).

Peer-review under responsibility of Institute of Communication and Computer Systems. doi: 10.1016/j.procs.2015.09.226

1. Introduction

Cloud Computing, widely understood as a computing paradigm for managing and delivering services over the Internet1, has been characterized by a continuous process of evolution. Both its conceptual context and the underlying technologies (e.g. virtualized resources and private networks) are subject to constant changes, thus, setting up a race in search for expertise among different companies and IT workers. At the same time, more and more organizations decide to adopt cloud environments either by migrating their existing software or by creating brand new cloud compatible components. However, the decision making process for migrating the business logic to the Cloud can be quite puzzling considering all the factors that may affect the outcome. Estimating and identifying these factors usually leads to a cost versus benefit analysis10. This analysis may include human resources costs, time spent, operational costs related to the pricing policies, availability and reliability, cost minimization in terms of hardware infrastructure, power consumption and maintenance, disaster handling issues, elasticity and controlled responses to unexpected peaks14, 11. One of the most important decisions that must be taken in order to satisfy goals and constraints derived from these factors is upon which provider and/or which specific provider services will be adopted for the cloudified application.

An important difficulty in addressing this problem is the unfamiliarity with technical details of cloud environments among IT workers, despite of the fact that Cloud computing is claimed to demand less IT skills12 than hosting on premise systems, and even though it presents itself as an evolution of older models starting to appear already in the late 60's16. Possible explanations can be given through the fact that the Cloud in its concrete form has only been made available for the enterprise world only in 200613 and numerous related research works, third party products and cloud platforms arise in a yearly or even a monthly basis, up until these days. A vast number of public cloud providers are exposed to the market and the number of services and service types they offer is progressively augmenting. These conditions have been intensified especially during the last years, when the frontiers between infrastructure (IaaS) and platform (PaaS) as a service have been broken resulting in cloud platforms that can support and integrate features from both layers.

Under these circumstances, the provider's selection becomes an even more demanding task to be carried through. The decision maker has to deal with two important issues. The first is the multi-objectivity of the migration itself because of constraints and goals set by the application and the human factor interacting with it. The second is the diversity and constant mutability of the Cloud computing environments, features, services and products. Considering these two aspects of the problem, the on-going work presented in this paper, proposes a model-driven solution. It offers to the end-user a high degree of control over the criteria which take part in the selection process, in order to exploit her overall experience and understanding of the under development software. At the same time it automates the process of identifying the features that could possibly be used as selection criteria as dictated by the prevailing conditions in the field of Cloud computing. It also automates the process of identifying the providers that fulfill the criteria set by the user, as well as the process of combining the results towards solving the multi-criteria decision making (MCDM) problem and proposing the most beneficial solution.

Towards this direction, a structured source of knowledge (CloudML@artist17) has been exploited which is designed to contain up-to-date information about real-world cloud providers and their offerings. On top of it a prototype eclipse plug-in called Cloud Target Selection (CTS) tool has been designed and implemented. Through its interactive use, the end-user is presented with dynamically extracted information and provides input by expressing her preferences or demands (through a simple task of criteria selection and ranking). The input is analysed and a recommendation is produced about the most suitable cloud provider. The rest of this paper is organized as follows: Section 2 presents the requirements of the CTS tool and an overview of the methodology. Section 3 and 4 present in detail every component of the methodology each falling into one of the two categories: model-driven and decision making. Section 5 is dedicated to the analysis of a prototype implementation and its evaluation. Finally section 6 describes a set of related works and section 7 concludes the current work and discusses the future plans.

2. CTS Requirements and overview

Functional requirements that can be extracted from the previous include:

• F-Req1: Information extraction from an up-to-date source of knowledge (CloudML@artist metamodels).

• F-Req2: Ability to dynamically identify evolving services and cloud features. These services and features can potentially be used as criteria for the target selection process and from now on will be referred to as candidate selection criteria.

• F-Req3: Ability to group and present to the end-user the candidate selection criteria

• F-Req4: Ability to capture the end-users' preferences over the candidate selection criteria. This requirement enables the identification of the actual selection criteria.

• F-Req5: Ability to capture the end-users' preferences over the importance of the groups of the candidate selection criteria. This requirement allows the user to have a higher degree of control over the selection process.

• F-Req6: Ability to allow the end-user to determine which cloud providers will take part in the selection process, among the ones which are supported by the tool. From now on, the providers selected by the user will be referred to as candidate cloud providers.

• F-Req7: Ability to evaluate each cloud provider according to the degree in which the criteria are fulfilled.

Furthermore, some non-functional requirements have been identified:

• NF-Req1: Minimization of the time needed for the provider selection.

• NF-Req2: High degree of usability (including learning curve).

• NF-Req3: High degree of extensibility

• NF-Req4: High degree of adaptability by minimizing the degree of dependency between the CTS tool and the information held in the source of knowledge (request for generalization).

In order to satisfy these requirements, the tool is designed as presented in the overview provided by Figure 1. More precisely, Figure 1b presents the functionality of the user interface, which can be thought of, as the steps which are visible to the end-user. The yellow boxes stand for the CloudML@artist input while the blue bubbles present all the actions that need to be performed in order to allow the interaction between the tool and the user. This functionality is realized by the methodology followed by the CTS tool which is presented in the diagram of Figure 1a through three diagram components: The contribution of the source of knowledge (yellow boxes), the methodology steps (pink boxes) and the outputs presented to the end-user (blue boxes). In a few words the methodology can be summed up in:

• Step1: Exploit the source of knowledge, in order to extract the candidate selection criteria

• Step2: Transform the candidate selection criteria into a convenient data structure and present them to the user.

• Step3: Capture user's preference upon these criteria in order to extract the actual selection criteria.

• Step4: Transform the actual selection criteria into queries to be performed towards the source of knowledge.

• Step5: Exploit the source of knowledge in order to perform these queries.

• Step6: Combine the query results with the user's preferences in order to assign scores to the providers and make the final decision.

The methodology steps, the algorithms and the components of the tool, are subject to a conceptual categorization and thus will be presented in two different sections each one corresponding to one of the identified clusters:

• Model driven: As mentioned before, the source of knowledge exploited is the CloudML@artist modeling language, which is formed by a set of metamodels as will be discussed in subsection 3.1. As a result, every component that is related to information extraction, information processing, data structures and model querying will fall into this category.

• Decision making: This cluster includes all the components that contribute towards the direction of setting the actual selection criteria and ranking the candidate cloud providers in order to obtain a final recommendation for the most suitable cloud provider.

Figure 1 (a) Overview of the CTS methodology; (b) Model-user interaction

3. Model driven components and steps of the CTS methodology

3.1. Source of knowledge

First, a source of knowledge will be the basis where the tool will obtain information from. This role will be served by CloudML@artist, a set of UML profiles (based on and extending CloudML) which, when applied to one another, compose a model hierarchy. Each profile contains a set of relevant stereotypes that help capture and describe a subset of features or typical characteristics. The resulting structure serves the purpose of describing a concrete cloud environment. Technically, the profiles defined in the CloudML @artist metamodel fall into three categories. This categorization (Error! Reference source not found. 1) has been one of the main considerations for their exploitation by the CTS tool, and includes the basic Core profile, including main cloud platform characteristics, the supporting profiles focused on specific aspects of a cloud offering or provider (in terms of measured performance, availability, cost etc.) and the concrete provider profiles, which inherit from the previous two categories and concretize the specific features. These concrete instances of providers can be considered each as a representation of a cloud provider expressed via the stereotypes of categories A and B. Profiles of type A are providing basic information when applied on profiles of type C, meaning that they cover functional aspects of their description. In the same respect, each profile in category B, defines stereotypes covering different aspects when compared with each other. For example, stereotypes of the benchmarking profile, when applied on an element of a provider's profile, will stand for the concluding results of a benchmarking process carried out on this exact provider. So, given that the profiles will be kept up to date, these meta-models will be an easy-to-use source of knowledge which contains information that a user would have to extract after days of investigation through web sites and tutorials. This aspect contributes towards F-Reqs 1, 2, 3 and 7. In addition, the exploitation of the CloudML@artist in the CTS methodology is essential for achieving all the non-functional goals set in section 2. Its importance will be highlighted throughout the paper and the details of its contribution will be made clearer through the description of the rest of the model driven components or steps.

Table lProfile Categories and Examples

Profile Categories

Examples

A Core Profile is the base profile of CloudML@artist. It models the main cloud characteristics from the point of view of a cloud platform, containing stereotypes and datatypes which can be common among all cloud providers. In addition, it contains 3 sub-profiles which model the specific characteristics of IaaS, PaaS and SaaS environments, by defining the corresponding stereotypes and data types.

B Supporting profiles extend the core profile and define stereotypes for describing other cloud aspects. For the time being, four supporting profiles have been defined and cover the aspects of availability, benchmarking, pricing and security correspondingly. The contained stereotypes, can be applied on stereotypes of the providers' profiles, thus contributing in the total process of modeling the cloud providers' specific characteristics.

C Providers' profiles aim at modeling specific cloud providers. For the time being, three cloud providers are supported and thus, three profiles have been created: Google App Engine, Windows Azure and Amazon EC2.

"IaaSInstanceType" stereotype (Category A) has been applied to "M1MediumInstance" (Category C). Following this stereotype application, specific values for properties such as memory, and network performance are included in every "MIMediumlnstance" instantiation.

After the benchmarking process is finished, and if needed, the stereotype "YCSBResults" (Category B) can be applied to the "M1MediumInstance" (Category C). This way, concrete values for properties such as throughput and latency of the YCSB benchmark tests performed on this type of instances will be included in the description of Amazon WS.

Amazon EC2 profile (Category C) contains the definition of a stereotype called "M1MediumInstance". This stereotype, upon instantiation, will stand for an actual VM instance of ml.medium type.

3.2. Candidate providers

Depending on the effort spent on the maintenance and extension of CloudML@artist, a large number of providers could be included as candidates for the selection process. As can be predicted, as the number of the supported providers augment, the performance will decrease. Thus, the requirement of the selection of the candidate providers has been added (F-Req6).

The fulfillment of this requirement is realized by a simple search over the source of knowledge for the available profiles of type C. This way, the supported providers are identified and presented to the end-user. At the same time, the user interface enables the user to make selections among the supporting providers. These selections will form the set of the candidate cloud providers which will actually take part in the cloud target selection process. The candidate providers' selection can be seen as part of the functionality of the user interface as described in section 2 (Figure 1b)

3.3. Groups of candidate selection criteria

The candidate selection criteria are grouped into categories as indicated by F-Req3. These categories are not produced by the tool itself. On the contrary, they are defined by and extracted from the source of knowledge. The way this grouping becomes feasible is described in the following sections.

3.4. Candidate selection criteria extraction

The use of the CloudML@artist metamodels as the source of knowledge begins from the very early phase of extracting the candidate selection criteria (Step 1 of the methodology). This becomes feasible through a mapping between the metamodel elements and the elements of a new data structure which will be referred to as the intermediate model. The name is derived from the fact that this data model is used as an exchange data format between the source of knowledge (only profiles of categories A and B) and the candidate selection criteria displayed to the end-user. The intermediate model can be viewed through the diagram of generalizations in Figure 2. Every element of the intermediate model is a Model Element through inheritance. All elements that represent candidate selection criteria are Leaf Elements and all the elements that must be kept as information but are not identified as candidate selection criteria are Helper Elements.

For the extraction process, the mapping between the CloudML@artist elements and the intermediate model is presented in Table 2. It is essential to be mentioned that the mapping does not apply to all the elements of the CloudML@artist metamodel. An extraction example will clear thing up:

Candidate selection criteria extraction example: Let's assume that we are working with the core profile (category A) and we are trying to extract any candidate selection criteria from it. The Core profile incorporates a Stereotype which is called Common Features and contains features that are common to all providers. This stereotype is marked for containing features that are eligible to be included into the set of the candidate selection criteria. It is, thus, mapped to a Helper Element. This means that it is kept as an entity but will not be one of the leaves presented to the end user for selection. In addition, it will serve as an individual group of criteria (as described in sub-section 3.3). As a next step, the Common Features stereotype is being scanned for properties of types Enumeration, Boolean, or High Level Evaluation. The last type is defined in the context of the metamodel and gets three values: poor, average and extensive. Other types, such as numerical types will not be included into the set of criteria because they cannot be handled similarly (for instance, numerical values need a definition of whether a high value is preferable to a low and this information is not contained in the metamodels). Advancing with our example, every property of the aforementioned three types, is automatically set as a criterion, and is mapped to one of the: Enumeration Property, Leaf Property and Leaf HLE Property (Table 2). Boolean and HLE types are directly included into the set of leaves as they stand for features that either exist or not for a provider (Boolean), or are marked as good or bad for a provider (HLE). Enumeration types though, need further expansion in order to be presented to the user. For instance, the property "Availability Zones" is not a selection criterion by itself. If expanded, it can enumerate a set of availability zones each of which can be set as a criterion by the user, e.g. "Set the criterion of the existence of an availability zone in Oceania".

Before moving on to the next components, two points must be highlighted: First, every class of the intermediate model keeps information that will be used in the next step of the reverse process. In this reverse process (methodology step 5), the profiles of type A and B are not re-read (mostly for memory optimization purposes). Second, the incorporation of the intermediate model to the CTS tool works towards the direction of satisfying NF-Reqs 2, 3, 4. It is extensible itself (new mappings and model elements can be created without altering the existing ones) and it enables adaptability to the evolving cloud environments as the criteria are extracted by their definition types. For instance, if the feature of availability zones is not present to the cloud environments after some years, the fact will be captured by the source of knowledge maintainer and will disappear from the candidate selection criteria of the CTS tool without any source code alterations. Finally it enables the task of criteria grouping increasing, thus the degree of usability of the tool.

Table 2Mapping from CloudML@artist to the Intermediate model

Intermediate Model CloudML@artist

Model Element Helper Element All Profiles or Stereotypes (plus non model elements logically corresponidng to groupings)

Enumeration Property Property of type "Enumeration"

Service Element Stereotype named "Service"

Leaf Element (no entity, serves as generalization for the Leaves of the tree)

Leaf Enumeration Value Enumeration Literal

Leaf Property Property of type "Boolean Primitive Type"

Leaf High Level Evaluation Property (Leaf HLE) Property of type "Enumeration" of type "High Level Evaluation"

Figure 2 Intermediate model

3.5. Actual selection criteria into model queries

The candidate selection criteria are presented to the end-user though the UI and she selects the actual selection criteria. After the selection has been made,, the selected criteria must be translated into model queries which will be applied to the metamodels of category C (methodology step 4). As was previously discussed, these metamodels can be seen as representations of providers. Up to this point of research, three types of queries have been identified:

• High Level Evaluation (HLE) Query: This type of query is performed when a Leaf HLE property of the intermediate model is set as a criterion, which in turn is a property of a stereotype or profile (Service Element or Helper Element). This query is executed in two steps. The first step is to certify that this type of element has been evaluated for each of the selected providers and the second is to get the results of the evaluation. For instance, if the user has set "monitoring" as a criterion, a query must be formed in order to check if the profiles of all the providers contain evaluation of the monitoring property and if so, to return the result of this evaluation (poor, average or extensive).

• Boolean Query: This type of query is performed when a Leaf Property of the intermediate model is set as a criterion. It corresponds to a search as to whether this property exists for his provider or not. For instance, if the user has set the scale-up property as a criterion, the query must be formed in order to provide information about the existence of this feature on the set of cloud features of each provider.

• Composite Query: This type of query is performed when a Leaf Enumeration Value is set as a criterion. This means that the user has selected one or more values of an Enumeration Property which is in turn a property of a stereotype (Service element or Helper Element). As a result, a structure is used in order to group all the requested (by the user) values under the same Enumeration Property so as to access the corresponding applied stereotype just once for each provider profile during the query execution.

The model query types can be extended for supporting other selection criteria as the maturity of the tool advances.

3.6. Query execution

After the queries are formed and grouped, they must be executed on each one of the selected provider profiles (methodology step 5). The challenging aspect in this part of the algorithm is evolved around the handling of memory as the decision of using metamodels as the source of knowledge transforms the NF-Req 1 for performance into a memory management related problem. The two principles followed are: Do not load more than one model resource at a time and Do not keep a whole model resource loaded in the memory. The data structures which contribute to this direction are:

• The intermediate model described in subsection 3.4, is the most important of them, as it allows for profiles of categories A and B not to remain loaded for long.

• The Target Profile structure allows each provider's profile (profiles of category C) to be loaded just once during the query execution and hold a representation of the resulting score assigned to each one of them.

• The request structure holds all the information required by a model query to be performed (e.g. applied stereotype, property of the stereotype, required value of the property in case of composite query).

• The response structure holds all the information required from each type of response (e.g. how many property values were found out of the selected ones)

So, the process followed, is that every provider profile is loaded and is transformed in a Target Profile structure. Then, every query is sent as a request, is executed over the Target Profile and is sent back as a response. Then the same process takes place for the next provider profile.

An issue which emerges during querying each provider's profile and has a strong impact on the performance of the mechanism is the importance of performing all the queries over the same applied stereotype in a row. This means that applied stereotypes on a provider's profile do not have to be searched again and again by their names or

their types. Instead, when the applied stereotype is detected, a reference is kept so as to be used for every related query.

The query execution is the last methodology step which deals with UML metamodels. The next steps contribute in the decision making part of the CTS tool by utilizing all the results produced from the up to this point usage of the tool.

4. Decision making mechanism

This is the last and under research part of the CTS tool -currently implemented in a naïve manner. There is the intention, in the future, to deploy multi-criteria decision making algorithms in order to produce the final suggestion. The decision making process for the purposes of this work, unfolds in two layers, which aim to combine different types of information.

4.1. Layer 1:Combine the results for one group of criteria

Each group of criteria (structured in a single tree structure and presented through a single view) must return an evaluation of each provider according to the user's preferences. Up to this part of the paper, the adopted strategies have managed to provide information about which criteria are fulfilled. Now, this information must be combined in order to produce a kind of "score" for every provider, based upon the results for one single group of criteria. The current approach assumes that every criterion has the same importance as all the others. As a result, a score is assigned to each provider according to the percentage of compatibility with the requirements. More specifically, each one of the three categories of queries is given a unique way of producing score components. The score assignment is described in Table 3.

However, this approach is soon going to be revised and ultimately changed, together with the user interface in order for the user to be able to select not only which criteria will be taken into account during the evaluation of the providers, but also how important each criterion will be for the final decision.

Table 3 Score assignment for one group of criteria Table 4 Examples of conjoint analysis tables for two different users

HLE query poor: si = 0 average: si = 0.5 extensive: sj = 1

Boolean query exists: si = 1 does not exist: si = 0

Composite query found n out of m (n<m): si = n/m

Total score: S =

User 2 SG Perfect SG Medium SG Low

CFG Perfect 1 2 5

CFG Medium 3 4

CFG Low 6 8 9

User 1 SG Perfect SG Medium SG Low

CFG Perfect 1 2 3

CFG Medium 4 5 6

CFG Low 7 8 9

4.2. Layer2: Combine the results of Layer 1

Having produced suggestions for each group of criteria separately is not enough for the final decision to be taken. As a result, an intelligent method should be adopted in order for the results of Layer 1 analysis to be combined and allow the extraction of safe conclusions.

4.2.1. The Weighting factors (Implemented)

This type of problem could be easily solved just by assigning weights to each of the groups of criteria. For instance, let's assume that, in the problem, there are three different groups of criteria namely presented as follows: Service Group (SG) is the GR1 and stands for the criteria related to the existence of services and offerings. Cloud Features Group (CFG) is the GR2 and stands for the criteria related to the existence of specific cloud features e.g. scale-up. Benchmark Features Group (BFG) is the GR3 and stands for the criteria related to benchmark results.

Let's also assume that the user has set weights for each group as Wgri, WGR2, and WGR3 and that the scores produced in layer 1 for a specific user are S1GR1, S1GR2, S1GR3 for Provider1 and S2GR1, S2GR2, S2GR3 for Provider2 respectively. Then the final scores for each provider would follow the equation:

FS = W ■ S + W ■ S + W ■ S (1)

1 ° GR1 °GR1 ' GR 2 ° GR 2^' GR3 °GR3 W

4.2.2. Conjoint analysis (Notyet implemented)

The problem discussed in 4.2.1 can be addressed with the use of a variation of conjoint analysis and will allow the user to express, apart from the degree of preference on an individual group, also the relation between a pair of individual groups. This method will be explained through the previous example. First some definitions will be revised in order to match the conjoint analysis approach.

• Each group of criteria will be considered as a single criterion. The set of all the available criteria will be denoted as C which in the example would take the form: C = {SG, CFG, BFG}

• Score categories will be created imitating the fuzzy logic. The set of all the available score categories will be denoted as S which in the example would take the form: S = {"perfect", "medium", "low"}

• Score categories will be considered as criteria alternatives, meaning that each criterion can have one of the three alternatives. From the above, a set of pairs is defined as:

A = C x S

After the alternatives have been set -could be three or four according to the deviation of the results- the user is asked to set priorities on the different combinations of scores, for every pair of criteria, by completing pair tables as the ones shown in Table 4. These two tables represent the priorities set by different users for the pair of the SG criterion and the CFG criterion. Such tables are called trade-off tables and they denote what the users are willing to sacrifice in one aspect of the problem in order to gain advantages in another one.

Applying conjoint analysis on these trade-off tables results in having a score for each alternative of each criterion. So, according to the definitions given specifically for the conjoint analysis, the result would be a function returning a natural number for each (a, b) G A, G: A ^ N

Finally, scores will be assigned to each of the providers. For instance, consider Amazon Web Services and Microsoft Azure as the two candidate providers. Each one of them is assigned with a subset of A; let those subsets be A1 and A2 respectively. For the specific example utilized through the whole section, each of these subsets will contain three elements, one for each of the three criteria. The final score assigned to the ith provider (i = 1, 2) will be calculated as:

FS, = £ G(a,) ( 3 )

j=1...|A,|

$ Service Features # General Features £3

GeneralFeatures i

scaleup sea Leo ut freeTier gpulnstances load Balancing ~~ supportsOCCI

supportsMultitenancy supportsFederation i Advanced service level in monitoring support

■jjj pricingPlan by_month yeardiscount other

pay_as_you_go by_hour semester year

y certification 1

F1 pci

sas_70 other none fisma P* iso_27001 safe_harbor typeAppOriented | DB

Data_Analytics Web_Serving Streaming SoFtwareTestin ~~ WebSearch — Web2_0 Scientific

0 scope public private j^J availabilityZones | East US (*™ North_Central_US r" So uth_Centra l_U S WestJJS EastAsia Southeast_Asia "" Brazil_South North_Europe WestEurope Japan_East Japan_West usageType Free Paid Premier

Figure 3 General Features view 5. Prototype implementation and Evaluation

The algorithms and methods described as levels of the CTS process in the previous section are implemented in a first prototype, including a user interface which allows human interaction and functions as a proof of concept for the algorithms, especially in the part of testing the model compatibility and the querying ability of the designed tool. The tool is implemented in Java, as an Eclipse plug-in, exploiting, thus, the Eclipse modeling advantages as well as the distribution of the CloudML@artist as an eclipse plug-in.

5.1. Discussion upon the evaluation strategies

laaS » „ _, PaaS

laaSStorageService w RuntlmeService

» •*> type » A name

Raw_storage

Volumc.storage 1 Java_7

Object_storage I- Ruby_1_8

| Block_blobs Ruby_1_9

Page.blobs f~~ NodeJs

| Tables f Phyton_2_7

r~ Queues PMP_Runtime

" "0- region I- Go_1

J~~ EastjJS w {J^ programmingLanguaje

North_Central_us r~~ J««

| South_Central_US F c_sharp

| West_US phyton

East_Asia r go

Southeast_Asia F 'uby

f~~ Bra/ilSouth F nodejs

F North_Euro, e F scala

West_Europe 1 ohp

I" Japan_East f per«

F Japan_West » ijt appllcatlonFramework

» replication type 5prlng_Framework_3_2_S

F Geo_Redundant r~ 5prlng_NET_2_0_0

J- Read_Access_Geo_Rec r~" Sprlng_Framework3_1

Locally_Redundant Ralls

laaSProcessingService | Sinatra

» ijj platform | Play_2_0

r~ Centos Lift

F RedHatEnterpriseLInu) Django_0_96

I Debian "~ DJango_l_0_2

F OradeEnterprlseLlnux f~" Django_1_1

F WindowsServer Struts2

I SUSELtnux Enterprise f"~~ Ubuntu r~ FreeBSD f~~ openSUSELinux

SUSEEnterprlseLlnux " «jjj processorArchltecture r~ Arch_64_Blt f Arch_32_Blt » Û laaSServiceBus v "O1 region f~~ East.US r~ North_Central_US South_Central_US West_US f East.AsIa ^ Southeast Asia f- Brazil.South

North_Europe F West_Europe F Japan_East F Japan_West • - SaaS

▼ ^ Sof twareService » {¡J authenticationMethods f~" OAuth

SA ML ' HMAC ' Basic

» AppllcattonContalnerServlce » name

Oracle_Weblogic_Applicatk | Apache_Tomcat

J8oss_Appllcation_Server WebSphere Applicatlon Se r~ MS_NET_AppUcat»on_Conta I Jettv servtet container » SqlStorageService * {¡J dataStructureType f~~ Table | Blob Queue

» Û NonSqlStorageService ' <* type

f~~ Keyvalue f Column

Document f~~ Muttivalue I- ObJectsOrlented I- Graph Tabular » ^ InstancesService » <* type

| Backend F Frontend

Figure 4 Service Group view

A decision support system must be first evaluated for its ability to make trustworthy decisions. This is usually achieved in two ways. The first is to determine the distance between the correct output and the one produced by the tool. In the case of the CTS tool this cannot be applied, as there is no correct answer. On the contrary, cloud experts would strongly disagree upon the best solution. The second way is to define a set of criteria upon the fulfillment of which the evaluation of the output will take place. For instance when the decision is on whether to sell product A instead of product B, the criterion could be a raise in the profits after the decision is applied. This strategy does not apply either to the CTS tool case. The reason is that the only criteria that can be defined are the ones set by the user. However, these are the criteria that in combination produced the outcome. So if the tool functions reliably (meaning that the results of the model queries are the correct ones), then the evaluation would be self-provable and would always return a positive feedback. Alternatively, the evaluation of

the decision could be based upon real-life usage statistics coming from the opinions of end-users who would make use of the tool, adopt the decision produced by it, and evaluate it after a significant amount of time. This again, does not apply here, as the source of knowledge is at an experimental stage, meaning that commercial cloud environments have not yet been fully incorporated so as to lead to provider scores that can be effective in the real-world environments.

Under theses circumstances, the evaluation of the tool will evolve around three aspects:

• Evaluation of the extracted candidate selection criteria and their groupings

• Reliability of the query results

• Fulfillment of the non-functional properties set in section 2 5.2. Evaluation of the CTS tool

After applying the methodology described in section 2 up to the step of extracting the candidate selection criteria, the user is presented with two eclipse views. The first view contains the group of candidate selection criteria derived from Service stereotypes (process explained in section 3.4), forming thus the Service group (Figure 3). The second view contains the group of candidate selection criteria derived from the Common Features stereotype, forming thus the General features group (Figure 4). The extracted criteria constitute, apart from a proof of concept for the functionality described in section 2, a good insight about the potentials of the tool in terms of dynamically identifying features that can be helpful in the decision making process.

Furthermore, as discussed in the sub-section 5.1, the minimum requirement in order for the final decision to be considered trustworthy is the reliability of the query results. Reliable query results mean that the scores assigned to the providers are in accordance with the criteria set by the end-user for the selection and this in turn means that the final decision is as closest as possible to the user's preferences. Towards this direction, a logging mechanism has been created in order to record queries and responses. The log file has a structure dictated by the query execution algorithm. So, there is a record for every cloud provider (in the query execution algorithm it is represented by the Target Profile structure) and each record contains one field for every query performed upon this cloud provider together with the results. The results are represented by individual scores as they were described in Table 3 for each query category accordingly. Several tests have been performed with different testing versions of Provider Profiles (profiles of category C). These testing versions were created by applying different stereotypes of the Core Profile to the Provider Profiles. The results showed that the recorded queries were the ones set by the actual selected criteria, while at the same time for every query the response was accurate.

The final and most important step of the evaluation is to investigate aspects related to the non-functional requirements of the tool.

• Usability: The steps a user has to perform in order to carry through with the CTS process are: Open the desirable view/views, check the boxes corresponding to the criteria of her selection, open the view for the selection of the candidate cloud providers, check the boxes corresponding to the providers that will take part in the CTS process, press the evaluation button, set the weights for each group of criteria, view the results, repeat with different input if necessary. The only part of the process where errors have been encountered is when the user forgets to select among the supported cloud providers. This results in the execution of the evaluation process for every one of the available providers.

• Extensibility: The fact that the tool is organized in groups of criteria and eclipse views allows new groups to be created. Thus, new potentials for extraction of further candidate selection criteria are opened. More specifically, the tool can be extended in terms of groups of criteria (by exploiting more stereotypes other than the Service stereotype and the Common Features stereotype), and in terms of intermediate model features (by identifying other types of mappings and adding more leaves to the intermediate model). This means that in the future other types of properties can also be included in the family of selection criteria. This fact might also dictate extensions to the query types and the score categories, which is absolutely possible without altering the implementation of the current structures.

• Adaptability: From the beginning of the tool's construction, the most important motivation has been the ability of the tool to adapt to the new conditions in the field of cloud computing. As can be deducted from the way the

model driven components utilize the source of knowledge (subsections 3.1 and 3.2), changes in the CloudML@artist metamodels will automatically alter the CTS tool without any intervention in the tool's source code. For instance, new services might be added or new features might be included in the field of Cloud Computing. If these changes are captured by the source of knowledge, then they will be automatically captured by the CTS without human intervention, as indicated by the following example: If the IaaSStorageService is removed from the CloudML@artist modeling language, it will also disappear (automatically) from the view presented in Figure 3.

• Performance: The most time consuming parts of the CTS process are the ones that need access to the source of knowledge. These are the candidate selection criteria extraction (methodology steps 1 and 2) and the query execution (methodology step 5). Here are presented some performance metrics that can be obtained through experimental measurements:

o Average Query Execution time (QE), different for each Query type (subsection 3.5)

o Average Response Time for loading one Provider's Profile (PP)

o Average Response Time for Candidate Selection Criteria Extraction (TotalExT)

o Average Response Time for loading the Core Profile (CP)

It must be noted that results from QE and PP if combined can give the response time for the methodology step 5 while the result from the TotalExT is the response time of the methodology steps 1 and 2. The average values of the experimental results are presented in table 5. The experiments have been conducted on a machine with 2 CPU cores at 1.70 GHz and 7.7GiB of physical memory. As can be noticed, the performance bottleneck is the model loading. This justifies our decision for loading every model resource just once as well as our decision for keeping the important information in the intermediate model (stored in the memory) instead of reloading the model resource every time the tool needs access to the already parsed information.

Finally, it must be noted that depending on the number of selection criteria the user has set, the response time of methodology step 5 can vary. The same applies to the TotalExT which varies depending on the version of CloudML@artist used (as mentioned before CloudML@artist must constantly evolve in order to be up-to-date).

Table 5 Performance time experimental results

QE PP TotalExT CP

Average RT <1ms 2.75s 7.33s 4.72s

Comments Boolean Measured for CP time is included. This RT TotalExT-CP produces

Queries seem to experimental will be increased when the RT for the model

be the fastest versions of the extending the tool adding parsing phase of the

profiles more groups of criteria extraction of the criteria

6. Related Work

In the last years there has been a great amount of effort to address the issue of diversity and constant mutability of cloud services and cloud providers. Many ideas and approaches have been published which for the purposes of this work are grouped into three categories.

Measuring, Benchmarking and comparing: One way of approaching the selection problem is to test and evaluate the available choices and finally go for the best one. Some testing methods have been developed and used towards this direction: A. Li et al.2 present a mechanism which aims at comparing public cloud providers. The comparison is being held upon three types of services: computing, storage and networking. A set of metrics is defined for each type of service and different types of workloads are used for the actual testing of the providers. Other works include a research for the identification of metrics that can be proved valuable for the evaluation of cloud services3, and the construction of a framework together with a set of benchmarking algorithms aiming at the

statistical evaluation of PaaS environments6. Finally, in another work15, a framework for measuring and representing service performance across different application types is introduced. The representation of the results contributes to the CloudML@artist, which means that it can be easily integrated with the CTS tool introducing, thus, an additional group of criteria.

Selection of Cloud services and Cloud providers: Another family of related works, which are more targeted at the heart of the problem, follows a concrete approach: a source of knowledge (database, repository, human input or model) is used, the decision criteria are set, the information coming from the source is combined with the actual requirements and a recommendation is produced. More specifically, a model-based approach4 introduces and utilizes cloud feature models in order to represent cloud services, cloud features and requirements. Then a methodology is introduced which maps the modeled requirements on the modeled services in order to find the best fitting solution. In another work5, a framework is proposed for gathering quality of service information from three different sources: users' opinions, providers' specifications and third party monitoring results. The decision maker sets the QoS criteria and a recommendation is being made by analyzing the QoS history and the user's preferences with the use of a MCDM algorithm (two algorithms are being tested). In the same direction, V. Andrikopoulos et al.9 present a decision support mechanism for the migration of an application is presented, which uses a knowledge database, selects a set of candidate targets according to the user's preferences and finally adopts the most suitable one by minimizing the predicted costs. Again, an MCDM algorithm is used. The issue is also addressed in the context of the Contrail project18, where the Contrail Federation component enables access to multiple cloud providers. The user provides the application description (in OVF standard format) and SLA terms for QoS and QoP in order for the Contrail to negotiate and select the most suitable provider of the federation. Finally, the ASCETIC project19 currently promises to offer functionality for provider selection based on performance, energy and ecological constraints in a provider, application and virtual machine level.

Search engines: The final approach presented, deals with the problem of lack of information for cloud providers and services which is the main weakness of works described in the previous paragraph. Two works are selected which try to automate the process of information extraction from he web about cloud providers and services: The first7 introduces a search engine for detecting services fulfilling the user's requirements. In addition, a cloud ontology has been constructed which acts supportively during the web research and improves the accuracy of the testing results. The second8 presents a method based on crawlers that will be able to detect cloud services from different sources and perform clustering upon them with the aid of k-means algorithm in order to detect similarities and differences.

7. Conclusions and future work

Given the complexity and multi-objectivity of the Cloud target selection problem, as well as the difficulty of objectively evaluating the resulting decision, the exploitation of human experience and intuition plays a determinant role in the CTS approach introduced in this paper. Interaction is made possible through the implementation of a user interface which supports the process from beginning to end. This work, in its initial steps, aspires to provide a manifold solution by dealing with different aspects of the problem.

The CTS tool aims at addressing the need for obtaining and combining different types of information from different sources by using a unified cloud modeling language (CloudML@artist). Every interaction with the metamodels is performed abstractly so as not to be bound to the specific cloud characteristics and services but to the definitions of cloud concepts of "service", "cloud feature" and "cloud provider" introduced by the modeling language. Thus, it becomes flexible and adaptive upon technological advances (e.g. the introduction of new providers, or new types of services). Additionally, the extraction of the criteria from the metamodels is performed into groups (abstractly defined as well), which facilitates user interaction. This also means that extensions concerning support for different groups of criteria can be easily designed and incorporated to the overall methodology by reusing existing components and without affecting the already supported groups of criteria. Future plans for extensions include integration of two groups emerging from the benchmark and cost profiles currently included as profiles in the source of knowledge. The final aspect covered is the decision making mechanism itself. At the moment, a simple weight-based algorithm is implemented, and a conjoint analysis approach is proposed and

designed to be included in next versions of the CTS. Future plans for research include the adaptation and testing of advanced multi-criteria decision making algorithms in order to capture in more detail the end-user's preferences.

Acknowledgements

The research leading to these results is partially supported by the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 317859, in the context of the ARTIST Project.

References

1. Q. Zhang, L. Cheng, R. Boutaba, Cloud computing: State-of-the-art and research challenges, Journal of Internet Services and Applications, 1 (2010) 7-18.

2. A. Li, X. Yang, S. Kandula, and M. Zhang, "CloudCmp: comparing public cloud providers," in IMC, 2010.

3. Zheng Li , Liam O'Brien , He Zhang , Rainbow Cai, On a Catalogue of Metrics for Evaluating Commercial Cloud Services, Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing, p.164-173, September 20-23, 2012.

4. Erik Wittern , Jörn Kuhlenkamp , Michael Menzel, Cloud service selection based on variability modeling, Proceedings of the 10th international conference on Service-Oriented Computing, November 12-15, 2012, Shanghai, China.

5. Zia Ur Rehman , Omar Khadeer Hussain , Farookh Khadeer Hussain, Parallel Cloud Service Selection and Ranking Based on QoS History, International Journal of Parallel Programming, v.42 n.5, p.820-852, October 2014.

6. Gültekin Ata§ , Vehbi Cagri Gungor, Performance evaluation of cloud computing platforms using statistical methods, Computers and Electrical Engineering, v.40 n.5, p.1636-1649, July, 2014.

7. Kang, J., Sim, K. M.: Cloudle: A multi-criteria cloud service search engine. In: IEEE Asia-Pacific Services Computing Conference (APSCC), pp. 339-346 (2010).

8. Shengjie Gong; Kwang Mong Sim, "CB-Cloudle and cloud crawlers," Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on , vol., no., pp.9,12, 27-29 June 2014.

9. V. Andrikopoulos, Z. Song, and F. Leymann, "Supporting the Migration of Applications to the Cloud through a Decision Support System," in Proceedings of the IEEE Sixth International Conference on Cloud Computing, 2013, pp. 565-572

10. V. Tran, J. Keung, A. Liu, A. Fekete. Application migration to cloud: A taxonomy of critical factors. In Proceedings of the 2nd International Workshop on Software Engineering for Cloud Computing, ACM, New York, USA, pp. 22-28, 2011

11. V. Andrikopoulos, S. Strauch, and F. Leymann, "Decision support for application migration to the cloud: Challenges and vision," in Proceedings of the 3rd International Conference on Cloud Computing and Service Science, CLOSER 2013, 8-10 May 2013, Aachen, Germany. SciTePress, 2013

12. Y. Jadeja and K. Modi, "Cloud computing-concepts, architecture and challenges," in Computing, Electronics and Electrical Technologies (ICCEET), 2012 International Conference on, 2012, pp. 877-880

13. L. Qian, Z. Luo, Y. Du, et al., "Cloud computing: an overview," in Cloud Computing, vol. 5931 of Lecture Notes in Computer Science, pp. 626-631, Springer, Berlin, Germany, 2009.

14. Alkhater, Nouf, Walters, Robert and Wills, Gary (2014) An investigation of factors influencing an organisation's intention to adopt cloud computing. In, International Conference on Information Society (i-Society 2014), London, GB, 10 - 12 Nov 2014, 2pp.

15. G. Kousiouris, G. Giammatteo, A. Evangelinou, N. Galante, E. Kevani, C. Stampoltas, A. Menychtas, A. Kopaneli , K. Ramasamy Balraj, D. Kyriazis, T. Varvarigou, P. Stuer,L. Orue-Echevarria Arrieta "A Multi-Cloud Framework for Measuring and Describing Performance Aspects of Cloud Services Across Different Application Types", to appear in Proceedings of MultiCloud 2014, Special Session on Multi-Clouds, in the context of the 4th International Conference On Cloud Computing and Services Science (CLOSER 2014), 3-5 April 2014, Barcelona.

16. SpainR. Buyya, C.S. Yeo and S. Venugopal, &ldquo, Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities,&rdquo, Proc. 10th IEEE Int',l Conf. High Performance Computing and Comm., pp. 5-13, Sept. 2008.

17. ARTIST D7.2.3, Nunzio Andrea Galante/Gabriele Giammatteo, 2015, D7.2.3 Cloud services modelling and performance analysis framework.

18. Contrail EU funded Project, http://contrail-project.eu/

19. Ascetic EU funded Project, http://www.ascetic-project.eu/