**3. Study design**

4 Will-be-set-by-IN-TECH

different mobile devices of the same category show different power consumption, and a specific power consumption model for each device is difficult to obtain. Thus, instead of using external metering instrumentation to detect power consumption, only the internal battery voltage sensor is used, which is found across many modern smartphones. Authors indicate that for a 10-second interval, the PowerBooter technique has an accuracy of about 4.1% within

From a software engineering point of view, most contributions are devoted in developing frameworks and tools for energy metering and profiling. The authors of PowerBooter also propose an on-line power estimation tool called PowerTutor [17]. It implements the PowerBooter model in order to profile power consumption of applications, basing upon their component usage. Another example, which makes use of external metering devices, is ANEPROF [4], which authors define as a real-measurement-based energy profiler able to reach function-level granularity. It is developed for Android OS-based devices, thus it is aimed at profiling Java applications. It is based on JVM event profiling, using software probes to record runtime events and system calls. Authors had to address several design issues, such as overhead control and proper time synchronization. Power consumption profiling is made through correlation of real-time power measurements done by an external DAQ, connected to a ARM Computer-on-Module running Android 2.0. Authors also provide profiling data of four popular applications (Android Browser, GMail, Facebook, Youtube). The accuracy of ANEPROF depends on the hardware meter used. Its CPU overhead is stated to be less than 5%. Finally, SEMO [5] is a smart energy monitoring system, developed for Android, which provides also application-level consumption monitoring. This system is composed of three components: an *inspector*, which monitors the information on the battery, warning users when the battery reaches a critical condition; a *recorder*, which basically logs the actual charge of the battery and the running applications, and an *analyzer*, which calculates the energy

Another alternative for energy measurement is low-level power-analysis using instruction-level models [14]. These models provide accurate power estimates for small kernels of code. An example of this kind of model is presented in Equation 1 where [10]

*i*,*j*

The first part is the summation of the base energy cost of each instruction (*BCi* is the base energy cost and *Ni* is the number of times instruction *i* is executed). The second part accounts for the circuit state (*SCi*,*<sup>j</sup>* is the energy cost when instruction *i* is followed by during the program execution). The third part accounts for energy contribution *OCk* of other instruction

The study presented here is instead focused on the analysis of power consumption data, and it is designed to find out usage patterns of IT devices' energy consumption and to identify situations in which there is a waste of energy. Webber et al. [16] also collected data on devices, focusing on the after-hours power state of networked devices in office buildings: they showed that most of devices are left powered on during night, concluding that this is the first cause of

(*SCi*,*<sup>j</sup> Ni*,*<sup>j</sup>* ) + ∑

*k*

(*OCk*) (1)

consumption rate for each application and ranks them according to it.

*i*

effects such as stalls and cache misses during the program execution.

(*BCi*) + ∑

Energy is the total energy dissipation of the program.

*Energy* = ∑

measured values.

energy waste.

### **3.1. Goal description and research questions**

The aim of this research is to assess the impact of software and its usage on power consumption in computer systems. The goal is defined through the Goal-Question-Metric (GQM) approach. [1]. This approach, applied to the experiment, led to the definition of the model presented in Table 1. The first research question investigates whether and how much software impacts power consumption. The second research question investigates whether a categorization of usage scenarios with respect to functionality is also valid for power consumption figures. The third research question tries to find a quantifiable relationship between power consumption and actual usage of the computer system, by selecting four metrics relative to the main system resources (CPU, Disk, Memory and Network).


**Table 1.** The GQM Model

#### **3.2. Usage scenarios**

The following usage scenarios, described in detail, will provide the basis for the analysis. The scenarios have been designed trying to simulate common operations for a desktop user, and they provide benchmarks (see Section 1) for the different resources of the computer system. This way, we will obtain useful information on the relationship between resource usage and power consumption.

	- *0 Idle.* This scenario aims at evaluating power consumption during idle states of the system. In order to avoid variations during the runs, most of OS'automatic services were disabled (i.e. Automatic Updates, Screen Saver, Anti-virus and such).

become always more available,it has been thought to be reductive not to consider usage scenarios that make a more intensive use of the Internet than Web Navigation and E-Mails. Thus, the Skype scenario has been developed. Skype is the most used application for Video Calls and Video Conferences among private users. For this scenario's purposes, a Test Skype Account was created, and the Skype Application was deployed on the test machine. Then, for each run, a test call is made to another machine (which is a laptop situated in the

Energy Effi ciency in the ICT - Pro ling Power Consumption in Desktop Computer Systems 359

*8 - Skype Call (Video Enabled).* This scenario is similar to scenario 7, but the Video Camera is enabled during the call. This allows to evaluate the impact of the Video Data Stream both

*9 - Audio Playback.* This scenario aims to evaluate power consumption during the reproduction of an Audio content. For this scenario's purpose, an MP3 file has been selected, with a length of 5 minutes, to reproduce through a common multimedia player. Windows Media Player has been chosen, as it is the default player in Microsoft systems,

*10 - Video Playback.* Same as above, but in this case the subject for reproduction is a Video File

*11 - Peer-to-Peer Data Exchange.* As for the Skype scenarios, it has been made the decision to take into account also a Peer-to-Peer scenario, which has proven to be a very common practice among private users. For this scenario, BitTorrent was selected as a Peer-to-Peer application, because of its large diffusion and less-variant usage pattern if compared to other Peer-to-Peer networks with more complex architectures. During this scenario, the system user starts the BitTorrent client, opens a previously provided .torrent archive, related to an Ubuntu distribution, and starts the download, which proceeds for 5 minutes.

In Table 2 all the scenarios are summarized with a brief description of each of them. The last column reports the category which scenarios belong to, from a functional point of view,

• *Idle* (Scenario 0) is the basis of the analysis, evaluates power consumption during the

• *Network* (Scenarios 1,2,7,8,11) represents activities that involve networking and Internet.

• *File System* (Scenarios 4,5) concerns activities that involve storage devices and File System

• *Multimedia* (Scenarios 6,9,10) represents activities that involve audio/video peripherals

In order to answer the Research Questions, it is necessary to specify the independent variables that will characterize the experiment. As anticipated in the previous section, four metrics have been selected to evaluate the system usage. These metrics were measured by means of software logging (as will be explained in the *Instrumentation* section) considering the following

• *Productivity* (Scenario 3) is related to activities of personal productivity.

same laboratory) for 5 minutes, which is the prefixed duration of all scenarios.

on power consumption and on system resources.

and thus one of the most diffused.

in AVI format, same duration.

according to the following:

and multimedia contents.

**3.3. Variable selection**

operations.

values:

periods of inactivity of the system.


become always more available,it has been thought to be reductive not to consider usage scenarios that make a more intensive use of the Internet than Web Navigation and E-Mails. Thus, the Skype scenario has been developed. Skype is the most used application for Video Calls and Video Conferences among private users. For this scenario's purposes, a Test Skype Account was created, and the Skype Application was deployed on the test machine. Then, for each run, a test call is made to another machine (which is a laptop situated in the same laboratory) for 5 minutes, which is the prefixed duration of all scenarios.


In Table 2 all the scenarios are summarized with a brief description of each of them. The last column reports the category which scenarios belong to, from a functional point of view, according to the following:


## **3.3. Variable selection**

6 Will-be-set-by-IN-TECH

*0 - Idle.* This scenario aims at evaluating power consumption during idle states of the system. In order to avoid variations during the runs, most of OS'automatic services were disabled

*1 - Web Navigation.* This scenario depicts one of the most common activities for a basic user - Web Navigation. During the simulation, the system user starts a web browser, inputs the URL of a web page and follows a determined navigation path. Google Chrome has been chosen as the browser for this scenario because of its better performance on the test system, which allowed us to increase navigation time. The website chosen for this scenario is the homepage of the SoftEng research group http://softeng.polito.it, so that the same contents and navigation path could be maintained during all the scenario runs. *2 - E-Mail.* This scenario simulates sending and receiving E-Mails. For this scenario's purpose, a dedicated E-Mail account has been created in order to send and receive always the same message. In this scenario, the system user opens an E-Mail Client, writes a short message, sends it to himself, then starts checking for new messages by pushing on the send/receive button. Once the message has been received, the user reads it (the reading activity has been simulated with an idle period), then deletes the messages and starts over. *3 - Productivity Suite.* This scenario evaluates power consumption during the usage of highly-interactive applications, such as office suites. For this scenario, Microsoft Word 2007 has been chosen, the most used Word Processor application. During the scenario execution, the system user launches the application and creates a new document, filling it with content and applying several text editing/formatting functions, such as enlarge/shrink Font dimension, Bold, Italics, Underlined, Character and background colors, Text alignment and interline, lists. Then the document is saved on the machine's hard drive. For each execution a new file is produced, thus the old file gets deleted at the

*4 - Data Transfer (Disk).* This scenario evaluates power consumption during operations that involve the File System, and in particular the displacement of a file over different positions of the hard drive, which is a very common operation. For this scenario's purpose, a data file of a relevant size (almost 2 GB) has been prepared in order to match the file transfer time with the prefixed scenario duration (5 minutes). The scenario structure is as follows: the system user opens an Explorer window, selects the file and moves it to another location.

*5 - Data Transfer (USB).* As using portable data storage devices has become a very common practice, this scenario has been developed to evaluate power consumption during a file transfer from the system hard drive to an USB Memory Device. This scenario is very similar to the previous one, exception given for the file size (which is slightly lower, near

*6 - Image Browsing/Presentation.* This scenario evaluates power consumption during another common usage pattern, which is a full-screen slide-show of medium-size images, which can simulate a presentation as well as browsing through a series of images. In this scenario, the system user opens a PDF File composed of several images, using the Acrobat Reader application. It sets the Full-Screen visualization, then manually switches through the

*7 - Skype Call (Video Disabled).* For an average user, the Internet is without any doubt the most common resource accessed via a Computer System. Moreover, as broadband technologies

1.8 GB) and the file destination, which is the logical drive of the USB Device.

images every 5 seconds (thus simulating a presentation for an audience).

It waits for file transfer to end, then closes Explorer and exits.

(i.e. Automatic Updates, Screen Saver, Anti-virus and such).

end of the scenario.

In order to answer the Research Questions, it is necessary to specify the independent variables that will characterize the experiment. As anticipated in the previous section, four metrics have been selected to evaluate the system usage. These metrics were measured by means of software logging (as will be explained in the *Instrumentation* section) considering the following values:



• CPU C1 Time Percentage, intended as time spent by the CPU in low-power (C1) State • CPU C2 Time Percentage, intended as time spent by the CPU in low-power (C2) State • CPU C3 Time Percentage, intended as time spent by the CPU in low-power (C3) State

Energy Effi ciency in the ICT - Pro ling Power Consumption in Desktop Computer Systems 361

• Memory

• Hard Disk

• Network

• Memory Page Writings per second • Memory Page Readings per second • Memory Available (KiloBytes) per second

• Physical Disk Transfers (Read/Write) per second • Logical Disk Transfers (Read/Write) per second

Productivity, File System and Multimedia scenarios.

• *RQ 1: Does Software impact Power Consumption?*

*H*20: *Pidle* = *Pnet* = *Pprod* = *Pfile* = *PMM H*2*a*: *Pidle* �= *Pnet* �= *Pprod* �= *Pfile* �= *PMM*

of the instant power consumption value (W).

**3.4. Hypotheses formulation**

*H*10: *Pidle* ≥ *Pn*, *n* ∈ [1, 11] *H*1*a*: *Pidle* < *Pn*, *n* ∈ [1, 11]

later in this Section.

**3.5. Instrumentation**

• Network Packets per second as seen by the Network Interface Card

The dependent variable selected for the experiment is *P* i.e. the instant power consumption (W). Therefore, *Pn* is the average power consumption during Scenario *n* = 1..11 and *Pidle*|*net*|*prod*<sup>|</sup> *file*|*MM* is the average power consumption of (respectively) Idle, Network,

Basing upon the GQM Model, the Research Questions can be formalized into Hypotheses. In order to formally express Research Question 3, *ρ*(*x*, *y*) expresses the correlation coefficient between variables *x* and *y*. *β* represents a significant correlation value, which will be defined

• *RQ 2: Is it possible to classify software usage scenarios basing upon power consumption?*

Every scenario has been executed automatically by means of a GUI Automation Software for 5 minutes, obtaining 30 runs per scenario, each composed of 300 observations (one per second)

The test machines selected are two desktop PCs of different generations. In Table 3, the Hardware/Software configuration of the machines is presented. As can be seen, the difference

• *RQ 3: What is the relationship between usage and power consumption?*

*H*30: *ρ*(*ICPU*, *P*) = *ρ*(*IMemory*, *P*) = *ρ*(*IDisk*, *P*) = *ρ*(*INetwork*, *P*) = 0 *H*3*a*: *max*[*ρ*(*ICPU*, *P*), *ρ*(*IMemory*, *P*), *ρ*(*IDisk*, *P*), *ρ*(*INetwork*, *P*)] > *β*

**Table 2.** Software Usage Scenarios Overview

	- CPU Time Percentage, intended as time spent by the CPU doing active work in a second
	- CPU User Time Percentage, intended as time spent by the CPU executing user instructions (i.e. applications) in a second
	- CPU Privileged Time Percentage, intended as time spent by the CPU executing system instructions (services, daemons) in a second
	- CPU Deferred Procedure Calls Percentage, intended as time spent by the CPU executing DPC in a second
	- CPU Interrupt Time Percentage, intended as time spent by the CPU serving interrupts in a second

8 Will-be-set-by-IN-TECH

running, most of OS'automated

read new messages, write a short

Idle

Network

Network

Productivity

File System

File System

Multimedia

Network

Network

Multimedia

Multimedia

Network

**Nr. Title Description Category**

services disabled.

operate, close browser.

message, send, close client.

block of text, save, close.

series of medium-size images.

conversation (video disabled), close

conversation (video enabled), close

put a file into download queue, download for 5 minutes, close.

an Audio file, close player.

a Video file, close player.

• CPU Time Percentage, intended as time spent by the CPU doing active work in a second • CPU User Time Percentage, intended as time spent by the CPU executing user

• CPU Privileged Time Percentage, intended as time spent by the CPU executing system

• CPU Deferred Procedure Calls Percentage, intended as time spent by the CPU

• CPU Interrupt Time Percentage, intended as time spent by the CPU serving interrupts

position to another.

0 Idle No user input, no applications

1 Web Navigation Open browser, visit a web-page,

2 E-Mail Open e-mail client, check e-mails,

3 Productivity Suite Open word processor, write a small

4 Data Transfer (disk) Copy a large file from a disk

5 Data Transfer (USB) Copy a large file from an USB Device to disk.

6 Presentation Execute a full-screen slide-show of a

7 Skype Call (no video) Open Skype client, execute a Skype

Skype.

Skype.

9 Audio Playback Open a common media player, play

10 Video Playback Open a common media player, play

11 Peer-to-Peer Open a common peer-to-peer client,

**Table 2.** Software Usage Scenarios Overview

executing DPC in a second

in a second

instructions (i.e. applications) in a second

instructions (services, daemons) in a second

• CPU

8 Skype Call (video) Open Skype client, execute a Skype

	- Physical Disk Transfers (Read/Write) per second
	- Logical Disk Transfers (Read/Write) per second
	- Network Packets per second as seen by the Network Interface Card

The dependent variable selected for the experiment is *P* i.e. the instant power consumption (W). Therefore, *Pn* is the average power consumption during Scenario *n* = 1..11 and *Pidle*|*net*|*prod*<sup>|</sup> *file*|*MM* is the average power consumption of (respectively) Idle, Network, Productivity, File System and Multimedia scenarios.

#### **3.4. Hypotheses formulation**

Basing upon the GQM Model, the Research Questions can be formalized into Hypotheses. In order to formally express Research Question 3, *ρ*(*x*, *y*) expresses the correlation coefficient between variables *x* and *y*. *β* represents a significant correlation value, which will be defined later in this Section.

• *RQ 1: Does Software impact Power Consumption?*

*H*10: *Pidle* ≥ *Pn*, *n* ∈ [1, 11] *H*1*a*: *Pidle* < *Pn*, *n* ∈ [1, 11]

• *RQ 2: Is it possible to classify software usage scenarios basing upon power consumption?*

*H*20: *Pidle* = *Pnet* = *Pprod* = *Pfile* = *PMM H*2*a*: *Pidle* �= *Pnet* �= *Pprod* �= *Pfile* �= *PMM*

• *RQ 3: What is the relationship between usage and power consumption?*

*H*30: *ρ*(*ICPU*, *P*) = *ρ*(*IMemory*, *P*) = *ρ*(*IDisk*, *P*) = *ρ*(*INetwork*, *P*) = 0 *H*3*a*: *max*[*ρ*(*ICPU*, *P*), *ρ*(*IMemory*, *P*), *ρ*(*IDisk*, *P*), *ρ*(*INetwork*, *P*)] > *β*

#### **3.5. Instrumentation**

Every scenario has been executed automatically by means of a GUI Automation Software for 5 minutes, obtaining 30 runs per scenario, each composed of 300 observations (one per second) of the instant power consumption value (W).

The test machines selected are two desktop PCs of different generations. In Table 3, the Hardware/Software configuration of the machines is presented. As can be seen, the difference



The measurement of power consumption was done through two different devices. For the old-generation PC, PloggMeter4 (see Figure 3) device was used. This device is capable of computing Active and Reactive Power, Voltage, Current Intensity, *Cosϕ*. The data is stored within the PloggMeter's 64kB memory and can be downloaded in a text file format via Zigbee wireless connection to a Windows enabled PC or Laptop or viewed as instantaneous readings on the installed Plogg Manager software. The device drivers were slightly modified to adapt the PloggMeter recording capability to this analysis' purposes, specifically to decrease the logging interval from 1 minute (which is too wide if compared to software time) to 1 second.

Energy Effi ciency in the ICT - Pro ling Power Consumption in Desktop Computer Systems 363

For the new-generation PC, WattsUp PRO ES<sup>5</sup> (see Figure 4) device was used. This device is capable of measuring current power consumption (Watts), power factor, line voltage and other metrics. The data is stored within the device internal memory, and then downloadable

<sup>4</sup> Youmeter - Plogg Technologies, http://www.plogginternational.com/products.shtml

<sup>5</sup> WattsUp Pro ES, https://www.wattsupmeters.com/secure/products.php?pn=0&wai=0&spec=2

**Figure 3.** The PloggMeter device

**Figure 4.** The WattsUp Pro ES device

via USB interface. The sampling rate resolution is 1 second.

**Table 3.** HW/SW Configuration of the test machine

in terms of hardware is relevant; this will allow us to make some evaluations about how power consumption varied over the years, with the evolution of hardware architectures.

Different software and hardware tools have been used to do monitoring, measurement and test automation. The Software tool adopted is Qaliber3, (see Figure 2) which is mainly a GUI Testing Framework, composed of a Test Developer Component, that allows a developer to write a specific test case for an application, by means of "recording" GUI commands, and a Test Builder Component, which allows to create complex usage scenarios by combining the use cases. One of the most important features of Qaliber is its possibility to log system information during scenario execution, using Microsoft's Performance Monitor Utility. By defining a specific Counter Log, adding all the variables of interest, it is possible to tell Qaliber to start Performance Monitor simultaneously with the Scenario, thus allowing a complete monitoring of all the statistics needed for this analysis.


**Figure 2.** Qaliber Test Builder screenshot

<sup>3</sup> Qaliber - GUI Testing Framework, http://www.qaliber.net/

The measurement of power consumption was done through two different devices. For the old-generation PC, PloggMeter4 (see Figure 3) device was used. This device is capable of computing Active and Reactive Power, Voltage, Current Intensity, *Cosϕ*. The data is stored within the PloggMeter's 64kB memory and can be downloaded in a text file format via Zigbee wireless connection to a Windows enabled PC or Laptop or viewed as instantaneous readings on the installed Plogg Manager software. The device drivers were slightly modified to adapt the PloggMeter recording capability to this analysis' purposes, specifically to decrease the logging interval from 1 minute (which is too wide if compared to software time) to 1 second.

#### **Figure 3.** The PloggMeter device

10 Will-be-set-by-IN-TECH

in terms of hardware is relevant; this will allow us to make some evaluations about how power

Different software and hardware tools have been used to do monitoring, measurement and test automation. The Software tool adopted is Qaliber3, (see Figure 2) which is mainly a GUI Testing Framework, composed of a Test Developer Component, that allows a developer to write a specific test case for an application, by means of "recording" GUI commands, and a Test Builder Component, which allows to create complex usage scenarios by combining the use cases. One of the most important features of Qaliber is its possibility to log system information during scenario execution, using Microsoft's Performance Monitor Utility. By defining a specific Counter Log, adding all the variables of interest, it is possible to tell Qaliber to start Performance Monitor simultaneously with the Scenario, thus allowing a complete

consumption varied over the years, with the evolution of hardware architectures.

**CPU** AMD Athlon XP 1500+ Intel Core i7-2600 **Memory** 768 MB DDR SDRAM 4 GB DDR3 SDRAM

**Display Adapter** ATI Radeon 9200 PRO 128 MB

**HDD** Maxtor DiamondMax Plus 9

**Network Adapter** NIC TX PCI 10/100 3Com EtherLink XL

**OS** Microsoft Windows XP Professional SP3

**Table 3.** HW/SW Configuration of the test machine

monitoring of all the statistics needed for this analysis.

**Figure 2.** Qaliber Test Builder screenshot

<sup>3</sup> Qaliber - GUI Testing Framework, http://www.qaliber.net/

80GB Hard Drive

**Desktop 1 (old generation) Desktop 2 (new generation)**

ATI Radeon HD 5400

Western Digital 1 TB

Ethernet

Intel 82579V Gigabit

Windows 7 Professional SP1

For the new-generation PC, WattsUp PRO ES<sup>5</sup> (see Figure 4) device was used. This device is capable of measuring current power consumption (Watts), power factor, line voltage and other metrics. The data is stored within the device internal memory, and then downloadable via USB interface. The sampling rate resolution is 1 second.

**Figure 4.** The WattsUp Pro ES device

<sup>4</sup> Youmeter - Plogg Technologies, http://www.plogginternational.com/products.shtml

<sup>5</sup> WattsUp Pro ES, https://www.wattsupmeters.com/secure/products.php?pn=0&wai=0&spec=2

#### 12 Will-be-set-by-IN-TECH 364 Energy Effi ciency – The Innovative Ways for Smart Energy, the Future Towards Modern Utilities Energy Efficiency in the ICT - Profiling Power Consumption in Desktop Computer Systems <sup>13</sup>

#### **3.6. Analysis methodology**

The goal of data analysis is to apply appropriate statistical tests to reject the null hypothesis. The analysis will be conducted separately for each scenario in order to evaluate which one has an actual impact on power consumption.

interval is a compromise between the power metering device capability and the software

Energy Effi ciency in the ICT - Pro ling Power Consumption in Desktop Computer Systems 365

Subsequently, *network confounding factors* could arise: as several usage scenarios involving network activity and the Internet are included in our treatments, the unpredictability of the network behaviour could affect some results. Another confounding factor is represented by *OS scheduling operations*: the scheduling of user activities and system calls is out of the experiment control. This may cause some additional variability in the scenarios, especially for

In addition, the two machines on which our tests are performed are different in terms of hardware and software configuration. This is done on purpose, because we wanted to test devices which could represent common machines used in home and office scenarios, for both generations. Thus, installing an old version of an operating system on a new machine or viceversa would have altered this assumption. However, this will introduce another confounding factor, but still, will provide useful information regarding the evolution of these systems, even if no specific research hypotheses can be verified about the comparison.

Finally, the main external threat concerns a possible *limited generalization* of the results: this is due to the fact that the experiment was conducted on only two different test machines, which

We present in Table 4 and Table 5 the following descriptive statistics about measurements for each scenario. Tables reports in this order mean (Watts), median (Watts), standard error on the mean, 95% confidence interval of the mean, variance, standard deviation (*σ*), variation coefficient (the standard deviation divided by the mean), index of dispersion

Power consumptions show an excursion of about 11 W for both PCs, even if the baseline is quite different (an average of 87 W in Idle scenario for the Old PC, 51 W for the New PC). Moreover, the very low variability indexes ensure that the different samples for each scenario

The results of hypotheses testing of the research questions are exposed in this section.

The testing of hypothesis *H*<sup>1</sup> and *H*<sup>2</sup> are exposed in Table 6 and 7. These table report the scenarios tested, the p-value of Mann-Whitney test and the estimated difference of the

Figure 5 represents the bar plot of the power consumption increase (in watts), with respect to idle, of each scenario. Figure 6 shows the box plot of scenario categories for each PC. As regards hypothesis *H*3, which evaluates correlations between resource usage and power consumption, more steps are needed. First of all, Table 8 reports the results of the Data Distribution Analysis. Then, in Table 9 and Table 10, are presented the results of the correlation

is a limited population to be representative of a whole category.

logging service. However, it could be a wide interval if compared to software time.

those that involve the File System.

**4.1. Preliminary data analysis**

(variance-to-mean ratio, VMR).

are homogeneous.

**4.2. Hypothesis testing**

medians between Idle scenario and the other ones.

**4. Results**

In order to extract a Power Consumption profile for each Usage Scenario, a set of descriptive statistics was derived from the experimental data. For a single scenario, a total of 30 runs were executed, each composed of 300 observations (one per second) of the power consumption value. Thus, the calculations for the descriptive statistics were made using two approaches: firstly, the average of each run is extracted, obtaining a short vector of 30 elements, which was used as the subject of our analysis. This method allowed to speed up the calculations, and because of the decreased sampling rate, the data was less variant and showed an almost regular distribution.

Afterwards, the same analysis on the full datasets was applied, which means a total of 9000 observations. Comparing the results from these two approaches, focusing on the Index of Dispersion and the variance, the variability of a single scenario can be appreciated, which was also a useful tool for validating the experiment.

First of all, the null hypothesis *H*10 will be tested for each scenario. Then the scenarios will be grouped into categories and *H*20 will be tested for each category.

First of all, data distribution must be analysed, in order to determine the appropriate testing method for each hypothesis. The data distribution analysis was conducted using the Shapiro-Wilk normality test. Since its results pointed out that the data was not normally distributed, non parametric tests were adopted, in particular the Mann-Whitney test [7] for testing *H*20, and the Spearman's rank correlation coefficient (also known as Spearman's *ρ*) for testing *H*30.

The first hypothesis *H*10 is clearly directional, thus the one-tailed variant of the test will be applied. The second and third hypotheses *H*20, *H*30 are not directional, therefore the two-sided variant of the tests will be applied.

We will draw conclusions from our tests based on a significance level *α* = 0.05, that is we accept a 5% risk of type I error – i.e. rejecting the null hypothesis when it is actually true. Moreover, since we perform multiple tests on the same data – precisely twice: first overall and then by category – we apply the Bonferroni correction to the significance level and we actually compare the test results versus a *α<sup>B</sup>* = 0.05/2 = 0.025. As regards Spearman's *ρ* significance, using 298 degrees of freedom (since 300 observations per scenario are available) the significance level of the *ρ* coefficient is 0.113. Thus, correlations coefficients resulting higher than this value can be considered as significant positive or negative correlations.

#### **3.7. Validity evaluation**

The threats of experiment validity can be classified in two categories: **internal** threats, derived from treatments and instrumentation, and **external** threats, that regard the generalization of the work.

There are three main internal threats that can affect this analysis. The first concerns the *measurement sampling*: measurements were taken with a sampling rate of 1 second. This interval is a compromise between the power metering device capability and the software logging service. However, it could be a wide interval if compared to software time.

Subsequently, *network confounding factors* could arise: as several usage scenarios involving network activity and the Internet are included in our treatments, the unpredictability of the network behaviour could affect some results. Another confounding factor is represented by *OS scheduling operations*: the scheduling of user activities and system calls is out of the experiment control. This may cause some additional variability in the scenarios, especially for those that involve the File System.

In addition, the two machines on which our tests are performed are different in terms of hardware and software configuration. This is done on purpose, because we wanted to test devices which could represent common machines used in home and office scenarios, for both generations. Thus, installing an old version of an operating system on a new machine or viceversa would have altered this assumption. However, this will introduce another confounding factor, but still, will provide useful information regarding the evolution of these systems, even if no specific research hypotheses can be verified about the comparison.

Finally, the main external threat concerns a possible *limited generalization* of the results: this is due to the fact that the experiment was conducted on only two different test machines, which is a limited population to be representative of a whole category.
