Posts tagged Instances
End-To-End Performance Study of Cloud Services
Jun 3rd
Here is a good study found at: High Scalability
Cloud computing promises a number of advantages for the deployment of data-intensive applications. Most prominently, these include reducing cost with a pay-as-you-go pricing model and (virtually) unlimited throughput by adding servers if the workload increases. At the Systems Group, ETH Zurich, we did an extensive end-to-end performance study to compare the major cloud offerings regarding their ability to fulfill these promises and their implied cost.
The focus of the work is on transaction processing (i.e., read and update work-loads), rather than analytics workloads. We used the TPC-W, a standardized benchmark simulating a Web-shop, as the baseline for our comparison. The TPC-W defines that users are simulated through emulated browsers (EB) and issue page requests, called web-interactions (WI), against the system. As a major modification to the benchmark, we constantly increase the load from 1 to 9000 simultaneous users to measure the scalability and cost variance of the system. Figure 1 shows an overview of the different combinations of services we tested in the benchmark.
![]() |
| Figure 1: Systems Under Test |
The main results are shown in Figure 2 and Table 1 – 2 and are surprising in several ways. Most importantly, it seems that all major vendors have adopted a different architecture for their cloud services (e.g., master-slave replication, partitioning, distributed control and various combinations of it). As a result, the cost and performance of the services vary significantly depending on the workload. A detailed description of the architectures is provided in the paper. Furthermore, only two architectures, the one implemented on top of Amazon S3 and MS Azure using SQL Azure as the database, were able to scale and sustain our maximum workload of 9000 EBs, resulting in over 1200 Web-interactions per second (WIPS). MySQL installed on EC2 and Amazon RDS are able to sustain a maximum load of approximate 3500 EBs. MySQL Replication performed similar to MySQL standalone with EBS, so we left it off the picture. Figure 1 shows that the WIPS of Amazon’s SimpleDB grow up to about 3000 EBs and more than 200 WIPS. In fact, SimpleDB was already overloaded at about 1000 EBs and 128 WIPS in our experiments. At this point, all write requests to hot spots failed. Google AppEngine already dropped out at 500 emulated browsers with 49 WIPS. This is mainly due to Google’s transaction model not being built for such high write workloads. When implementing the benchmark, our policy was to always use the highest offered consistency guarantees, which come closest to the TPC-W requirements. Thus, in the case of AppEngine, we used the offered transaction model inside an entity group. However, it turned out, that this is a big slow-down for the whole performance. We are now in the process of re-running the experiment without transaction guarantees and curios about the new performance results.
![]() |
| Figure 2: Comparison of Architectures [WIPS] |
Table 1 shows the total cost per web-interaction in milli dollar for the alternative approaches and a varying load (EBs). Google AE is cheapest for low workloads (below 100 EBs) whereas Azure is cheapest for medium to large workloads (more than 100 EBs). The three MySQL variants (MySQL, MySQL/R, and RDS) have (almost) the same cost as Azure for medium workloads (EB=100 and EB=3000), but they are not able to sustain large workloads.
![]() |
| Table 1: Cost per WI [m$], Vary EB |
The success of Google AE for small loads has two reasons. First, Google AE is the only variant that has no fixed costs. There is only a negligible monthly fee to store the database. Second, at the time these experiments were carried out, Google gave a quota of six CPU hours per day for free. That is, applications which are below or slightly above this daily quota are particularly cheap.
Azure and the MySQL variants win for medium and large workloads because all these approaches can amortize their fixed cost for these workloads. Azure SQL server has a fixed cost per month of USD 100 for a database of up to 10 GB, independent of the number of requests that need to be processed by the database. For MySQL and MySQL/R, EC2 instances must be rented in order to keep the database online. Likewise, RDS involves an hourly fixed fee so that the cost per WIPS decreases in a load situation. It should be noted that network traffic is cheaper with Google than with both Amazon and Microsoft.
Table 2 shows the total cost per day for the alternative approaches and a varying load (EBs). (A “-” indicates that the variant was not able to sustain the load.) These results confirm the observations made previously: Google wins for small workloads; Azure wins for medium and large workloads. All the other variants are somewhere in between. The three MySQL variants come close to Azure in the range of workloads that they sustain. Azure and the three MySQL variants roughly share the same architectural principles (replication with master copy architectures). SimpleDB is an outlier in this experiment. With the current pricing scheme, SimpleDB is an exceptionally expensive service. For a large number of EBs, the high cost of SimpleDB is particularly annoying because users must pay even though SimpleDB drops many requests and is not able to sustain the workload.
Continue Reading at: High Scalability
Related posts
Rackspace Cloud Servers versus Amazon EC2: Performance Analysis
Jan 20th
Here is an Excellent Analysis that we have come across at TheBitSource
The Bitsource conducted a review of the two cloud computing platforms, Rackspace Cloud Servers and Amazon Elastic Compute Cloud (EC2), to get a general idea of overall system performance. Included in the tests were pure computing power (CPU), and raw disk I/O throughput. Using a consistent testing methodology across most instance sizes over a two-month time span (a painstaking process requiring lots of patience) has resulted in the following comparison of CPU performance, disk performance and cost between the two platforms.
We hope that this evaluation will save the reader some effort by answering questions about how the two platforms compare in terms of time, power, efficiency and cost. The purpose of this article is to answer a simple question: How do the two platforms stack up against each other in terms of disk and CPU efficiency?
Test Methodology
Ten executions were run for each test type, at different times, with 5 tests being run one week on 5 different servers, and the following week on another 5, to distribute the tests over time. (The point of this technique was to ensure that, if the cloud platform was experiencing heavy usage at the time of the test, the cloud’s workload would not skew the test results.) Results for all test runs were averaged, graphed, and compared.
For example, for kernel compile tests, 5 virtual instances for each size server were spun up, the same test was executed against all five 5 servers, and then the results were recorded. The following week, the remaining 5 tests were executed in the same fashion. Finally, all results were averaged for each instance size, which constituted the final result.
The control variables for all tests were as follows: CentOS 5.2 x86_64 hardware, GCC version 4.1.2, and only local storage (no remotely-mounted volumes). In the case of EC2 Small and Medium instances, no image is available on EC2 for deploying the x86_64 version of CentOS to it, so a 32-bit CentOS 5.2 image was used in these specific cases.
When performance-testing cloud platforms, it is important to test multiple instances simultaneously, during different times of the day, and over multiple weeks to get a real measure of system performance for CPU and disk performance. The reason is that the capacity of each cloud platform could vary dramatically given different times of day and load on the platforms. Ultimately, good resource-distribution algorithms, system architecture, and capacity planning will shine through in the performance results between platforms. It should be noted that the amount and timing of samples is important to get a measure of overall system performance for cloud platforms.
How the CPU Tests Were Conducted
Typically it is best to test specific applications to get a true measure of performance, but in this case a generic measures of system performance were used to make a comparison between both platforms. SPEC CPU2006 benchmarks were considered, but their minimum requirements exceeded what was available on the systems used for this procedure. Thus, a test of compiling the Linux kernel was selected in order to determine a general measurement of overall system performance and measure the CPU usage to accomplish the specified task. The Linux kernel version used in all kernel compilation CPU tests was stable 2.6.31.5.
Compiling the Linux kernel is a compute-intensive task, which requires work from most of the CPUs. We matched the number of parallel compilation tasks to the number of cores on the machine, to demonstrate getting maximal efficiency out of the host tested.
All instance sizes on Cloud Servers have four CPU cores available; regardless of the instance size, there is no limit on the amount of compute power available to that specific virtual system based on size.
On Amazon EC2, different instance sizes (based on memory) have different configurations for the number and type of CPUs available.
To demonstrate the full capacity of a particular instance size, the number of parallel jobs executed for the compilation was matched to the number of CPU cores available to the instance. For example, if an instance had 4 CPU cores, the compilation command executed would be make –j4 bzImage to compile the Linux kernel image. This command creates 4 parallel jobs to compile the Linux source, resulting in gains in performance for the compilation task, and fully saturating the system’s CPUs for the task of compiling the Linux kernel.
How the Disk Tests Were Conducted
The Iozone (http://iozone.org) disk benchmark was selected as the tool for benchmarking file system performance. Version 3.327 was used for performing the tests. Iozone is an open source disk-performance utility that starts a timer, performs a transfer, and records throughput of the disk subsystem. Iozone was chosen for its flexibility and the fact that it is freely available for anyone who might want to try replicating the results.
Iozone was run using a fixed file size to exceed the operating system caches. Any file size of less than the amount of available memory on the system will typically only test the CPU and disk caches on the system; therefore, a file size greater than the system memory was chosen in order to exercise the physical I/O subsystem.
The global options for all tests began with the record length to be written on the test file; a record length of 32 kilobytes was selected. The tests run were write, read, reread, random read, and random write. The only specific variable was the actual size of the file, and it was set to a size greater than the system’s physical memory in order to ensure that we exercised the physical I/O subsystem for the disk I/O tests.
The Iozone command to perform disks tests was as follows:
where X is the only variable, consisting of the file size according to system memory, as mentioned earlier.
For instances with less than 4 gigabytes of system memory, a static file size of 6 gigabytes was used for the Iozone tests for this particular range of instance sizes.
For instances with a system memory size of 4 to 8 gigabytes, a static test file size of 10 gigabytes was used for the Iozone tests.
For servers with 8 to 15 gigabytes of memory, a static file size of 17 gigabytes was used for the Iozone tests.
Performance Results
Kernel Compilation Results
The time taken to compile the Linux kernel on all Cloud Server instances was consistent across all instance sizes. All Cloud Server instance types compiled the Linux kernel in approximately 3 ½ minutes, with the exception of the 256 MB instance, which took a little over 5 minutes to compile the kernel.
For the kernel compilation test on Amazon EC2, the results vary widely between instance sizes. The best case was the EC2 Extra Large High CPU instance, which came in under 2 minutes and did not reflect typical performance for EC2 instances on average. The worst case was the EC2 Small instance which took approximately 24 minutes to complete. On average, it takes nearly 8 ¼ minutes to compile across all EC2 instance sizes, which is greatly due to the EC2 small instance taking 24 minutes to complete.
Linux Kernel Compilation Time Results
Linux Kernel Compilation Time Platform Averages
On average, CPU user time used between Cloud Servers and EC2 on all instance sizes was about 45% on average and didn’t vary much between the two platforms. The CPU usage graphs during the Linux kernel compilation tests show that the Cloud Servers instances use about 6% more CPU system time on average for compilation of the Linux kernel.
Linux Kernel Compilation CPU Usage Results
Linux Kernel Compilation CPU Usage Platform Averages
Related posts
Amazon creates cloud computing spot market – CloudTweaks.com
Dec 14th
Amazon on Monday rolled out spot pricing for cloud computing so customers can buy capacity at any price on the open market.
The concept is an interesting one since Amazon Web Services is making computing capacity available on the market just like any other commodity (see Amazon statement, Werner Vogels and Amazon Web Services blog).
Dubbed Spot Instances, Amazon customers can bid on unused Elastic Compute Cloud (EC2) capacity and run those instances as long as their bid exceeds the spot price. The rub is that you can be outbid.









