Calxeda's ARM server tested
by Johan De Gelas on March 12, 2013 7:14 PM EST- Posted in
- IT Computing
- Arm
- Xeon
- Boston
- Calxeda
- server
- Enterprise CPUs
Benchmark Configuration
First of all, a big thanks to Wannes De Smet, who assisted me the benchmarks. Below you can read the configuration details of our "real servers". The Atom machines are a mix of systems. The Atom 230 is part of a 1U server featuring a Pegatron IPX7A-ION motherboard with 4GB of DDR2-667. The N450 is found inside an ASUS EeePC netbook, and the Atom N2800 is part of Intel's DN2800MT Marshalltown mainboard. The latter has 4GB of DDR3-1333 while the former only has 1GB of DDR2-667.
Supermicro SYS-6027TR-D71FRF Xeon E5 server (2U Chassis) | |
CPU |
Two Intel Xeon processor E5-2660 (2.2GHz, 8c, 20MB L3, 95W) Two Intel Xeon processor E5-2650L (1.8GHz, 8c, 20MB L3, 70W) |
RAM | 64/128GB (8/16x8GB) DDR3-1600 Samsung M393B1K70DH0-CK0 |
Motherboard | X9DRT-HIBFF |
Chipset | Intel C600 |
BIOS version | R 1.1a |
PSU | PWS-1K28P-SQ 1280W 80 Plus Platinum |
The Xeon E5 CPUs have four memory channels per CPU and support DDR3-1600, and thus our dual CPU configuration gets eight DIMMs for maximum bandwidth. Each core supports Hyper-Threading, so we're looking at 16 cores with 32 threads.
Boston Viridis Server | |
CPU | 24x ECX-1000 4c Cortex-A9 1.4GHz |
RAM | 24x Netlist 4GB (96GB) low-voltage ECC PC3L-10600W-9-10-ZZ DRAM |
Motherboard | 6x EC-cards |
Chipset | none |
Firmware version | ECX-1000-v2.1.5 |
PSU | SuperMicro PWS-704P-1R 750Watt |
Common Storage System
An iSCSI LIO Unified Target accesses a DataON DNS-1640 DAS. Inside the DAS we have set up eight Intel SSDSA2SH032G1GN (X25-E 32GB SLC) in RAID-0.
Software Configuration
The Xeon E5 server runs VMware ESXi 5.1. All vmdks use thick provisioning, independent, and persistent. The power policy is "Low Power". We chose the "Low Power" policy as this enables C-states while the impact on performance is minimal. All other systems use Ubuntu 12.10. The power management policy is "ondemand". This enables P-states on the Atom and Calxeda ECX-1000.
99 Comments
View All Comments
Gigaplex - Tuesday, March 12, 2013 - link
I wouldn't call that a spectacular performance per watt ratio. It's a bit faster than the Xeon under a cherry picked benchmark (much slower under others), and is only marginally lower power. Best case it's an 80% improvement over Sandy Bridge with regards to performance per watt, and Atom wasn't represented. Considering all the hype, I was expecting something a little more... exciting. Ignoring Ivy Bridge improvements, Haswell isn't far off.spronkey - Tuesday, March 12, 2013 - link
Yeah... I agree. It also only seems to really come into its own in high concurrency. The Xeons idle quite similarly in terms of power - what happens if you compare it to more Xeon cores? It seems like on a per core basis, Intel still has the advantage on both fronts?spronkey - Tuesday, March 12, 2013 - link
I would also point out that the A15 has already been compared against Sandy and Ivy cores and come up short in performance per watt; so I'm very interested to see what the next step for these ARM node servers is.JohanAnandtech - Wednesday, March 13, 2013 - link
I warned against the hype in the first sentences. :-) ARM CPUs are still rather weak and not a good match for most applications. However, the fact that we could actually find a case where they do a lot better than the current Xeon systems was surprising to me.wsw1982 - Wednesday, April 3, 2013 - link
No, it should not surprise any people regarding how picky the use case is. I mean, I do think you can find a use case the ARM 11 output perform Xeon. E.g. Serving 1 web request per hour :)LogOver - Tuesday, March 12, 2013 - link
24 servers ran inside 24 VM's on Xeon server, while for ARM server you used the 24 physical server nodes... Hmm... Does not seems to me like apple to apple comparison. Why not to compare, for example, 16 physical nodes on both, xeon and arm servers?haplo602 - Wednesday, March 13, 2013 - link
And how do you slice the Xeon server into 16 physical nodes ? It does not support any kind of HW partitioning that I am aware of. On the other hand the Calxeda machine is a cluster by design. If you try 16 Xeon nodes you'll go through the roof with power.Colin1497 - Wednesday, March 13, 2013 - link
I think the question is this:Was 24 VM's optimal for the Xeon? Since we're visualizing the Xeon, why 24? Just because you had 24 ARM nodes? Would the Xeon done better with 4VM's? Or 16? Or 1000? 24 seems arbitrary.
JohanAnandtech - Wednesday, March 13, 2013 - link
We tested with 16 as I briefly mentioned in the conclusion. The 2650L did 170 responses/s per VM, or about 40% better. Total Throughput = 2.7k/s, while with 24, 2.9 K/s. THe flexibility that the Xeon has to reduce the number of VMs if higher throughput is necessary is definitely an advantage, but the performance numbers are not that different with different VM configs.Kurge - Wednesday, March 13, 2013 - link
How about with 0 VM's? Just run it on the metal.