Configuration of storage:
IBM V7000 gen2+
10x SSD 4TB
Interface 10GBe (iSCSI)
Configuration of test VM
2CPU, 4GB RAM
Linux Debian on ESX 6
Datastore mapped to ESX
Performance limited by disks, CPU usage under 40% on storage.
DRAID5 (8D + 1P + 1S)
test: (g=0): rw=randrw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.0.9
Starting 1 process
test: Laying out IO file(s) (1 file(s) / 40960MB)
test: (groupid=0, jobs=1): err= 0: pid=3987: Wed Mar 29 07:57:35 2017
read : io=30723MB, bw=139125KB/s, iops=34781 , runt=226126msec
write: io=10237MB, bw=46360KB/s, iops=11589 , runt=226126msec
cpu : usr=9.58%, sys=37.75%, ctx=709816, majf=0, minf=4
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=7864963/w=2620797/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=30723MB, aggrb=139125KB/s, minb=139125KB/s, maxb=139125KB/s, mint=226126msec, maxt=226126msec
WRITE: io=10237MB, aggrb=46359KB/s, minb=46359KB/s, maxb=46359KB/s, mint=226126msec, maxt=226126msec
Disk stats (read/write):
sdb: ios=7856666/2618202, merge=0/45, ticks=10471096/3310356, in_queue=13777900, util=100.00%
test read 100%
test: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.0.9
Starting 1 process
test: (groupid=0, jobs=1): err= 0: pid=3991: Wed Mar 29 08:00:59 2017
read : io=40960MB, bw=206382KB/s, iops=51595 , runt=203230msec
cpu : usr=9.88%, sys=38.44%, ctx=665551, majf=0, minf=68
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=10485760/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=40960MB, aggrb=206382KB/s, minb=206382KB/s, maxb=206382KB/s, mint=203230msec, maxt=203230msec
Disk stats (read/write):
sdb: ios=10475310/2, merge=0/1, ticks=11981140/0, in_queue=12170624, util=100.00%
test write 100%
test: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.0.9
Starting 1 process
test: (groupid=0, jobs=1): err= 0: pid=3995: Wed Mar 29 08:05:59 2017
write: io=40960MB, bw=139572KB/s, iops=34892 , runt=300512msec
cpu : usr=7.17%, sys=29.45%, ctx=546661, majf=0, minf=4
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=0/w=10485760/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=40960MB, aggrb=139571KB/s, minb=139571KB/s, maxb=139571KB/s, mint=300512msec, maxt=300512msec
Disk stats (read/write):
sdb: ios=0/10480731, merge=0/60, ticks=0/17984528, in_queue=17979712, util=100.00%
DRAID6 (7D + 2P + 1S)
test: (g=0): rw=randrw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.0.9
Starting 1 process
test: Laying out IO file(s) (1 file(s) / 40960MB)
test: (groupid=0, jobs=1): err= 0: pid=4021: Tue Mar 28 19:43:26 2017
read : io=30719MB, bw=127238KB/s, iops=31809 , runt=247222msec
write: io=10241MB, bw=42419KB/s, iops=10604 , runt=247222msec
cpu : usr=9.06%, sys=33.66%, ctx=787004, majf=0, minf=4
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=7864034/w=2621726/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=30719MB, aggrb=127238KB/s, minb=127238KB/s, maxb=127238KB/s, mint=247222msec, maxt=247222msec
WRITE: io=10241MB, aggrb=42418KB/s, minb=42418KB/s, maxb=42418KB/s, mint=247222msec, maxt=247222msec
Disk stats (read/write):
sdb: ios=7862987/2621478, merge=0/49, ticks=11209100/3770780, in_queue=14975764, util=100.00%
test read 100%
test: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.0.9
Starting 1 process
test: (groupid=0, jobs=1): err= 0: pid=4025: Tue Mar 28 19:47:06 2017
read : io=40960MB, bw=190892KB/s, iops=47723 , runt=219721msec
cpu : usr=9.46%, sys=35.56%, ctx=756041, majf=0, minf=68
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=10485760/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=40960MB, aggrb=190892KB/s, minb=190892KB/s, maxb=190892KB/s, mint=219721msec, maxt=219721msec
Disk stats (read/write):
sdb: ios=10478253/3, merge=0/1, ticks=13149764/169304, in_queue=13315696, util=100.00%
test write 100%
test: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.0.9
Starting 1 process
test: (groupid=0, jobs=1): err= 0: pid=4029: Tue Mar 28 19:53:33 2017
write: io=40960MB, bw=108597KB/s, iops=27149 , runt=386225msec
cpu : usr=5.93%, sys=22.56%, ctx=660059, majf=0, minf=4
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=0/w=10485760/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=40960MB, aggrb=108597KB/s, minb=108597KB/s, maxb=108597KB/s, mint=386225msec, maxt=386225msec
Disk stats (read/write):
sdb: ios=0/10484791, merge=0/83, ticks=0/23609532, in_queue=23605968, util=100.00%
- well of course performance will be limited by the disks if we slap them into any parity based raid configuation as they will be limited to one drives iops and also incur a latency hit that boggles of ressources.
ReplyDelete- the 40% cpu usage is not highly informative either. whilst IBM made DRAID have multithreaded queues, it does not scale threads beyond the number of cores with that task. and they limit the number of cores on a give task. so you'll not see it hit 100% in UI display while the responsible CPU cores might very well be at 100% usage.
This is IBMs design and nothing you can avoid either in such a closed system design. luckily in the next generations they needed you go a different route, as newer CPUs would only come with both more cores and more PCIe lanes for NVMe.
So they kinda were forced by Intel to optimize their scale-out. AFAIK they still haven't ported over the multi-threading to Raid 10 meaning that you will - yes - always be "disk limited". But its not due to the disks, really.
Clarification: Raid1 also ain't optimized for multithreaded IO in the system, AFAIK. From what I've seen they put a lot more effort into making sure you can't do anything unsupported with your system than they put into optimization.
ReplyDeleteI sold off my roughly same-age Tegile and the (Enterprise, nonetheless) v3k/v5k/v7k performance is so absymal in comparison - albeit with better hardware. but IBMs stingyness limits it too much. They seemingly don't want to hand out one more MHz then absolutely neccessary. Probalby their internal processes and the IBM-i-f-cation take up so many ressources and $$$ that it suddenly looks plausible to use the lowest end CPUs "since we are disk bound anyway", and accordingly they end up being "disk bound".
ReplyDeleteThey surely don't seem to have thought - "ok but if we go from a Pentium D to a Xeon D 8 Core, or from a 2609 to a 2667 we might have 30% more performance, maybe we should do that and let the system show its full potential"