Budget home NAS performance analysis
Intro
I decided to reuse some older hardware I had lying around to drive a NAS where we can store our media. Music and movies make up a portion of our data but it’s the photos which are are by far the most important to us. And we have lots and lots of them. I could have bought a NAS and populated it with drives but I wanted to go through the challenge of setting up Raid arrays by my self. This was I could use the server for things like running containers and other bits and pieces. No raid setup is a substitute for a good backup strategy. I run a weekly job to transfer everything over from the server to another PC’s single drive and every 3 months I backup the pictures and documents to the cloud. I’m willing to loose movies and videos should the array die along with the backup drive.
After setting it all up I noticed that the write speeds of the array were quite slow. Looking around for articles and posts on various sites I was unable to find any concrete performance numbers from a setup similar to mine for comparison. I did come across lots of server-grade setups but those were backed by hardware Raid controllers and/or used SSD caches. My setup doesn’t use either, not does it really have to as I’m not looking at making it go faster. I just want to understand whether the numbers I am seeing are indeed the limit of what I can expect.
Articles online discuss the theoretical speed as being the number of drives * the speed of a single drive, all divided by 2 because of Raid-10. With my HDD drives this comes to 4 * 180 / 2 = 360MB/s . Yet, as shown by the benchmarks, I only able to get 200MB/s even with sequential writes. Random writes are ~30% slower.
I’m also trying to justify an upgrade of our LAN to 10Gb and the performance output here will help me to make that decision. Based on my criteria I formulated the following 3 questions and this page is all about answering them:
- Am I getting all that I can from the drivers I’ve chosen?
- What sort of speeds can I expect to see?
- Based on the speeds, does it make sense to upgrade from 1Gb ethernet?
- How long will a full back up take
- can I edit photos and videos directly off of the NAS?
Firstly I’m going to run some artificial tests against all of my drives to get a baseline of what they can do. I have a secondary SSD attached to my server and the boot drive is also an SSD. I’m going to run the benchmarks against all the drives, but will not run any other tests of the boot drive. I will then compares the results against a practical test using ‘curl’ and ‘rsync’ to copy the test file between the array and the secondary SSD. All the drives are using SATA 6 ports on my motherboard.
Hardware
Boot device
Model Number: KINGSTON SH103S3240G
Firmware Revision: 580ABBF0
Read/Write: 555 / 510 MB/s
A SATA SSD left over from a previous build. I suspect this is a dodgy drive as the write speeds are very low. Never trust this brand.
Secondary SSD
Model Number: CT500MX500SSD1
Firmware Revision: M3CR020
Read/Write: 560 / 510 MB/s
A cheap SATA SSD I use for Virtual Machines. All of my VMs were shutdown when doing the tests.
RAID 10 drive (x4)
Model Number: ST4000VN008-2DR166
Firmware Revision: SC60
Read/Write: 180 MB/s
Cache: 64MB
RPM: 5900
I chose these because they were the quietest spinning drives I was able to buy here. I understand that NAS-rated drives are meant to be used more for sequential operations with emphasis on reads. Apparently such drives are also designed to run in close proximity to each other so they should have more tolerance to heat and vibrations. My drivers are installed on rubber grommets and are sat next to two 140mm fans which keep them cool. For the file system I went with ext4.
PCI-e x4 NVME SSD (x2)
Model Number: Intel DC P3605 SSDPEDME016T4S
Firmware Revision: 8DV1RA13
Read/Write: 2,600 / 1,700 MB/s
Random IOPS Read/Write: 450,000 / 56,000
2 second hand NVME SSD drives, each sat in a PCI-e x4 slot. For 100 EUR per drive, this was a steal considering their low latency. Especially that according to smartcl one drive has 75% life left while the second one has 99%. The bandwidth and IOPs numbers are somewhat lower compared to new drives, but are more than enough for a home lab
Synthetic test setup
Before trying out any copy tools, we need to benchmark the array as is in order to know how far away from the best-case scenario we are. For this I used fio as the performance benchmarking tool. You can install it with:
$ sudo apt install fio $ fio -v
fio-3.16
iostat gives us the disk utilization along with useful information about its queues, number and size of requests.
$ sudo iostat -xmd /dev/sd[c-f] 1 100000
The benchmarks adjust the io-depth (QD) and the number of jobs (W) is also increased to generate more IO. The block-size is set to 512kb across all the tests.
Reads
fio --name=myjob --randrepeat=0 --ioengine=libaio --direct=1 --size=10G --filename=testfile \QD=1, W=2
--bs=512k \
--iodepth=1 \
--rw=read \
--numjob=1
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
sdc 294.00 147.00 0.00 0.00 1.79 512.00 0.00 0.00 0.00 0.00 0.00 0.00 0.08 100.00
sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sde 295.00 147.50 0.00 0.00 1.42 512.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 92.00
sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
READ: bw=303MiB/s (318MB/s), 303MiB/s-303MiB/s (318MB/s-318MB/s), io=10.0GiB (10.7GB), run=33771-33771msec
fio --name=myjob --randrepeat=0 --ioengine=libaio --direct=1 --size=10G --filename=testfile \
--bs=512k \
--iodepth=1 \
--rw=read \
--numjob=2
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
sdc 167.00 83.50 0.00 0.00 4.92 512.00 0.00 0.00 0.00 0.00 0.00 0.00 0.55 86.40
sdd 36.00 18.00 0.00 0.00 5.89 512.00 0.00 0.00 0.00 0.00 0.00 0.00 0.15 30.00
sde 179.00 89.50 0.00 0.00 4.25 512.00 0.00 0.00 0.00 0.00 0.00 0.00 0.46 86.00
sdf 24.00 12.00 0.00 0.00 6.67 512.00 0.00 0.00 0.00 0.00 0.00 0.00 0.12 22.00
READ: bw=196MiB/s (206MB/s), 98.2MiB/s-98.2MiB/s (103MB/s-103MB/s), io=20.0GiB (21.5GB), run=104237-104304msec
Now we increase the io-depth(QD) to 32, which is the max queue size for SATA HDDs
QD=32, W=1fio --name=myjob --randrepeat=0 --ioengine=libaio --direct=1 --size=10G --filename=testfile \QD=32, W=2
--bs=512k \
--iodepth=32 \
--rw=read \
--numjob=1
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
sdc 216.00 108.00 0.00 0.00 34.15 512.00 0.00 0.00 0.00 0.00 0.00 0.00 6.97 100.00
sdd 246.00 123.00 0.00 0.00 35.79 512.00 0.00 0.00 0.00 0.00 0.00 0.00 8.28 100.40
sde 233.00 116.50 0.00 0.00 33.32 512.00 0.00 0.00 0.00 0.00 0.00 0.00 7.28 99.60
sdf 234.00 117.00 0.00 0.00 34.37 512.00 0.00 0.00 0.00 0.00 0.00 0.00 7.58 100.00
READ: bw=462MiB/s (484MB/s), 462MiB/s-462MiB/s (484MB/s-484MB/s), io=10.0GiB (10.7GB), run=22164-22164msec
fio --name=myjob --randrepeat=0 --ioengine=libaio --direct=1 --size=10G --filename=testfile \
--bs=512k \
--iodepth=1 \
--rw=read \
--numjob=2
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
sdc 289.00 149.00 8.00 2.69 131.89 527.94 0.00 0.00 0.00 0.00 0.00 0.00 37.56 98.80
sdd 327.00 163.50 0.00 0.00 57.14 512.00 0.00 0.00 0.00 0.00 0.00 0.00 18.02 100.00
sde 295.00 147.50 0.00 0.00 18.81 512.00 0.00 0.00 0.00 0.00 0.00 0.00 5.02 96.80
sdf 284.00 142.00 0.00 0.00 22.90 512.00 0.00 0.00 0.00 0.00 0.00 0.00 5.96 99.60
READ: bw=611MiB/s (640MB/s), 611MiB/s-611MiB/s (640MB/s-640MB/s), io=20.0GiB (21.5GB), run=33533-33533msec
Writes
We will run the same set of tests for the writes and we will add a few additional tests with more workers.
fio --name=myjob --randrepeat=0 --ioengine=libaio --direct=1 --size=10G --filename=testfile \ --bs=512k \ --iodepth=1 \ --rw=write \ --numjob=1QD=1, W=2
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
sdc 0.00 0.00 0.00 0.00 0.00 0.00 136.00 67.01 0.00 0.00 3.42 504.53 0.06 93.60
sdd 0.00 0.00 0.00 0.00 0.00 0.00 136.00 67.01 0.00 0.00 3.35 504.53 0.05 92.00
sde 0.00 0.00 0.00 0.00 0.00 0.00 136.00 67.01 0.00 0.00 3.79 504.53 0.10 98.40
sdf 0.00 0.00 0.00 0.00 0.00 0.00 136.00 67.01 0.00 0.00 3.45 504.53 0.06 94.00
WRITE: bw=114MiB/s (120MB/s), 114MiB/s-114MiB/s (120MB/s-120MB/s), io=10.0GiB (10.7GB), run=89709-89709msec
fio --name=myjob --randrepeat=0 --ioengine=libaio --direct=1 --size=10G --filename=testfile \QD=32, W=1
--bs=512k \
--iodepth=1 \
--rw=write \
--numjob=2
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
sdc 0.00 0.00 0.00 0.00 0.00 0.00 266.00 132.01 0.00 0.00 3.47 508.18 0.10 99.20
sdd 0.00 0.00 0.00 0.00 0.00 0.00 266.00 132.01 0.00 0.00 3.44 508.18 0.10 99.60
sde 0.00 0.00 0.00 0.00 0.00 0.00 267.00 132.51 0.00 0.00 3.47 508.19 0.09 99.60
sdf 0.00 0.00 0.00 0.00 0.00 0.00 266.00 132.01 0.00 0.00 3.35 508.18 0.09 98.40
WRITE: bw=248MiB/s (260MB/s), 124MiB/s-124MiB/s (130MB/s-130MB/s), io=20.0GiB (21.5GB), run=82449-82452msec
fio --name=myjob --randrepeat=0 --ioengine=libaio --direct=1 --size=10G --filename=testfile \QD=32, W=2
--bs=512k \
--iodepth=32 \
--rw=write \
--numjob=1
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
sdc 0.00 0.00 0.00 0.00 0.00 0.00 202.00 99.51 0.00 0.00 45.64 504.46 8.86 90.00
sdd 0.00 0.00 0.00 0.00 0.00 0.00 205.00 101.01 0.00 0.00 48.43 504.57 9.50 90.00
sde 0.00 0.00 0.00 0.00 0.00 0.00 197.00 97.01 0.00 0.00 45.09 504.26 8.46 97.60
sdf 0.00 0.00 0.00 0.00 0.00 0.00 205.00 101.01 0.00 0.00 46.79 504.57 9.19 89.60
WRITE: bw=188MiB/s (197MB/s), 188MiB/s-188MiB/s (197MB/s-197MB/s), io=10.0GiB (10.7GB), run=54462-54462msec
fio --name=myjob --randrepeat=0 --ioengine=libaio --direct=1 --size=10G --filename=testfile \QD=32, W=4
--bs=512k \
--iodepth=32 \
--rw=write \
--numjob=2
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
sdc 0.00 0.00 0.00 0.00 0.00 0.00 193.00 95.51 0.00 0.00 38.73 506.74 7.10 93.20
sdd 0.00 0.00 0.00 0.00 0.00 0.00 191.00 94.51 0.00 0.00 34.30 506.68 6.17 84.00
sde 0.00 0.00 0.00 0.00 0.00 0.00 194.00 96.01 0.00 0.00 38.01 506.76 6.99 98.00
sdf 0.00 0.00 0.00 0.00 0.00 0.00 194.00 96.01 0.00 0.00 33.31 506.76 6.06 86.80
WRITE: bw=218MiB/s (229MB/s), 109MiB/s-109MiB/s (114MB/s-114MB/s), io=20.0GiB (21.5GB), run=93777-93908msec
fio --name=myjob --randrepeat=0 --ioengine=libaio --direct=1 --size=10G --filename=testfile \
--bs=512k \
--iodepth=32 \
--rw=write \
--numjob=4
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
sdc 0.00 0.00 0.00 0.00 0.00 0.00 272.00 135.50 0.00 0.00 32.98 510.13 8.43 89.60
sdd 0.00 0.00 0.00 0.00 0.00 0.00 269.00 134.00 0.00 0.00 41.76 510.11 10.70 97.20
sde 0.00 0.00 0.00 0.00 0.00 0.00 271.00 134.51 1.00 0.37 35.55 508.27 9.10 95.20
sdf 0.00 0.00 0.00 0.00 0.00 0.00 271.00 134.51 1.00 0.37 34.84 508.27 8.88 93.60
WRITE: bw=242MiB/s (254MB/s), 60.6MiB/s-60.7MiB/s (63.5MB/s-63.6MB/s), io=40.0GiB (42.9GB), run=168747-169093msec
Note: increasing the io-depth (QD) really helps with the reads, but has the opposite effect on the writes, which slow down. The writes appear to improve by increasing the number of workers.
Practical usage
Now that I have a benchmark it’s time to run some tests using regular tools. My main use-case is to backup a few TBs of data from the NAS to a different box. The second use-case write hundreds of photos in the .raw format onto the NAS. I’m going to run all the tests between the NAS drives and the PCI-e NVME as only one of those drives will easily out perform the entire raid. Doing so will remove any IO bottleneck between the devices. There is an added challenge with testing between onboard SATA devices as the total SATA bandwidth on the mother board is not a sum of all the SATA ports combines. My mother board has 8xSATA6 and 2xSATA3 ports giving it a total theoretical speed of 54 Gbps (6.75 GB/s). It’s impossible to run all the drives at full speed to achieve that throughput, though I have never tried to find the max SATA bandwidth for my motherboard, I doubt that it will ever go over 16 Gbps (2 GB/s). I don’t have direct proof, but I’d surprised that a consumer chip-set can handle any more bandwidth. After all, how many people run a 6-drive SSD RAID array on their consumer motherboard?
rsync
timelimit -t 180 -T 180 cat /dev/random > /mnt/nvme0/testfilecopy the test file make sure to start the copy from the nvme
read from raid
to NVME
rsync --progress --stats /mnt/storage/testfile /mnt/nvme0/testfile testfile
35,941,646,336 100% 184.59MB/s 0:03:05 (xfr#1, to-chk=0/1)
iostat for all the devices shows:
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util nvme0n1 13.00 0.05 0.00 0.00 0.85 4.00 12668.00 1542.88 52.00 0.41 7.65 124.72 76.97 100.00 Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util sdc 122.00 103.97 93.00 43.26 3.49 872.66 0.00 0.00 0.00 0.00 0.00 0.00 0.13 78.40 sdd 25.00 10.53 2.00 7.41 7.56 431.36 0.00 0.00 0.00 0.00 0.00 0.00 0.14 25.60 sde 120.00 34.96 4.00 3.23 3.69 298.33 0.00 0.00 0.00 0.00 0.00 0.00 0.21 76.40 sdf 149.00 78.57 190.00 56.05 3.72 539.95 0.00 0.00 0.00 0.00 0.00 0.00 0.23 78.00
Although the write speed for nvme0n1 shows 1542.88 MB/s, it’s not a constant value. The entire transfer is quite spiky as the NVME cache is periodically written to the drives at high speed and then the speed drops to 0 as the cache is refilled.
to SATA SSD
$ rsync --progress --stats ../storage/testfile . testfile 5,368,709,120 100% 179.77MB/s 0:00:28 (xfr#1, to-chk=0/1)
From the synthetic tests and the cached write, we know that the Raid array is capable of faster writes. Inspecting the secondary SSD with
‘nmon’ reveals that it’s only being read from at 200MB/s which points to ‘rsync’ being the bottleneck.
write to Raid
from NVME
rsync --progress --stats /mnt/nvme0/testfile /mnt/storage/
testfile
51,304,464,384 100% 220.12MB/s 0:03:42 (xfr#1, to-chk=0/1)
iostat for all the devices show:
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util nvme0n1 2196.00 274.50 0.00 0.00 0.05 128.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 98.80 Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util sdc 19.00 0.07 0.00 0.00 14.63 4.00 334.00 117.25 2.00 0.60 14.60 359.49 4.42 94.80 sdd 0.00 0.00 0.00 0.00 0.00 0.00 334.00 117.25 2.00 0.60 12.50 359.49 3.49 90.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 307.00 116.80 4.00 1.29 12.75 389.60 3.29 80.40 sdf 0.00 0.00 0.00 0.00 0.00 0.00 307.00 116.80 4.00 1.29 12.60 389.60 3.26 80.80
from SATA SSD
$ rsync --progress --stats ../workers/testfile . testfile 5,368,709,120 100%; 189.69MB/s 0:00:26 (xfr#1, to-chk=0/1)
A re-run of the copy does not read from the SSD drive. Instead it uses the system cache. The write speed is in line with the expectationof 360MB/s, but I don’t think this is actually copying any data to the drive:
$ rsync --progress --stats ../workers/testfile testfile2 testfile 5,368,709,120 100% 353.88MB/s 0:00:14 (xfr#1, to-chk=0/1)
transfer between SATA SSDs
quick test to see the speeds between the two SATA SSDs. The SATA SSDs I have here are not the best but I’d still expect them to pull a higher speed between them than what the RAID could achieve.
$ rsync --progress --stats ~/testfile . testfile 5,368,709,120 100% 159.97MB/s 0:00:32 (xfr#1, to-chk=0/1)
This is an interesting result. I don’t understand why the transfer between the two SSDs is so slow. It’s possible that the book device is actually running at 3Gbps instead of the 6Gb, but that result is only around 1.48Gbps. No idea here.
Explanation
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.
curl
write to Raid
from SATA SSD
$ curl -o testfile FILE:///mnt/workers/testfile
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 8957M 100 8957M 0 0 397M 0 0:00:22 0:00:22 --:--:-- 308M
read from Raid
to SATA SSD
$ curl -o testfile FILE:///mnt/storage/testfile
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 5120M 100 5120M 0 0 197M 0 0:00:25 0:00:25 --:--:-- 178M
Explanation
Curl looks to be quicker than ‘rsync’ when it comes to writing to the Raid-10 array, while the read speeds remain similar.
mdadm sync
Output from mdadm sync:
$ cat /proc/mdstatThe speed of 189037K/s lines up with what we are getting from rsync.
Personalities : [raid10] [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4]
md0 : active raid10 sde[1] sdd[0] sdg[4] sdf[2]
7813772288 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
[===================>.] check = 98.8% (7726865408/7813772288) finish=7.6min speed=189037K/sec
bitmap: 0/59 pages [0KB], 65536KB chunk
unused devices: <none>
Summary
‘rsync’, ‘curl’, ‘pv’ all seems to be quite slow at reading from the Raid-10 array for some reason. I can only get read speeds of about 200 MB/s using any of these tools.
There might be a bottleneck between the motherboard storage controllers which the drives are attached to. The Raid-10 array is plugged into SATA 6 ports powered by the AsMedia controller, while the two SSDs are driven through the Intel controller. To test this I’m going to plug in a PCI-E SATA 6 extension card and connect the secondary SSD to it.
Conclusion
This wasn’t as easy as I thought it would be. The clearest part was that the synthetic tests were always going to produce better results than what you’d expect from real-file scenarios. I’m glad about two things:
- I now know what the limits of the Raid-10 array are
- My knowledge in the area of storage has improved
Clearly there is a huge difference in performance when it comes to the software you are using. Going from write speeds of 309MB/s in an artificial test, to 268 MB/s with curl down to 190MB/s with ‘rsync’ really proves that you have to understand your use case in order to choose the right hardware and follow that up with appropriate configuration. I believe my use cases are ‘ticked’ because I know how fast my array can read/write. When it comes to the right speed I just want to understand what is going on. The speeds I”m getting suite me just fine. It”s the understanding of why am I seeing what I am seeing that is important. Getting the most out of a product is satisfying especially from the point of view of an engineer.
What sort of speeds can I expect to see?
The writes are consistently around the 200MB/s mark with the ability to hit 300MB/s when utilizing multiple threads. The reads seem to be 180MB/s, which is a third of what the synthetic tests can do.
Am I getting all that I can from the drivers I’ve chosen?
No. The synthetic tests do get 600MB/s in reads but the apps I use cannot match that. I tried running multiple copy jobs at the same time, but they all summed up to the read speed of 200 MB/s. I doubt this configuration can go any faster.
Based on the speeds, does it make sense to upgrade from 1Gb ethernet?
No. We rely on ‘rsync’ for the most part and it only reaches around 190MB/s on the writes which won’t exhaust 2.5Gb (312.5 MB/s). Windows copy might be able to make use of the extra bandwidth and come close to curl’s write speeds. We hardly ever read data at the same time from other devices. Since we cannot get anywhere near the 600MB/s read speeds we saw during the synthetic test, we can stay on 1Gb for the foreseeable future.
Possibilities for the future
Not all hope is lost however. There are lots of other things we can try, however most of them would require a rebuild of the array. I’d would also love to try this out with some different drives which come with higher transfer speeds and bigger caches. Here is a short brain dump of what we can do to explore further:
- hardware raid to add a SSD drive as a cache – there are some experimental packages out there that can help with adding a SSD to ‘mdadm’ Raid arrays.
- a different file system
- a storage OS
- fiber optic LAN to get away from the presumed chipset bottleneck
- a different Raid setup. I did try Raid-5 initially but the sync took days so the drives were continuously spinning for days.
- the ”far layout”. Mine is ‘Layout : near=2’. For this I have to rebuild the array.