Increase the speed of Linux Software RAID reconstruction
If you are in a situation where you sit in front of the console (or on a remote ssh connection) waiting for a Linux software RAID to finish rebuilding (either you added a new drive, or you replaced a failed one, etc.) then you might be frustrated by how slow this process is running. You are running cat on /proc/mdstat repeatedly (you should really use watch in this case
), and this seems to never finish… Obviously that there is a logical reason for this ‘slowness‘ and on a production system you should leave it running with the defaults. But in case you want to speed up this process here is how you can do it. This will place a much higher load on the system so you should use it with care.
To see your Linux kernel speed limits imposed on the RAID reconstruction use:
cat /proc/sys/dev/raid/speed_limit_max
200000
cat /proc/sys/dev/raid/speed_limit_min
1000In the system logs you can see something similar to:
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
This means that the minimum guaranteed speed of the rebuild of the array is approx 1MB/s. The actual speed will be higher and will depend on the system load and what other processes are running at that time.
In case you want to increase this minimum speed you need to enter a higher value in speed_limit_min. For example to set this to approx 50 megabytes per second as minimum use:
echo 50000 >/proc/sys/dev/raid/speed_limit_minThe results are instant… you can return to the watch window to see it running, and hope that this will finish a little faster (this will really depend on the system you are running, the HDDs, controllers, etc.):
watch cat /proc/mdstat>
6th September 2006, 04:39
Thank you for this nice tip. But what value should I set to speed_limit_min? If I give it a too high value (e.g. 100000), will it hurt my raid?
8th September 2006, 14:41
That should be ok and not cause any problems. Probably the system will not be able to reach that speed but by no means it will not hurt the system. If you are doing this with an idle system and wait to complete the RAID build this should be ok. If not, you might see the system responding much slower, as it will dedicate more resource to the RAID restoration.
21st September 2006, 01:01
Ahh, thanks. Didn’t help too much on these linked up IDEs, but 30 MB/s went to about 32, time dropped from 85 to 65 minutes within an instant.
3rd January 2007, 23:26
I’m desperately searching through the web the cause to the slow reconstruction speed of my Ubuntu Edgy raid1 array. It seems that my system is totally unsensible to variation in speed_limit_max and speed_limit_min.
Whatever value I set in those variables, resynch speed stucks between 4000 and 6000 Kb/sec.
Yes, my “diskparm -Tt” is ok while the system is healthy. Yes, those orrible performances are independent on system load and usage.
I’m starting to suspect that there are some issues with Ubuntu itself. I think I’m going to signal it on official forums, but it’s important to spread the web with this message: poor Ubuntu mdadm users: you’re are not alone!
2nd April 2007, 04:02
I cannot seem to write to this file as root:
root ~ # echo 50000 > /proc/sys/dev/raid/speed_limit_min
bash: /proc/sys/dev/raid/speed_limit_min: cannot overwrite existing file
root ~ # ls -l /proc/sys/dev/raid/speed_limit_min
-rwxr–r– 1 root root 0 Apr 2 00:00 /proc/sys/dev/raid/speed_limit_min
I am in the process of resynching.. does this lock the file?
2nd April 2007, 06:53
Steve:
“does this lock the file?” No. You can change it on the fly. The effect is instant (though you will not see this immediately on the real speed of the sync because it will depend on many other factors). Anyway regarding your questions you can change this at any time without anything special and the effect should be instant.
Why you are not able to write on the file? I am not sure… It might be something related to your OS/Kernel/SE/Permissions, etc. I can’t tell with the given information only.
HTH.
13th April 2007, 18:15
Doesn’t work on CentOS 4.4. the speeds always stay the same. I did see them change once but it went from 23000 down to 18000 after setting min and max to 50000. Since 23000 was over 20000 (which was supposed to be the max–according to /proc . . .) I don’t think it’s working at all. This is an old kernel. Centos is using 2.6.9.
4th June 2007, 06:26
I also had the “permission denied” problem on my ubuntu 6.06 machine. The trick is to REALLY switch user to root, as plain sudo does not work…
Try:
sudo -s -H
and then
echo 50000 >/proc/sys/dev/raid/speed_limit_min
and instantly I was down from 7000minutes to 240!!
)
17th November 2007, 17:57
Great tip! Thank you!
I was struggling with my RAID recovering at no more than 256KB/s (min_speed was set to 100) with almost no I/O nor CPU activity. Now, with min_speed set to 5000, I get ~5MB/s, which is enough. Interesting that the system does seem to care about the max_speed (set to 100000) value too much.
19th February 2008, 03:31
[...] Now’s the time to go get a drink, see a movie, wash the car/dog/cat, file your taxes AND take a nap. This will take a long time, depending on the amount of data on the drives and the speed of your system. On my system, it reported 1500 min (that’s 25 hours) when it started. You can speed this up some (at the expense of load on the system), by following the advise here. [...]
3rd November 2008, 11:31
regarding permission denied problem: “sudo echo” does not do the trick, because the second part of the command (redirection to /proc/sys/dev/raid/speed_limit_min should have root privileges). what does work is
echo 10000 | sudo tee /proc/sys/dev/raid/speed_limit_min
27th November 2008, 01:38
Unfortunately this doesn’t seem to help in our chase – 2 sata disks which are seen as SCSI by RHEL5 (2.6 kernel) and the raid rebuild is very very slow (1000-500K/sec).
changing speed_limit_min makes no difference.
anyone else got ideas ? there are no errors or other obvious problems.
-j
6th December 2008, 14:56
[...] the sync speed is only around 1000KB/s. I tried looking for solutions on Google and found this and this. The solution I got didn’t help increase the sync speed, nor the slow response I was getting [...]
7th March 2009, 01:35
Thank you! I was growing a full 3TB raid up to 7.5TB (6×1.5TB raid5) and it was supposed to take 15 _days_. With this tweak I’m down to ~32hours! YAY!
7th October 2009, 06:00
any ideas how to do this on a hardware or dm raid?
i’ve an 4gb reconfiguring to 6.8 and i’ve about 14 days remaining
6th January 2010, 06:15
My problem is that writing or reading files to the array is extremely slow, less than 1MiB/sec
I have 5 x 1TB drives on SATA 1.5gbps ports using MDADM on a system with 1gb RAM, and a 2ghz single core AMD cpu.
Any help would be appreciated.
6th January 2010, 18:38
@Chas: is the speed of the individual drives faster than what you get on the raid partition? I would say that the speed penalty of this comes from the speed of the actual drives as mdadm is very fast normally.
7th January 2010, 02:10
I haven’t checked that, but it is difficult to imagine a modern hard drive anything like that slow, I’ll check and post more info.
9th January 2010, 06:35
Well I think the problem is the sil3114, after some Googleing, I found that it has issues with drives of 1TB +
It is possible that a newer firmware on the card may help .. I have 2 cards and one I used a year or so back with some 320GB drives in SATA controller mode and it worked great, THAT card hangs after displaying the model of the first drive .. the other one seems to have been assembled with a flash chip that is not supported by the mfg’s flash utility (?!) and so it is in raid mode, however Linux still detected the drives individually (bypassing the cards firmware?!) using hdparm showed no DMA flag at all, so it is probably using PIO mode.
I will try to re-flash the first card and see if things work any better, I might find a flash chip on ebay and do a bit-o field engineering
14th January 2010, 07:08
con’t — It turned out I’d rather just have it work, it was less expensive to get an ASROCK A780LM motherboard from Newegg, than to buy a better controller card ($62.55 USD shipped). it has 6 SATA II ports, and built in Gbit Ethernet. Now I just need to figure out why its only getting 100/Mbit link speeds. Its a great board for a file server. I just transplanted the CPU/RAM form the old one.
PCLinuxOS 2009.2 running on:
AMD Sempron 1150LE (2ghz 45W)
ASROCK A780LM integrated everything board (one IDE, 6 SATA II ports)
2GB Corsair DDR2 5300 RAM
5 WDC green 1TB SATA drives in software raid5
1 laptop 120GB SATA boot drive
The RAID transfer rate over the network is now about 9.94 Mbytes/sec, pretty close to the max for 100Mbit.
28th January 2010, 17:51
Hi
First of all thank you for this nice explanation. I am Running Fedora 10 with the Kernel 2.6.27.41-170.2.117.fc10.x86_64 my system is a core2duo 2.66Ghz, 4GB Memory, and i have a Software Raid 5 with 7x1TB running. I change the settings like described here but my reshape speed is like this :
[==>..................] reshape = 11.2% (108814776/969924224) finish=1065.4min speed=13467K/sec
I can’t get higher speed, and during the reshape if i have a look at top my system is 70% idle
I think it really depends which disk u have, which controllers are used and which kernel and distri are in use. Grrr waiting and watch
14th March 2010, 20:45
@phantomd This is why I ditched the 1.5Gbps 4-port PCI card with the crappy bios.
I looked at my options and found that a motherboard with 6 SATA ports @ 3Gbps integrated was less expensive then most good 4 port cards.
If only I’d known about using chunk sizes of 128kb or even 256kb to speed things even more. As it stands with 32kb chunks, I get about 16 to 20Mbytes/sec over Gbit Ethernet. I may swap the CPU with another machine to an Athlon64 x2 @ 2.5Ghz (65W).
Good luck
15th April 2010, 20:34
On an AMD Phenom II system, I was able to drop the rebuild time from almost 8,000 minutes down to about 3 – 4 hours with almost no impact on my desktop performance. Let’s face it, RAID isn’t a huge CPU load so on a modern system, the defaults are probably way too low.
Considering that the entire time the array is rebuilding, you are vulnerable to a disk failure, getting the array rebuilt should be a priority item (unless you have two or more redundant drives).
My latest drive “failure” came from a loose cable. This is the third such incident since I built the array. I only open the box to replug the cables, so I’m wondering if the SATA connectors are more prone to working loose than the PATA connectors were?.
I notice that some SATA cables have a clip to hold them in place while others are just press-fit. Does anyone have any comments on the relative merits of the two types?
16th April 2010, 08:14
I’m having exactly the same problems (SATA connectors loosing contact without being moved). The connector type with the latch is less prone to this, but it occurs only less often. I’d say every 3-6 months. Reseating them helps for further 3-6 months. I’ve tried buying new ones but no change.
I know there are so called contact cleaner sprays that reduce contact resistance, but I haven’t tried one yet.
26th May 2010, 14:58
I like this video production. What camera did you use?
18th June 2010, 19:24
Thanks a lot for this tip. Really useful piece of information.