I had setup a mirrored RAID in OMV and then the PC hardware, not the disks I should emphasize, failed and I needed to move the system disk and two data disks to other hardware.
Because I decided to test each disk separately this broke the mirror and in OMV I now saw:
Work off line
I was really keen to keep the two disks in the same state and not have them accidentially updated by other servers on my network. So for the entire time for this exercise I kept the OMV server and my desktop completely off the lan with a static IP set on my PC. I would strongly suggest you think of something similar.
Work off line
At this point I had a spare blank disk around and so took out one of the drives from the array, put the new one in, did a “Recover” in OMV and rebuilt the array, when the array was built I took the spare disk out. The point was if everything went to custard I could use the spare disk, which I had now turned into a disk from the mirror, to rebuild the array from scratch as it were.
If I login to OMV using ssh then if I run:
ARRAY /dev/md/Mirror metadata=1.2 name=nas1:Mirror UUID=391fb756:204b7361:f368c71b:fb4eea09
ARRAY /dev/md/Mirror metadata=1.2 name=nas1:Mirror UUID=391fb756:204b7361:f368c71b:fb4eea09
As you can see the two listed arrays are identical.
From the earlier screen shots you can see it thinks there are now two arrays md126 and md127. I can query each of these:
mdadm --detail /dev/md126
Gives
/dev/md126:
Version : 1.2
Creation Time : Thu Nov 10 19:41:47 2016
Raid Level : raid1
Array Size : 2930135488 (2794.40 GiB 3000.46 GB)
Used Dev Size : 2930135488 (2794.40 GiB 3000.46 GB)
Raid Devices : 2
Total Devices : 1
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Wed Dec 23 09:37:53 2020 State : active, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Name : nas1:Mirror (local to host nas1)
UUID : 391fb756:204b7361:f368c71b:fb4eea09
Events : 940496
Number Major Minor RaidDevice State
2 8 32 0 active sync /dev/sdc
1 0 0 1 removed
Then if I run:
Then if I run:
I get:
/dev/md127:
Version : 1.2
Creation Time : Thu Nov 10 19:41:47 2016
Raid Level : raid1
Array Size : 2930135488 (2794.40 GiB 3000.46 GB)
Used Dev Size : 2930135488 (2794.40 GiB 3000.46 GB)
Raid Devices : 2
Total Devices : 1
Persistence : Superblock is persistent
Intent Bitmap : Internal
<code>Update Time : Wed Dec 23 09:59:36 2020 State : active, degraded</code>
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
<code> Name : nas1:Mirror (local to host nas1) UUID : 391fb756:204b7361:f368c71b:fb4eea09 Events : 942450 Number Major Minor RaidDevice State 0 0 0 0 removed 3 8 16 1 active sync /dev/sdb</code>
So nearly identical, the problem was how to join them.
If you want to see the details for one disk in the array you can run:
mdadm --examine /dev/sdc
This gives
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 391fb756:204b7361:f368c71b:fb4eea09
Name : nas1:Mirror (local to host nas1)
Creation Time : Thu Nov 10 19:41:47 2016
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
Array Size : 2930135488 (2794.40 GiB 3000.46 GB)
Used Dev Size : 5860270976 (2794.40 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : fbd7e6a5:7e687f1e:4bd1c7cb:42945d52
Internal Bitmap : 8 sectors from superblock
Update Time : Wed Dec 23 09:37:53 2020
Checksum : f4cb61d7 - correct
Events : 940496
Device Role : Active device 0
Array State : A. ('A' == active, '.' == missing)
And for the other disk run:
mdadm --examine /dev/sdb
Giving:
/dev/sdb:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 391fb756:204b7361:f368c71b:fb4eea09
Name : nas1:Mirror (local to host nas1)
Creation Time : Thu Nov 10 19:41:47 2016
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 2930135488 (2794.40 GiB 3000.46 GB)
Used Dev Size : 5860270976 (2794.40 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 665bdd07:200d588f:1a7bf968:f1b050ee
Internal Bitmap : 8 sectors from superblock
Update Time : Wed Dec 23 10:06:38 2020
Checksum : 8c53478d - correct
Events : 943424
Device Role : Active device 1
Array State : .A ('A' == active, '.' == missing)
The resolution
As with some of these things it can be a simple trick to fix, it’s just a ton of Googling to find it.
You need to run:
mdadm --manage /devmd126 --re-add /dev/sdc
Unfortunately this will give you an error:
mdadm: error opening /devmd126: No such file or directory
To fix this you need to run:
mdadm --stop /dev/md127
Now the problem is that because I had set this up as a share one of the current mirrors, md126 or md127 will be serving the SMB share. so you may get back:
mdadm: Cannot get exclusive access to /dev/md127:Perhaps a running process, mounted filesystem or active volume group?
If happens try to stop the other mirror so try:
mdadm --stop /dev/md126
Which gives:
mdadm: stopped /dev/md126
So now the mirror md127 is still running and you will recall earlier you ran “mdadm –detail /dev/md127” which included the line:
3 8 16 1 active sync /dev/sdb
So you need to readd the OTHER disk, which is sdc. So run the following:
mdadm --manage /dev/md127 --re-add /dev/sdc
Which returns:
mdadm: re-added /dev/sdc
Now in OMV you see the RAID is fixed:
At this point it’s fixed, see it was simple, once you got to the bottom of it.
Testing I did
I was quite concerned that the merged array was working. I when it was up I went to one of the shares and deleted some files and added one large video file.
Then I took one drive out and restarted OMV and check the files were as expected and I could play the video
Then I swapped and took other the other drive and connected the one I just removed. Again I checked the share and checked I could play the video.
So for myself I believe it is working ok
What are events
When you do a “–details” or “–examine” you will see a line that looks like:
Events : 944167
I assumed that this number should be constantly increasing whenever I changed the file system. I now don’t think that is the case and although I am not completely sure, I think it is related to “events” in the array, as in things that have happened on the array – I don’t believe just because the number doesn’t change or it goes up, that anything is broken.
It is also very important to note that the listed “events” is not real time or does not appear to be. You should give it a few minutes and repeat the command and you will often see the number change.
If there have been no file changes I have found that the numbers reported for each disk will stablize and be the same number.
Spares Missing Event emails
After fixing the array you may find you get emails from OMV with a subject similar to:
SparesMissing event on /dev/md/Mirror:nas1 [nas1.cantabrian]
If you do follow the instructions in the post: Spares missing event emails after RAID changes