Linux Raid 5 recovery
Linux Raid 5 recovery
Hi.
Have a Synology Diskstation 1010+. Data RAID volume got corrupted and may have been reformatted. Raid5+hot spare, 5x2TB disks. Using R-Studio for Linux (Ubuntu).
Read a lot about RAID recovery here and elsewhere, but only found instructions on how to manually figure out disk order/offset using NTFS, not EXT3, which is what I have.
The Synology partitions each drive into 3 partitions. The last partition on each disk - starting at around 4.5GB in, is the one that has the RAID5 data.
I used several programs to try to recover disk order/offset, but have gotten quite confused. One program says the disk order is drives 3102, 8 sectors offset. Another identical 1010+ unit I have (which is working) shows (through mdadm) that the order on that unit for the RAID5 volume is drives 0123, 264 offset. On the working drive, I can mount that unit's RAID5 volume in Ubuntu with no problem.
To add to the confusion, I thought that recovery would only work if I created a virtual block RAID in R-Studio with the correct order for the 4 disks (leaving out the hot spare) and the correct offset. However, I have been able to partially recover some very large video files constructing the virtual RAID in either 1234 or 4213 disk order, which doesn't make sense to me. The recovered files have "glitches" every 30 or so seconds, but I would have thought that a recovered AVI wouldnt work at all if the disk order and/or offset weren't totally correct. Have been working on this on and off for several weeks and cannot figure out what should be the correct parameters.
Surely there must be a way I can look at the individual drives with a hex search to try to manually figure out the correct order/offset for that particular RAID volume (like the instructions here on NTFS RAIDs)? I have googled a lot but found no instructions for similar examination of ext2/3/4 component drives to find this out.
Many thanks for any input,
Jose
Have a Synology Diskstation 1010+. Data RAID volume got corrupted and may have been reformatted. Raid5+hot spare, 5x2TB disks. Using R-Studio for Linux (Ubuntu).
Read a lot about RAID recovery here and elsewhere, but only found instructions on how to manually figure out disk order/offset using NTFS, not EXT3, which is what I have.
The Synology partitions each drive into 3 partitions. The last partition on each disk - starting at around 4.5GB in, is the one that has the RAID5 data.
I used several programs to try to recover disk order/offset, but have gotten quite confused. One program says the disk order is drives 3102, 8 sectors offset. Another identical 1010+ unit I have (which is working) shows (through mdadm) that the order on that unit for the RAID5 volume is drives 0123, 264 offset. On the working drive, I can mount that unit's RAID5 volume in Ubuntu with no problem.
To add to the confusion, I thought that recovery would only work if I created a virtual block RAID in R-Studio with the correct order for the 4 disks (leaving out the hot spare) and the correct offset. However, I have been able to partially recover some very large video files constructing the virtual RAID in either 1234 or 4213 disk order, which doesn't make sense to me. The recovered files have "glitches" every 30 or so seconds, but I would have thought that a recovered AVI wouldnt work at all if the disk order and/or offset weren't totally correct. Have been working on this on and off for several weeks and cannot figure out what should be the correct parameters.
Surely there must be a way I can look at the individual drives with a hex search to try to manually figure out the correct order/offset for that particular RAID volume (like the instructions here on NTFS RAIDs)? I have googled a lot but found no instructions for similar examination of ext2/3/4 component drives to find this out.
Many thanks for any input,
Jose
Re: Linux Raid 5 recovery
As this RAID is a soft one, its devices contain metadata, including the RAID creation date. So, you may find when the RAID is created. You may use man mdadm and man dmraid commands to see more information on them.
Generally, such RAID systems have several partitions, the last one RAID6, the first ones is a mirror. The first partitions may contain the /var/log directory with various log files (dmesg, syslog and their backups). They contain logs of previous successful RAID assembly which includes the disk order and algorithm. When such logs are found, it's possible to find where the RAID starts. It's also possible to use the correct version of metadata, possibly to specify them in a command line. Such metadata for the current version of mdadm is stored at the beginning of the partition, and at the end for 0.90 (key –metadata=0.90).
If this doesn't help, it's possible to use very specific tables which increases like NTFS file records. They can be used to determine disk order and RAID parameters.
Generally, such RAID systems have several partitions, the last one RAID6, the first ones is a mirror. The first partitions may contain the /var/log directory with various log files (dmesg, syslog and their backups). They contain logs of previous successful RAID assembly which includes the disk order and algorithm. When such logs are found, it's possible to find where the RAID starts. It's also possible to use the correct version of metadata, possibly to specify them in a command line. Such metadata for the current version of mdadm is stored at the beginning of the partition, and at the end for 0.90 (key –metadata=0.90).
If this doesn't help, it's possible to use very specific tables which increases like NTFS file records. They can be used to determine disk order and RAID parameters.
Re: Linux Raid 5 recovery
OK, so I found some old logs on the first partitions of the drives (which are not RAID).
Looking at message logs from last year, I found this section which seems to show how the RAID was configured. It seems to show the disk order (sda3, sdb3, etc...) at the bottom, but I don't see any clues as to offset, block size, etc... Anything here that might be helpful and I am missing?
May thanks,
José
*************************************************************
Jul 28 17:31:43 kernel: [ 12.115576] VFS: Mounted root (ext4 filesystem) readonly on device 9:0.
Jul 28 17:31:43 kernel: [ 12.837837] Brand: Synology
Jul 28 17:31:43 kernel: [ 12.840778] Model: DS-1010+
Jul 28 17:31:44 kernel: [ 15.227989] md: md2: set sda3 to auto_remap [0]
Jul 28 17:31:44 kernel: [ 15.232634] md: md2: set sde3 to auto_remap [0]
Jul 28 17:31:44 kernel: [ 15.237394] md: md2: set sdd3 to auto_remap [0]
Jul 28 17:31:44 kernel: [ 15.242110] md: md2: set sdc3 to auto_remap [0]
Jul 28 17:31:44 kernel: [ 15.246770] md: md2: set sdb3 to auto_remap [0]
Jul 28 17:31:44 kernel: [ 15.279732] 0: w=1 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0
Jul 28 17:31:44 kernel: [ 15.284919] 3: w=2 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0
Jul 28 17:31:44 kernel: [ 15.290153] 2: w=3 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0
Jul 28 17:31:44 kernel: [ 15.295333] 1: w=4 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0
Jul 28 17:31:44 kernel: [ 15.300519] raid5: raid level 5 set md2 active with 4 out of 4 devices, algorithm 2
Jul 28 17:31:44 kernel: [ 15.308496] RAID5 conf printout:
Jul 28 17:31:44 kernel: [ 15.311894] --- rd:4 wd:4
Jul 28 17:31:44 kernel: [ 15.314713] disk 0, o:1, dev:sda3
Jul 28 17:31:44 kernel: [ 15.318338] disk 1, o:1, dev:sdb3
Jul 28 17:31:44 kernel: [ 15.321845] disk 2, o:1, dev:sdc3
Jul 28 17:31:44 kernel: [ 15.325416] disk 3, o:1, dev:sdd3
Looking at message logs from last year, I found this section which seems to show how the RAID was configured. It seems to show the disk order (sda3, sdb3, etc...) at the bottom, but I don't see any clues as to offset, block size, etc... Anything here that might be helpful and I am missing?
May thanks,
José
*************************************************************
Jul 28 17:31:43 kernel: [ 12.115576] VFS: Mounted root (ext4 filesystem) readonly on device 9:0.
Jul 28 17:31:43 kernel: [ 12.837837] Brand: Synology
Jul 28 17:31:43 kernel: [ 12.840778] Model: DS-1010+
Jul 28 17:31:44 kernel: [ 15.227989] md: md2: set sda3 to auto_remap [0]
Jul 28 17:31:44 kernel: [ 15.232634] md: md2: set sde3 to auto_remap [0]
Jul 28 17:31:44 kernel: [ 15.237394] md: md2: set sdd3 to auto_remap [0]
Jul 28 17:31:44 kernel: [ 15.242110] md: md2: set sdc3 to auto_remap [0]
Jul 28 17:31:44 kernel: [ 15.246770] md: md2: set sdb3 to auto_remap [0]
Jul 28 17:31:44 kernel: [ 15.279732] 0: w=1 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0
Jul 28 17:31:44 kernel: [ 15.284919] 3: w=2 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0
Jul 28 17:31:44 kernel: [ 15.290153] 2: w=3 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0
Jul 28 17:31:44 kernel: [ 15.295333] 1: w=4 pa=0 pr=4 m=1 a=2 r=4 op1=0 op2=0
Jul 28 17:31:44 kernel: [ 15.300519] raid5: raid level 5 set md2 active with 4 out of 4 devices, algorithm 2
Jul 28 17:31:44 kernel: [ 15.308496] RAID5 conf printout:
Jul 28 17:31:44 kernel: [ 15.311894] --- rd:4 wd:4
Jul 28 17:31:44 kernel: [ 15.314713] disk 0, o:1, dev:sda3
Jul 28 17:31:44 kernel: [ 15.318338] disk 1, o:1, dev:sdb3
Jul 28 17:31:44 kernel: [ 15.321845] disk 2, o:1, dev:sdc3
Jul 28 17:31:44 kernel: [ 15.325416] disk 3, o:1, dev:sdd3
Re: Linux Raid 5 recovery
algorithm =2 shows that the RAID is left-symmetric.
But what do you mean by saying "may have been reformatted"? Do you mean reassembling the RAID, or just formatting the RAID?
But what do you mean by saying "may have been reformatted"? Do you mean reassembling the RAID, or just formatting the RAID?
Re: Linux Raid 5 recovery
I mean that the Synology may have reformatted the RAID5 (data) partition. I can find 3 volumes in there with R-Studio and the one that was lost was done with a previous version of the firmware, so it has a different "number" on it (please see the recovered information on the enclosed .png files).
Have used several different sector offsets and the one that appears to give me the most "found files", with some of the previous tree structure, is 264. It is also the only setting where I get both a "direct volume" and a black "recognized" on the list of discovered volumes/partitions (please see enclosed screenshot files). However, I am still having the problem of glitches every 30 seconds or so when playing back the media files. A lot of the folder structure is not being recovered, although there appear to be several "inodes" and "superblocks" listed on R-Studio's recovered list.
Also, when trying to recover certain files, R-Studio gives me several errors stating that "file fragment is outside the size of the disc/volume/partition" or something similar. This RAID5 volume is the last one on the disks (partition 3 on each disk, starting at around 4.5 GB on each disk). I assume R-Studio would know how to configure this software raid volume without any other settings tweaks, correct?
I am about to give up on this recovery effort as it does appear the files are damaged. If I can see parts of the files, I assume my sector offset is correct (although I was able to see some files with other sector offsets). Given the above, if you or anyone else has any ideas of something else I should try, please do let me know. I would also appreciate if you could let me know if you see something obviously wrong on the screenshots.
Many thanks,
José
PS.: Unable to upload files - get a "Could not upload attachment..." error. Files are only 220 KB each... Let me know if you would like to see them and where to send them. Thanks!
Have used several different sector offsets and the one that appears to give me the most "found files", with some of the previous tree structure, is 264. It is also the only setting where I get both a "direct volume" and a black "recognized" on the list of discovered volumes/partitions (please see enclosed screenshot files). However, I am still having the problem of glitches every 30 seconds or so when playing back the media files. A lot of the folder structure is not being recovered, although there appear to be several "inodes" and "superblocks" listed on R-Studio's recovered list.
Also, when trying to recover certain files, R-Studio gives me several errors stating that "file fragment is outside the size of the disc/volume/partition" or something similar. This RAID5 volume is the last one on the disks (partition 3 on each disk, starting at around 4.5 GB on each disk). I assume R-Studio would know how to configure this software raid volume without any other settings tweaks, correct?
I am about to give up on this recovery effort as it does appear the files are damaged. If I can see parts of the files, I assume my sector offset is correct (although I was able to see some files with other sector offsets). Given the above, if you or anyone else has any ideas of something else I should try, please do let me know. I would also appreciate if you could let me know if you see something obviously wrong on the screenshots.
Many thanks,
José
PS.: Unable to upload files - get a "Could not upload attachment..." error. Files are only 220 KB each... Let me know if you would like to see them and where to send them. Thanks!
Re: Linux Raid 5 recovery
Yes, just sent them to my mail: raptorbck (at) gmail (dot) com .
Re: Linux Raid 5 recovery
I'd find a relatively large jpg file, preview it, and try to find a correct block size, judging on the stripe structure of the picture.
Re: Linux Raid 5 recovery
OK, so we are about ready to through in the towel...
As far as we can tell, the damaged RAID5 volume has the following attributes:
4 drives, using Partition 3 on the 4 drives (there is a hot spare, but that shouldn't matter).
Linux software raid inside a Synology 1010+.
Total volume size aprox. 5.4 TBytes
Left Synchronous
64K block
264 sector offset
I found several jpegs using the settings above, but they were fairly small. The biggest one was about 622 KB. Several folders previously in that volume did not show up even after a complete scan, so we can't find any larger jpegs.
This particular jpeg file opens up perfectly after recovery. Several other smaller jpegs do not open correctly, although 3 or 4 do. I would imagine that a 622 K jpeg would be big enough to ascertain if the stripe size/disk order/offset is correct, given our block settings - but I can't be sure (a 264-sectors offset is fairly large and I wonder if that might not be correct, but I also do not have enough experience to know if an incorrect sector offset might cause these symptoms, or if only the block size matters when viewing these recovered jpegs).
However, we do believe that these are pretty much the correct settings for the damaged RAID5 volume. Tried several block sizes, disk orders and offsets. None gave us a "direct volume" like with the settings above, and only these settings yielded a large amount of recovered files (aprox. 350GB on the direct volume, more after a scan - even though several of the media files we were able to recover have glitches in them). I think that the accidental reformat of the RAID5 volume by the Synology OS probably messed up some or all of the stripes in at least one RAID disk component. Even when the jpegs opened partially, the stripe order looked correct.
So, I think it's probably time to give up. After several weeks of partial success, I can only imagine that the entire volume got corrupted by the Synology's "parity check" routine. We also checked leaving one disk out (adding a "missing disk") and letting the program recover with the parity information, but the results were the same.
Any last thoughts welcomed!
Many thanks and all best,
Cleiber
As far as we can tell, the damaged RAID5 volume has the following attributes:
4 drives, using Partition 3 on the 4 drives (there is a hot spare, but that shouldn't matter).
Linux software raid inside a Synology 1010+.
Total volume size aprox. 5.4 TBytes
Left Synchronous
64K block
264 sector offset
I found several jpegs using the settings above, but they were fairly small. The biggest one was about 622 KB. Several folders previously in that volume did not show up even after a complete scan, so we can't find any larger jpegs.
This particular jpeg file opens up perfectly after recovery. Several other smaller jpegs do not open correctly, although 3 or 4 do. I would imagine that a 622 K jpeg would be big enough to ascertain if the stripe size/disk order/offset is correct, given our block settings - but I can't be sure (a 264-sectors offset is fairly large and I wonder if that might not be correct, but I also do not have enough experience to know if an incorrect sector offset might cause these symptoms, or if only the block size matters when viewing these recovered jpegs).
However, we do believe that these are pretty much the correct settings for the damaged RAID5 volume. Tried several block sizes, disk orders and offsets. None gave us a "direct volume" like with the settings above, and only these settings yielded a large amount of recovered files (aprox. 350GB on the direct volume, more after a scan - even though several of the media files we were able to recover have glitches in them). I think that the accidental reformat of the RAID5 volume by the Synology OS probably messed up some or all of the stripes in at least one RAID disk component. Even when the jpegs opened partially, the stripe order looked correct.
So, I think it's probably time to give up. After several weeks of partial success, I can only imagine that the entire volume got corrupted by the Synology's "parity check" routine. We also checked leaving one disk out (adding a "missing disk") and letting the program recover with the parity information, but the results were the same.
Any last thoughts welcomed!
Many thanks and all best,
Cleiber
Re: Linux Raid 5 recovery
My last thought is did you checked the "[40]Extra found files" on the Drive pane? This is the place where R-Studio puts files find using the scan for Known File Types (or file signatures). Probable, this is the last resort.
Re: Linux Raid 5 recovery
Every time we ran a full scan the "extra files" feature was on.
Do you think a 622K file should have been big enough to determine if disk order/blocksize/offset was correct? With 4 disks, 64K block I think a stripe would be either 64K or 256K, correct? (I read about this, but right now forgot how it is setup).
Thanks!
Do you think a 622K file should have been big enough to determine if disk order/blocksize/offset was correct? With 4 disks, 64K block I think a stripe would be either 64K or 256K, correct? (I read about this, but right now forgot how it is setup).
Thanks!