MKV header & raw file recovery
Posted: Mon Mar 28, 2016 1:35 am
Working on two hard drive which contained many movies in MKV format, I discovered an issue (probably easy to fix) in R-Studio's raw file detection ability for that format.
The first HDD has an intact file system, the files have been simply deleted, and R-Studio (v7.7) finds them with their original names and attributes. However, some of the MKVs also appear as "Extra found files" (with a "link" symbol), but not all of them.
Then the second HDD has a severely corrupted file system (about 25GB have been filled with 0's), so no partition structure of file tree can be identified, and surprisingly R-Studio only manages to find three MKV files, all three truncated at a few KB, whereas Photorec finds many full length MKV files, perfectly readable. If I examine those files with WinHex, it appears that there are two different headers, and R-Studio only detects one.
[1] 1A 45 DF A3 93 42 82 88 6D 61 74 72 6F 73 6B 61 > detected by R-Studio
[2] 1A 45 DF A3 A3 42 86 81 01 42 F7 81 01 42 F2 81 04 42 F3 81 08 42 82 88 6D 61 74 72 6F 73 6B 61 > not detected by R-Studio
WinHex has a file carving function and recognizes MKV files with both headers by default. The "File Type Signatures Search.txt" file inside WinHex directory does indeed contain both headers definitions :
Matroska mkv;mka (matroska|\x01\x42\xF7\x81\x01\x42\xF2\x81) 8 10485760
(Which means : at offset "8" there can be either "matroska" = type [1] or "01 42 F7 81 01 42 F2 81" = type [2], and default length will be 10485760 bytes.)
And indeed if I examine the files from the first HDD, those which appear in "Extra found files" all have the same type of header [1], those which do not have the other type [2].
The other issue is the file length. Apparently R-Studio cuts the MKV files after detecting a certain number of "00" bytes, yet many files, including those MKVs, can have quite a large number of null bytes at any point (in this case right after the header), so it can't be a good way to determine the ending of a file. The default behaviour should be : consider that the file continues unless another file start is detected. I tried to create custom settings for MKV so as to detect all those files with R-Studio, but so far with no success.
(Besides, R-Studio can't preview MKV files like it can do for other video formats, in such a case it would be nice to have at least the possibility of using an external program, like VLC Media Player.)
So, could this issue be fixed in a future update ? And how could I create a customized MKV definition so that it could recognize both types of headers and find the correct file length, in such a simple case where the files are mostly one right after the other on the HDD ?
The first HDD has an intact file system, the files have been simply deleted, and R-Studio (v7.7) finds them with their original names and attributes. However, some of the MKVs also appear as "Extra found files" (with a "link" symbol), but not all of them.
Then the second HDD has a severely corrupted file system (about 25GB have been filled with 0's), so no partition structure of file tree can be identified, and surprisingly R-Studio only manages to find three MKV files, all three truncated at a few KB, whereas Photorec finds many full length MKV files, perfectly readable. If I examine those files with WinHex, it appears that there are two different headers, and R-Studio only detects one.
[1] 1A 45 DF A3 93 42 82 88 6D 61 74 72 6F 73 6B 61 > detected by R-Studio
[2] 1A 45 DF A3 A3 42 86 81 01 42 F7 81 01 42 F2 81 04 42 F3 81 08 42 82 88 6D 61 74 72 6F 73 6B 61 > not detected by R-Studio
WinHex has a file carving function and recognizes MKV files with both headers by default. The "File Type Signatures Search.txt" file inside WinHex directory does indeed contain both headers definitions :
Matroska mkv;mka (matroska|\x01\x42\xF7\x81\x01\x42\xF2\x81) 8 10485760
(Which means : at offset "8" there can be either "matroska" = type [1] or "01 42 F7 81 01 42 F2 81" = type [2], and default length will be 10485760 bytes.)
And indeed if I examine the files from the first HDD, those which appear in "Extra found files" all have the same type of header [1], those which do not have the other type [2].
The other issue is the file length. Apparently R-Studio cuts the MKV files after detecting a certain number of "00" bytes, yet many files, including those MKVs, can have quite a large number of null bytes at any point (in this case right after the header), so it can't be a good way to determine the ending of a file. The default behaviour should be : consider that the file continues unless another file start is detected. I tried to create custom settings for MKV so as to detect all those files with R-Studio, but so far with no success.
(Besides, R-Studio can't preview MKV files like it can do for other video formats, in such a case it would be nice to have at least the possibility of using an external program, like VLC Media Player.)
So, could this issue be fixed in a future update ? And how could I create a customized MKV definition so that it could recognize both types of headers and find the correct file length, in such a simple case where the files are mostly one right after the other on the HDD ?