Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".

slaapliedje · Post by **slaapliedje** » Sat Sep 19, 2020 1:12 pm

Hello,

I had previously made an image of a 3tb disk I messed up at some point, and was attempting to figure out how to get files off of it, when the inevitable happened and I messed up the GPT on the disk I'd copied it to (it's been a bad few months for disks for me). I managed to recover everything else off the drive, but this one file. It says the recovery chance is good, but whenever I make enough space on another drive, it simply says there is not enough space. So I went overboard and bought an 8tb drive.. and it still gives me the same error.

It was formatted NTFS (which has a rather large maximum file size, and it was originally on an NTFS partition, so I know it should have been fine. But then I formatted the new drive to Ext4, and still get the same error.

I see a few other mentions of a bug with this particular issue. I'm running 4.9.3919 on Linux.
Thanks

abolibibelot · Post by **abolibibelot** » Sat Sep 19, 2020 2:49 pm

Not sure... From what I read, the allocation information in the MFT record for files larger than 4GB generally gets wiped when the file is deleted, but from what I understand about this particular case, that image file was not deleted, it's rather an issue preventing regular access to the partition. If the image file is in one fragment, and you know where it starts (that information should still be there in the MFT), you could create a new image (with dd / ddrescue, or a hexadecimal editor) starting from that offset and with a size corresponding to the exact size of the file, if it appears somewhere. Or maybe there's a chance to fix the issue preventing regular access, but since you have plenty of space on that new 8TB drive, and since you seem to be in the middle of a bad luck streak, it would be wise to make a backup by whatever crude method that works, before attempting any in-place fix.
By v. 4.9.3919, you mean R-Studio or R-Undelete ? For R-Studio it would be a very old version. I haven't much experience with R-Undelete. Does it also display the list of sectors somewhere ? (In R-Studio : right-click on a file, “View/Edit”, then on the bottom left area click on the “Sectors” tab on the far left, beside “Properties”, it's often hidden if the window is too small. If the file is contiguous, i.e. in 1 fragment, there should be no discontinuity in the “Parent sector” column ; if a file is fragmented, at some points it jumps from a range of incremental values to a completely different value.) Recuva on Windows makes this quite straightforward, saying something like : “2292417 cluster(s) allocated at offset 325942721” for a 9 389 736 093 bytes MKV file, which means that, if push came to shove, it could be flawlessly extracted without relying on the allocation information (but Recuva probably wouldn't work if the partition is no longer accessible because of that GPT issue). Another option is to directly examine the corresponding MFT record. I did that once, it's a bit tricky but interesting ; for a non fragmented file there's only one field, like, for the aforementioned example file : 43 C1 FA 22 C1 7D 6D 13. The first byte (43) means that the number of allocated clusters (which comes first) is coded in 3 bytes, and the location of that cluster run (which comes second) is coded in 4 bytes ; since it's in “little endian” it's read backwards, so it translates as : 22 FA C1 = 2292417 clusters are allocated to that file starting from cluster n° 13 6D 7D C1 = 325942721 ; with a common cluster size of 4096 bytes, it translates as 9389740032 bytes starting from offset 1335061385216. If there are a few fragments though not an overwhelming number, say less than 10, it's still manageable to write a dd or ddrescue script by decoding the cluster runs manually. (I once successfully pieced together a bunch of files which were each in thousands of fragments, from a partial image from which they couldn't be recovered by R-Studio or WinHex or anything because the MFT was incomplete, and the HDD was kaputt by then, but I already had a complete list of their clusters obtained through several means before it went south.)
Sometimes when bad luck strikes and none of the tools you rely on is willing to help and the ones that would are unavailable, you got to adapt and improvise ! è_é

slaapliedje · Post by **slaapliedje** » Sat Sep 19, 2020 9:51 pm

I'm running R-Studio, according to https://www.r-studio.com/data_recovery_ ... load.shtml version 4.9 build 3919 was released on Apr 06 2020.

Doing some quick math for the sectors vs parent sector; Sector starts with 0, and ends with 5860357382 and parent sector starts with 1465130624 and ends with 7325977598.

So if my logic is correct, take the number from the last Parent Sector, and minus the last number of the sector, and it should equal the staring Parent sector (if they're contiguous). Unfortunately it's close, but not quote. It equals 1,465,620,216. This would mean in that huge amount of sectors, it's off by 489,591...

I decided as a test to try to DD an empty 3TB file onto the new external drive to see if it complains at all about saving 3TB to it. It's still chugging away though...

abolibibelot · Post by **abolibibelot** » Sun Sep 20, 2020 10:45 am

If it's off by such a small amount it should still be manageable with such a method. Could you directly examine the MFT record ? With R-Studio, in “Metafiles”, right-click on “$MFT”, then “View/Edit”, then in the new window, “Edit” => “Find”, type the name of the image file in “Unicode” and run the search. (Or if your version has the “Get info” option you can directly see the MFT number of a selected file, then open the MFT in the hex viewer and type this value in KB.) A MFT record has always a size of 1024 bytes ; my knowledge of its structure is still fuzzy, but the cluster run(s) should be located near the end, before “ÿÿÿÿ,yG” which marks the end of the used area. If I search a file which I know is in two fragments, I see this :
43 77 C0 00 4D 96 D6 19 43 74 C2 00 E2 96 0F F2
Well, this is a tricky example because the second starting offset is negative, I'm still not quite sure how to spot and translate negative hexadecimal values... Another one which is more simple :
43 | 80 BA 00 | 81 82 E3 06 | 32 | F2 6E | D9 D5 00
This translates as : 06 E3 82 81 = 47744 clusters allocated starting from cluster BA 80 = 115573377, then 6E F2 = 28402 clusters allocated starting from cluster 115573377 + 54745 = 115628122 (each cluster run is relative to the one before {*}). As 1 cluster (4096 bytes) = 8 sectors, I can then verify in the “Sectors” column for that file that it starts at “parent” sector 115573377 x 8 = 924587016, then that at sector 47744 x 8 = 381952 the “parent” sector value jumps from 924587016 + 381951 = 924968967 to 115628122 x 8 = 925024976. YEAH !

{*} Which was probably designed to save some bytes, but can wreak havoc if there's only one value corrupted in the case of a highly fragmented file ; I lost my Inbox e-mail database that way once, as mentioned in the thread linked earlier -- a CHKDSK “repair” had completely wiped it to 0 byte, since it couldn't locate all its cluster runs ; luckily I had a recent backup, otherwise it would have been imposssible to piece it back together as it was in hundreds of fragments, some of which had already been overwritten when I noticed the issue.

slaapliedje · Post by **slaapliedje** » Sun Sep 20, 2020 11:52 am

Ouch on wiping the inbox file, but then that's why I use IMAP

Hmm, Get Info said the MFT number was 64, I didn't see where to put that in, but was able to find the name under Unicode in $MFTReconstructed. This is definitely a bug though (saying there isn't enough disk space) as this worked fine;

Code: Select all

sudo dd if=/dev/zero of=disk1.raw bs=1M count=3145728
[sudo] password for jfergus: 
3145728+0 records in
3145728+0 records out
3298534883328 bytes (3.3 TB, 3.0 TiB) copied, 20716.9 s, 159 MB/s

I guess the real question is, is it a bug in the Linux version? Which sounds like maybe it's based on an older code base? I mean it looks like the Windows version is at 8.14 build 179623. Kind of seems to me the software isn't a 'one license fits all' sort of thing, so buying the Linux version doesn't translate into having access to the Windows one. Otherwise I'd try that too.

abolibibelot · Post by **abolibibelot** » Sun Sep 20, 2020 1:17 pm

I guess the real question is, is it a bug in the Linux version? Which sounds like maybe it's based on an older code base? I mean it looks like the Windows version is at 8.14 build 179623. Kind of seems to me the software isn't a 'one license fits all' sort of thing, so buying the Linux version doesn't translate into having access to the Windows one. Otherwise I'd try that too.

No idea about that, let's hope someone in-the-know chimes in.
But indeed this looks to be the relevant MFT record, so the allocation information should be found at offset 40(hex) of the field “80” ($Data -- see here for an in-depth description) down there, so :

Code: Select all

41 | 06 | D0 83 EA 0A | 34 | 66 DC BE 20 | 14 AF 00 | 44 | 0A 38 EB 0A | 67 1C BF 20

Which, if I'm not mistaken (and if I'm reading the characters correctly -- hard to distinguish a “0” from a “8” or a “B” on this small screenshot), translates as :
– 6 clusters allocated starting from cluster 183141328
– 549379174 clusters allocated starting from cluster 183141328 + 44820 = 183186148
– 183187466 clusters allocated starting from cluster 183186148 + 549395559 = 732581707
Total number of clusters = [732583031 false] 732566646 = [3000660094976 false] 3000592982016 bytes, which looks right.
I have little experience with dd, but with ddrescue this should do the trick :

Code: Select all

ddrescue /dev/sdb1 /media/sdc1/sdf_copy.img /media/sdc1/sdf_copy.log -i 750146879488 -s 24576 -o 0
ddrescue /dev/sdb1 /media/sdc1/sdf_copy.img /media/sdc1/sdf_copy.log -i 750330462208 -s 2250257096704 -o 24576
ddrescue /dev/sdb1 /media/sdc1/sdf_copy.img /media/sdc1/sdf_copy.log -i 3000654671872 -s 750335860736 -o 2250257121280
-i = input offset
-s = size
-o = output offset

Last sector should be (732581707 + 183187466) x 8 -1 = 7326153383 ; it's off by 175785 from the 7325977598 last sector value you noted, but I've noticed that for some reason the “Sectors” column could stop before the actual end for large files, so perhaps if you scroll down to the very end of the file you'll get the number I calculated. Either that, or it's a 0/8/B mistake, you'll have to check.

EDIT 1 : Well, actually the file size field says : 00 60 47 A1 BA 02, which is 3000592982016 (which is the exact capacity of my own 3TB HDDs), which makes both values inconsistent. But, after checking, it appears that the total number of clusters is 732566646, which gives the correct size -- I must have made a mistake when copy-pasting the previously obtained values, adding 549395559 instead of 549379174. (I corrected the wrong numbers above.) The ddrescue commands should be correct (24576 + 2250257096704 + 750335860736 = 3000592982016).

EDIT 2 : “Hmm, Get Info said the MFT number was 64, I didn't see where to put that in”
Since each MFT record is exactly 1KB, putting this value directly in the offset box set in KB gets to the corresponding MFT record. Otherwise, it would be 64 x 1024 = 65536 in bytes. I don't see an easier way to show a file's MFT record in R-Studio.
DMDE is convenient for this : right-click on a file, then “Open MFT file”, et voilà. And the MFT records are presented in a more easily readable form, with readily translated values. For instance, if I open the MFT record for a 419172179 bytes file which is in 4 fragments, then open the 80h $Data field (click on the “+”), I can read :

Code: Select all

allocated: 419233792
size:      419172179
initializ: 419172179
compress:  419233792
    0 run: 42h len: 30765 relc: 1B880BFEh :461900798
30765 run: 42h len: 30735 relc: F50609ABh :277747113
61500 run: 42h len: 30209 relc:  BF60DB5h :478421854
91709 run: 42h len: 10643 relc: F6173CDEh :312172604
102352 run: 00h
FFFFFFFFh End Mark

Which is consistent which the values in the MFT record :

Code: Select all

48 00 04 00 00 00 00 00 00 00 FD 18 00 00 00 00
53 0F FC 18 00 00 00 00 53 0F FC 18 00 00 00 00
00 00 FD 18 00 00 00 00 42 2D 78 FE 0B 88 1B 42
0F 78 AB 09 06 F5 42 01 76 B5 0D F6 0B 42 93 29
DE 3C 17 F6 00 F8 FF FF FF FF FF FF 82 79 47 11

18 FC 0F 53 = 419172179 (actual size)
18 FD 00 00 = 419233792 (allocated size, multiple of the cluster size)
Then the cluster runs :
42 | 2D 78 | FE 0B 88 1B => 30765 clusters at relative cluster +461900798
42 | 0F 78 | AB 09 06 F5 => 30735 clusters at relative cluster -184153685 = 277747113
42 | 01 76 | B5 0D F6 0B => 30209 clusters at relative cluster +200674741 = 478421854
42 | 93 29 | DE 3C 17 F6 => 10643 clusters at relative cluster -166249250 = 312172604

So apparently to calculate a negative hexadecimal value it goes like this :
F5 06 09 AB
Apparently if the first digit is higher than 7, then it's a negative number (I'll have to check that again). Then the value of the first byte (or last as it appears in “little endian”) has to be converted to binary, and the “1” on the outer left has to be removed, then the remaining value, converted to decimal, subtracted 128, and the result, with a “-” sign, mutiplied by the corresponding power of 16, added to the rest of the number converted to decimal. I still don't quite get it but at least I get a correct result :
11 x 16^0 + 10 x 16^1 + 9 x 16^2 + 0 x 16^3 + 6 x 16^4 + 0 x 16^5 = 395691
F5(h) = 11110101(b) => 1110101(b) = 117(d) => 117 - 128 = -11 => -11 x 16^6 = -184549376
-184549376 + 395691 = -184153685
461900798 - 184153685 = 277747113
For the other one :
F6 17 3C DE
173CDE(h) = 1522910
F6(h) = 11110110(b) => 1110110(b) = 118(d) => 118 - 128 = -10 => -10 x 16^6 + 1522910 = -166249250
478421854 - 166249250 = 312172604

Pfiouh ! At least I learned sumpting today...

EDIT 3 : I re-read this, which made me realize two potential caveats with what I suggested above :
1) You might need to place the parameters before the names of the input / output files for the ddrescue commands to work (and “sudo” before).
2) Since the issue is that the partition on the recovery drive is no longer accessible by regular means, the partition might not be recognized at all on a Linux environment, so referencing it with (for instance) “/dev/sdb1” might not work. In which case, use the whole device as input, and change the offset values according to the partition offset. The typical partition offset for a GPT partitioned 3TB HDD with a single partition is 129MB or 135266304 bytes ; if R-Studio recognizes the partition it should indicate its offset somewhere, if it's a different value, correct accordingly.

Code: Select all

sudo ddrescue -i 750282145792 -s 24576 -o 0 /dev/sdb /media/sdc1/sdf_copy.img /media/sdc1/sdf_copy.log
sudo ddrescue -i 750465728512 -s 2250257096704 -o 24576 /dev/sdb /media/sdc1/sdf_copy.img /media/sdc1/sdf_copy.log
sudo ddrescue -i 3000789938176 -s 750335860736 -o 2250257121280 /dev/sdb /media/sdc1/sdf_copy.img /media/sdc1/sdf_copy.log

It would be wise to run a quick test with for instance a 1MB size (-s 1048576), then open the output with a hexadecimal editor to check if it looks like it's supposed to, before running the whole script, which should take a few hours.

EDIT 4 : I had made some mistakes in the commands above, mixing up “sdc” and “sdc1” ; fixed.

slaapliedje · Post by **slaapliedje** » Mon Sep 21, 2020 11:45 am

Missed your last reply, but after buying the Windows versikn, loading up the scan data, it wouldn't even show the lost NTFS partition with the same scan data from the Linux version...

So I ended up having to reboot into Linux, and tried again. And now it is actually recovering correctly! Guess I will see if I can finally get more data out of it this time. Not sure why Linux needed a reboot, but if it works, I will not complain.

abolibibelot · Post by **abolibibelot** » Mon Sep 21, 2020 2:25 pm

Well, then it may be useful to someone, someday... é_è
(I got such a bad headache yesterday after all these tedious calculations, I was hoping that it would be worth it. But it's one of my specialties to put in a lot of effort for very little effect.)

slaapliedje · Post by **slaapliedje** » Mon Sep 21, 2020 4:37 pm

abolibibelot wrote: ↑
Mon Sep 21, 2020 2:25 pm
Well, then it may be useful to someone, someday... é_è
(I got such a bad headache yesterday after all these tedious calculations, I was hoping that it would be worth it. But it's one of my specialties to put in a lot of effort for very little effect.)

I totally appreciate it, and I learned a lot actually. Is kind of weird that the version numbers of the software are so far off from each other. Also why is it the Linux version would show the NTFS partition fine, but Windows would not.

Then again, that brings us fu circle on why this happened in the first place. The drive was 6tb in Linux, but formatted as NTFS. But Windows was seeing it as non GPT and did not ask to make sure I wanted to lose my data before it tried to convert it! Everything in Linux accepted it was GPT, otherwise it wouldn't have allowed me to write above 2tb.

Data Recovery and Disk Utilities Forum @R-TT

Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".

Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".

Re: Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".

Re: Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".

Re: Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".

Re: Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".

Re: Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".

Re: Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".

Re: Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".

Re: Recovering large (3tb) image file on Linux and getting "There is not enough space on the disk".