Thursday, April 14, 2022
Introduction
Recovery records are a known benefit of the RAR5 format, yet no research on their effectiveness is available. As such, users do not know whether the recovery record will be effective for their purposes. In this study, both sequential and sparse corruptions of the archive are tested, with varying recovery record sizes.
Background
First of all, why use such software?
Archival software is used to create an archive of files. Basic archival functions are built into common operating systems through ZIP files, but the possibilities are limited. For more options, programs such as WinRAR are required. Whilst WinRAR does support ZIP, its main attraction is the RAR5 format, which has a much greater feature set.
Once an archive of files has been created; it can be sent to someone as a single group, it can be encrypted for added security, and can backed up to the cloud. In the case of RAR5, a recovery record can also be added, protecting the archive from damage.
When a folder containing many files is archived, a single file containing all others can be produced. This single file will be faster to transfer between devices, sometimes substantially so. If a large file is to be uploaded to a storage service, a split archive can be created, turning one 40GB file into, say, twenty 2GB files; so that network instability only requires the reupload of one 2GB chunk rather than the entire 40GB file. Additionally, instability during restoration is less of an issue - if the provider times out whilst the 40GB file is being downloaded, it could be difficult to retrieve the data.
More advanced formats additionally support superior compression algorithms, resulting in much smaller archive sizes than for ZIP. ZIP files are quite basic and do not have features such as split archives and encryption. Anyone working seriously with archival will require something more substantial.
The options are RAR5 through WinRAR, as tested in this study - but notably, also 7z through 7Zip. RAR5 and 7z support many comparable features, and in the author's personal experience, it has been found that 7z often features superior compression, but objectively does not support recovery data. Any damage to the archive has the potential to result in a complete loss of data. When these archives are uploaded for off-site backup, encryption is essential. The primary concern, after security, is data integrity.
The Recovery Record
What could cause a data integrity issue? When files are stored, the main culprits are bit rot, general disk errors, and disk failure.
Bit rot is a natural phenomenon caused by the decay of charge within the storage device, whereby there is a chance that a bit of data can change from one to zero.
In time, a storage device will encounter an uncorrectable error of some kind when trying to read or write data. The only scenario where this does not occur is if it fails completely in advance. When an uncorrectable error is encountered, there is a chance that data has been lost, depending on the type of error. If the error identifies a failed sector, and that sector is now unreadable, then any file using that sector is now corrupt. A sector is a chunk of data typically either 512 bytes or 4096 bytes in size. These errors can be monitored using CrystalDiskInfo.
These above phenomena are somewhat rare, but if the data is valuable, action must be taken to detect and recover from these faults. This article is not intended as a deep dive into these strategies, but know that the first line of defence is to store the data across multiple devices - any storage device can fail at any time, so if data is stored only on one disk, then this data is unsafe. If using only two storage devices, ensure they are isolated from each other to mitigate the risks of accidental deletion and malware.
A backup strategy can include RAR5 archives with recovery records, such that any missing data can be recovered - but just how useful is the recovery record?
A recovery record must be added to an archive whilst it is intact. The recovery record is then used to resolve any corruptions encountered later.
How do we know if an archive has been corrupted?
Knowing Whether An Archive Is Corrupt
Good archive formats such as RAR5 contain checkums which they use to verify data integrity. A manual check can be performed using WinRAR's Test function. Extraction can also be attempted, and WinRAR will report any faults it encounters, such as with the below message.
WinRAR, during an extraction, reporting archive corruption.
Even after repairing the archive, WinRAR may report "Unexpected end of archive". This does not necessarily indicate an issue. It was found that even after seeing this message, the extracted files verified successfully. This was verified using independent checkums created with LiamFootHash.
WinRAR reporting unexpected end of archive, after repairing.
Repairing an Archive
Once finding that the archive is corrupt, repairs can be attempted. Open the archive, go to Tools > Repair Archive, and click OK. A window similar to the following then appears:
WinRAR repairing 1MB of damage
Note that the original archive was named "Firmware.rar", and the repaired archive is named "fixed.Firmware.rar". This is an automatic convention. Once this stage completes, the fixed archive must be extracted from; not the initial, damaged archive.
If the archive has damaged headers, it may not open at all when clicked through File Explorer - in this case, the WinRAR Graphical User Interface or command-line must be used.
Note that the presence of a fixed archive does not mean it was actually fixed. This "fixed" archive must be tested. The key point from the image below is that a file reports as corrupt. This means the repair was not successful. The unexpected end of archive, and corrupt recovery record, are not important. If the repair was successful, the fixed archive should be extracted and then re-archived.
When testing the fixed archive, the archive is known to be fixed correctly if there are no reported checksum errors. See an example of a checksum error below, indicating a failed repair.
WinRAR fixed archive reports as corrupt when tested.
Recovery Testing Introduction
In the next section, an archive containing a single file, nested inside two folders, is created. This is simply the format in which the file was found, a PlayStation 3 firmware, that was randomly selected for testing. As such, all tests are performed on an archive containing only one file.
In this testing, the archive contains a recovery record of 1% (Firmware.rar) or 5% (Firmware5.rar), as specified, with other archive settings being identical. Sequential and sparse corruptions are then tested against the archives containing 1% and 5% recovery records, explained in further detail in the discussion. Though a user specifies a percentage for the recovery record, WinRAR will decide on its exact sizing.
The corruptions are performed by replacing bytes within the file. To do this, a brief .Net 6 Console Application was created. This program reads the file stream, and writes random noise to the appropriate positions.
All prefixes such as k (kilo) and M (mega) are used as binary prefixes, meaning they actually stand for kibi, mebi e.t.c. in this context.
For example, 1k == 1024 bytes.
512k == 1024 * 512 bytes == 524,288 bytes.
1M == 1024^2 bytes == 1,048,576 bytes.
In the program, the above rules are used to calculate bytes. Where a fractional number of bytes is found, for example 1.7M == 1.7 * 1024^2 ==1,782,579.2 bytes; the value is always rounded down to the nearest byte.
Please note all tests use WinRAR 6.02.
Recovery Testing - 1% Recovery Record
Firmware.rar: 198.3781MB, 1% Recovery (2.01MB), Solid, Encrypted, BLAKE2 Checksums, Best compression, 1024MB dictionary.
Sequential Corruption Test
In this test, one sequential chunk of bytes is replaced, beginning at Start Position, and then replacing the number of bytes specified under Bytes Changed. The replacement bytes are random noise.
Some results are marked with an asterisk. See the Header Corruption section of the discussion for details.
Start Position (Bytes) | Bytes Changed (Bytes) | Recoverable |
0 | 1024 | Yes* |
0 | 1.925M | Yes* |
0 | 1.950M | Yes* |
0 | 1.975M | No |
0 | 2.000M | No |
512k | 1024 | Yes |
1M | 1024 | Yes |
1M | 1.000M | Yes |
1M | 1.500M | Yes |
1M | 1.900M | Yes |
1M | 1.925M | Yes |
1M | 1.950M | No |
1M | 2.000M | No |
Sparse Corruption Test - 1%
In this test, X bytes are replaced at every interval of 1% of the file's size - so bytes are replaced at 1%, 2%, 3%... all the way to 99%. 100% is the end of the file, so nothing can be done there. In total there are 99 starting points. For this file, 1% of its size is 1.9838MB.
Total Bytes Replaced (Bytes) | Approximate Bytes Per Starting Point (Bytes) | Recoverable |
512k | 5.17k | No |
256k | 2.59k | No |
50k | 517 | No |
1k | 10 | No |
99 | 1 | No |
Sparse Corruption Test - 2%
As above, but with intervals at 2% of the file's size - so bytes are replaced at 2%, 4%, 6%... all the way to 98%. 100% is the end of the file, so nothing can be done there. In total there are 49 starting points. For this file, 2% of its size is 3.9676MB.
Total Bytes Replaced (Bytes) | Approximate Bytes Per Starting Point (Bytes) | Recoverable |
256k | 5.22k | No |
50k | 1.02k | No |
1k | 20 | No |
49 | 1 | No |
Sparse Corruption Test - 5%
As above, but with intervals at 5% of the file's size - so bytes are replaced at 5%, 10%, 15%, 20%... all the way to 95%. 100% is the end of the file, so nothing can be done there. In total there are 19 starting points. For this file, 5% of its size is 9.9189MB.
Total Bytes Replaced (Bytes) | Approximate Bytes Per Starting Point (Bytes) | Recoverable |
1.9M | 102k | No |
1.8M | 97k | No |
1.5M | 81k | No |
1M | 54k | No |
900k | 47k | No |
875k | 46k | No |
850k | 45k | Yes |
768k | 40k | Yes |
512k | 27k | Yes |
256k | 13.5k | Yes |
Sparse Corruption Test - 10%
As above, but with intervals at 10% of the file's size - so bytes are replaced at 10%, 20%, 30%... all the way to 90%. 100% is the end of the file, so nothing can be done there. In total there are 9 starting points. For this file, 10% of its size is 19.8378MB.
Total Bytes Replaced (Bytes) | Approximate Bytes Per Starting Point (Bytes) | Recoverable |
1.9M | 216k | No |
1.5M | 171k | No |
1.35M | 154k | No |
1.325M | 151k | No |
1.3M | 148k | Yes |
1.2M | 137k | Yes |
1M | 114k | Yes |
Recovery Testing - 5% Recovery Record
Firmware5.rar: 206.4367MB, 5% Recovery (10MB), Solid, Encrypted, BLAKE2 Checksums, Best compression, 1024MB dictionary.
Sequential Corruption Test
In this test, one sequential chunk of bytes is replaced, beginning at Start Position, and then replacing the number of bytes specified under Bytes Changed. The replacement bytes are random noise.
Some results are marked with an asterisk. See the Header Corruption section of the discussion for details.
Start Position (Bytes) | Bytes Changed (Bytes) | Recoverable |
0 | 1024 | Yes* |
0 | 9.75M | Yes* |
0 | 9.80M | Yes* |
0 | 9.85M | No |
512k | 1024 | Yes |
1M | 2.00M | Yes |
1M | 8.00M | Yes |
1M | 9.50M | Yes |
1M | 9.75M | Yes |
1M | 9.80M | No |
1M | 9.85M | No |
1M | 9.90M | No |
1M | 10.00M | No |
Sparse Corruption Test - 1%
In this test, X bytes are replaced at every interval of 1% of the file's size - so bytes are replaced at 1%, 2%, 3%... all the way to 99%. 100% is the end of the file, so nothing can be done there. In total there are 99 starting points. For this file, 1% of its size is 2.0644MB.
Total Bytes Replaced (Bytes) | Approximate Bytes Per Starting Point (Bytes) | Recoverable |
2M | 20.69k | No |
1.75M | 18.10k | No |
1.725M | 17.84k | No |
1.7M | 17.58k | Yes |
1.625M | 16.81k | Yes |
1.5M | 15.52k | Yes |
512k | 5.17k | Yes |
99 | 1 | Yes |
Sparse Corruption Test - 2%
As above, but with intervals at 2% of the file's size - so bytes are replaced at 2%, 4%, 6%... all the way to 98%. 100% is the end of the file, so nothing can be done there. In total there are 49 starting points. For this file, 2% of its size is 4.1287MB.
Total Bytes Replaced (Bytes) | Approximate Bytes Per Starting Point (Bytes) | Recoverable |
6M | 125.39k | No |
5.9M | 123.30k | No |
5.825M | 121.73k | No |
5.8M | 121.21k | Yes |
5.75M | 120.16k | Yes |
5.5M | 114.94k | Yes |
5M | 104.49k | Yes |
4M | 83.59k | Yes |
2M | 41.80k | Yes |
Sparse Corruption Test - 5%
As above, but with intervals at 5% of the file's size - so bytes are replaced at 5%, 10%, 15%, 20%... all the way to 95%. 100% is the end of the file, so nothing can be done there. In total there are 19 starting points. For this file, 5% of its size is 10.3218MB.
Total Bytes Replaced (Bytes) | Approximate Bytes Per Starting Point (Bytes) | Recoverable |
8M | 431.16k | No (corrupt header) |
6M | 323.37k | No (corrupt header) |
5M | 269.47k | No (corrupt header) |
4.75M | 256k | No (corrupt header) |
4.7M | 253.31k | Yes |
4.65M | 250.61k | Yes |
4.55M | 245.22k | Yes |
4.5M | 242.53k | Yes |
4M | 215.58k | Yes |
3M | 161.68k | Yes |
1M | 53.89k | Yes |
Sparse Corruption Test - 10%
As above, but with intervals at 10% of the file's size - so bytes are replaced at 10%, 20%, 30%... all the way to 90%. 100% is the end of the file, so nothing can be done there. In total there are 9 starting points. For this file, 10% of its size is 20.6437MB.
Total Bytes Replaced (Bytes) | Approximate Bytes Per Starting Point (Bytes) | Recoverable |
9M | 1024k | No |
8.75M | 995.56k | No |
8.625M | 981.33k | No |
8.6M | 978.49k | No |
8.55M | 972.80k | Yes |
8.5M | 967.11k | Yes |
8M | 910.22k | Yes |
Discussion
Header Corruption
With both 1% and 5% recovery records, it was initially reported that header damage at the beginning of the archive resulted in a complete loss of data. These results are marked with asterisks in the Sequential Corruption tests, and have been amended, as they would previously have shown No, where they now show Yes.
When such damage occurs, if a user double-clicks the archive in the Windows File Explorer, or right-clicks the archive and Opens With WinRAR, the following message appears.
Error message when the beginning of an archive has been damaged, shown when opening through File Explorer.
When this message appears, WinRAR does not open, and no recovery options are presented. This lead to the conclusion that such damage to the archive's headers, placed at the beginning of the archive, was not recoverable. After publishing this article, the author was contacted by Eugene Roshal of WinRAR, who stated that the recovery record includes backup headers and that this damage should be recoverable.
This damage is indeed recoverable as Eugene stated, however a user must open the WinRAR Graphical User Interface, which presents as a file explorer; and then navigate to the damaged archive, then click the Repair button in the Ribbon. Alternatively, a user could execute the Repair command using a command-line from within the WinRAR installation directory. For example:
WinRAR.exe r "F:\temp\Recovery Testing\Firmware.rar" "F:\temp\Recovery Testing"
This command repairs Firmware.rar and places a fixed.Firmware.rar into the Recovery Testing directory.
Conversely, for all other types of damage, even for the 5% sparse corruption test with the 5% recovery record where WinRAR explicitly reports damaged headers and is unable to recover, the archive still opens in WinRAR when selected through File Explorer.
WinRAR reporting a corrupt header when opening the 5% recovery record archive after the 5% sparse corruption test with 8M of corruption.
For the case of damage to the beginning of the archive where the archive was not openable through File Explorer, the author suggested to Eugene that this be fixed, and the following response was received:
This is a rather complicated issue. Indeed, it would make repairing archives missing their signature and headers simpler. But it can confuse users in other cases. Suppose someone renamed a .jpg file to .rar, either by mistake or on purpose. Currently WinRAR refuses to open such files and issues that "wrong format or damaged" message. If we allow WinRAR to open such a file as an empty .rar archive, even if we display the same error message in another window, a user is more likely to mistakenly decide that it might be a valid archive. On the one hand there are broken archives without a signature, where it might be better to open them as an empty archive. On the other hand there are files of an incorrect format with their signature mismatching their file extension and it is better to refuse opening them. We need to choose which use case is more important. For now I decided to prioritize the wrong format scenario.
The author appreciates the reasoning behind this decision.
Again regarding the 5% sparse corruption test for the 5% recovery record, it is seen that important headers, for this particular archive, are also located approximately 256k after one or more intervals of 10.3218M, causing the recovery record to underperform due to loss of headers. Eugene stated that each block of the recovery record, which can be sized up to 64KB each, includes its own headers. It seems the interval here is coincidentally damaging the recovery record beyond repair. This behaviour was not seen in any other tests.
The archive with 5% recovery record was also re-archived with the same settings, and still encountered corrupt headers at one or more of these locations. This means the archive layout is consistent when re-archiving. Additionally, re-archiving with the same password does produce a physically different archive even if the layout appears consistent, which is good for security - if it produced a byte-for-byte identical archive, that would lead to attacks on the encryption.
The following was found at rarlab.com:
"Normally RAR uses the quick open data to store copies of file and service headers."
https://www.rarlab.com/technote.htm
The 5% recovery record archive was again re-archived with the same settings, but also including quick open information for all files. The corrupt headers were still encountered at the same place, so the quick open data could not resolve this corruption.
Sequential Corruption
The sequential corruption tests simulate a sequential loss of data; that is, a single block of data that is replaced with random noise. It is in this scenario that the recovery record performs at its best, and is able to repair slightly less damage than the size of the recovery record.
The below table reports the recovery record efficiency. This is calculated as (number of bytes repaired) / (recovery record size). Note that for the 5% recovery record, the size of this record is reported as 10MB with no specified precision, whereas the 1% recovery record is reported as 2.01MB.
Start Point of Sequential Damage (Bytes) | 1% Recovery Record Efficiency | 5% Recovery Record Efficiency |
0 | 97.0% | 98.0% |
1M | 95.8% | 97.5% |
The slight variance in efficiency depending on the start point is because WinRAR is working in blocks of 64KB. When one byte of a block is corrupted, the entire block is rendered corrupt. Depending on the start point, the number of corrupt blocks can be slightly larger than when starting at 0. These blocks also start at some point after the headers, so they are offset from zero. It is likely that the test starting at 1M is damaging an additional block of data, which is why the efficiency is slightly lower.
Sparse Corruption
This test simulates a sparse loss of data, that is to say it loses many chunks of data. Imagine punching many holes into a sheet of paper. As the number of holes increases, the effectiveness of the recovery record decreases, restoring progressively less data. For 1% (99 holes) and 2% (49 holes) sparse corruptions, the 1% recovery record was not effective at all, even for single byte losses. The 5% recovery record resolved this.
Note that in practice, a single byte of damage will corrupt an entire block of data sized up to 64k. For 19 holes, this results in 19 damaged blocks, or up to 1216k total. For the 1% recovery record, this is less than the size of the recovery record and is recoverable. For 49 holes, this is 49 damaged blocks, or up to 3136k total, which is larger than the size of the recovery record, and so is not recoverable.
Bytes Restored vs. Number of Holes, 1% Recovery Record
Bytes Restored vs. Number of Holes, 5% Recovery Record
Relative Recoverability
Number of Holes | Relative Recoverability of 5% record, compared with 1% record |
1 | 5.06 |
9 | 6.58 |
19 | 5.66 |
49 | infinite |
99 | infinite |
Whilst the recovery record increased to 5x its initial size, the amount of data that was recoverable increased by more than this. The sequential test was closest, restoring only slightly more than 5x the data; but there are notable improvements beyond sheer size for 9 and 19 holes, as well as for 49 and 99 holes, which result in an infinite improvement.
Brief Notes on Recoverability without Encryption and without Recovery Records
Additional archives were created, each with no recovery record. One was encrypted, the other was not. In separate tests, single bytes were replaced at positions 1M and 2M. Neither archive could recover from these single-byte losses. However, if the archive contained many files, this corruption would perhaps only affect one file, and the others may be recoverable. This theory may only be true if Solid archiving is not used.
"if any file in a solid archive is damaged, it will be impossible to extract all files which follow the damaged area. Thus if a solid archive is stored to a potentially unreliable media, it is recommended to make use of the recovery record."
https://documentation.help/WinRAR/HELPArcSolid.htm
Solid archiving tends to result in stronger compression by considering many files as one, and then compressing them together. This is not relevant to the tests in this study, as the archives contain only a single file. Solid archiving was chosen for its general compression benefit.
Further Research
It would be interesting to see whether recoverability is influenced by the number of files in an archive, or the number of parts the archive is split into. Some users may also like to see recoverability without encryption.
In this testing, data was replaced by random noise without shifting the remaining data. This is representative of disk corruption. Further testing could repeat this experiment, but rather than replace a chunk of data with random noise, simply shift the next data down to write over the chunk to be replaced.
Conclusion
The RAR5 recovery record is effective at recovering data. In the case of sequential corruptions, recoverable data is slightly less than the size of the recovery record. For sparse corruptions, recoverable data lessens with each separate hole. Furthermore, increasing the size of the recovery record enables recovery from larger numbers of holes. For the archives used in this study, the absence of a recovery record meant that any archive damage resulted in a complete loss of data, making the recovery record essential in the author's opinion.
RAR5 does not negate the requirement to back up important data across multiple isolated devices, but is likely to protect archives from damage on any single storage device, increasing overall data durability.
Change Log
2022-04-22
After being contacted by Eugene Roshal of WinRAR, the following amendments were made: