File recovery: Difference between revisions

From ArchWiki
(→‎Text file recovery: correcting a typing error)
(→‎List of utilities: Add {{AUR|xfs_undelete-git}} to the list of recovery utilities)
Tag: 2017 source edit
 
(79 intermediate revisions by 31 users not shown)
Line 1: Line 1:
[[Category:File systems]]
[[Category:File systems]]
[[Category:System recovery]]
[[Category:System recovery]]
[[zh-cn:File recovery]]
[[ja:ファイルリカバリ]]
This article lists data recovery and undeletion options for Arch Linux.
[[zh-hans:File recovery]]
{{Related articles start}}
{{Related|/Post recovery tasks#Photorec}}
{{Related articles end}}


== Special notes==
This article lists data recovery and undeletion options for Linux.
 
== Special notes ==


=== Before you start ===
=== Before you start ===


This page is mostly intended to be used for educational purposes. If you have accidentally deleted or otherwise damaged your '''valuable and irreplaceable''' data and have no previous experience with data recovery, turn off your computer immediately (Just press and hold the off button or pull the plug; do not use the system shutdown function) and seek professional help. It is quite possible and even probable that, if you follow any of the steps described below without fully understanding them, you will worsen your situation.
{{Expansion|The following advice is not true for devices which are failing, and mostly applies to accidental file deletion on a healthy drive.}}
 
This page is mostly intended to be used for educational purposes. If you have accidentally deleted or otherwise damaged your '''valuable and irreplaceable''' data and have no previous experience with data recovery, turn off your computer immediately (Just press and hold the off button or pull the plug; do not use the system shutdown function) and seek professional help.
 
{{Warning|It is quite possible and even probable that, if you follow any of the steps described below without fully understanding them, you will worsen your situation.}}


=== Failing drives ===
=== Failing drives ===
Line 17: Line 26:


The image files created from a utility like ddrescue can then be mounted like a physical device and can be worked on safely. Always make a copy of the original image so that you can revert if things go sour!
The image files created from a utility like ddrescue can then be mounted like a physical device and can be worked on safely. Always make a copy of the original image so that you can revert if things go sour!
{{Accuracy|Although written on the blog of a data recovery company, it seems there are [https://drivesaversdatarecovery.com/why-the-freezer-trick-for-hard-drives-doesnt-work/ voices against] the "freezer trick" on drives from the last 10 years. This paragraph is mostly untouched since 2009 and might not be applicable to modern drives.}}


A tried and true method of improving failing drive reads is to keep the drive cold. A bit of time in the freezer is appropriate, but be careful to avoid bringing the drive from cold to warm too quickly, as condensation will form. Keeping the drive in the freezer with cables connected to the recovering PC works great.
A tried and true method of improving failing drive reads is to keep the drive cold. A bit of time in the freezer is appropriate, but be careful to avoid bringing the drive from cold to warm too quickly, as condensation will form. Keeping the drive in the freezer with cables connected to the recovering PC works great.
Line 29: Line 40:


To make an image, one can use {{ic|dd}} as follows:
To make an image, one can use {{ic|dd}} as follows:
  # dd if=/dev/target_partition of=/home/user/partition.image
 
  # dd if=/dev/''target_partition'' of=/home/''user/partition''.image


=== Working with digital cameras ===
=== Working with digital cameras ===
Line 35: Line 47:
In order for some of the utilities listed in the next section to work with flash media, the device in question needs to be mounted as a block device (i.e., listed under /dev). Digital cameras operating in PTP (Picture Transfer Protocol) mode will not work in this regard. PTP cameras are transparently handled by libgphoto and/or libptp. In this case, "transparently" means that PTP devices do not get block devices. The alternative to PTP mode, USB Mass Storage (UMS) mode, is not supported by all cameras. Some cameras have a menu item that allows switching between the two modes; refer to your camera's user manual. If your camera does not support UMS mode and therefore cannot be accessed as a block device, your only alternative is to use a flash media reader and physically remove the storage media from your camera.
In order for some of the utilities listed in the next section to work with flash media, the device in question needs to be mounted as a block device (i.e., listed under /dev). Digital cameras operating in PTP (Picture Transfer Protocol) mode will not work in this regard. PTP cameras are transparently handled by libgphoto and/or libptp. In this case, "transparently" means that PTP devices do not get block devices. The alternative to PTP mode, USB Mass Storage (UMS) mode, is not supported by all cameras. Some cameras have a menu item that allows switching between the two modes; refer to your camera's user manual. If your camera does not support UMS mode and therefore cannot be accessed as a block device, your only alternative is to use a flash media reader and physically remove the storage media from your camera.


== Foremost==
== List of utilities ==
 
See also [[Wikipedia:List of data recovery software#File Recovery]]
 
* {{App|ddrutility|Compliment to GNU {{Pkg|ddrescue}}. Find what files are related to the bad sectors and some special tools for NTFS. No longer actively supported.|https://sourceforge.net/projects/ddrutility/|{{AUR|ddrutility}}}}
* {{App|[[Wikipedia:dvdisaster|dvdisaster]]|Additional error protection for CD/DVD media.|https://sourceforge.net/projects/dvdisaster/|{{AUR|dvdisaster}}}}
* {{App|ext4magic|recover deleted or overwritten files on ext3 and ext4 filesystems.|https://sourceforge.net/projects/ext4magic/|{{Pkg|ext4magic}}}}
* {{App|[[Foremost]]|Console program to recover files based on their headers, footers, and internal data structures. This process is commonly referred to as data carving. The headers and footers can be specified by a configuration file or command line switches can be used to specify built-in file types.|https://foremost.sourceforge.net/|{{Pkg|foremost}}}}
* {{App|[[Wikipedia:PhotoRec|PhotoRec]]|File data recovery software designed to recover lost files including video, documents and archives from hard disks, CD-ROMs, and lost pictures (thus the Photo Recovery name) from digital camera memory.|https://www.cgsecurity.org/|{{Pkg|testdisk}}}}
* {{App|Scalpel|File carving and indexing application originally based on [[Foremost]], although significantly more efficient. It allows an examiner to specify a number of headers and footers to recover filetypes from a piece of media.|https://github.com/sleuthkit/scalpel|{{AUR|scalpel-git}}}}
* {{App|[[Wikipedia:TestDisk|TestDisk]]|Data recovery software primarily designed to help recover lost partitions and/or make non-booting disks bootable again when these symptoms are caused by faulty software: certain types of viruses or human error (such as accidentally deleting a Partition Table).|https://www.cgsecurity.org/|{{Pkg|testdisk}}}}
* {{App|[[XFS#Undelete|xfs_undelete]]|Traverses the inode B+trees of each allocation group and tries to recover all files on an XFS filesystem marked as deleted.|https://github.com/ianka/xfs_undelete|{{AUR|xfs_undelete-git}}}}
 
== Ext4Magic ==
 
{{Pkg|ext4magic}} is a recovery tool for the ext3 and ext4 file system.
 
{{Accuracy|Leaving the system on may let other processes running on the system overwrite the data blocks occupied by the deleted files. This is why [[#Before you start]] suggests to shut down the system ASAP. If anything, the advice should be to [https://man.archlinux.org/man/fsync.2 sync data to disk] before the shutdown.}}
 
If you erroneously deleted some files/folders, '''do not''' turn off your computer. For best results (actually, for decent results) you must save the ext4 journal somewhere.
 
Immediately open a terminal and dump a copy of the filesystem journal:
 
# sudo debugfs -R "dump <8> /some/safe/path/sd''XY''.journal" /dev/sd''XY''
 
Depending on whether the deleted files are on your root partition, you will want to save the journal to different locations: for root partitions, mount an external drive and dump the journal there; for non-root partitions, any other partition will do. Avoid saving to {{ic|/tmp}}, because your data may be cleaned up.
 
If the deleted files are on your root partition, you should turn off the computer after saving the journal. Do so by holding the power button until it turns off, because this prevents additional writes to the disk. Proceed with the process from the Arch bootable media. Otherwise, you may proceed with your booted system, after unmounting the affected partition ({{ic|sudo umount /dev/sdXY}} or {{ic|sudo umount /home}}, for example).


[http://foremost.sourceforge.net Foremost] is a console program to recover files based on their headers, footers, and internal data structures.  This process is commonly referred to as data carving. Foremost can work on disk image files (such as those generated by dd, Safeback, Encase, etc.) or directly on a drive. The headers and footers can be specified by a configuration file or command line switches can be used to specify built-in file types. These built-in types look at the data structures of a given file format, allowing for more reliable and faster recovery.
To list the recoverable files:


See [[Foremost]] article.
# ext4magic /dev/sd''XY'' -a "$(date -d "-2hours" +%s)" -f deleted/folders/root -j ''/some/safe/path/sdXY''.journal -l


== Extundelete ==
* {{ic|-a}} applies a filter so that only files deleted after a certain Unix epoch are shown; in this example, it's set at the last 2 hours. If you are running from Arch bootable media, you may want to change the timezone ({{ic|1=export TZ="Europe/Berlin"}}) before using absolute times. If not specified, the default is the last 24h.
* {{ic|-f}} indicates that {{ic|ext4magic}} should only list files in a certain subfolder. This path is relative to the partition root.
* {{ic|-j}} indicates that {{ic|ext4magic}} should use specific backup of the journal; otherwise it uses the regular system journal, which does not contain record of deletion if you have restarted your computer.
* {{ic|-l}} lists the deleted files


'''[http://extundelete.sourceforge.net/ Extundelete]''' is a terminal-based utility designed to recover deleted files from ext3 and ext4 partitions. It can recover all the recently deleted files from a partition and/or a specific file(s) given by relative path or inode information. Note that it works only when the partition is unmounted. The recovered files are saved in the current directory under the folder named {{ic|RECOVERED_FILES/}}.
Files are listed with their recoverability percentage in the first column.


=== Installation ===
To actually recover all files with 100% recoverability, run this command:
 
# ext4magic /dev/sd''XY'' -a "$(date -d "-2hours" +%s)" -f deleted/folders/root -j /some/safe/path/sd''XY''.journal -d /recovery/path -r
 
* {{ic|-d}} indicates the target where the recovered files will be stored
* {{ic|-r}} indicates that only files with 100% recoverability should be recovered; {{ic|-m}} will try to recover more files, but will take longer.
 
If you are missing the journal backup you may still try to recover your files, but expect poor results.
 
To recover all files, deleted in the last 24 hours:
 
# ext4magic /dev/sd''XY'' -r
 
To recover a directory or file:
 
# ext4magic /dev/sd''XY'' -f ''path/to/lost/file'' -r
 
The ''lowercase r'' flag {{ic|-r}} will only recover complete files, that were not overwritten.
To also recover broken files, that were partially overwritten, use the ''uppercase R'' flag {{ic|-R}}.
This will also restore not-deleted files and empty directories.
 
The default destination is {{ic|./RECOVERDIR}}
which can be changed by adding the option {{ic|-d path/to/dest/dir}}.
 
If a file exists in the destination directory,
the new file is renamed with a trailing hash sign {{ic|#}}.


{{Pkg|extundelete}} is available in the [[official repositories]].
To recover files deleted after 'five days ago':


=== Usage ===
# ext4magic /dev/sd''XY'' -f ''path/to/lost/file'' -a $(date -d -5days +%s) -r


''Derived from the post on [http://linuxpoison.blogspot.com/2010/09/utility-to-recover-deleted-files-from.html Linux Poison].''
To use a file list:


To recover data from a specific partition, the device name for the partition, which will be in the format {{ic|/dev/sd''XN''}} (''X'' is a letter and ''N'' is a number.), must be known. The example used here is {{ic|/dev/sda4}}, but your system might use something different (For example, MMC card readers use {{ic|/dev/mmcblkNpN}} as their naming scheme.) depending on your filesystem and device configuration. If you are unsure, run {{ic|df}}, which prints currently mounted partitions.
# ext4magic /dev/sd''XY'' -f ''path/to/lost/file'' -Lx | grep -a ^--- >recovery-files-big.txt
# ext4magic /dev/sd''XY'' -i recovery-files-big.txt -R
# ext4magic /dev/sd''XY'' -f ''path/to/lost/file'' -lx | grep -a '^  100%' >recovery-files-small.txt
# ext4magic /dev/sd''XY'' -i recovery-files-small.txt -r


Once which partition data is to be recovered from has been determined, simply run:
The difference between the ''uppercase L'' flag {{ic|-L}} and the ''lowercase L'' flag {{ic|-l}}  
# extundelete /dev/sda4 --restore-file ''directory''/''file''
is the same as between the two ''r'' flags {{ic|-R}} and {{ic|-r}} (see above).
Any subdirectories must be specified, and the command runs from the highest level of the partition, so, to recover a file in {{ic|/home/''SomeUserName''/}}, assuming {{ic|/home}} is on its own partition, run:
# extundelete /dev/sda4 restore-file ''SomeUserName''/''SomeFile''
To speed up multi-file recovery, extundelete has a {{ic|--restore-files}} option as well.


To recover an entire directory, run:
Use {{ic|grep -a}} to preserve binary file names.
# extundelete /dev/sda4 --restore-directory ''SomeUserName''/''SomeDirectory''


For advanced users, to manually recover blocks or inodes with extundelete, debugfs can be used to find the inode to be recovered; then, run:
Using a file list allows to filter the files, for example by file extension:
# extundelete --restore-inode ''inode''
''inode'' stands for any valid inode. Additional inodes to recover can be listed in an unspaced, comma-separated fashion.


Finally, to recover all deleted files from an entire partition, run:
  # cat recovery-files-big.txt | grep -a '\.jpg"$' >recovery-files-big-jpg.txt
  # extundelete /dev/sda4 --restore-all


== Testdisk and PhotoRec ==
... or to split the file list:


TestDisk and Photorec are both open-source data recovery utilities licensed under the terms of the [http://www.gnu.org/licenses/gpl.html GNU Public License] (GPL).
# cat recovery-files-big.txt | split -l 100 - recovery-files-big-100-each-


'''TestDisk''' is primarily designed to help recover lost partitions and/or make non-booting disks bootable again when these symptoms are caused by faulty software, certain types of viruses, or human error, such as the accidental deletion of partition tables.
== TestDisk and PhotoRec ==


'''PhotoRec''' is file recovery software designed to recover lost files including photographs (Hint: '''Photo'''graph'''Rec'''overy), videos, documents, archives from hard disks and CD-ROMs. PhotoRec ignores the filesystem and goes after the underlying data, so it will still work even with a re-formatted or severely damaged filesystems and/or partition tables.
TestDisk and Photorec are both open-source data recovery utilities licensed under the terms of the [https://www.gnu.org/licenses/gpl.html GNU Public License] (GPL).


=== Installation ===
'''TestDisk''' is primarily designed to help recover lost partitions and/or make non-booting disks bootable again when these symptoms are caused by faulty software, certain types of viruses, or human error, such as the accidental deletion of partition tables. TestDisk detects numerous filesystem including NTFS, FAT12, FAT16, FAT32, exFAT, ext2, ext3, ext4, btrfs, BeFS, CramFS, HFS, JFS, Linux Raid, Linux Swap, LVM, LVM2, NSS, ReiserFS, UFS, XFS. It can also undelete files from FAT, NTFS, exFAT and ext2 filesystem.


{{Pkg|testdisk}} from the [[official repositories]] provides both TestDisk and PhotoRec.
TestDisk allows to fix partition tables, recover deleted partitions, recover FAT32 boot sector from its backup, rebuild FAT12/FAT16/FAT32 boot sectors, fix FAT tables, rebuild NTFS boot sector and more.
=== Post recovery tasks ===
After the data were recovered by utilities like the [[File recovery#Testdisk and PhotoRec|photorec]] they will store recovered files with a random names(for most of the files) under a numbered directories. An example of the [[File recovery#Testdisk and PhotoRec|photorec]] recovered files {{ic|./recup_dir.1/f872690288.jpg}}, {{ic|./recup_dir.1/f864563104_wmclockmon-0.1.0.tar.gz}}.
{{Note|To leave only the original names of the files you can use something like this with combination of the ''grep'', ''cat'' or echo commands:{{ic|<nowiki>awk -F'|' '{print $1}'|awk -F'/' '{print $3}' | grep '_' | sed 's/[^_]*_//'</nowiki>}}. But during restoration, files with the same names can be overwritten if you will not use the ''-i'' option with the ''cp'' command and/or removing/disabling the ''$CountAll'' variable in the script. And files without the original name will be skipped.}}
Example: {{ic|<nowiki>$ echo ./recup_dir.1/f864563104_wmclockmon-0.1.0.tar.gz |awk -F'|' '{print $1}'|awk -F'/' '{print $3}' | grep '_' | sed 's/[^_]*_//'</nowiki>}} or {{ic|<nowiki>$ basename ./recup_dir.1/f864563104_wmclockmon-0.1.0.tar.gz | sed 's/[^_]*_//m'</nowiki>}}
The whole ''restore-orignames-only.sh'' script you can download from the [https://sourceforge.net/projects/postrecoverytasksphotorec/ SourceForge] website. To the files with the same name will be added ''_DuplicateXXX'' where XXX is a number of a processed file.


==== Creating a database with more details about files ====
'''PhotoRec''' is file recovery software designed to recover lost files including photographs (Hint: '''Photo'''graph'''Rec'''overy), videos, documents, archives from hard disks and CD-ROMs. PhotoRec ignores the filesystem and goes after the underlying data, so it will still work even with a re-formatted or severely damaged filesystems and/or partition tables.


In this example the [[Xdg-open#get mime type|xdg-mime]] is used to gather information about the mime types but the {{ic|file --mime-type -b}} and {{ic|file -i -b}} commands does the same output as the {{ic|xdg-mime query filetype}} command, with more or less details. This script will collect a
=== Installation ===
lot of more additional information about the files into the '''info-mime-size-db.txt'''. Put the script in the destination directory that you used in photorec, make it executable and run it.


{{hc|start-collect-file-info.sh|<nowiki>#!/bin/bash
[[Install]] the {{Pkg|testdisk}} package, which provides both TestDisk and PhotoRec.
ScriptsName=$(basename $0)
if [ 'XX' != 'XX'"$1" ]; then
if [ -f "$1"  ]; then
echo $1
echo $(file "$1" -F" |"  ) '|' $(xdg-mime query filetype "$1") '|' $(du -h $1 | awk '{print $1}' ) >> info-mime-size-db.txt
else
echo The « $1 » is not a valid file name.
fi
else
find -type f -exec sh -e ./$ScriptsName "{}" \;
fi</nowiki>}}


The script will build a file with pattern '''path to file/file name | info about the file | mime type | size''', here is an example:{{ic|<nowiki>./recup_dir.1/f872690288.jpg | JPEG image data, JFIF standard 1.01 | image/jpeg | 24K</nowiki>}}<br>
=== Usage ===
You can download ''start-collect-file-info.sh'' also from the [https://sourceforge.net/projects/postrecoverytasksphotorec/files/ SourceForge] website.


==== Choosing a restoration method ====
After running e.g. {{pkg|ddrescue}} to create image.img,
Those scripts will copy up to 25 files in an each new created folder and some of them are using the ''info-mime-size-db.txt'' from above.
{{ic|photorec image.img}} will open a terminal UI where you can select what file types to search for and where to put the recovered files. There is very good documentation on their [https://www.cgsecurity.org/wiki/TestDisk wiki].
{{Warning|Remove the {{ic|echo}} command in front of the ''cp'' and ''mkdir'' otherwise the scripts will only show what is gonna to be done without restoring anything to a destination, do a dry run.}}
===== Search folder by folder =====
This will copy files from one destination to another, based on the name or the file extension, it doesn't use or do checks for any other information about files as e.g. a mime-type or a pre-made file with the descriptions. You can modify the script depends on what kind of files you will need.
This script is slow because is must go through each folder and search for files.
You can download the example script «search-folder-by-folder.sh» from the [https://sourceforge.net/projects/postrecoverytasksphotorec/ SourceForge].<br>
If it is not so many files with the same extension then it will be enough to use something like {{ic|find -name *.xcf -exec copy "{}" $HOME/Desktop \;}} to avoid the ''overload'' of a destination folder you can calculate how many files are found {{ic|<nowiki>find -type f -name *xcf | wc -l</nowiki>}}. {{Note|The photorec utility stores up to 500 recovered files in a single folder.}}


===== Reading filenames from a file =====
=== Files recovered by photorec ===
Does the same as above but reading data from the the ''info-mime-size-db.txt'' file. It is a slow way of restoring but very good if the filename contains spaces.
{{bc|<nowiki>#!/bin/bash
CountAll="0"
CountToLimit="0"
Img='Jpg'
Destination="$HOME/Images/${Img}-images/"
NewFolder="0"
ToLimit="0"
echo mkdir -v "$(echo $Destination$Img$NewFolder)" -p
while read line ; do
GetFullPath="$(echo $line | grep 'jpg |' | awk '{print $1}'  )"
if [ 'XX' != 'XX'$GetFullPath  ]; then
FileName=$(basename $GetFullPath )
echo $FileName
CountAll=$((CountAll+1))
CountToLimit=$((CountToLimit+1))
    if [ $CountToLimit -gt 25 ]; then
CountToLimit="0"
NewFolder=$((NewFolder+1))
echo mkdir -v "$Destination$Img$NewFolder" -p
    fi
echo cp -fv "$PWD/$(echo $FileName )" "$Destination$Img$NewFolder/File${Img}_${CountAll}_${FileName}"
fi;
done < info-mime-size-db.txt;</nowiki>}}


===== Adding filenames from a file into an array  =====
The photorec utility stores recovered files with a random names(for most of the files) under a numbered directories, e.g. {{ic|./recup_dir.1/f872690288.jpg}}, {{ic|./recup_dir.1/f864563104_wmclockmon-0.1.0.tar.gz}}.
Does the same as above but by preparing an array as source of necessary data. It is much faster and more easier to modify the search pattern by editing the script but may have a problem if the filename contains spaces.
{{bc|<nowiki>#!/bin/bash
CountAll="0"
CountToLimit="0"
Img='Jpg'
Destination="$HOME/Images/${Img}-images/"
NewFolder="0"
CountToLimit="0"
echo mkdir -v "$(echo $Destination$Img$NewFolder)" -p
#ArrayOfFiles=($(grep -i 'jpg | '  info-mime-size-db.txt | awk '{print $1 }'));
ArrayOfFiles=($(grep -i -e 'image/jpeg' -e 'image/gif' -e 'image/png' -e 'image/tiff' -e 'image/x-ms-bmp' info-mime-size-db.txt | awk '{print $1  }'));
TotalItems=${#ArrayOfFiles[@]}
while [ $TotalItems != $CountAll ] ; do
CountToLimit=$((CountToLimit+1 ))
FileName=$(basename ${ArrayOfFiles[$CountAll]} )
echo $FileName
    if [ $CountToLimit -gt 25 ]; then
CountToLimit="0"
NewFolder=$((NewFolder+1))
echo mkdir -v "$Destination$Img$NewFolder" -p
#ImageResolution="$(identify $FileName | awk '{print $3}')"
#ImageResolution=$(feh -l "$FileName"  | tail -1 | awk '{print $3"x"$4}')
    fi
echo cp -fv "$PWD/$FileName" "$(echo $Destination$Img$NewFolder)/Img${Img}${ImageResolution}_${CountAll}_${FileName}"
CountAll=$((CountAll+1))
done</nowiki>}} The {{Pkg|feh}} and ''identify'' from {{Pkg|imagemagic}} package can give more information about images or if you want just add the image size to name of the file then just uncomment one of them and file names will look like {{ic|ImgJpg300x177_15209_f247817680.jpg}}. They represent information in the different ways.
An example for the {{Pkg|feh}}:
{{hc|$ feh -l Jpg_1_f872690176.jpg|NUM FORMAT WIDTH HEIGHT PIXELS SIZE ALPHA FILENAME
1 jpeg 640 480 307k 56k - Jpg_1_f872690176.jpg}}
An example for the ''identify'':
{{hc|$ identify Jpg_1_f872690176.jpg|Jpg_1_f872690176.jpg JPEG 640x480 640x480+0+0 8-bit sRGB 56.7KB 0.000u 0:00.009}}
Or you can use any other specialized utility to get information about file types you need.<br>
If you searching only for a file extensions then you can populate the array with the ''find'' command, e.g. {{ic|1=ArrayOfFiles=($(find -type f -name *jpg \;))}}.


=== See also ===
=== See also ===


* Wiki (TestDisk): http://www.cgsecurity.org/wiki/TestDisk
* How to get the original filenames: [https://www.cgsecurity.org/wiki/PhotoRec_FAQ#How_to_get_the_original_filenames_.3F PhotoRec FAQ]
* Wiki (Photorec): http://www.cgsecurity.org/wiki/PhotoRec
* How to add your own custom file signature: [https://www.cgsecurity.org/wiki/Add_your_own_extension_to_PhotoRec CGSecurity Wiki]
* Homepage: http://www.cgsecurity.org/
* Wiki (TestDisk): https://www.cgsecurity.org/wiki/TestDisk
* [[Sort images by resolution]]
* Wiki (Photorec): https://www.cgsecurity.org/wiki/PhotoRec
* [[Restore name of a tar.gz archive]]
* Homepage: https://www.cgsecurity.org/


== e2fsck ==
== e2fsck ==
Line 193: Line 176:
To determine where the superblocks are, run {{ic|dumpe2fs -h}} on the target, unmounted partition. Superblocks are spaced differently depending on the filesystem's blocksize, which is set when the filesystem is created.
To determine where the superblocks are, run {{ic|dumpe2fs -h}} on the target, unmounted partition. Superblocks are spaced differently depending on the filesystem's blocksize, which is set when the filesystem is created.


An alternate method to determine the locations of superblocks is to use the -n option with mke2fs. Be '''sure''' to use the {{ic|-n}} flag, which, according to the {{ic|mke2fs}} manpage, "''Causes mke2fs to not actually create a filesystem, but display what it would do if it were to create a filesystem. This can be used to determine the location of the backup superblocks for a particular filesystem, so long as the mke2fs parameters that were passed when the filesystem was originally created are used again. (With the -n option added, of course!)''".
An alternate method to determine the locations of superblocks is to use the -n option with mke2fs. Be '''sure''' to use the {{ic|-n}} flag, which, according to {{man|8|mke2fs}}, "Causes mke2fs to not actually create a filesystem, but display what it would do if it were to create a filesystem. This can be used to determine the location of the backup superblocks for a particular filesystem, so long as the mke2fs parameters that were passed when the filesystem was originally created are used again. (With the -n option added, of course!)".


=== Installation ===
=== Installation ===
Line 199: Line 182:
Both {{ic|e2fsck}} and {{ic|dumpe2fs}} are included in the base Arch install as part of {{pkg|e2fsprogs}}.
Both {{ic|e2fsck}} and {{ic|dumpe2fs}} are included in the base Arch install as part of {{pkg|e2fsprogs}}.


=== See also ===
See also {{man|8|e2fsck}} and {{man|8|dumpe2fs}}.


* e2fsck man page: http://phpunixman.sourceforge.net/index.php/man/e2fsck/8
== Working with raw disk images ==
* dumpe2fs man page: http://phpunixman.sourceforge.net/index.php?parameter=dumpe2fs&mode=man


== Working with raw disk images ==
{{Merge|QEMU}}


If you have backed up a drive using ddrescue or dd and you need to mount this image as a physical drive, see this section.
If you have backed up a drive using ddrescue or dd and you need to mount this image as a physical drive, see this section.
Line 211: Line 193:


To mount a complete disk image to the next free loop device, use the {{ic|losetup}} command:
To mount a complete disk image to the next free loop device, use the {{ic|losetup}} command:
  # losetup -f -P /path/to/image
  # losetup -f -P /path/to/image


Line 217: Line 200:
* The {{ic|-P}} flag creates additional devices for every partition.
* The {{ic|-P}} flag creates additional devices for every partition.
}}
}}
See also [[QEMU#With loop module autodetecting partitions]].


=== Mounting partitions ===
=== Mounting partitions ===


In order to be able to mount a partiton of a whole disk image, follow [[File recovery#Mount_the_Entire_Disk|the steps above]].
In order to be able to mount a partition of a whole disk image, follow [[#Mount the entire disk|the steps above]].


Once the whole disk image is mounted, a normal {{ic|mount}} command can be used on the loop device:
Once the whole disk image is mounted, a normal {{ic|mount}} command can be used on the loop device:
  # mount /dev/loop0p1 /mnt/example
  # mount /dev/loop0p1 /mnt/example
This command mounts the first partition of the image in loop0 to the folder to the mountpoint {{ic|/mnt/example}}. Remember that the mountpoint directory must exist!
This command mounts the first partition of the image in loop0 to the folder to the mountpoint {{ic|/mnt/example}}. Remember that the mountpoint directory must exist!


Line 230: Line 217:
Once the entire disk image has been mounted as a loopback device, its drive layout can be inspected.
Once the entire disk image has been mounted as a loopback device, its drive layout can be inspected.


=== Using QEMU to Repair NTFS ===
=== Using QEMU to repair NTFS ===


With a disk image that contains one or more NTFS partitions that need to be {{ic|chkdsk}}ed by Windows since no good NTFS filesystem checker for Linux exists, QEMU can use a raw disk image as a real hard disk inside a virtual machine:
With a disk image that contains one or more NTFS partitions that need to be {{ic|chkdsk}}ed by Windows since no good NTFS filesystem checker for Linux exists, QEMU can use a raw disk image as a real hard disk inside a virtual machine:
  # qemu -hda ''/path/to/primary''.img -hdb ''/path/to/DamagedDisk''.img
  # qemu -hda ''/path/to/primary''.img -hdb ''/path/to/DamagedDisk''.img
Then, assuming Windows is installed on {{ic|''primary''.img}}, it can be used to check partitions on {{ic|''/path/to/DamagedDisk''.img}}.
Then, assuming Windows is installed on {{ic|''primary''.img}}, it can be used to check partitions on {{ic|''/path/to/DamagedDisk''.img}}.
{{Warning|Do not use lower version of Windows to check NTFS partitions create by higher version of it, e.g. Windows XP can do damage to NTFS partitions created by Windows 8 by "fixing" [[wikipedia:NTFS#Metafiles|metadata]] configuration that it does not support, resulting in damage/removal of these unsupported entries.}}


== Text file recovery ==
== Text file recovery ==


It's possible to find deleted plain text on a hard drive with a few commands. A preferably unique string from the file you are trying to recover is needed.
It is possible to find deleted plain text files on a hard drive by directly searching on the block device. A preferably unique string from the file you are trying to recover is needed.
 
First, use the {{ic|strings}} command to dump all the text from a partition:


# strings /dev/sd''XN'' > ''BigStringsFile''
Use {{ic|grep}} to search for fixed strings ({{ic|-F}}) directly on the partition:


Then use {{ic|grep}} to filter through the content of ''BigStringsFile'':
$ grep -a -C 200 -F 'Unique string in text file' /dev/sd''XN'' > ''OutputFile''


$ grep -i -200 "Unique string in text file" ''BigStringsFile'' > ''GrepOutputFile''
Hopefully, the content of the deleted file is now in ''OutputFile'', which can be extracted from the surrounding context manually.


{{Note|The {{ic|-200}} option tells grep to print 200 lines of context from before and after each match of the string. You may need to adjust this if the text you are looking for is very long.}}
{{Note|The {{ic|-C 200}} option tells grep to print 200 lines of context from before and after each match of the string. Alternatives are the {{ic|-A}} and {{ic|-B}} flags, which print context only from after and before each match, respectively. You may need to adjust the number of lines if the file you are looking for is very long.}}
Hopefully, the correct deleted data is now in ''GrepOutputFile''.


== See also ==
== See also ==


* [https://help.ubuntu.com/community/DataRecovery Data Recovery] on the Ubuntu wiki
* [https://help.ubuntu.com/community/DataRecovery Data Recovery] on the Ubuntu wiki

Latest revision as of 10:12, 22 February 2024

This article lists data recovery and undeletion options for Linux.

Special notes

Before you start

This article or section needs expansion.

Reason: The following advice is not true for devices which are failing, and mostly applies to accidental file deletion on a healthy drive. (Discuss in Talk:File recovery)

This page is mostly intended to be used for educational purposes. If you have accidentally deleted or otherwise damaged your valuable and irreplaceable data and have no previous experience with data recovery, turn off your computer immediately (Just press and hold the off button or pull the plug; do not use the system shutdown function) and seek professional help.

Warning: It is quite possible and even probable that, if you follow any of the steps described below without fully understanding them, you will worsen your situation.

Failing drives

In the area of data recovery, it is best to work on images of disks rather than physical disks themselves. Generally, a failing drive's condition worsens over time. The goal ought to be to first rescue as much data as possible as early as possible in the failure of the disk and to then abandon the disk. The ddrescue and dd_rescue utilities, unlike dd, will repeatedly try to recover from errors and will read the drive front to back, then back to front, attempting to salvage data. They keep log files so that recovery can be paused and resumed without losing progress.

See Disk cloning.

The image files created from a utility like ddrescue can then be mounted like a physical device and can be worked on safely. Always make a copy of the original image so that you can revert if things go sour!

The factual accuracy of this article or section is disputed.

Reason: Although written on the blog of a data recovery company, it seems there are voices against the "freezer trick" on drives from the last 10 years. This paragraph is mostly untouched since 2009 and might not be applicable to modern drives. (Discuss in Talk:File recovery)

A tried and true method of improving failing drive reads is to keep the drive cold. A bit of time in the freezer is appropriate, but be careful to avoid bringing the drive from cold to warm too quickly, as condensation will form. Keeping the drive in the freezer with cables connected to the recovering PC works great.

Do not attempt a filesystem check on a failing drive, as this will likely make the problem worse. Mount it read-only.

Backup flash media/small partitions

As an alternative to working with a 'live' partition (mounted or not), it is often preferable to work with an image, provided that the filesystem in question is not too large and that you have sufficient free HDD space to accommodate the image file. For example, flash memory devices like thumb drives, digital cameras, portable music players, cellular phones, etc. are likely to be small enough to image in many cases.

Be sure to read the man pages for the utilities listed below to verify that they are capable of working with image files.

To make an image, one can use dd as follows:

# dd if=/dev/target_partition of=/home/user/partition.image

Working with digital cameras

In order for some of the utilities listed in the next section to work with flash media, the device in question needs to be mounted as a block device (i.e., listed under /dev). Digital cameras operating in PTP (Picture Transfer Protocol) mode will not work in this regard. PTP cameras are transparently handled by libgphoto and/or libptp. In this case, "transparently" means that PTP devices do not get block devices. The alternative to PTP mode, USB Mass Storage (UMS) mode, is not supported by all cameras. Some cameras have a menu item that allows switching between the two modes; refer to your camera's user manual. If your camera does not support UMS mode and therefore cannot be accessed as a block device, your only alternative is to use a flash media reader and physically remove the storage media from your camera.

List of utilities

See also Wikipedia:List of data recovery software#File Recovery

  • ddrutility — Compliment to GNU ddrescue. Find what files are related to the bad sectors and some special tools for NTFS. No longer actively supported.
https://sourceforge.net/projects/ddrutility/ || ddrutilityAUR
  • dvdisaster — Additional error protection for CD/DVD media.
https://sourceforge.net/projects/dvdisaster/ || dvdisasterAUR
  • ext4magic — recover deleted or overwritten files on ext3 and ext4 filesystems.
https://sourceforge.net/projects/ext4magic/ || ext4magic
  • Foremost — Console program to recover files based on their headers, footers, and internal data structures. This process is commonly referred to as data carving. The headers and footers can be specified by a configuration file or command line switches can be used to specify built-in file types.
https://foremost.sourceforge.net/ || foremost
  • PhotoRec — File data recovery software designed to recover lost files including video, documents and archives from hard disks, CD-ROMs, and lost pictures (thus the Photo Recovery name) from digital camera memory.
https://www.cgsecurity.org/ || testdisk
  • Scalpel — File carving and indexing application originally based on Foremost, although significantly more efficient. It allows an examiner to specify a number of headers and footers to recover filetypes from a piece of media.
https://github.com/sleuthkit/scalpel || scalpel-gitAUR
  • TestDisk — Data recovery software primarily designed to help recover lost partitions and/or make non-booting disks bootable again when these symptoms are caused by faulty software: certain types of viruses or human error (such as accidentally deleting a Partition Table).
https://www.cgsecurity.org/ || testdisk
  • xfs_undelete — Traverses the inode B+trees of each allocation group and tries to recover all files on an XFS filesystem marked as deleted.
https://github.com/ianka/xfs_undelete || xfs_undelete-gitAUR

Ext4Magic

ext4magic is a recovery tool for the ext3 and ext4 file system.

The factual accuracy of this article or section is disputed.

Reason: Leaving the system on may let other processes running on the system overwrite the data blocks occupied by the deleted files. This is why #Before you start suggests to shut down the system ASAP. If anything, the advice should be to sync data to disk before the shutdown. (Discuss in Talk:File recovery)

If you erroneously deleted some files/folders, do not turn off your computer. For best results (actually, for decent results) you must save the ext4 journal somewhere.

Immediately open a terminal and dump a copy of the filesystem journal:

# sudo debugfs -R "dump <8> /some/safe/path/sdXY.journal" /dev/sdXY

Depending on whether the deleted files are on your root partition, you will want to save the journal to different locations: for root partitions, mount an external drive and dump the journal there; for non-root partitions, any other partition will do. Avoid saving to /tmp, because your data may be cleaned up.

If the deleted files are on your root partition, you should turn off the computer after saving the journal. Do so by holding the power button until it turns off, because this prevents additional writes to the disk. Proceed with the process from the Arch bootable media. Otherwise, you may proceed with your booted system, after unmounting the affected partition (sudo umount /dev/sdXY or sudo umount /home, for example).

To list the recoverable files:

# ext4magic /dev/sdXY -a "$(date -d "-2hours" +%s)" -f deleted/folders/root -j /some/safe/path/sdXY.journal -l
  • -a applies a filter so that only files deleted after a certain Unix epoch are shown; in this example, it's set at the last 2 hours. If you are running from Arch bootable media, you may want to change the timezone (export TZ="Europe/Berlin") before using absolute times. If not specified, the default is the last 24h.
  • -f indicates that ext4magic should only list files in a certain subfolder. This path is relative to the partition root.
  • -j indicates that ext4magic should use specific backup of the journal; otherwise it uses the regular system journal, which does not contain record of deletion if you have restarted your computer.
  • -l lists the deleted files

Files are listed with their recoverability percentage in the first column.

To actually recover all files with 100% recoverability, run this command:

# ext4magic /dev/sdXY -a "$(date -d "-2hours" +%s)" -f deleted/folders/root -j /some/safe/path/sdXY.journal -d /recovery/path -r
  • -d indicates the target where the recovered files will be stored
  • -r indicates that only files with 100% recoverability should be recovered; -m will try to recover more files, but will take longer.

If you are missing the journal backup you may still try to recover your files, but expect poor results.

To recover all files, deleted in the last 24 hours:

# ext4magic /dev/sdXY -r

To recover a directory or file:

# ext4magic /dev/sdXY -f path/to/lost/file -r

The lowercase r flag -r will only recover complete files, that were not overwritten. To also recover broken files, that were partially overwritten, use the uppercase R flag -R. This will also restore not-deleted files and empty directories.

The default destination is ./RECOVERDIR which can be changed by adding the option -d path/to/dest/dir.

If a file exists in the destination directory, the new file is renamed with a trailing hash sign #.

To recover files deleted after 'five days ago':

# ext4magic /dev/sdXY -f path/to/lost/file -a $(date -d -5days +%s) -r

To use a file list:

# ext4magic /dev/sdXY -f path/to/lost/file -Lx | grep -a ^--- >recovery-files-big.txt
# ext4magic /dev/sdXY -i recovery-files-big.txt -R

# ext4magic /dev/sdXY -f path/to/lost/file -lx | grep -a '^  100%' >recovery-files-small.txt
# ext4magic /dev/sdXY -i recovery-files-small.txt -r

The difference between the uppercase L flag -L and the lowercase L flag -l is the same as between the two r flags -R and -r (see above).

Use grep -a to preserve binary file names.

Using a file list allows to filter the files, for example by file extension:

# cat recovery-files-big.txt | grep -a '\.jpg"$' >recovery-files-big-jpg.txt

... or to split the file list:

# cat recovery-files-big.txt | split -l 100 - recovery-files-big-100-each-

TestDisk and PhotoRec

TestDisk and Photorec are both open-source data recovery utilities licensed under the terms of the GNU Public License (GPL).

TestDisk is primarily designed to help recover lost partitions and/or make non-booting disks bootable again when these symptoms are caused by faulty software, certain types of viruses, or human error, such as the accidental deletion of partition tables. TestDisk detects numerous filesystem including NTFS, FAT12, FAT16, FAT32, exFAT, ext2, ext3, ext4, btrfs, BeFS, CramFS, HFS, JFS, Linux Raid, Linux Swap, LVM, LVM2, NSS, ReiserFS, UFS, XFS. It can also undelete files from FAT, NTFS, exFAT and ext2 filesystem.

TestDisk allows to fix partition tables, recover deleted partitions, recover FAT32 boot sector from its backup, rebuild FAT12/FAT16/FAT32 boot sectors, fix FAT tables, rebuild NTFS boot sector and more.

PhotoRec is file recovery software designed to recover lost files including photographs (Hint: PhotographRecovery), videos, documents, archives from hard disks and CD-ROMs. PhotoRec ignores the filesystem and goes after the underlying data, so it will still work even with a re-formatted or severely damaged filesystems and/or partition tables.

Installation

Install the testdisk package, which provides both TestDisk and PhotoRec.

Usage

After running e.g. ddrescue to create image.img, photorec image.img will open a terminal UI where you can select what file types to search for and where to put the recovered files. There is very good documentation on their wiki.

Files recovered by photorec

The photorec utility stores recovered files with a random names(for most of the files) under a numbered directories, e.g. ./recup_dir.1/f872690288.jpg, ./recup_dir.1/f864563104_wmclockmon-0.1.0.tar.gz.

See also

e2fsck

e2fsck is the ext2/ext3 filesystem checker included in the base install of Arch. e2fsck relies on a valid superblock. A superblock is a description of the entire filesystem's parameters. Because this data is so important, several copies of the superblock are distributed throughout the partition. With the -b option, e2fsck can take an alternate superblock argument; this is useful if the main, first superblock is damaged.

To determine where the superblocks are, run dumpe2fs -h on the target, unmounted partition. Superblocks are spaced differently depending on the filesystem's blocksize, which is set when the filesystem is created.

An alternate method to determine the locations of superblocks is to use the -n option with mke2fs. Be sure to use the -n flag, which, according to mke2fs(8), "Causes mke2fs to not actually create a filesystem, but display what it would do if it were to create a filesystem. This can be used to determine the location of the backup superblocks for a particular filesystem, so long as the mke2fs parameters that were passed when the filesystem was originally created are used again. (With the -n option added, of course!)".

Installation

Both e2fsck and dumpe2fs are included in the base Arch install as part of e2fsprogs.

See also e2fsck(8) and dumpe2fs(8).

Working with raw disk images

This article or section is a candidate for merging with QEMU.

Notes: please use the second argument of the template to provide more detailed indications. (Discuss in Talk:File recovery)

If you have backed up a drive using ddrescue or dd and you need to mount this image as a physical drive, see this section.

Mount the entire disk

To mount a complete disk image to the next free loop device, use the losetup command:

# losetup -f -P /path/to/image
Tip:
  • The -f flag mounts the image to the next available loop device.
  • The -P flag creates additional devices for every partition.

See also QEMU#With loop module autodetecting partitions.

Mounting partitions

In order to be able to mount a partition of a whole disk image, follow the steps above.

Once the whole disk image is mounted, a normal mount command can be used on the loop device:

# mount /dev/loop0p1 /mnt/example

This command mounts the first partition of the image in loop0 to the folder to the mountpoint /mnt/example. Remember that the mountpoint directory must exist!

Getting disk geometry

Once the entire disk image has been mounted as a loopback device, its drive layout can be inspected.

Using QEMU to repair NTFS

With a disk image that contains one or more NTFS partitions that need to be chkdsked by Windows since no good NTFS filesystem checker for Linux exists, QEMU can use a raw disk image as a real hard disk inside a virtual machine:

# qemu -hda /path/to/primary.img -hdb /path/to/DamagedDisk.img

Then, assuming Windows is installed on primary.img, it can be used to check partitions on /path/to/DamagedDisk.img.

Warning: Do not use lower version of Windows to check NTFS partitions create by higher version of it, e.g. Windows XP can do damage to NTFS partitions created by Windows 8 by "fixing" metadata configuration that it does not support, resulting in damage/removal of these unsupported entries.

Text file recovery

It is possible to find deleted plain text files on a hard drive by directly searching on the block device. A preferably unique string from the file you are trying to recover is needed.

Use grep to search for fixed strings (-F) directly on the partition:

$ grep -a -C 200 -F 'Unique string in text file' /dev/sdXN > OutputFile

Hopefully, the content of the deleted file is now in OutputFile, which can be extracted from the surrounding context manually.

Note: The -C 200 option tells grep to print 200 lines of context from before and after each match of the string. Alternatives are the -A and -B flags, which print context only from after and before each match, respectively. You may need to adjust the number of lines if the file you are looking for is very long.

See also