In the previous post, we discovered who was Mr. ‘X’, using BulkExtractor. It is time to know what he has ‘seen’. That’s why today I want to talk to you about Foremost. In my opinion, the carving tool par excellence.
Foremost is a forensic program, which works under the Linux console, designed for the recovery of erased or undeleted data, which was developed by Jesse Kornblum and Kris Kendall when they served in the Office of Special Investigations of the Air Force. It is the basis of Scalpel, its evolution, although its operation is, exactly, the same.
To perform file recovery, a configuration file, (foremost.conf), is used, which we will see below, which specifies headers and feet, file structure, such as search patterns, known as magic numbers . It looks for a header specified in foremost.conf and, when it finds it, writes it to a file, along with the data that follows it, until it finds the specified footer, or reaches the size limit indicated in the configuration file. And this file, ‘foremost.conf’, is very important, because it depends on it the reliability of the recovery.
It was originally designed to work on disk images, but can be run under any file, regardless of format, or even directly on a drive.
Let’s see it in operation, to explain it as best as possible.
In order not to lose the good practices, (remember that the habit makes the monk) …
We have, again, the evidence of the previous case: A dump of RAM, which belongs to MR. ‘X’, and which we list with ‘ls‘
We proceed to identify them, by means of ‘file‘
Installing Foremost is very simple
And the first thing we should do is ask how it works. For this we pass the parameter ‘-h’
-V: Display copyright information and exit
-t: specify file type, (-t jpeg,pdf …)
-d: Turn on indirect block detection, (for UNIX file-systems)
-i: Specify input file, (default is ‘stdin‘)
-a: Write all headers, perform no error detection, (corrupted files).
-w: Only write the audit file, do not write any detected files to the disk.
-o: Set output directory, (defaults to ‘output’)
-c: Set configuration file to use, (defatults to ‘foremost.conf’)
-q: Enable quick mode. Search are performed on 512 byte boundaries.
-Q: Enables quiet mode. Suppress output messages.
-v: Verbose mode. Logs all messages to screen.
What is its basic functioning? The invocation of foremost with the ‘-i’ parameter to indicate the recovery goal.
Very easy, right? We have extracted some results in the ‘output’ directory. And it has also generated an ‘audit.txt’ file, which is the report with the result.
But look at what kind of files have been recovered.
It has only extracted files in ‘doc’, ‘ost’, ‘exe’ and ‘pf’ format. I do not consider it sufficient. We have many options to play with this great program. And, one of them, is that we can specify what types of files we want, or tell them to recover everything they can be able to recognize, (which is written in the file ‘foremost.conf’).
We are going to tell him to retrieve everything that he is able to identify, and we will also specify an exit directory.
But what the hell do I see on my screen!! What is it that has been seen? Simply, the data that has been recovering.
Let’s see what has recovered us this time, in the directory that we have indicated.
Look at the amount of file extensions you have managed to recover, unlike the previous run.
In the basic execution we had recovered a total of four extensions, four types of files. This time he has managed to extract a total of fifteen.
Let’s see the folders …
And a minimum of its content. Image files ‘jpg’…
Image files ‘bmp’…
For obvious reasons I will not show the content that ‘Mr. X’ was watching.
If you sharpen the view a little, you can see that there are files that can not be displayed and others that seem incomplete. This means that they are incomplete, damaged, or that Foremost has found a coincidence in the established magic numbers, but they do not really correspond to the file type.
For example, we also have files in ‘zip’ format. And here I show you one of them.
We can even extract it, without any problems.
So is. I do not usually go around with all my hard drives and Forensics tools. I only carry with me some USB flash drives, but I did not want to introduce it to that computer and bring me some ‘gift’, so I downloaded theDumpIt utility to capture the RAM.
All right. We have given you an exit directory. But, what if we want to re-run Foremost, on that same directory? Did we delete it? Did we overwrite it? I usually use Foremost in this way, (when I do not specify my own ‘foremost.conf’ file).
What exactly did I tell him?
-t all: Let him retrieve all the files he knows, (which he has in his own ‘foremost.conf’).
-T: Let him write me a time stamp next to the directory name, so I do not have to delete it if I run it several times.
-v: Let him enable me verbose mode so he writes all the information he processes.
-i: The image file object of the recovery.
But where did I get the ‘-T’ option? Because it was not explained in its help, with its parameter ‘-h’. For something we always have to read. From his ‘man foremost‘
And, with this, we can execute as many times as we want the tool, without eliminating the previous result.
And we review the files that it has been able to extract, and that should be the same as in the previous case, since we have not changed its configuration file.
So far, we have only seen in the ‘audit.txt’ file, the information regarding the extracted files. But it contains much more.
What can we see in this audit report? When the execution started, what invocation was made, the output directory specified, the configuration file used. And with regard to the recovered material, the recovery number, the file name, the size, the offset of the file and a comments section, where there may be some kind of metadata.
We could leave it here, but …
If we do not break, if we do not play, if we are not curious, we will not know how things work. I am not so. Let’s be curious. We said at the beginning that it works with a configuration file. Well let’s see what he tells us.
Much to read? In short, the configuration file is used to control what types of files to look for. For each file type, the extension, header and/or footer is described, whether or not it is case-sensitive and the maximum file size. You can comment lines with ‘#’, so they will be ignored. It works with hexadecimal and octal values. You can use a wildcard character, with ‘?’. That if we want to perform the extraction, without typing any extension we put the value ‘NONE’ in the field of the name of the extension …
Let’s look at it with a simple example.
We are seeing what configuration has the file extension ‘jpg’, in the file ‘foremost.conf’. We can see that, in the name of the extension, ‘jpg’ is set. We see that there is an ‘y’, which indicates that it is case-sensitive. We see a value ‘20000000’, which indicates the maximum file size, set to bytes. We can also see the different headers and footers that exist for this type of files, in a hexadecimal value.
Lets go see it…
Do not we see it at all clear? Let’s use GHex, which is a hexadecimal file editor.
It’s better that way, right? Look at the header and the footer of the ‘jpg’ files that we have in the file ‘foremost.conf’ and compare them with the one in the upper image. Do they match? This is how the parameters of the extensions are set.
Let’s see it with some tests.
We have this description for this type of ‘jpg’ file, with an extension name ‘jpg, with a value’ y ‘to be case sensitive, with a header and a footer in hexadecimal format.
Let’s change the name of the extension and size.
If you look at it, I have changed the name of the extension ‘jpg’ to ‘fwhibbit’, and the maximum size assigned I have set it to
‘4194304’, (4 MB). In this capture there is an error because the size assigned to the next execution actually corresponds to 1024 bytes. I proceed, again, to its execution …
First of all, attention, because in this case, the order of factors does alter the product. The first thing to mention, after invoking Foremost, and in case we want to use our own ‘foremost.conf’ file, is where it is, with the parameter ‘-c’.
Now, we can see that it has worked without any problem, ‘real’. Because that value of the name of the extension that we have modified, is just so we can identify it, a reference name. It does not affect the structure of the data we want to retrieve. Look at the size of the extracted data.
We can see how the file extension is written.
We can see the information about the execution that has been carried out with the tool.
And we can see that it has extracted a total of 976 files. If you compare these results, you can see that this time more files have been extracted than in the other case.
Let’s look at another example.
In this case, I have changed the name of the extension that I want to identify, to ‘jpg’, in addition to the size, which I have set to 4096 bytes. We run it again.
And we list what has been extracted.
Again, look at the results. Note the size of the extracted files. Look at the names of the files, forgetting their extension, and you will see that they agree with the previous case.
Let’s see in a final way how Foremost works.
As you can see, I have two files of configuration of Foremost, identical, except for the size. In the first I have set a value of 1024 bytes and in the second I have set a size of 4194304 bytes, (4 MB). We proceed to its execution, again. First with one …
And then with the other …
Let’s see what he has extracted in both cases. First one …
And then another …
In both cases, 976 files have been extracted.
We will list the header of the directories and we will see that they do not match the names of the files, nor their size.
And we could leave it here. But … What happens when we find out that we want to recover a file type that is not found in the ‘foremost.conf’ file? Well, it’s easy to answer. We have two options. Look for it in databases online, that there are a few available. For example, https://www.filesignatures.net/, or http://www.garykessler.net/library/file_sigs.html, and trust them. Or we can download that type of file, find our own magic numbers ourselves and use the available bases to compare the results.
For example, a file that is not available in the foremost configuration file is ‘flv’. Well, if we do not have that type of file available on our computer, we look for it and download it.
We study it, with the tool that we feel most comfortable, and we look for its header.
And we can compare our data with others.
And with this data, we create or modify our file ‘foremost.conf’
And we re-run the tool. If any data were wrong, he would tell us.
Now, yes. With this we have seen, and for the case that occupies me I prepare this configuration file, which is more than enough.
I’m interested in retrieving the files ‘jpg’, ‘png’, ‘bmp’ and ‘tif’. We could quietly make case insensitive to all extensions.
I run it for the last time
List the results
I check the audit report, ‘audit.txt’
And we can see what types of files have been extracted.
Keep in mind that for this tool to work as reliably as possible, we must know what we are looking for. And, in this case, it’s about knowing what kind of files we want to find, using their headers, their footers, (their data structure), and their maximum size.
That is all, for now. See you at the next entry. This Minion, delivered and loyal to you, says goodbye… for now.