In this second post in my short series of timeline analysis I’m going to discuss the use of the CSV output module. In my previous post I discussed a bit about the different modules there are in log2timeline, at least the version that was released then, and the meaning of each entry within the mactime format. However since then, I’ve started to use the CSV output module instead of the mactime one, which will be the focus today.
After reading the excellent blog post about reviewing timelines using Excel by Corey few months ago I immediately told my self…that’s exactly what I was planning to do for my second series of the timeline analysis post. However, I wanted to focus more on the CSV output and the benefits of using that, so I decided to a similar blog post as he did, just with a alternative method. Somehow this blog post just got buried away somewhere far, far away and never got written… well, until now. You can read the original post by Corey here, and I’m not going to try to repeat what is said there, please read the post yourselves as it contains lots of useful information that I’m not including here.
The problem I’ve got with the mactime format and Excel is the fact that filtering is not really optimal when it comes to something like the supertimeline. That is to say, it is easy to filter out dates, but as soon as you start trying to filter the description field you start to hit the limitations of Excel quite quickly, just like Corey pointed out. However, the CSV output module of log2timeline splits up the information into more fields, allowing greater and more fine grained filtering, making the basic filters more useful and the need to go to advanced filters less likely. So I’m going to focus on that aspect in this post. The CSV output consists of the following fields:
date,time,timezone,MACB,source,sourcetype,type,user,host,short,desc,\
version,filename,inode,notes,format,extra
The fields meaning is:
- Date: The date of the event, in the format of MM/DD/YYYY
- Time: The time of day, expressed in a 24h format, HH:MM:SS
- Timezone: the timezone that was used to call the tool with.
- MACB: The MACB meaning of the fields, mostly for compatibility with the mactime format.
- source: The short name for the source. All web browser history is for instance WEBHIST, registry entries are REG, simple log files are LOG, etc.
- sourcetype: A slightly more comprehensive description of the source, “Internet Explorer” instead of WEBHIST, “NTUSER.DAT” instead of REG, etc.
- type: The type of the timestamp itself, such as “Last Accessed”, “Last Written” or “Last modified”, etc.
- user: The username associated with the entry, if one is available.
- host: The hostname associated with the entry, if one is available.
- short: A short description of the entry, usually contains less text than the full description field.
- desc: The description field, this is where most of the information is stored, the actual parsed description of the entry.
- version: The version number of the timestamp object.
- filename: The filename with the full path of the filename that contained the entry
- inode: The inode number of the file being parsed.
- notes: Some input modules insert additional information in the form of a note, which comes here. Or it can be used during the review by the investigator.
- format: The name of the input module that was used to parse the file.
- extra: Some additional information parsed is joined together and put here.
Now we only need to create the timeline… the first step is to generate a timeline using TSK, save to a file, let’s call it “bodyfile”. Then to generate the timeline using log2timeline (in this case we are dealing with a Windows XP image):
cd /mnt/analyze
timescanner -d . -z local -f winxp -o csv -w /cases/123456/timeline.txt
Now you got two files, one being in CSV format the other in mactime (the one produced by TSK). Now we need to convert the mactime format into CSV, again using log2timeline to do so. We use the mactime input module to read in the bodyfile, and we append it to the timeline.txt file we created earlier.
log2timeline -f mactime -z local -o csv -w timeline.txt bodyfile
This way we have a file, “timeline.txt”, which is a CSV file containing both the timeline extracted using TSK and log2timeline. We can now open this file up in Excel. I’m using the Mac OS X version, so the screenshots might differ a bit from the Windows version, yet the principles should all stay the same (Windows screenshots can be found in Corey’s post here).
Import the File Into Excel
The first step obviously is to open Excel and import the file. That can be done in two ways, either simply opening the file itself or by choosing “File/Import”.

If you decided to go for the “import” function, the next screen is a choice of file type. Here we will choose a text file instead of the default value of a CSV file.

Choosing the file type in the import menu
The next step of the text import wizard gives us the choice of either splitting the file using a fixed with or delimited. Since this is a comma separated file, we will choose “Delimited”.

The file is comma delimited, so choose delimited.
The third step of the wizard let’s us choose which delimiters we’ve got. We only want to use the “comma” option, so please un-check the “Tab” field and check the “comma” one.

Check the box marked "comma", and un-check the box "tab"
The third and final step of the wizard is to choose the data type for each column. I usually choose the value of the “date” field as a “Date” of the format “MDY” to make things a bit easier in the final step.

In the last window in the text import wizard, choose the date field as the value "Date: MDY"
Sorting
Now all the data is imported, one thing to notice is the fact that timescanner does not sort the timestamps at all. This is due to the fact that timescanner recursively scans the image, and parses each file it finds, inserting the events as it goes. Therefore we need to sort the data, and create filters. Start by highlighting the top row and choose “Filter” from the “Data” menu.

Highlight fields and choose "Data/Filter"
This turns all of our fields inside the top row into filterable columns, with a drop-down menu. Now all we need to do is to sort the data in the correct time order. So we go for the “Data / Sort…” option in the menu.

Time to sort the data
To correctly sort the data we start by sorting the fields based on the “Date” column, and we usually want to have the latest events on the top, so we choose to sort on value in the order of “Newest to oldest”. However this is not enough, since we’ve also got the time field. We therefore add another field using the “+” sign, there we choose the column “time”, which we also sort on values based on the order of “Largest to Smallest”.

Sorting the timestamp based on date from newest to oldest and time on largest to smallest
Now we’ve got our timeline all sorted out and ready to analyse.
Analysis
Just to show the simple filters and what we can do with them. Now we can easily sort based on the sourcetype for instance. Just click on the arrow next to the “sourcetype” column and choose which sourcetypes you would like to include in the analysis.

Filtering based on sourcetype
Now you can easily choose which days to examine, months or years. Simply use the “date” column and filter based on that. Here we are only focusing on January 2009 (or janúar as it is written in Icelandic).

Filtering dates, simple filter
One final trick I use quite frequently with the simple filters. As soon as I’ve gone through some timestamps that are of some value, I often highlight them and give them color, usually only use few colors. Then I can filter out or include only events of certain color.

Fields can be color coded
Filtering based on color:

- Lines can be filtered by the color
Finally the CSV output contains often too many columns. So it is often wise to hide few columns from view. I usually hide the following fields:
- timezone (the same for everyone usually)
- host (if I’m examining one host only on the timeline)
- short (better to see the full description)
- version (don’t care about that one)
- format
- Sometimes I hide other fields as well, depending on the investigation.
-

-
- Some fields can be hidden
One thing to notice though is the fact that timelines can quickly become way too large for Excel and other spreadsheet application. The spreadsheet becomes incredibly slow and difficult to manage. So I usually use some grep and other command-line kung-fu to remove irrelevant entries from the timeline, or only include the timespan of the investigation, that is making it as small as possible.
In my next post, I will go over some of the command line stuff I usually do to minimize my timeline and other tricks I tend to do before loading it up in Excel.
I will also go over visualization, using the output module of BeeDocs.
And if you have any other suggestions, please add them in the comment section.