Archive

Posts Tagged ‘log2timeline’

Very quick update – new release

October 1st, 2011 No comments

I know I haven’t been really active on the blog lately (really not written a thing) but I wanted to talk about the new release of log2timeline. So version 0.61 was released few days ago. It mostly contains some bug fixes (at least on my behalf). The only real changes that I did was to add an input module to read the log2timeline CSV format. I also added  a bash_completion script that makes it easier to use the tool in the bash shell.

However this release does hopefully mark a shift in the development of the tool. This release has seven new input modules, six of which that were donated to me. And I’ve got one more module that will be part of the next release. This is at least to me very exciting news, since that means that other people are starting to use the tool and find it to be useful enough to add modules to it, and I hope that this shift in development will continue ;)

I would like to thank an anonymous donator that contributed five input modules and Willi Ballenthin that contributed another module to the tool, his second one. And then also to John Ritchie that contributed a module that parses the Firefox cache files, although it hasn’t yet been distributed with the tool, sent just slightly too late for me to add it in this release. And I hope that people will continue to contribute modules to the tool.

I know that documentation has been lacking for potential developers. I hope that I will find the time soon to create such documentation, making it easier for people to contribute modules. In the mean time, I suggest looking at previously developed modules and to download the source code and look inside the “dev/” directory. There you can find templates for new modules which can assist in the creation. However better documentation is hopefully on its way ;)

Quick update

May 5th, 2011 No comments

It’s been a while since my last post, and several things have happened since then…. for instance the release of version 0.52 of log2timeline.  I will publish another blog post detailing the difference between version 0.51 and 0.52, such as the use of l2t_process a new tool released alongside log2timeline.

I will also be talking at the upcoming SANS Forensics and Incident Response Summit in Austin, Texas. I’m going to be discussing the release of version 0.60 of log2timeline (stay tuned), which I’m working hard at getting ready (the reason for the few blog posts lately). There will be a lot of great talks at the summit, so I urge you to check it out…

And the nominees for the Forensic4Cast awards has been posted, and for some obscure reason someone decided to nominate log2timeline as the best computer forensics software.  I’m honored that the tool got nominated, something that was really a surprise to me, and given the competition (FTK and Encase) the tool is really up against giants in the field.

Timeline Analysis 201 – review the timeline

February 22nd, 2011 1 comment

In this second post in my short series of timeline analysis I’m going to discuss the use of the CSV output module.  In my previous post I discussed a bit about the different modules there are in log2timeline, at least the version that was released then, and the meaning of each entry within the mactime format.  However since then, I’ve started to use the CSV output module instead of the mactime one, which will be the focus today.

After reading the excellent blog post about reviewing timelines using Excel by Corey few months ago I immediately told my self…that’s exactly what I was planning to do for my second series of the timeline analysis post.  However, I wanted to focus more on the CSV output and the benefits of using that, so I decided to a similar blog post as he did, just with a alternative method.  Somehow this blog post just got buried away somewhere far, far away and never got written… well, until now.  You can read the original post by Corey here, and I’m not going to try to repeat what is said there, please read the post yourselves as it contains lots of useful information that I’m not including here.

The problem I’ve got with the mactime format and Excel is the fact that filtering is not really optimal when it comes to something like the supertimeline.  That is to say, it is easy to filter out dates, but as soon as you start trying to filter the description field you start to hit the limitations of Excel quite quickly, just like Corey pointed out.  However, the CSV output module of log2timeline splits up the information into more fields, allowing greater and more fine grained filtering, making the basic filters more useful and the need to go to advanced filters less likely.  So I’m going to focus on that aspect in this post.  The CSV output consists of the following fields:

date,time,timezone,MACB,source,sourcetype,type,user,host,short,desc,\
version,filename,inode,notes,format,extra

The fields meaning is:

  • Date: The date of the event, in the format of MM/DD/YYYY
  • Time: The time of day, expressed in a 24h format, HH:MM:SS
  • Timezone: the timezone that was used to call the tool with.
  • MACB: The MACB meaning of the fields, mostly for compatibility with the mactime format.
  • source: The short name for the source.  All web browser history is for instance WEBHIST, registry entries are REG, simple log files are LOG, etc.
  • sourcetype: A slightly more comprehensive description of the source, “Internet Explorer” instead of WEBHIST, “NTUSER.DAT” instead of REG, etc.
  • type: The type of the timestamp itself, such as “Last Accessed”, “Last Written” or “Last modified”, etc.
  • user: The username associated with the entry, if one is available.
  • host: The hostname associated with the entry, if one is available.
  • short: A short description of the entry, usually contains less text than the full description field.
  • desc: The description field, this is where most of the information is stored, the actual parsed description of the entry.
  • version: The version number of the timestamp object.
  • filename: The filename with the full path of the filename that contained the entry
  • inode: The inode number of the file being parsed.
  • notes: Some input modules insert additional information in the form of a note, which comes here.  Or it can be used during the review by the investigator.
  • format: The name of the input module that was used to parse the file.
  • extra: Some additional information parsed is joined together and put here.

Now we only need to create the timeline… the first step is to generate a timeline using TSK, save to a file, let’s call it “bodyfile”. Then to generate the timeline using log2timeline (in this case we are dealing with a Windows XP image):

cd /mnt/analyze
timescanner -d .  -z local -f winxp -o csv -w /cases/123456/timeline.txt

Now you got two files, one being in CSV format the other in mactime (the one produced by TSK).  Now we need to convert the mactime format into CSV, again using log2timeline to do so. We use the mactime input module to read in the bodyfile, and we append it to the timeline.txt file we created earlier.

log2timeline -f mactime -z local -o csv -w timeline.txt bodyfile

This way we have a file, “timeline.txt”, which is a CSV file containing both the timeline extracted using TSK and log2timeline.  We can now open this file up in Excel.  I’m using the Mac OS X version, so the screenshots might differ a bit from the Windows version, yet the principles should all stay the same (Windows screenshots can be found in Corey’s post here).

Import the File Into Excel

The first step obviously is to open Excel and import the file.  That can be done in two ways, either simply opening the file itself or by choosing “File/Import”.

Go to the menu File/Import...

If you decided to go for the “import” function, the next screen is a choice of file type.  Here we will choose a text file instead of the default value of a CSV file.

Choose a text file

Choosing the file type in the import menu

The next step of the text import wizard gives us the choice of either splitting the file using a fixed with or delimited.  Since this is a comma separated file, we will choose “Delimited”.

Choose a delimited file

The file is comma delimited, so choose delimited.

The third step of the wizard let’s us choose which delimiters we’ve got.  We only want to use the “comma” option, so please un-check the “Tab” field and check the “comma” one.

Don't check the box "tab", instead just have the "comma" box checked

Check the box marked "comma", and un-check the box "tab"

The third and final step of the wizard is to choose the data type for each column.  I usually choose the value of the “date” field as a “Date” of the format “MDY” to make things a bit easier in the final step.

In the last window in the text import wizard, choose the date field as the value "Date: MDY"

In the last window in the text import wizard, choose the date field as the value "Date: MDY"

Sorting

Now all the data is imported, one thing to notice is the fact that timescanner does not sort the timestamps at all.  This is due to the fact that timescanner recursively scans the image, and parses each file it finds, inserting the events as it goes. Therefore we need to sort the data, and create filters.  Start by highlighting the top row and choose “Filter” from the “Data” menu.

Turn on filters for the fields

Highlight fields and choose "Data/Filter"

This turns all of our fields inside the top row into filterable columns, with a drop-down menu.  Now all we need to do is to sort the data in the correct time order.  So we go for the “Data / Sort…” option in the menu.

Go to the "Data/Sort" menu to sort the data

Time to sort the data

To correctly sort the data we start by sorting the fields based on the “Date” column, and we usually want to have the latest events on the top, so we choose to sort on value in the order of “Newest to oldest”.  However this is not enough, since we’ve also got the time field.  We therefore add another field using the “+” sign, there we choose the column “time”, which we also sort on values based on the order of “Largest to Smallest”.

Sorting the timestamp based on date from newest to oldest and time on largest to smallest

Sorting the timestamp based on date from newest to oldest and time on largest to smallest

Now we’ve got our timeline all sorted out and ready to analyse.

Analysis

Just to show the simple filters and what we can do with them.  Now we can easily sort based on the sourcetype for instance.  Just click on the arrow next to the “sourcetype” column and choose which sourcetypes you would like to include in the analysis.

simple filter

Filtering based on sourcetype

Now you can easily choose which days to examine, months or years.  Simply use the “date” column and filter based on that.  Here we are only focusing on January 2009 (or janúar as it is written in Icelandic).

Simple filter (dates)

Filtering dates, simple filter

One final trick I use quite frequently with the simple filters.  As soon as I’ve gone through some timestamps that are of some value, I often highlight them and give them color, usually only use few colors.  Then I can filter out or include only events of certain color.

Color coding fields

Fields can be color coded

Filtering based on color:

Filtering based on colors

Lines can be filtered by the color

Finally the CSV output contains often too many columns.  So it is often wise to hide few columns from view.  I usually hide the following fields:

  • timezone (the same for everyone usually)
  • host (if I’m examining one host only on the timeline)
  • short (better to see the full description)
  • version (don’t care about that one)
  • format
Sometimes I hide other fields as well, depending on the investigation.
Hide fields
Some fields can be hidden

One thing to notice though is the fact that timelines can quickly become way too large for Excel and other spreadsheet application.  The spreadsheet becomes incredibly slow and difficult to manage.  So I usually use some grep and other command-line kung-fu to remove irrelevant entries from the timeline, or only include the timespan of the investigation, that is making it as small as possible.

In my next post, I will go over  some of the command line stuff I usually do to minimize my timeline and other tricks I tend to do before loading it up in Excel.

I will also go over visualization, using the output module of BeeDocs.

And if you have any other suggestions, please add them in the comment section.

SANS summit and gold paper

August 27th, 2010 1 comment

Well, its been quite a while since my last post, summer vacation coupled with paternity leave gave me a pleasant absence from the computer screen. But I’m back now, and surprisingly my gold paper got finally been published.  The title of the paper is “Mastering the Super Timeline With log2timeline”, and for those that carefully read the title it describes my little pet project of log2timeline and timeline analysis in general.

And I’m about to give a talk at the SANS EU Forensics summit taking place in London on the 8th and 9th of September. Well unless some unnamed volcanoes here in Iceland start to protest again… it will take place then.

log2timeline Version 0.50 Released

June 30th, 2010 No comments

Well, I’ve finally decided to release version 0.50 of log2timeline.  Lot of things have changed since version 0.43, although there is only one new input module introduced to the tool, we will get to that later.  I just wanted to go over some of the changes made to the tool.

First of all the verification phase has been slightly changed in all input modules to make it slightly more optimized, thus making timescanner run considerably faster on a mounted image.  Secondly, which is perhaps the biggest structural change, is the modification of the timestamp object as it is called within the framework.  The timestamp object is basically a Perl hash that contains information about the parsed line or timestamp within a file being parsed by the tool.  The timestamp object is created by the input module, and then used by the output module to create the appropriate output.  The old structure of the timestamp object was too dependent on the mactime output, making it difficult to create new output modules, without repeating some information.  So the timestamp object was completely changed, making it independent of any particular output method, making the output modules need to process the data a bit more, but instead make the output a lot more intuitive.

Other very important changes is the addition of code contribution in the tool, for the very first time someone actually contributed code to the tool (something I’m hoping is a new trend, not just an once in the lifetime of the tool).  One new input module for parsing the output of both psscan and psscan2 modules in volatility was added by Julien Touche as well as the new timescanner_threaded application that was contributed by Ben Schmitt.  One of the new things about version 0.50 was supposed to modify timescanner so it would be a threaded application, making it considerably quicker than the older single-threaded one.  With version 0.50 the very first threaded version is released as a proof of concept, and it is not recommended for real use, since it really isn’t any faster than the single-threaded one and sometimes it skips printing some of the timestamps.  So do not use it for anything else than to test it, and report bugs.  I’m hoping that in the next version we will have this fixed, perhaps it needs to be completely rewritten to work properly, perhaps there are just few bugs that we need to sort out.

Another change that has to be mentioned is the ability of timescanner to choose which modules to use.  The older versions had the mentality of either use every input module (timescanner) or just a single one (log2timeline).  Version 0.50 introduces the possibility to manually select which modules are loaded up by timescanner and used for the recursive scan.  The option -f of timescanner is the key here in choosing the appropriate modules.  The option can be used in the following way:

  • -f module1,module2,module3
    The first option is just to list all the modules that you want to use, comma separated.
  • -f=-module1,module2,module3
    The second method is to use the minus (-) signal to indicate which modules you want to skip.  However it should be noted that if you use the – signal you have to prepend it with an equal sign ( -f=-module1).  This tells timescanner to load up all available input modules EXCEPT the ones that are listed up.
  • -f list
    The third option is to use a pre-defined lists of modules to use.  These pre-defined lists are simply text files that contain the names of the modules to use.  The current lists that are included with the tool are:
web
 chrome, firefox3, firefox2, ff_bookmark, opera, iehistory, iis,

winvista
 chrome, evt, exif, ff_bookmark, firefox3, iehistory, iis, mcafee, opera, oxml,
pdf, prefetch, recycler, restore, sol, userassist, win_link, xpfirewall,

winxp
 chrome, evt, exif, ff_bookmark, firefox3, iehistory, iis, mcafee, opera, oxml,
pdf, prefetch, recycler, restore, setupapi, sol, userassist, win_link,
xpfirewall,

There are several other changes that have been made, so read the full changelog to see a list of all changes.  Two new output modules have been added to the tool, TLNX which is simply a TLN output in a simple XML format.  The other new module is the BeeDocs, which is a timeline visualization tool that runs on Mac OS X.  The output module saves a tab delimited text file that can be imported directly into the tool.

I was also supposed to give a talk about log2timeline at the SANS EU forensic summit, that got canceled because of our lovely volcano here in Iceland.  I promised to release my presentation as soon as I would release the new version, so here it is. The presentation contains among other screen shots of the BeeDocs output module as well as some better description of the timestamp object and the inner structure of log2timeline.

Timeline Analysis 101

May 28th, 2010 1 comment

I recently got the question of how to start with your timeline analysis.  And usually when someone finally asks you the question, you know that there are quite a lot of others that have absolutely no idea how to go about such analysis yet somehow don’t have the guts to ask.  Therefore for those that have never done any timeline analysis before or just want to get a better clarification on the meaning of the fields provided in the timeline, etc, here is my mini guide to get you started in your quest of timeline analysis, and who knows, this might be the first post in a series of similar ones.

First of all you need to create the super timeline. That should be the first step, without it there isn’t much to analyze.

We don’t want a simple filesystem timeline, which although can be revealing just doesn’t tell us give us enough overview of what happened on the system.  So we would like to start by extending it into a super timeline, using few tools to extract as much information as we possibly can.  And since there is still not a single tool to do all that for us, at least until I’ve added the functionality of these tools into log2timeline, we will have to make due with these great instructions on how to create a super timeline.

Since we are adding information using few tools we need to use a common output method, and in this case we used a mactime output to create the bodyfile and then change it into a CSV file using the mactime tool from the Sleuthkit (as the instructions go over, step-by-step).  Now we are ready to import this into a spreadsheet application of our choice to start our analysis.  But first thing first… what do all of these fields mean and especially in the context of a super timeline?

The format of the mactime body file starting from version 3.x is an ASCII file, pipe delimited, which is structured in the following way:

MD5|name|inode|mode_as_string|UID|GID|size|atime|mtime|ctime|crtime

The mactime body file was created to properly represent timestamps from filesystems (since that is what TSK does) so you can see there are four timestamps in each line, even though there aren’t always four recorded timestamps in every filesystem. We still use this format to describe entries that have originated from other sources than files, that is information extracted from within log files, even though some of the fields have no real meaning and quite possibly have the potential to make the analysis more difficult to understand.

A quick explanation of all the available fields in the mactime body format is perhaps needed.  Some of the information found here is taken from TSK wiki.

  • MD5
    This is the MD5 sum of the file, something that isn’t really used at all, but kept there just in case someone would want it (very time consuming to calculate the md5 sum for each file) – although it is possible to populate this field using log2timeline.
  • name
    This is the name and path of the file
  • inode
    Although there is no notion of inodes in most filesystems this field’s name is still inode.  The value of this field refers to the metadata address, which differs depending on the filesystem in question.  In the FAT context this refers to the FAT number and in NTFS this is the MFT number, etc…
  • mode_as_string
    Again, here is something that refers to the *NIX way of representing file access settings, dwrxrwxrwx is the default standard for representing file access rights in *NIX.  If you see the letter, then that access is defined, otherwise it is filled with -.  The first letter represents the file type, if we take a closer look at the Wiki for the TSK we see the following definition of the file type field:

    • -: Unknown type
    • r: Regular file
    • d: Directory
    • c: Character device
    • b: Block device
    • l: Symbolic link
    • p: Named FIFO
    • s: Shadow
    • h: Socket
    • w: Whiteout
    • v: TSK Virtual file / directory (not a real directory, created by TSK for convenience).

    The majority of the entries will be either ‘-’, ‘r’ or ‘d’, others are mostly *NIX focused. The next three letters represent write/read/execute, which are the three access rights you can set on a file in a *NIX system.  You see that this is repeated three times, the first set is the access settings of the owner of the file (user settings), the next the group settings (each file has only one group and one user) and last you have settings for everyone else.

    In other words, the mode:
    -rwx—r–
    Means that this is a file that can be read, modified and executed by it’s owner. All other members of the group that the file belongs to cannot do anything with it, that is to say they have no access rights. And everyone else, those that do not belong to the group and are not the owner, can read the file but not execute it nor modify it.

  • UID
    This is the User ID for the owner of the file.
  • GID
    This is the Group ID for the group permission of the file.
  • size
    The size of the file
  • atime
    Mactime uses the MACB method of representing timestamps.  And the meaning of each of these timestamps differ between filesystems, so I will use a very generic description here and then show you a more detailed one later.
    This is the file’s last access time.
  • mtime
    This represents the last time it was modified.
  • ctime
    This represents the time when the file was changed.
  • crtime
    This represents the time the file was created.

So we need to take a closer look at the MACB (modified, accessed, creation, birth) definition which is used.  Since each filesystem contains there own definition of the timestamps we really can’t generalize and say that these timestamps have the same meaning in each context.  So we need to take a look at it from a different perspective, I will just include a table from the Sleuthkit web site:

MAC Meaning by File System

File System M A C B
Ext2/3 Modified Accessed Changed N/A
FAT Written Accessed N/A Created
NTFS File Modified Accessed MFT Modified Created
UFS Modified Accessed Changed N/A

If we take a closer look at this, you can start to see where this form does not properly describe timestamps extracted from other sources. Although the timeline to be analyzed in the spreadsheet application contains only one timestamp per entry, it will contain the MACB definition of the timestamp in question. Each line in the CSV file contains the following fields:

Date,Size,Type,Mode,UID,GID,Meta,File Name

In other words, you have the date, the size of the file, the mode, user ID, group ID, the META or inode number and the “File Name” field.

How does a registry entry fall into these fields? It does not necessarily have a group ID, nor a user ID, and not really a size field, inode number nor a file name, etc… here comes the a bit of artistic license into play when use these fields to properly describe such events.  And another thing, a registry entry has one timestamp, called “Last Write time”… how does that fit into the MACB definition that is more geared towards filesystems?  This makes the need of adjusting the definition of MACB.

Each file has there own settings in regards to timestamps, and in log2timeline context you will see the following input modules (assuming the latest published release, which is of this time version 0.43).  this is the meaning of the MACB fields found within the timeline:

Input module atime mtime ctime crtime
chrome The time a URL was visited or a file downloaded
evt The time when the event is registered in the Event log file
evtx The time when the event is registered in the Event log file
exif The time when the event is registered in the Even log file
ff_bookmark The time when a bookmark was created or visited or when a bookmark folder was created or modified (the type is contained within the description)
firefox3 (URL record) The time when a URL was visited
firefox3 (bookmark) When a bookmark or a bookmark folder was last modified When a bookmark or a bookmark folder was added
iehistory Meaning different depending on the location of the file – cache meaning when the file was saved on the hard drive When the user visited the URL – but can differ between index.dat files
iis The time when the entry was logged down
isatxt The time when the entry was logged down
mactime atime of the record mtime of the record ctime of the record crtime of the record
opera The time a user visited the URL
oxml The time of the action that is referred to in the text
pcap The time when the packet was recorded
prefetch The time when the executable that the prefetch file points to was executed last
recycler The time the file was sent to the recycle bin
restore The time when the restore point was created according to the timestamp in the rp.log file
setupapi The timestamp in the log file for each entry, indicating the time when the information was logged down
sol The time of the event that was found inside the file
sol (no timestamp found) The last access time of the sol file The last modification time of the sol file The last inode or metadata change time of the sol file
squid The time when the entry was logged down
tln The timestamp within the TLN entry
userassist The extracted timestamp from the binary value of the registry entry
win_link The extracted last access time from the LNK file The extracted last modified time from the LNK file The extracted last creation time from the LNK file
xpfirewall The time when the entry was logged down

Some of these modules will include a size field definitions, while other do not, etc.  So you will see different fields populated by different modules, depending on if that particular item is really applicable. And you will see that the “File Name” field is changed for a description of the event as extracted from the artifact in question.

This of course only applies to the mactime output, yet other output mechanism share some of these fields.  In future posts I will go into more details of both other output mechanism as well as the actual analysis part, showing examples of timelines and how to interpret them.

One of the nice things about exporting the data into the mactime body file (besides the fact that you can include information from other tools) is that you can easily transform it into a CSV file for importing into a spreadsheet application.  And in the spreadsheet application you can easily hide the fields that you are not interested in, making the analysis easier. So typically what I do is to split the date part into date and time, then I hide the mode,uid and gid fields.  After that you can turn on filtering in the spreadsheet and start analyzing (or just use grep/less/vim combination). Of course this is just my own preference while doing timeline analysis, and it depends on what I’m looking for as well.

-->