Archive

Posts Tagged ‘Timeline analysis’

Timeline Analysis 101

May 28th, 2010 kiddi 1 comment

I recently got the question of how to start with your timeline analysis.  And usually when someone finally asks you the question, you know that there are quite a lot of others that have absolutely no idea how to go about such analysis yet somehow don’t have the guts to ask.  Therefore for those that have never done any timeline analysis before or just want to get a better clarification on the meaning of the fields provided in the timeline, etc, here is my mini guide to get you started in your quest of timeline analysis, and who knows, this might be the first post in a series of similar ones.

First of all you need to create the super timeline. That should be the first step, without it there isn’t much to analyze.

We don’t want a simple filesystem timeline, which although can be revealing just doesn’t tell us give us enough overview of what happened on the system.  So we would like to start by extending it into a super timeline, using few tools to extract as much information as we possibly can.  And since there is still not a single tool to do all that for us, at least until I’ve added the functionality of these tools into log2timeline, we will have to make due with these great instructions on how to create a super timeline.

Since we are adding information using few tools we need to use a common output method, and in this case we used a mactime output to create the bodyfile and then change it into a CSV file using the mactime tool from the Sleuthkit (as the instructions go over, step-by-step).  Now we are ready to import this into a spreadsheet application of our choice to start our analysis.  But first thing first… what do all of these fields mean and especially in the context of a super timeline?

The format of the mactime body file starting from version 3.x is an ASCII file, pipe delimited, which is structured in the following way:

MD5|name|inode|mode_as_string|UID|GID|size|atime|mtime|ctime|crtime

The mactime body file was created to properly represent timestamps from filesystems (since that is what TSK does) so you can see there are four timestamps in each line, even though there aren’t always four recorded timestamps in every filesystem. We still use this format to describe entries that have originated from other sources than files, that is information extracted from within log files, even though some of the fields have no real meaning and quite possibly have the potential to make the analysis more difficult to understand.

A quick explanation of all the available fields in the mactime body format is perhaps needed.  Some of the information found here is taken from TSK wiki.

  • MD5
    This is the MD5 sum of the file, something that isn’t really used at all, but kept there just in case someone would want it (very time consuming to calculate the md5 sum for each file) – although it is possible to populate this field using log2timeline.
  • name
    This is the name and path of the file
  • inode
    Although there is no notion of inodes in most filesystems this field’s name is still inode.  The value of this field refers to the metadata address, which differs depending on the filesystem in question.  In the FAT context this refers to the FAT number and in NTFS this is the MFT number, etc…
  • mode_as_string
    Again, here is something that refers to the *NIX way of representing file access settings, dwrxrwxrwx is the default standard for representing file access rights in *NIX.  If you see the letter, then that access is defined, otherwise it is filled with -.  The first letter represents the file type, if we take a closer look at the Wiki for the TSK we see the following definition of the file type field:

    • -: Unknown type
    • r: Regular file
    • d: Directory
    • c: Character device
    • b: Block device
    • l: Symbolic link
    • p: Named FIFO
    • s: Shadow
    • h: Socket
    • w: Whiteout
    • v: TSK Virtual file / directory (not a real directory, created by TSK for convenience).

    The majority of the entries will be either ‘-’, ‘r’ or ‘d’, others are mostly *NIX focused. The next three letters represent write/read/execute, which are the three access rights you can set on a file in a *NIX system.  You see that this is repeated three times, the first set is the access settings of the owner of the file (user settings), the next the group settings (each file has only one group and one user) and last you have settings for everyone else.

    In other words, the mode:
    -rwx—r–
    Means that this is a file that can be read, modified and executed by it’s owner. All other members of the group that the file belongs to cannot do anything with it, that is to say they have no access rights. And everyone else, those that do not belong to the group and are not the owner, can read the file but not execute it nor modify it.

  • UID
    This is the User ID for the owner of the file.
  • GID
    This is the Group ID for the group permission of the file.
  • size
    The size of the file
  • atime
    Mactime uses the MACB method of representing timestamps.  And the meaning of each of these timestamps differ between filesystems, so I will use a very generic description here and then show you a more detailed one later.
    This is the file’s last access time.
  • mtime
    This represents the last time it was modified.
  • ctime
    This represents the time when the file was changed.
  • crtime
    This represents the time the file was created.

So we need to take a closer look at the MACB (modified, accessed, creation, birth) definition which is used.  Since each filesystem contains there own definition of the timestamps we really can’t generalize and say that these timestamps have the same meaning in each context.  So we need to take a look at it from a different perspective, I will just include a table from the Sleuthkit web site:

MAC Meaning by File System

File System M A C B
Ext2/3 Modified Accessed Changed N/A
FAT Written Accessed N/A Created
NTFS File Modified Accessed MFT Modified Created
UFS Modified Accessed Changed N/A

If we take a closer look at this, you can start to see where this form does not properly describe timestamps extracted from other sources. Although the timeline to be analyzed in the spreadsheet application contains only one timestamp per entry, it will contain the MACB definition of the timestamp in question. Each line in the CSV file contains the following fields:

Date,Size,Type,Mode,UID,GID,Meta,File Name

In other words, you have the date, the size of the file, the mode, user ID, group ID, the META or inode number and the “File Name” field.

How does a registry entry fall into these fields? It does not necessarily have a group ID, nor a user ID, and not really a size field, inode number nor a file name, etc… here comes the a bit of artistic license into play when use these fields to properly describe such events.  And another thing, a registry entry has one timestamp, called “Last Write time”… how does that fit into the MACB definition that is more geared towards filesystems?  This makes the need of adjusting the definition of MACB.

Each file has there own settings in regards to timestamps, and in log2timeline context you will see the following input modules (assuming the latest published release, which is of this time version 0.43).  this is the meaning of the MACB fields found within the timeline:

Input module atime mtime ctime crtime
chrome The time a URL was visited or a file downloaded
evt The time when the event is registered in the Event log file
evtx The time when the event is registered in the Event log file
exif The time when the event is registered in the Even log file
ff_bookmark The time when a bookmark was created or visited or when a bookmark folder was created or modified (the type is contained within the description)
firefox3 (URL record) The time when a URL was visited
firefox3 (bookmark) When a bookmark or a bookmark folder was last modified When a bookmark or a bookmark folder was added
iehistory Meaning different depending on the location of the file – cache meaning when the file was saved on the hard drive When the user visited the URL – but can differ between index.dat files
iis The time when the entry was logged down
isatxt The time when the entry was logged down
mactime atime of the record mtime of the record ctime of the record crtime of the record
opera The time a user visited the URL
oxml The time of the action that is referred to in the text
pcap The time when the packet was recorded
prefetch The time when the executable that the prefetch file points to was executed last
recycler The time the file was sent to the recycle bin
restore The time when the restore point was created according to the timestamp in the rp.log file
setupapi The timestamp in the log file for each entry, indicating the time when the information was logged down
sol The time of the event that was found inside the file
sol (no timestamp found) The last access time of the sol file The last modification time of the sol file The last inode or metadata change time of the sol file
squid The time when the entry was logged down
tln The timestamp within the TLN entry
userassist The extracted timestamp from the binary value of the registry entry
win_link The extracted last access time from the LNK file The extracted last modified time from the LNK file The extracted last creation time from the LNK file
xpfirewall The time when the entry was logged down

Some of these modules will include a size field definitions, while other do not, etc.  So you will see different fields populated by different modules, depending on if that particular item is really applicable. And you will see that the “File Name” field is changed for a description of the event as extracted from the artifact in question.

This of course only applies to the mactime output, yet other output mechanism share some of these fields.  In future posts I will go into more details of both other output mechanism as well as the actual analysis part, showing examples of timelines and how to interpret them.

One of the nice things about exporting the data into the mactime body file (besides the fact that you can include information from other tools) is that you can easily transform it into a CSV file for importing into a spreadsheet application.  And in the spreadsheet application you can easily hide the fields that you are not interested in, making the analysis easier. So typically what I do is to split the date part into date and time, then I hide the mode,uid and gid fields.  After that you can turn on filtering in the spreadsheet and start analyzing (or just use grep/less/vim combination). Of course this is just my own preference while doing timeline analysis, and it depends on what I’m looking for as well.

Timelines, again

March 23rd, 2010 kiddi No comments

I forgot to mention Aftertime in my last blog post, which is a new tool to create and analyse timelines.  Rob pointed this tool to me the other day, and I’ve done some limited testing on it.  It is very easy to create the timeline, just add the image file and let it crunch through it, all point-and-click and easy.  That is nice and I’m sure some will prefer that over the CLI method of log2timeline, where you need to use the command line and know the parameters of the tool, etc.  The tool also provides a nice GUI to display the timeline, using separate colors for each source, and to create reports.  Yet somehow I got the feeling it might be easy to overlook some of the important events, especially if they are only couple of them. This might be because I’m not used to examining timelines visually like this or because it might be hard to detect a single event that is surrounded with benign ones using a visual method like this. This is something I have to test further, since I think there are a lot of benefits of being able to visualize the timeline.

Harlan Carvey posted yesterday about some of these links that you see in this post.  One of which was the addition of regtime.pl into the timeline that includes every change made to the registry.  I haven’t added that functionality into log2timeline yet, that is to parse every single registry key into the timeline.  Today I’ve only included the UserAssist key, which adds more context to the registry entries than simply dump everything there.  In the near future you will see a lot more registry entries parsed using log2timeline, where I intend to parse only specific keys to add to the timeline, parsed and put into context.  I’ve been playing around few of these entries and I hope to add in version 0.51, at least part of my thoughts on the subject.

Although I agree with Harlan that adding every registry entry into the timeline can sometimes be an overkill and drown you with events and that in some cases it might loose some context (since you are not parsing the content of the keys).  However I have to admit that in some cases it really helps you find some registry entries that you might have otherwise missed.  I know that it has helped me greatly in at least few exams that I’ve done where I used tools like regtime.pl or reglookup-timeline to create the timeline.  In those cases I had a very specific timeframe which I was looking at, making the addition not so difficult to parse through, and found evidence or settings of software that I did not know at the time was installed (since timeline analysis is often the first step I do).  That led me quickly to what I was really looking for, thereby shortening the investigation time considerably.  I’m not saying that I wouldn’t have found what I was looking for using other methods, but adding the content of the entire registry into the timeline greatly reduced the investigation time so I think there is definitely value in it.  That being said, adding modules for log2timeline that actually parse the content of some specific keys and adding context to those last write times adds more value to the timeline than simply just the last write time and the name of the key, but it will never catch everything and every little piece of software you might have installed.  One thing that I liked about Aftertime though was that you could easily put everything into the timeline and then if you didn’t like seeing all the registry keys for example you could simply exclude them from the timeline and focus on something else, so if a particular source was somehow not useful at all, you could easily exclude it from the timeline (something that can be done using awk for instance in an ASCII file, but not something that everyone perhaps likes to do).

Timeline analysis, links and discussion

March 22nd, 2010 kiddi 1 comment

Timeline analysis has been getting a lot of press lately.  Harlan discussed some of the sources and usability of timeline analysis in a recent blog post. And then you’ve got few posts that describe how to create timelines, both from a live Windows machine, and from registry files. Rob Lee also posted a blog about creating timelines from shadow volume copies.

Then log2timeline has been getting some discussion as well. Paul Bobby actually pointed some bugs to me as well as posting two posts on his blog, one being a discussion of  the issues of mounting the image file using Encase and accessing log2timeline from a virtual machine. The other one about an Encase script to extract all the files that log2timeline is capable of parsing to a directory (at least most of them), to make it easy for log2timeline to parse it without the need to mount the image file.

Rob Lee also posted a blog post about super timeline creation, using among other tools, log2timeline.  I will post similar posts as Rob soon, just have to complete the new version of the tool first.  The plan is to complete it, which I hope will add some good improvements, before the SANS EU forensics summit. And speaking of the summit, the detailed agenda for the SANS EU forensics summit is up, make sure you don’t miss it.

log2timeline updated

March 6th, 2010 kiddi 5 comments

I’ve just released a new version of log2timeline, version 0.42.  The new version includes two new input modules, one for extracting timestamps from PDF metadata and another one from McAfee anti-virus log files.  The new version also includes several bug fixes, the full changelog can be read here. The development focus will be to move the tool to version 0.50, which will introduce a new design of how timestamps and related information is handled within the framework, including a shift to TLN as the standard output format, more details can be found inside the roadmap.

log2timeline will also be included in the upcoming 2.0 release of the SIFT (SANS Investigative Forensic Toolkit) workstation, which will be available soon (and yes it is based on Ubuntu now). That way people can enjoy the tool without needing to go through the installation process with all the needed dependencies.

The agenda for the upcoming SANS EU forensics summit is up. I encourage everyone that has the change to attend this summit, there are some greate talks and of course a great change to meet some of the top experts in computer forensics in Europe.  And of course a change to meet with me and get me to implement some feature to log2timeline that you always wished was there, but for some odd reason you didn’t send me an e-mail to request it.

Small updates

February 17th, 2010 kiddi No comments

Just recently saw a post at Slashdot about Adobe implementing private browsing in their Flash Player.  That means that when the user starts private browsing mode in their web browsers LSO files will not be stored on disk.  This is implemented in the way that during the private browser session all Flash cookies are stored only in memory, and as soon as the browser is closed they are cleared.

Why do we care about this? Well with this change we will start to see that private browsing is becoming more private (or actually private), and it will make our lives as forensic investigators more difficult, since we cannot simply examine Flash cookies to determine the users browsing history (at least partially).

I just posted a blog post in the SANS forensics blog about the structure of LSO files and a quick view of how log2timeline parses it to extract timestamps. I’m not going to repeat that post on this site, so if you would like to know more about the binary format of LSO files, please read the blog post.

Recently there have been a lot of discussions about creating a standard for timeline analysis. Currently log2timeline relies upon the good old mactime format for it’s output (although it is possible to use several different output mechanism), a standard that was created for filesystem timelines.  Although it works great for its original purpose It might not be the most optimal output mechanism when incorporating timestamp information from other sources which is one of the reasons why this push for a new standard has been discussed. The structure of log2timeline will be changed soon to separate the internal structure away from the emphasis on mactime and move to a more neutral approach, and perhaps change the default output mechanism to something like TLN.

With the move to a more neutral approach more logic will be moved into the output modules, meaning that it will be easier to make the description text (which every output module includes) can be more descriptive and does not need to repeat information that might be contained within the output itself (such as TLN which includes a source field, why repeat the source in the description field?)

-->