Archive

Archive for June, 2009

Squid Timeline analysis

June 24th, 2009 kiddi No comments

Sometimes it can be useful to know at what time a malware starts communicating to the outside world, and often it is done through HTTP or HTTPS.  So it can be quite useful to examine network log files to determine the initial time that the malware started to communicate to the C&C.

One method in doing so would be to use the tool mactime from TSK to read Squid access log, you only need to modify the access log so it is contained in a bodyfile.  So I wrote the script squid2timeline that achieves that. The usage of the script is:

squid2timeline -c CONFIG [-l] [-h HOST] [ACCESS]
 Where CONFIG refers to the configuration file of squid, usually /etc/squid/squid.conf
 The script then reads the variables needed to determine the correct format of the squid
 access file and the location of the current squid access file.
 Optional: ACCESS defines the access file to read, otherwise the current one as it is
 defined in the squid.conf file will be read.

squid2timeline [-l] [-h HOST] [-e] ACCESS
 -e Indicates that the access file is constructed using emulate_h t t p d_log on
 Otherwise (the default behaviour) emulate_h t t p d_log will be assumed to be off

 [-l] Defines a legacy timeline format as used by TSK version 1.X and 2.X,
 otherwise version 3.0+ is assumed.

 [-h HOST] defines a host name to be included in the timeline.

So one example of the usage of this script is to map the timeline of one individual IP address that is infected, or suspected of being infected, from the access log file and run it through the script and mactime.

grep 10.1.1.1 access.log.1 > access.log_10.1
squid2timeline access.log_10.1 > body
mactime -b body -i hour summary -d > timeline.csv

The content of the file “timeline.csv” would then be a timeline in a CSV format and the file “summary” contains an hourly summary of the traffic.  If we examine the content of the summary file it looks like this:

Hourly Summary for Timeline of body

Mon Jun 22 2009 04:00:00, 835
Mon Jun 22 2009 05:00:00, 945
Mon Jun 22 2009 06:00:00, 807
Mon Jun 22 2009 07:00:00, 814
Mon Jun 22 2009 08:00:00, 810
Mon Jun 22 2009 09:00:00, 804
Mon Jun 22 2009 10:00:00, 879
Mon Jun 22 2009 11:00:00, 1680
Mon Jun 22 2009 12:00:00, 1789
Mon Jun 22 2009 13:00:00, 1023
....

So a unusual spike appears in the traffic around 11:00, something that could be an indication of an infection.  This than can assist the analyst to focus the investigation on that timeline.

Categories: Forensics, Network Analysis Tags:

Office 2007 metadata

June 12th, 2009 kiddi No comments

Metadata information from documents can be a great source of information for investigators.  And I’ve often come across documents created in Microsoft Word or other Office documents.  There are several scripts and tools to read the properitary binary office 2003 and earlier format that Microsoft created and I’ve got nothing to add to those tools.  But I couldn’t find any tools that listed the metadata information from Office documents created using Office 2007, which use the OpenXML document format.  So I decided to examine it a bit further.

Microsoft has published a good document describing the structure of OpenXML, for instance here. Essentially a document created in the OpenXML document format is a compressed file, using the well known ZIP format.  Inside the ZIP file are predefined structures of files, mostly XML files that describe the document and it’s content.  So it can be easily read using standard available libraries in scripting languages such as Perl.

According to Microsoft a folder is created inside the ZIP archive called “_reis”.  This folder contains a file named “.rels” that defines the root relationships within the package.  This should be the first place to be able to parse the content of the document.  Whitin the .res file you find tags that define the relationship of the document:

<Relationship Id="someID" Type="relationshipType" Target="targetPart"/>

Metadata is stored in files that contain a type of “*properties”, most notable the “core-properties” and “extended-properties”. These files are usually stored in the following location:

  • docProps/core.xml
  • docProps/app.xml

These files then contain the actual metadata information, such as document creator, last saved by information, etc. These files then need to be extracted and parsed to display the metadata information.

To do this I wrote the script read_open_xml.pl that parses the contents of the .rels file to locate metadata information from the document and then extracts the metadata and prints it to the screen. Example of the usages is:

./read_open_xml.pl test.docx
==========================================================================
 cmd line: ./read_open_xml.pl test.docx
==========================================================================

Document name: test.docx
Date: Tue Jun  9 16:51:23 GMT 2009

--------------------------------------------------------------------------
File Metadata
--------------------------------------------------------------------------
 title = my company template
 subject = Document template
 creator = Kristinn Gudjonsson
 keywords = template, word
 description =
 lastModifiedBy = Kristinn Gudjonsson
 revision = 3
 lastPrinted = 2008-08-15T10:14:00Z
 created = 2008-08-15T10:14:00Z
 modified = 2008-08-15T10:14:00Z
 category = template
--------------------------------------------------------------------------
Application Metadata
--------------------------------------------------------------------------
 Template = my_template.dot
 TotalTime = 0
 Pages = 2
 Words = 159
 Characters = 908
 Application = Microsoft Word 12.1.2
 DocSecurity = 0
 Lines = 7
 Paragraphs = 1
 ScaleCrop = false
 Manager = Some dude
 Company = My Company
 LinksUpToDate = false
 CharactersWithSpaces = 1115
 SharedDoc = false
 HyperlinksChanged = false
 AppVersion = 12.0258

copyright, Kristinn Gudjonsson, 2009

The script also reads the character encoding of the XML documents and encodes the output accordingly.  If you experience any problems using the script, please notify me so I can fix the problem, but so far I haven’t come across any openXML document that hasn’t been correcly parsed using this script.

Update 1

I’ve modified the script slightly so it can be used in Windows.  I’ve tested the script on a Win XP SP3 machine using ActivePerl 5.10 and it should work.  You can get the Windows version here.
Categories: Forensics Tags:

Read a unicode file

June 9th, 2009 kiddi No comments

In a recent case I came across a machine that was infected with malware.  The machine had the free AVG antivirus installed.  AVG keeps their log at “C:\Documents and Settings\All Users\Application Data\avg8\Log”.  Under that folder are several log files, all identified by “file” as “MPEG ADTS, layer I, v1, 160 kbits, 32 kHz, Stereo”.  This is obviously not true, so I took a short look at one of the log files:

cat avgcore.log | xxd | head -10
0000000: fffe 5b00 4100 5600 4700 3800 2e00 4300  ..[.A.V.G.8...C.
0000010: 6f00 7200 6500 5d00 2000 4900 4e00 4600  o.r.e.]. .I.N.F.
0000020: 4f00 2000 3200 3000 3000 3800 2d00 3100  O. .2.0.0.8.-.1.
0000030: 3200 2d00 3000 3400 2000 3200 3000 3a00  2.-.0.4. .2.0.:.
0000040: 3200 3700 3a00 3100 3500 2c00 3000 3300  2.7.:.1.5.,.0.3.
...

As can be seen in the above output the file is written in Unicode, although the language is in English and therefore we could read the file using the ASCII table.  So I wrote a quick Perl script to read the file for me, which can be seen here.

The usage of the script is:

    read_unicode [-l] [-h] [-o OFFSET] FILE
Where:
        -l Preceed each printed line with a line number
        -h Print this help message
        -o OFFSET Defines the offset where the script starts reading the unicode text.
        This option can be used to skip a file header and read the content of the file.
        FILE this is the file in Unicode that is to be read by the script

So to read the log file in question, I could simply use

read_unicode avgcore.log
??[AVG8.Core] INFO 2008-12-04 20:27:15,031 XXX-F0C226 PID:528 THID:2772
ID:XX-XX-XX-XX-XX:YY.YY.YY MSG:'ERRORCODE:0x0', 'SIZE:0x16', 'SIZE:0x16'
[AVG8.Core] INFO 2008-12-04 20:27:15,265 XXX-F0C226 PID:528 THID:2772
ID:XX-XX-XX-XX-XX:YY.YY.YY MSG:'ERRORCODE:0x0', 'SIZE:0x16', 'SIZE:0x16'
...

Or to skip the file header

read_unicode -o 2 avgcore.log
[AVG8.Core] INFO 2008-12-04 20:27:15,031 XXX-F0C226 PID:528 THID:2772
ID:XX-XX-XX-XX-XX:YY.YY.YY MSG:'ERRORCODE:0x0', 'SIZE:0x16', 'SIZE:0x16'
[AVG8.Core] INFO 2008-12-04 20:27:15,265 XXX-F0C226 PID:528 THID:2772
ID:XX-XX-XX-XX-XX:YY.YY.YY MSG:'ERRORCODE:0x0', 'SIZE:0x16', 'SIZE:0x16'
...

Just a simple Perl script that does the job for me, at least for this case.

Categories: Forensics Tags:

Windows Prefetch Directory

June 8th, 2009 kiddi No comments

The Prefetch folder in Windows contains information about last run software on a Windows machine.  It can be very valuable to examine the content of the prefetch directory (can be found at %WINDIR%/Prefetch, usually either C:\WINDOWS\Prefetch or C:\WINNT\Prefetch) to find clues about which software has been recently run on the system.

To be able to use this script that I wrote, you need to first mount the Windows image file (see previous post from me on how-to mount a NTFS volume in Linux).  Then you can run the script, that can be found here, like this:

read_prefetch /mnt/analyze/WINNT/Prefetch

Or you can create a HTML report like this

read_prefetch -h /tmp/report.html /mnt/analyze/WINNT/Prefetch

An example report can be seen here:

Example report

Example report

Categories: Forensics, Windows Analysis Tags:

Restore Point Analysis

June 8th, 2009 kiddi No comments

Recently I wanted a small script to read the content of the restore point directory (C:\System Volume Information\_restore{GUID}) and read all the rp.log files that are inside the directory and print out a list of all the restorepoints, when they were taken and what the reason was.

So I wrote this script here to do that for me.  I borrowed some methods from some of older scripts from Harlan Carvey.  Few weeks after writing the script I saw a post from Carvey talking about timeline analysis of the system restore points, so I decided I added a support for timeline analysis.  The script that I have written is pretty similar to that of Carvey’s but still differs enough for me to publish it here.

The script works in an easy manner, you still need to mount the suspected image first.  This script is created and tested on a Linux box, so permissions on the mounted image are not a problem.  One method of mounting the image is:

mount.ntfs-3g -o ro,loop,nodev,noexec,show_sys_files /pathtoimage/image.dd
/mnt/analyze

If the image is mounted at the mount point /mnt/analyze the script can be easily run like this

cd /mnt/analyse/System\ Volume\Information/_restore....
rp_list .

The output is then something like this:

================================================================
RP    Name                Date
----------------------------------------------------------------
RP190    System Checkpoint        Thu Oct  9 00:27:28 2008
RP191    System Checkpoint        Sun Oct 12 16:41:07 2008
RP192    System Checkpoint        Mon Oct 13 21:57:47 2008
RP193    System Checkpoint        Sat Oct 18 01:40:42 2008
RP194    System Checkpoint        Sun Oct 19 10:54:00 2008
RP195    System Checkpoint        Tue Oct 21 21:40:45 2008
....

This is the default behaviour of the script.  There is an option to get the output in a format that can be easily imported into a bodyfile that can be read by TSK (the Sleuthkit) according to the information found here.

rp_list -t .
0|Restore Point (RP190) - System Checkpoint|36534|16895|0|0|4096|1224712189|
1223829667|1223829667|1223512048
0|Restore Point (RP191) - System Checkpoint|36475|16895|0|0|4096|1224009368|
1223935059|1223935059|1223829667
0|Restore Point (RP192) - System Checkpoint|34410|16895|0|0|8192|1225402222|
1224294042|1224294042|1223935067
0|Restore Point (RP193) - System Checkpoint|9856|16895|0|0|4096|1224414756|
1224413640|1224413640|1224294042
0|Restore Point (RP194) - System Checkpoint|6961|16895|0|0|4096|1225735837|
1224625236|1224625236|1224413640
0|Restore Point (RP195) - System Checkpoint|9502|16895|0|0|4096|1224799423|
1224717084|1224717084|1224625245
.....

This format can then easily be read using the tool mactime from TSK.

rp_list -t . > /tmp/rp.body
mactime -b /tmp/rp.body
Thu Oct 09 2008 00:27:28     4096 ...b 16895 0        0        36534   
Restore Point (RP190) - System Checkpoint
Sun Oct 12 2008 16:41:07     4096 ...b 16895 0        0        36475   
Restore Point (RP191) - System Checkpoint
                             4096 m.c. 16895 0        0        36534   
Restore Point (RP190) - System Checkpoint
Mon Oct 13 2008 21:57:39     4096 m.c. 16895 0        0        36475  
 Restore Point (RP191) - System Checkpoint
Mon Oct 13 2008 21:57:47     8192 ...b 16895 0        0        34410   
Restore Point (RP192) - System Checkpoint
Tue Oct 14 2008 18:36:08     4096 .a.. 16895 0        0        36475   
Restore Point (RP191) - System Checkpoint
Sat Oct 18 2008 01:40:42     8192 m.c. 16895 0        0        34410   
Restore Point (RP192) - System Checkpoint
....

It is also possible to add the -h HOST parameter to the script to include a host name into the timeline.  The timeline is formatted according to the specifications of TSK 3.0+, but it is possible to get the listing in a format that can be read using older TSK format by using the -l (legacy) switch to the script.

Categories: Forensics, Windows Analysis Tags:

Firefox web history

June 5th, 2009 kiddi No comments

Update 1:

This blog has been update with a new one, which has been replaced

Update 2:

Updated the blog post again, this time here.

One thing I noticed is that most of the tools that, at least the ones that I found, don’t seem to read Firefox 3 web history and display it in a human readable format (well there is one that I found, on firefoxforensics.com, but that is a Windows tool.  So I decided to write a small bash script (will most likely rewrite it in Perl later) that reads the places.sqlite file and displays it in a browser (w3m or lynx) in an easy to read format.

Firefox 3 stores all of it’s history in a file called places.sqlite which is a SQlite3 database.  The schema for the database is

  • id INTEGER PRIMARY KEY, an integer that indicates the primary key for the database, of no real interest
  • url LONGVARCHAR, the URL that has been visited and the protocol used, something that one likes to examine.
  • title LONGVARCHAR, the title of the page as it appears in the browser
  • rev_host LONGVARCHAR, the reverse of the host name that was visited. used to ease searching and querying into hosts visited in history file.
  • visit_count INTEGER DEFAULT 0, as the variable implies a counter for the site
  • hidden INTEGER DEFAULT 0 NOT NULL,either 0 or 1. if the URL is hidden then the user did not navigate directly to it, usually indicates an embedded page using something like an iframe
  • typed INTEGER DEFAULT 0 NOT NULL,indicates whether the user typed the URL directly into the location bar
  • favicon_id INTEGER,relationship to another table containing favicon
  • frecency INTEGER DEFAULT -1 NOT NULL, combination of frequency and recency, used to calculate which sites appear at the top of the suggestion list when URL’s are typed in the address bar.

The script can be downloaded here and is very simple in use. Best to put somewhere in your PATH and run it like this

read_firefox_history DATABASE

Where DATABASE is exchanged for the file name of the sqlite database containing the web history, eg. places.sqlite.

The small script then reads the places.sqlite file and parses it to display it in a easy to read format.

Categories: Forensics Tags:
-->