Archive

Archive for the ‘Windows Analysis’ Category

timescanner and IE history

April 26th, 2010 No comments

There has been some discussion lately about some limitations to timescanner in regards to the reading of timestamps in various index.dat files.  More precisely Windows decided that it would store timestamps using different timezones depending on the location of the index.dat, instead of sticking with the good old UTC format.  So for instance the history files (index.dat that is stored in the History.IE5 folder) are stored using local timestamps, while the daily and weekly history files have timestamps that are stored using both UTC and the time zone of the machine in question (all timestamps are still stored as a Windows FILETIME format).

So as a quick fix to the current release (and nightly built) I’ve just excluded the daily and weekly files from the tool timescanner.  But in the coming 0.50 release (keep mentioning that) I’ve included a more intelligent scanner, whereas I take into consideration the location of the file in question and apply the appropriate settings to the timestamps.  So timestamps that are stored in an index.dat file that have different meaning depending on their location will have the correct meaning and description in version 0.50 as they should be.

I will provide examples and more details about this new feature when I will release the tool.

Malware analysis

November 19th, 2009 No comments

I decided to to some malware analysis as a part of some presentation I had to do.  And since I went through the process, I decided to post it here if anyone is interested.

To begin with, I needed to find some malware to analyze.  And a great place to find live links to active malware is to visit the site: http://www.malwaredomainlist.com/mdl.php.

What I wanted to show was that despite having a fully patched machine with a fully updated AV is not always enough to protect you.  One way to do that is to either find a PDF or Flash exploit.  The one that I chose for this experiment was this one:

PDF exploit to be used

PDF exploit to be used

First things first, to download the malware example and do some static analysis on it.  First of all I ran pdfid.py from Didier Stevens to get some ID about the PDF document

pdfid.py dhkn.pdf
PDFiD 0.0.9 dhkn.pdf
 PDF Header: %PDF-1.4
 obj                    9
 endobj                 9
 stream                 2
 endstream              2
 xref                   0
 trailer                1
 startxref              1
 /Page                  0
 /Encrypt               0
 /ObjStm                0
 /JS                    1
 /JavaScript            2
 /AA                    0
 /OpenAction            0
 /AcroForm              0
 /JBIG2Decode           0
 /RichMedia             0
 /Colors > 2^24         0

By looking at this we can see that there is a Javascript code in the document, which is commonly used to exploit Adobe Reader and we also see that there are some streams in the document.  Now we need to take a closer look at the source code of the document. This can be done with any text editor, such as vim or just use less (or cat).

If we examine the document we don’t see any text part, just a stream that is JavaScript,

...
endstream
endobj
6 0 obj
<</CS /DeviceRGB /S /Transparency >>
endobj
7 0 obj
<</Length 76450 /Filter [/ASCIIHexDecode]>>
stream
7661722066686e7075783d27273b6567686a76783d22223b616567777a3d35303431323b6465757....
...

This stream can be easily decoded.  We see that the filter that is used is a simple ASCIIHexDecode, so I simply copy the stream

grep ^stream dhkn.pdf -A 2  > stream

I then edit the file and deleted lines that did not belong to the stream itself.  Since the file now contained only the hex code of the stream I could decode it with a simple Perl script

#!/usr/bin/perl

use strict;

my $line = undef;
my $string = undef;

my $file = shift;

die( 'Wrong usage: ' . $0 . ' FILE' ) unless -e $file;

open( FH, $file ) or die( 'Unable to read file ' . $file );
open( RW, '>' . $file . '.txt' );

while( $line = <FH> )
{
 print "Processing line\n";
 $line =~ s/\%//g;

 $string = pack 'H*', $line;
 print RW $string;
}

close( FH );
close( RW );

I run the script like this:

./conv.pl stream

The content of the text file stream.txt looks something like this

var fhnpux='';eghjvx="";aegwz=50412;deuv="";var fiqy='',lmuz=false,ekrt=true,gpqr=false,
hnrty='',hloqr=0,dimtz=String,afioxy=dimtz['fkrEoAmkCBhkaBrRCAoAdkeE'.replace(/[EABkR]/g,'')],
gilmq=String,abdmv=gilmq['eBvBaLlI'.replace(/[ILABJ]/g,'')],ikmqw="61",begily="",aefmos=[67,59,
63,151,171,159,153,165,159,160,164,81,156,154,174,144,159,165,94,170,151,163,169,161,98,81,162,...
...

Obviously a obfuscated JavaScript.  So we need to dig a little deeper. To make it easier to read a simple substitution is done

cat stream.txt | sed -e 's/;/;\n/g' > stream.js

This makes the code a litte bit easier to read.  Then to make it even more easy, vim is used to edit the file and add spaces and new lines where needed. There are two things in this code that are interesting (well the two most interesting things that pop up at least).  First of all the function close to the end of the file:

lquv=function()
{
      for(var fjpu;hloqr<aefmos.length;hloqr+=1)
      {
                var fjqu=hloqr%ikmqw.length+1;
                var dorvy=ikmqw.substring(hloqr%ikmqw.length,fjqu);
                var blrwy=aefmos[hloqr];
                begily+=afioxy(blrwy-dorvy.charCodeAt(0));
        }
        abdmv(begily);
};
lquv();

We see that we have a function called “lquv” which is seems to take care of decoding the obfuscation of the code.  We see in the end a function called “adbmv” is called with the parameter of begily, which is the variable that holds the decoded JavaScript.  The function “adbmv” is defined above in the code:

gilmq=String
abdmv=gilmq['eBvBaLlI'.replace(/[ILABJ]/g,'')]

This is a very simple obfuscation.  We see that “gilmq” is defined as a String and then “abdmv” is (when we complete the simple substition)

gilmq['eval']

So when the function calls “abdmv(begily)” we are about to evaluate or execute the code that is displayed in the variable begily.  We therefore need to know what is inside the variable “begily”.  The basis for “begily” resides in the variable “aefmos” (the second interesting thing we found), which is defined as:

aefmos=[67,59,6....

The easiest way (or at least an easy method) to decode this string is simply to modify the stream.js to a HTML document and open it up in a browser or other JavaScript interpreter.

We add to the top of the document

<html>
<head><title>TESTING</title></head>
<body>
<script>

And at the bottom

</script>
</body>
</html>

We then modify the JavaScript itself.  First of all the document ends with a ?, which we delete.  Then we modify the function "lquv" so that it prints the JavaScript instead of evaluating it.

lquv=function()
{
 for(var fjpu; hloqr<aefmos.length;hloqr+=1)
 {
 var fjqu=hloqr%ikmqw.length+1;
 var dorvy=ikmqw.substring(hloqr%ikmqw.length,fjqu);
 var blrwy=aefmos[hloqr];
 begily+=afioxy(blrwy-dorvy.charCodeAt(0));
 }
 //abdmv(begily);
 document.write(begily);
};

The change that I made is written in bold.  I then open this document up in a sandboxed environment to get the variable begily decoded.  This script looks like this:

function fix_it(yarsp, len)
{
 while (yarsp.length * 2 < len)
 {
 yarsp += yarsp;
 }

 yarsp = yarsp.substring(0, len/2);
...

Now we have the true JavaScript code that is to be run on the machine.  Inside this there are several functions, some of which contain the magic variable name “shellcode” or “payload”, which is usually considered to be an indication of a malware (if the obfuscation isn’t enough).  Near the end of the code we see this:

function pdf_start()
{
 var version = app.viewerVersion.toString();
 version = version.replace(/\D/g,'');
 var varsion_array = new Array(version.charAt(0), version.charAt(1), \
version.charAt(2));

 if ((varsion_array[0] == 8 ) && (varsion_array[1] == 0) || \
(varsion_array[1] == 1 && varsion_array[2] < 3))
 {
 util_printf();
 }

 if ((varsion_array[0] < 8 ) || (varsion_array[0] == 8 && \
varsion_array[1] < 2 && varsion_array[2] < 2))
 {
 collab_email();
 }

 if ((varsion_array[0] < 9 ) || (varsion_array[0] == 9 && \
varsion_array[1] < 1))
 {
 collab_geticon();
 }
}

pdf_start();

This function is called in the end and we can see that it begins by getting the Adobe Reader version code before going through an if sentence, trying to determine which exploit to use based on the version number.  This particular exploit is used against Adobe Reader versions:

  • 8.0 or 8.1.0-8.1.2
  • Older versions than 8.0 or version 8.2.0-2
  • Older versions than 9.0 or 9.0

There are different exploits run based on on of listed criteria above.  If we examine the payloads or shellcodes, we see that they are coded using the JavaScript function “escape” and are all similar to the one listed below:

var payload = unescape("%uEBE9%u0001%u5600%uA164%u0030%u0....

To further analyse this malware the payload has to be examined.  So we copy the payload to a file and create a simple Perl script to change the JavaScript to binary:

#!/usr/bin/perl
use strict;
use Encode;

my $file = shift;
my $line = undef;
my $string = undef;
my @chars = undef;
my $done;

die('file does not exist') unless -e $file;

open( FH, $file ) or die( 'Unable to open file: ' . $file );
open( RW, '>' . $file . '.dat' );
binmode RW;

# read all lines
while( <FH> )
{
 @chars = split( /%u/ );
 print "Processing line..\n";
 print "LINE CONSISTS OF " . $#chars . " CHARS\n";

 $done = -1;
 foreach my $char (@chars )
 {
 $done++; # increase done by one
 next unless $done;

 print RW pack( 'v',hex( $char ));
 }

}

close(RW);
close( FH );

So to run the script

./decode_shell shell

Now I’ve got a binary document, called shell.dat which can be easily analysed using strings

strings -a -t x shell.dat
36 QQSVW`
65 B`;U
1a8 PhhC
1d5 PSSSSSS
1f3 QQSVWjB
209 a.ex
229 YYt9
243 YYt
 W
264 YYFF;
274 QSf`
2a1 t
 @8
2ba http://style-boards.com/forum/bmosz2.exe
2e3 http://style-boards.com/forum/click.php?r=

We see that this particular shellcode (the one that is used to exploit version 9.0) is simply downloading more malware to the machine.  There are two files fetched, both of which at the time of analysis were removed from the server in question, so further analysis wasn’t possible.

To test the other payloads, we examine the one that exploits the util_printf vulnerability.

/decode_shell util_shell
Processing line..
LINE CONSISTS OF 391 CHARS
strings -a -t x util_shell.dat
...
209 a.ex
...
2ba http://style-boards.com/forum/cdruz2.exe
2e3 http://style-boards.com/forum/click.php?r=

And the collab_email exploit:

/decode_shell collab_email_shell
Processing line..
LINE CONSISTS OF 392 CHARS
strings -a -t x collab_email_shell.dat
...
209 a.ex
...
2ba http://style-boards.com/forum/fkntuw2.exe
2e4 http://style-boards.com/forum/click.php?r=

We can see that for each of the exploits there are two executable files downloaded.  And the file that comes with “click.php?r=” seems to be the same one for each of them.  The second executable does have a different name, fkntuw2.exe, cdrusz2.exe, bmosz2.exe

I was unable to analyze the executables further since they had all been removed from the server at the time I tried to download them, got a 404 error from the server.  Although the PDF document still remained on the server the last time I checked.

This concluded the static analysis of the code,  I also did a live dynamic analysis of the malware that I might share at a later time, but for now, let the static analysis do.

log2timeline, artifact timeline analysis – Part I

August 1st, 2009 2 comments

Update 1

Updated one command (according to a comment) and text regarding availability of comparable tools updated according to a post that I just posted on the SANS forensic blog

 

Timeline analysis can be extremely useful during any investigation.  Although traditional file system timeline can be very helpful it sometimes misses important events that are stored inside files.  These events might be crucial to the investigation or at least provide a better view of the events that really occurred on the suspect system. So to get the big picture, or a complete and accurate description we need to dig deeper and incorporate information found inside artifacts or log files into our timeline analysis. These artifacts or log files could reside on the suspect system itself or in another device, such as a firewall or a proxy (or any other device that logs down information that might be relevant to the investigation).

Unfortunately there are few tools out there that can parse and produce body files from the various artifacts found on different operating systems to include with the traditional filesystem analysis. Harlan Carvey has been working on some scripts for the Windows platform to accomplish this, such as regtime.pl to create a body file from the registry. Usually these tools are build specifically to parse a file/artifact that is of a particular format (such as a tool just to produce a body file from restore points). I’ve released some tools like that, as well as H. Carvey and others. I know of one attempt to create a framework to correlate different artifacts into a timeline, a project called Ex-Tip, by Mike Cloppert. There is a GCFA gold paper describing the framework. This project was started in May 2008, but hasn’t been maintained since then. Instead of extending that project I decided to start my own, that is to add a tool that can correlate information found inside different log files and artifacts into the traditional timeline analysis. I wanted to be able to easily integrate this tool into already existing tools that deal with timeline analysis, so I chose to output all timelines in a mactime body format, to be used with the tool mactime from TSK (The SleuthKit). This tool is called log2timeline and already supports incorporating seven different artifacts into the timeline.

In other words, this tool has been created to use artifacts and log files found on suspect systems (and others) in a timeline analysis to assist the investigator so that he can more easily see the “big picture”. That is to be able to build a more accurate timeline of the events that have occurred and when (and in which order).  Such a tool has to have a wide range of support for different log files and artifacts to be useful for investigators, yet despite only being capable of parsing six artifacts today I would like to publish my first beta version of the tool for people to download and try out.  Current version of the tool parses the following artifacts:

  • Prefetch directory (reads the content of the directory and parses files found inside)
  • UserAssist key info (reads the NTUSER.DAT user registry file to parse the content of UserAssist keys)
  • Squid access logs (with emulate_httpd_log off)
  • Restore points (reads the content of the directory and parses rp.log file inside each restore point)
  • Windows shortcut files (LNK)
  • Firefox history (for version 3.+)
  • Windows Recycle Bin (INFO2)

Although not nearly enough support for different artifacts, at least it is a start.  Future versions will support at least:

  • Event Logs
  • Index.dat files (IE History)
  • FF files (FF History older version)
  • ISA text export
  • Squid access log with httpd_emulate equal to on
  • Cisco ACL entries
  • Linux syslog
  • pcap dump files
  • Mac OS X artifacts
  • Other Linux artifacts
  • Opera and Safari history files

Ideas about new artifacts, or even contribution to the tool are greatly appreciated. The tool can be downloaded from here and the man page is accessible here.

One example of the usage is the following scenario.  A user has opened CMD.EXE and ran the command ipconfig.  To show that the user in question was the user that actually ran the command we start by taking a traditional timeline using TSK (in this instance the machine was booted into HELIX to create the timeline):

fls -m C: -r /dev/sda1 > /tmp/bodyfile
ils -m /dev/sda1 >> /tmp/bodyfile

Then mount the drive, for instance by issuing this command:

mkdir /mnt/analyze
mount.ntfs-3g -o ro,nodev,noexec,show_sys_files /dev/sda1 /mnt/analyze

Now the suspect drive is mounted as a read-only so we can inspect some of the artifacts found on the system.

cd /mnt/analyze/WINDOWS/Prefetch
log2timeline -f prefetch . >> /tmp/bodyfile

We start by navigating to the Prefetch directory, which stores information about recently started programs (created to speed up boot time of those processes) and run the tool against the Prefetch directory.  The output is then stored in the same bodyfile as the traditional file system timeline.  Then we navigate to the user that we are taking a closer look at to examine the UserAssist (stores information about recently run processes by that user) part of the user’s registry.

cd /mnt/analyze/Documents\ and\ Settings/USER
log2timeline -f userassist NTUSER.DAT >> /tmp/bodyfile

Now we have incorporated information found inside a particular user in the bodyfile.  Let’s examine the timeline a little bit closer, use the tool mactime from TSK to create a timeline

mactime -b /tmp/bodyfile > /tmp/timeline

We can the see part of the output below:

User running CMD and ipconfig

User running CMD and ipconfig

If we examine the timeline we can now see that on Sunday July 19th at 14:25:46 the user USER ran the command CMD.EXE as displayed in the UserAssist part of that particular user’s registry file.  Then few seconds later, or at 14:25:50 the command IPCONFIG.EXE was accessed according to the traditional timeline. And then we see that a .pf file (inside Prefetch directory) is created at 14:25:53, we also see that according to the Prefetch file the command has been executed six times, and the last time it was executed was at 14:25:50 (so we know that the update of the access time did not come from someone opening the file or otherwise modifying the access time, it was really executed).

Other examples of usage would include reading LNK files to include the information found there inside the timeline.  Take for instance all the documents found inside the folder “C:\Documents and Settings\USER\Recent” that stores information about recently opened documents by that particular user.  If we read the content of that directory and include that into our timeline, for instance by issuing this command:

cd /mnt/analyze/Documents\ and\ Settings\USER\Recent
ls -b *.lnk | xargs -n1 log2timeline -f win_link >> /tmp/bodyfile

We then recreate the timeline and examine the document “Not to be seen document.txt”, which is a document that this particular user should not have read.

Timeline Analysis

Timeline Analysis

If we examine the timeline above we see that at 20:23:22 on Jul the 31th the Prefetch file NOTEPAD.EXE-336351A9.pf is created, suggesting that NOTEPAD.EXE has been opened.  Then at 20:23:27 we see that both the M (modified) and A (access) timestamps have been updated (these timestamps are found inside the shortcut file itself), suggesting that the file was opened at that time using most probably NOTEPAD.EXE.  The reason why we don’t see NOTEPAD.EXE in the Prefetch timeline is that it is run again later, at 20:23:49 (which is the last time it was used).  The shortcut file itself was created at 20:23:38, which is after it had been opened, according to the information found inside the LNK file itself.

This shows that it is important to also examine the artifacts found on suspect systems and include them in the timeline analysis.

Windows Prefetch Directory

June 8th, 2009 2 comments

The Prefetch folder in Windows contains information about last run software on a Windows machine.  It can be very valuable to examine the content of the prefetch directory (can be found at %WINDIR%/Prefetch, usually either C:\WINDOWS\Prefetch or C:\WINNT\Prefetch) to find clues about which software has been recently run on the system.

To be able to use this script that I wrote, you need to first mount the Windows image file (see previous post from me on how-to mount a NTFS volume in Linux).  Then you can run the script, that can be found here, like this:

read_prefetch /mnt/analyze/WINNT/Prefetch

Or you can create a HTML report like this

read_prefetch -h /tmp/report.html /mnt/analyze/WINNT/Prefetch

An example report can be seen here:

Example report

Example report

Categories: Forensics, Windows Analysis Tags:

Restore Point Analysis

June 8th, 2009 No comments

Recently I wanted a small script to read the content of the restore point directory (C:\System Volume Information\_restore{GUID}) and read all the rp.log files that are inside the directory and print out a list of all the restorepoints, when they were taken and what the reason was.

So I wrote this script here to do that for me.  I borrowed some methods from some of older scripts from Harlan Carvey.  Few weeks after writing the script I saw a post from Carvey talking about timeline analysis of the system restore points, so I decided I added a support for timeline analysis.  The script that I have written is pretty similar to that of Carvey’s but still differs enough for me to publish it here.

The script works in an easy manner, you still need to mount the suspected image first.  This script is created and tested on a Linux box, so permissions on the mounted image are not a problem.  One method of mounting the image is:

mount.ntfs-3g -o ro,loop,nodev,noexec,show_sys_files /pathtoimage/image.dd
/mnt/analyze

If the image is mounted at the mount point /mnt/analyze the script can be easily run like this

cd /mnt/analyse/System\ Volume\Information/_restore....
rp_list .

The output is then something like this:

================================================================
RP    Name                Date
----------------------------------------------------------------
RP190    System Checkpoint        Thu Oct  9 00:27:28 2008
RP191    System Checkpoint        Sun Oct 12 16:41:07 2008
RP192    System Checkpoint        Mon Oct 13 21:57:47 2008
RP193    System Checkpoint        Sat Oct 18 01:40:42 2008
RP194    System Checkpoint        Sun Oct 19 10:54:00 2008
RP195    System Checkpoint        Tue Oct 21 21:40:45 2008
....

This is the default behaviour of the script.  There is an option to get the output in a format that can be easily imported into a bodyfile that can be read by TSK (the Sleuthkit) according to the information found here.

rp_list -t .
0|Restore Point (RP190) - System Checkpoint|36534|16895|0|0|4096|1224712189|
1223829667|1223829667|1223512048
0|Restore Point (RP191) - System Checkpoint|36475|16895|0|0|4096|1224009368|
1223935059|1223935059|1223829667
0|Restore Point (RP192) - System Checkpoint|34410|16895|0|0|8192|1225402222|
1224294042|1224294042|1223935067
0|Restore Point (RP193) - System Checkpoint|9856|16895|0|0|4096|1224414756|
1224413640|1224413640|1224294042
0|Restore Point (RP194) - System Checkpoint|6961|16895|0|0|4096|1225735837|
1224625236|1224625236|1224413640
0|Restore Point (RP195) - System Checkpoint|9502|16895|0|0|4096|1224799423|
1224717084|1224717084|1224625245
.....

This format can then easily be read using the tool mactime from TSK.

rp_list -t . > /tmp/rp.body
mactime -b /tmp/rp.body
Thu Oct 09 2008 00:27:28     4096 ...b 16895 0        0        36534   
Restore Point (RP190) - System Checkpoint
Sun Oct 12 2008 16:41:07     4096 ...b 16895 0        0        36475   
Restore Point (RP191) - System Checkpoint
                             4096 m.c. 16895 0        0        36534   
Restore Point (RP190) - System Checkpoint
Mon Oct 13 2008 21:57:39     4096 m.c. 16895 0        0        36475  
 Restore Point (RP191) - System Checkpoint
Mon Oct 13 2008 21:57:47     8192 ...b 16895 0        0        34410   
Restore Point (RP192) - System Checkpoint
Tue Oct 14 2008 18:36:08     4096 .a.. 16895 0        0        36475   
Restore Point (RP191) - System Checkpoint
Sat Oct 18 2008 01:40:42     8192 m.c. 16895 0        0        34410   
Restore Point (RP192) - System Checkpoint
....

It is also possible to add the -h HOST parameter to the script to include a host name into the timeline.  The timeline is formatted according to the specifications of TSK 3.0+, but it is possible to get the listing in a format that can be read using older TSK format by using the -l (legacy) switch to the script.

Categories: Forensics, Windows Analysis Tags:
-->