Second Network Forensics Contest

November 23rd, 2009 kiddi 3 comments

I just wanted to go over my solution to the second network forensics contest.

First of all a little disclaimer, since this is a competition where scripting is encouraged I decided beforehand to write a script and not rely on any available tools to complete this task (or at least to minimize usage of previous tools).

To begin with, we know that Ann is being monitored closely, since she was an apparent flight risk. After Ann’s disappearance the police brings along a network capture, claiming it to quite possibly indicate her whereabouts.

There are definitely some questions that need to be answered.  So to begin with, let’s examine the content quickly using tcpdump.  We want to see every IP and port number that has issued any IP traffic.  So let’s begin by quickly seeing all the possible sources.

tcpdump -nn -r evidence02.pcap  | awk -F 'IP' '{print $2}' | \
awk '{print $1}' | sort -nu
reading from file evidence02.pcap, link-type EN10MB (Ethernet)
10.1.1.20.53
64.12.102.142.587
192.168.1.10.52111

And then to see all the destinations.

tcpdump -nn -r evidence02.pcap | grep IP  | awk -F '>' '{print $2}' \
| awk '{print $1}' | sort -nu
reading from file evidence02.pcap, link-type EN10MB (Ethernet)
10.1.1.20.53:
64.12.102.142.587:
192.168.1.30.514:

We see a traffic that most likely is a DNS traffic (port 53) and then some other traffic that seems to going to the server 64.12.102.142 on port 587.  Let’s examine the TCP traffic little bit closer using a script that I wrote for the previous network forensic challenge, pcapcat.

pcapcat -r evidence02.pcap -b 0
[1] TCP 192.168.1.159:1036 -> 64.12.102.142:587
[2] TCP 192.168.1.159:1038 -> 64.12.102.142:587

We see that there are only two TCP connections that have been set up in this dump.  And they correspond with the output that we saw from tcpdump, that is Ann’s laptop is clearly communicating to the sever 64.12.102.142 on port 587.  We need to examine this traffic little closer, so let’s dump it using pcapcat.

pcapcat -r evidence02.pcap
[1] TCP 192.168.1.159:1036 -> 64.12.102.142:587
[2] TCP 192.168.1.159:1038 -> 64.12.102.142:587
Enter the index number of the conversation to dump or press enter to quit: 1
Dumping index value 1
Unable to determine output file
Give the name of the output file: file_1

And the second stream

pcapcat -r evidence02.pcap
[1] TCP 192.168.1.159:1036 -> 64.12.102.142:587
[2] TCP 192.168.1.159:1038 -> 64.12.102.142:587
Enter the index number of the conversation to dump or press enter to quit: 2
Dumping index value 2
Unable to determine output file
Give the name of the output file: file_2

Now we have two files, file_1 and file_2 that contain the gathered TCP stream from the network capture.  Start by checking out what this is. Try to identify the content using the command file, which uses magic values to determine the filetype.

file file_*
file_1: ASCII HTML document text, with CRLF line terminators
file_2: ASCII HTML document text, with CRLF line terminators

According to the file command, we are dealing with a HTML document.  Let’s try to see if that is correct

head -3 file_1
220 cia-mc06.mx.aol.com ESMTP mail_cia-mc06.1; Sat, 10 Oct 2009 15:35:16 -0400
EHLO annlaptop
250-cia-mc06.mx.aol.com host-69-140-19-190.static.comcast.net

head -3 file_2
220 cia-mc07.mx.aol.com ESMTP mail_cia-mc07.1; Sat, 10 Oct 2009 15:37:56 -0400
EHLO annlaptop
250-cia-mc07.mx.aol.com host-69-140-19-190.static.comcast.net

By examining the first three lines in each of these documents it becomes clear to use that this is in fact not a HTML document but a SMTP conversation.  So now we know that Ann was actually sending e-mails to the server 64.12.102.142

What IP address is this, let’s examine that a bit:

 dig -x 64.12.102.142
; <<>> DiG 9.6.0-APPLE-P2 <<>> -x 64.12.102.142
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57356
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 1
;; QUESTION SECTION:
;142.102.12.64.in-addr.arpa.    IN    PTR
;; ANSWER SECTION:
142.102.12.64.in-addr.arpa. 3600 IN    PTR    smtp-mc.mx.aol.com.
;; AUTHORITY SECTION:
102.12.64.in-addr.arpa.    3600    IN    NS    dns-02.ns.aol.com.
102.12.64.in-addr.arpa.    3600    IN    NS    dns-01.ns.aol.com.
;; ADDITIONAL SECTION:
dns-02.ns.aol.com.    51683    IN    A    205.188.157.232
...

We see that the reverse DNS (or the PTR record) for the IP address points to a server that looks to be a SMTP server belonging to AOL, which can be further strengthen by issuing a whois against the IP address:

whois  64.12.102.142
OrgName:    America Online, Inc.
OrgID:      AMERIC-158
Address:    10600 Infantry Ridge Road
City:       Manassas
StateProv:  VA
PostalCode: 20109
Country:    US
NetRange:   64.12.0.0 - 64.12.255.255
CIDR:       64.12.0.0/16
NetName:    AOL-MTC
NetHandle:  NET-64-12-0-0-1
Parent:     NET-64-0-0-0-0
NetType:    Direct Assignment
NameServer: DNS-01.NS.AOL.COM
NameServer: DNS-02.NS.AOL.COM
Comment:
RegDate:    1999-12-13
Updated:    1999-12-16
RTechHandle: AOL-NOC-ARIN
RTechName:   America Online, Inc.
RTechPhone:  +1-703-265-4670
RTechEmail:  doma...@aol.net
# ARIN WHOIS database, last updated 2009-10-15 20:00
# Enter ? for additional hints on searching ARIN's WHOIS database.

So we now know that Ann did in fact send two e-mails to this server that belongs to AOL.  Now we need to examine the conversation a bit better. To do that I created a script called smtp_anex (SMTP ANalyse and EXtraction tool).  So let’s use that script to analyse the traffic:

./smtp_anex -r file_1 -d data_1
------------------------------------------------------------
 SMTP_ANEX (SMTP ANALYSIS AND EXTRACTION)
------------------------------------------------------------
Information from e-mail header
 Mail from:  snea...@aol.com
 Recipient:  sec...@gmail.com
Information from e-mail body
 Mail from:  "Ann Dercover" <snea...@aol.com>
 Mail to  :  <sec...@gmail.com>
 Subject  :  lunch next week
Authentication information:
 Username: snea...@aol.com
 Password: 558r00lz
Header information:
 date :  Sat, 10 Oct 2009 07
 x-mimeole :  Produced By Microsoft MimeOLE V6.00.2900.2180
 x-mailer :  Microsoft Outlook Express 6.00.2900.2180
 content-type :  multipart/alternative;
 boundary="----=_nextpart_000_0006_01ca497c.3e4b6020" :
 x-priority :  3
 x-msmail-priority :  Normal
 mime-version :  1.0
 message-id :  <000901ca49ae$89d698c0$9f01a8c0@annlaptop>
Additional information:
 data_response: 250 OK
 cmd_ehlo: HASH(0x8b3610)
 banner: 220 cia-mc06.mx.aol.com esmtp mail_cia-mc06.1; sat, 10 oct 2009 15:35:16 -0400
 auth_leftovers: 235 - AUTHENTICATION SUCCESSFUL
 data_cmd_response: 354 start mail input, end with "." on a line by itself
 header: HASH(0x8b6ec0)
------------------------------------------------------------
 The message content
------------------------------------------------------------
-------- Text --------
Sorry-- I can't do lunch next week after all. Heading out of town. =
Another time! -Ann
-------- HTML --------
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; =
charset=iso-8859-1">
<META content="MSHTML 6.00.2900.2853" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>Sorry-- I can't do lunch next week =
after all.
Heading out of town. Another time! -Ann</FONT></DIV></BODY></HTML>

The script works by default by going through the SMTP conversation, and plocking out the relevant data.

It then prints the data both on screen and saves it to files (the printing to screen can be silenced using the option -q).  I used the option -d to save all the data in the folder “data_1″, which now contains the following files:

  • 1-HTML.html
  • 1-RAW.txt
  • 1-Text.txt
  • 1-info.txt

We can clearly see from the output that Ann was sending this e-mail from the address snea...@aol.com and was sending it to the address sec...@gmail.com.  The content of the conversation was (again taken from the output of the script):

Sorry-- I can't do lunch next week after all. Heading out of town. =
Another time! -Ann

This looks to be quite suspicious.  Ann is claiming that se cannot do lunch because she is heading out of town?

We also see the username and password that Ann uses in this conversation:

Authentication information:
 Username: snea...@aol.com
 Password: 558r00lz

The authentication information that the script reads comes from the command AUTH that is issued during the SMTP conversation:

AUTH LOGIN
334 VXNlcm5hbWU6
c25lYWt5ZzMza0Bhb2wuY29t
334 UGFzc3dvcmQ6
NTU4cjAwbHo=
235 AUTHENTICATION SUCCESSFUL

This is a very common authentication mechanism (LOGIN), where base64 is used to encode the messages, if we just decode it, we get:

S: 334 Username:
C: snea...@aol.com
S: 334 Password:
C: 558r00lz
S: 235 AUTHENTICATION SUCCESSFUL

where S: denotes server communications and C: client one. But we do not need to do this manually, the script does this for us.

Let’s examine the second e-mail a bit close, again using the script

smtp_anex -r file_2 -d data_2
------------------------------------------------------------
 SMTP_ANEX (SMTP ANALYSIS AND EXTRACTION)
------------------------------------------------------------
Information from e-mail header
 Mail from:  snea...@aol.com
 Recipient:  mist...@aol.com
Information from e-mail body
 Mail from:  "Ann Dercover" <snea...@aol.com>
 Mail to  :  <mist...@aol.com>
 Subject  :  rendezvous
Authentication information:
 Username: snea...@aol.com
 Password: 558r00lz
Header information:
 date :  Sat, 10 Oct 2009 07
 x-mimeole :  Produced By Microsoft MimeOLE V6.00.2900.2180
 x-mailer :  Microsoft Outlook Express 6.00.2900.2180
 boundary="----=_nextpart_000_000d_01ca497c.9dec1e70" :
 content-type :  multipart/mixed;
 x-priority :  3
 x-msmail-priority :  Normal
 mime-version :  1.0
 message-id :  <001101ca49ae$e93e45b0$9f01a8c0@annlaptop>
Additional information:
 data_response: 250 OK
 msg: Attachment dumped to file - name: secretrendezvous.docx
 cmd_ehlo: HASH(0x8b3610)
 banner: 220 cia-mc07.mx.aol.com esmtp mail_cia-mc07.1; sat, 10 oct 2009 15:37:56 -0400
 auth_leftovers: 235 - AUTHENTICATION SUCCESSFUL
 data_cmd_response: 354 start mail input, end with "." on a line by itself
 header: HASH(0x8b6ec0)
------------------------------------------------------------
 The message content
------------------------------------------------------------
-------- Text --------
Hi sweetheart! Bring your fake passport and a bathing suit. Address =
attached. love, Ann
-------- HTML --------
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; =
charset=iso-8859-1">
<META content="MSHTML 6.00.2900.2853" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>Hi sweetheart! Bring your fake passport =
and a

bathing suit. Address attached. love, Ann</FONT></DIV></BODY></HTML>

Now this looks to be quite suspicious, we can see from the output that Ann is again sending an e-mail, and this time to mist...@aol.com with the subject of “rendezvous”.  The text from the message is:

Hi sweetheart! Bring your fake passport and a bathing suit. Address =
attached. love, Ann

We also see from the output of the script the following additional information:

 msg: Attachment dumped to file - name: secretrendezvous.docx

So there was an attachment with the message, let’s examine the output of the folder data_2

  • 1-HTML.html
  • 1-RAW.txt
  • 1-Text.txt
  • 1-info.txt
  • 1-secretrendezvous.docx

We can therefore examine the content of the attachment. First of all, let’s calculate the MD5sum of the docx document that was attached to the e-mail.

9e423e11db88f01bbff81172839e1923  data_2/1-secretrendezvous.docx

Since this is a .docx document, we can use other scripts to read it, such as cat_open_xml.pl

cat_open_xml.pl 1-secretrendezvous.docx
Meet me at the fountain near the rendezvous point. Address below. I'm bringing
all the cash.
returning from a call..

We don’t get much from this, perhaps there is more to this document than just text.  Since we know that docx documents are nothing more than a simple ZIP file we can just extract the content of the document:

unzip -d doc 1-secretrendezvous.docx
Archive:  1-secretrendezvous.docx
 inflating: doc/[Content_Types].xml
 inflating: doc/_rels/.rels
 inflating: doc/word/_rels/document.xml.rels
 inflating: doc/word/document.xml
 extracting: doc/word/media/image1.png  
 inflating: doc/word/theme/theme1.xml
 inflating: doc/word/settings.xml
 inflating: doc/word/webSettings.xml
 inflating: doc/word/styles.xml
 inflating: doc/docProps/core.xml
 inflating: doc/word/numbering.xml
 inflating: doc/word/fontTable.xml
 inflating: doc/docProps/app.xml

Now we see that there is an image that is contained within the document.  Let’s examine it

md5sum doc/word/media/image1.png
aadeace50997b1ba24b09ac2ef1940b7  doc/word/media/image1.png

The image seems to be taken from Google maps, displaying the meeting place.

Playa del Carmen
1. Av. Constituyentes 1 Calle 10 x la 5ta
Avenida
Playa del Carmen, 77780, Mexico
01 984 873 4000
Meeting Place

Meeting Place

Now we know that there are strong indications that Ann’s secret lover has the email address mist...@aol.com (very sneaky address) and that Ann was sending him a message containing a possible meeting point (again a very subtle document called secretrendezvous).  This could very well be the location where she is at right now (since she has disappeared already and this seems to be the only clue of her whereabouts).

The rest is up to the police chief, our job here is done…

Malware analysis

November 19th, 2009 kiddi No comments

I decided to to some malware analysis as a part of some presentation I had to do.  And since I went through the process, I decided to post it here if anyone is interested.

To begin with, I needed to find some malware to analyze.  And a great place to find live links to active malware is to visit the site: http://www.malwaredomainlist.com/mdl.php.

What I wanted to show was that despite having a fully patched machine with a fully updated AV is not always enough to protect you.  One way to do that is to either find a PDF or Flash exploit.  The one that I chose for this experiment was this one:

PDF exploit to be used

PDF exploit to be used

First things first, to download the malware example and do some static analysis on it.  First of all I ran pdfid.py from Didier Stevens to get some ID about the PDF document

pdfid.py dhkn.pdf
PDFiD 0.0.9 dhkn.pdf
 PDF Header: %PDF-1.4
 obj                    9
 endobj                 9
 stream                 2
 endstream              2
 xref                   0
 trailer                1
 startxref              1
 /Page                  0
 /Encrypt               0
 /ObjStm                0
 /JS                    1
 /JavaScript            2
 /AA                    0
 /OpenAction            0
 /AcroForm              0
 /JBIG2Decode           0
 /RichMedia             0
 /Colors > 2^24         0

By looking at this we can see that there is a Javascript code in the document, which is commonly used to exploit Adobe Reader and we also see that there are some streams in the document.  Now we need to take a closer look at the source code of the document. This can be done with any text editor, such as vim or just use less (or cat).

If we examine the document we don’t see any text part, just a stream that is JavaScript,

...
endstream
endobj
6 0 obj
<</CS /DeviceRGB /S /Transparency >>
endobj
7 0 obj
<</Length 76450 /Filter [/ASCIIHexDecode]>>
stream
7661722066686e7075783d27273b6567686a76783d22223b616567777a3d35303431323b6465757....
...

This stream can be easily decoded.  We see that the filter that is used is a simple ASCIIHexDecode, so I simply copy the stream

grep ^stream dhkn.pdf -A 2  > stream

I then edit the file and deleted lines that did not belong to the stream itself.  Since the file now contained only the hex code of the stream I could decode it with a simple Perl script

#!/usr/bin/perl

use strict;

my $line = undef;
my $string = undef;

my $file = shift;

die( 'Wrong usage: ' . $0 . ' FILE' ) unless -e $file;

open( FH, $file ) or die( 'Unable to read file ' . $file );
open( RW, '>' . $file . '.txt' );

while( $line = <FH> )
{
 print "Processing line\n";
 $line =~ s/\%//g;

 $string = pack 'H*', $line;
 print RW $string;
}

close( FH );
close( RW );

I run the script like this:

./conv.pl stream

The content of the text file stream.txt looks something like this

var fhnpux='';eghjvx="";aegwz=50412;deuv="";var fiqy='',lmuz=false,ekrt=true,gpqr=false,
hnrty='',hloqr=0,dimtz=String,afioxy=dimtz['fkrEoAmkCBhkaBrRCAoAdkeE'.replace(/[EABkR]/g,'')],
gilmq=String,abdmv=gilmq['eBvBaLlI'.replace(/[ILABJ]/g,'')],ikmqw="61",begily="",aefmos=[67,59,
63,151,171,159,153,165,159,160,164,81,156,154,174,144,159,165,94,170,151,163,169,161,98,81,162,...
...

Obviously a obfuscated JavaScript.  So we need to dig a little deeper. To make it easier to read a simple substitution is done

cat stream.txt | sed -e 's/;/;\n/g' > stream.js

This makes the code a litte bit easier to read.  Then to make it even more easy, vim is used to edit the file and add spaces and new lines where needed. There are two things in this code that are interesting (well the two most interesting things that pop up at least).  First of all the function close to the end of the file:

lquv=function()
{
      for(var fjpu;hloqr<aefmos.length;hloqr+=1)
      {
                var fjqu=hloqr%ikmqw.length+1;
                var dorvy=ikmqw.substring(hloqr%ikmqw.length,fjqu);
                var blrwy=aefmos[hloqr];
                begily+=afioxy(blrwy-dorvy.charCodeAt(0));
        }
        abdmv(begily);
};
lquv();

We see that we have a function called “lquv” which is seems to take care of decoding the obfuscation of the code.  We see in the end a function called “adbmv” is called with the parameter of begily, which is the variable that holds the decoded JavaScript.  The function “adbmv” is defined above in the code:

gilmq=String
abdmv=gilmq['eBvBaLlI'.replace(/[ILABJ]/g,'')]

This is a very simple obfuscation.  We see that “gilmq” is defined as a String and then “abdmv” is (when we complete the simple substition)

gilmq['eval']

So when the function calls “abdmv(begily)” we are about to evaluate or execute the code that is displayed in the variable begily.  We therefore need to know what is inside the variable “begily”.  The basis for “begily” resides in the variable “aefmos” (the second interesting thing we found), which is defined as:

aefmos=[67,59,6....

The easiest way (or at least an easy method) to decode this string is simply to modify the stream.js to a HTML document and open it up in a browser or other JavaScript interpreter.

We add to the top of the document

<html>
<head><title>TESTING</title></head>
<body>
<script>

And at the bottom

</script>
</body>
</html>

We then modify the JavaScript itself.  First of all the document ends with a ?, which we delete.  Then we modify the function "lquv" so that it prints the JavaScript instead of evaluating it.

lquv=function()
{
 for(var fjpu; hloqr<aefmos.length;hloqr+=1)
 {
 var fjqu=hloqr%ikmqw.length+1;
 var dorvy=ikmqw.substring(hloqr%ikmqw.length,fjqu);
 var blrwy=aefmos[hloqr];
 begily+=afioxy(blrwy-dorvy.charCodeAt(0));
 }
 //abdmv(begily);
 document.write(begily);
};

The change that I made is written in bold.  I then open this document up in a sandboxed environment to get the variable begily decoded.  This script looks like this:

function fix_it(yarsp, len)
{
 while (yarsp.length * 2 < len)
 {
 yarsp += yarsp;
 }

 yarsp = yarsp.substring(0, len/2);
...

Now we have the true JavaScript code that is to be run on the machine.  Inside this there are several functions, some of which contain the magic variable name “shellcode” or “payload”, which is usually considered to be an indication of a malware (if the obfuscation isn’t enough).  Near the end of the code we see this:

function pdf_start()
{
 var version = app.viewerVersion.toString();
 version = version.replace(/\D/g,'');
 var varsion_array = new Array(version.charAt(0), version.charAt(1), \
version.charAt(2));

 if ((varsion_array[0] == 8 ) && (varsion_array[1] == 0) || \
(varsion_array[1] == 1 && varsion_array[2] < 3))
 {
 util_printf();
 }

 if ((varsion_array[0] < 8 ) || (varsion_array[0] == 8 && \
varsion_array[1] < 2 && varsion_array[2] < 2))
 {
 collab_email();
 }

 if ((varsion_array[0] < 9 ) || (varsion_array[0] == 9 && \
varsion_array[1] < 1))
 {
 collab_geticon();
 }
}

pdf_start();

This function is called in the end and we can see that it begins by getting the Adobe Reader version code before going through an if sentence, trying to determine which exploit to use based on the version number.  This particular exploit is used against Adobe Reader versions:

  • 8.0 or 8.1.0-8.1.2
  • Older versions than 8.0 or version 8.2.0-2
  • Older versions than 9.0 or 9.0

There are different exploits run based on on of listed criteria above.  If we examine the payloads or shellcodes, we see that they are coded using the JavaScript function “escape” and are all similar to the one listed below:

var payload = unescape("%uEBE9%u0001%u5600%uA164%u0030%u0....

To further analyse this malware the payload has to be examined.  So we copy the payload to a file and create a simple Perl script to change the JavaScript to binary:

#!/usr/bin/perl
use strict;
use Encode;

my $file = shift;
my $line = undef;
my $string = undef;
my @chars = undef;
my $done;

die('file does not exist') unless -e $file;

open( FH, $file ) or die( 'Unable to open file: ' . $file );
open( RW, '>' . $file . '.dat' );
binmode RW;

# read all lines
while( <FH> )
{
 @chars = split( /%u/ );
 print "Processing line..\n";
 print "LINE CONSISTS OF " . $#chars . " CHARS\n";

 $done = -1;
 foreach my $char (@chars )
 {
 $done++; # increase done by one
 next unless $done;

 print RW pack( 'v',hex( $char ));
 }

}

close(RW);
close( FH );

So to run the script

./decode_shell shell

Now I’ve got a binary document, called shell.dat which can be easily analysed using strings

strings -a -t x shell.dat
36 QQSVW`
65 B`;U
1a8 PhhC
1d5 PSSSSSS
1f3 QQSVWjB
209 a.ex
229 YYt9
243 YYt
 W
264 YYFF;
274 QSf`
2a1 t
 @8
2ba http://style-boards.com/forum/bmosz2.exe
2e3 http://style-boards.com/forum/click.php?r=

We see that this particular shellcode (the one that is used to exploit version 9.0) is simply downloading more malware to the machine.  There are two files fetched, both of which at the time of analysis were removed from the server in question, so further analysis wasn’t possible.

To test the other payloads, we examine the one that exploits the util_printf vulnerability.

/decode_shell util_shell
Processing line..
LINE CONSISTS OF 391 CHARS
strings -a -t x util_shell.dat
...
209 a.ex
...
2ba http://style-boards.com/forum/cdruz2.exe
2e3 http://style-boards.com/forum/click.php?r=

And the collab_email exploit:

/decode_shell collab_email_shell
Processing line..
LINE CONSISTS OF 392 CHARS
strings -a -t x collab_email_shell.dat
...
209 a.ex
...
2ba http://style-boards.com/forum/fkntuw2.exe
2e4 http://style-boards.com/forum/click.php?r=

We can see that for each of the exploits there are two executable files downloaded.  And the file that comes with “click.php?r=” seems to be the same one for each of them.  The second executable does have a different name, fkntuw2.exe, cdrusz2.exe, bmosz2.exe

I was unable to analyze the executables further since they had all been removed from the server at the time I tried to download them, got a 404 error from the server.  Although the PDF document still remained on the server the last time I checked.

This concluded the static analysis of the code,  I also did a live dynamic analysis of the malware that I might share at a later time, but for now, let the static analysis do.

Small update

October 16th, 2009 kiddi No comments

It’s been a while since I last posted a blog, so a little update.  There is a new network forensic contest published, I’ve already submitted my solution (will post it on the site after the deadline).  I encourage people to try it out, always fun to play with challenges like these, if you have the time.

I’ve been trying to find time to work on log2timeline, to complete the new version.  I’ve updated the GUI so it is feature compatible with the CLI, for those who prefer to use a GUI (yes there are those who actually prefer a GUI).  There are few things I like to complete before releasing the new version, but I decided to publish the development version on-line, so that you can always download the latest version (upload a new version almost as soon as I complete the code).  I still haven’t found time to update timescanner, since I’ve had reports that it is not functioning as advertised, that is scheduled to be completed in the next release.  I’ve also been playing a bit with CFTL (Computer Forensics TimeLab) and log2timeline, that is to create timelines in log2timeline and visually inspect them using CFTL.  I will post a blog soon with the results.

Categories: Forensics Tags:

Network Forensics Puzzle

September 21st, 2009 kiddi 3 comments

Update 1

The winner has been announced, see further detail here.  And despite all odds, it seems that I won the challenge this time despite both very different and really good solutions from other finalists. So here is my answer again, in little bit more detail than the posted solution.

 

There was a very interesting network forensics puzzle that was announced on the GCFA mailing list, and later on the SANS forensic blog and on the ISC. It contains a PCAP file and a small history about a fictional case dealing with suspected data leakage.  The PCAP file, evidence.pcap, can be downloaded from here. Since we are examining the content of the PCAP files, we need to be able to extract files and conversations from the network traffic.  A good list of various tools to achieve that can be found here on the Internet Storm Center’s site.

The deadline for submission is over so I guess it is alright for me to post my solution to this challenge here, even though the winner hasn’t been announced yet. I decided beforehand to try to be as little dependent on tools created by others, rather to create my own tools to solve this challenge.

There are of course various other tools to extract information from a PCAP file, including tcpdump, Wireshark and tons of other tools, although they lack an easy method to extract file contents (can be done through Wireshark though).  But in the spirit of a contest, I had to write my own script, especially since I wanted to dump the content of a stream, something that these tools aren’t really well equiped to do (although there are other tool to do so).

But anyway, to cut a long story short, I wrote a script called pcapcat in Perl, a self explanatory name, that reads the content of a PCAP file and gives you an option to dump the content of a TCP stream into a file.  This little script works fine especially when dealing with small PCAP files (first version of the script).

To use the script on the given evidence file we can use the default option of only showing new connections (that is we are looking for SYN packets without any other flags, ignoring ECN).  If we do that we only see few connections, since the IM conversation obviously was started before the packet capture started, so we only have partial information from it (not the beginning containing the initial TCP handshake).

Therefore we need to call the script so that it shows us all TCP packets, like this:

pcapcat -r evidence.pcap -a
[1] TCP 192.168.1.2:55488 -> 192.168.1.30:22[16]
[2] TCP 192.168.1.30:22 -> 192.168.1.2:55488[24]
[3] TCP 192.168.1.2:55488 -> 192.168.1.30:22[16]
[4] TCP 192.168.1.30:22 -> 192.168.1.2:55488[24]
[5] TCP 192.168.1.2:54419 -> 192.168.1.157:80[2]
[6] TCP 192.168.1.2:54419 -> 192.168.1.157:80[16]
[7] TCP 192.168.1.157:80 -> 192.168.1.2:54419[18]
[8] TCP 192.168.1.2:54419 -> 192.168.1.157:80[17]
[9] TCP 192.168.1.157:80 -> 192.168.1.2:54419[16]
[10] TCP 192.168.1.2:54419 -> 192.168.1.157:80[16]
Read more packets [Y|n]: y
[11] TCP 192.168.1.157:80 -> 192.168.1.2:54419[17]
[12] TCP 192.168.1.158:51128 -> 64.12.24.50:443[24]
[13] TCP 64.12.24.50:443 -> 192.168.1.158:51128[16]
[14] TCP 192.168.1.158:51128 -> 64.12.24.50:443[24]
[15] TCP 64.12.24.50:443 -> 192.168.1.158:51128[16]
[16] TCP 192.168.1.158:51128 -> 64.12.24.50:443[24]
[17] TCP 64.12.24.50:443 -> 192.168.1.158:51128[16]
[18] TCP 64.12.24.50:443 -> 192.168.1.158:51128[24]
[19] TCP 64.12.25.91:443 -> 192.168.1.159:1221[24]
[20] TCP 64.12.24.50:443 -> 192.168.1.158:51128[24]
Read more packets [Y|n]: n
Not printing out more packets

The script prints by default only new TCP connections (that is TCP SYN). To be able to capture an already started IM conversation I had to use the -a option to the script, to tell it to print and index number of all TCP connections (as previously stated).

The IM conversation is taking place on port 443 (usually associated to HTTPS traffic). To capture the IM conversation we need to dump the content of the first associated traffic that takes place on port 443 (index 12 according to pcapcat):

pcapcat -r evidence.pcap -a -d 12 -w conversation

The command above reads the PCAP file evidence.pcap, and dumps the content of the 12th packet, when reading all packets (not just new connections) and saves the dump in the file “conversation”.

To examine the content of the conversation the usage of xxd can be useful

cat conversation | xxd
0000000: 2a02 0061 00b7 0004 0006 0000 0000 0045  *..a...........E
0000010: 3436 3238 3737 3800 0001 0b53 6563 3535  4628778....Sec55
0000020: 3875 7365 7231 0002 008f 0501 0004 0101  8user1..........
0000030: 0102 0101 0083 0000 0000 4865 7265 2773  ..........Here's
0000040: 2074 6865 2073 6563 7265 7420 7265 6369   the secret reci
0000050: 7065 2e2e 2e20 4920 6a75 7374 2064 6f77  pe... I just dow
0000060: 6e6c 6f61 6465 6420 6974 2066 726f 6d20  nloaded it from
0000070: 7468 6520 6669 6c65 2073 6572 7665 722e  the file server.
0000080: 204a 7573 7420 636f 7079 2074 6f20 6120   Just copy to a
0000090: 7468 756d 6220 6472 6976 6520 616e 6420  thumb drive and
00000a0: 796f 7527 7265 2067 6f6f 6420 746f 2067  you're good to g
00000b0: 6f20 2667 743b 3a2d 2900 0300 002a 0200  o &gt;:-)....*..
00000c0: 6200 2200 0400 1400 0000 0000 4600 0000  b.".........F...
00000d0: 0000 0000 0000 010b 5365 6335 3538 7573  ........Sec558us
00000e0: 6572 3100 002a 0256 d400 cb00 0100 0a80  er1..*.V........
00000f0: 0085 2a8b 4100 0e00 0200 0400 0000 4500  ..*.A.........E.
0000100: 0100 0200 0300 0100 0100 0000 5000 0009  ............P...
0000110: c400 0007 d000 0005 dc00 0003 2000 0017  ............ ...
0000120: 7000 0017 7000 0094 dc00 0002 0000 0050  p...p..........P
0000130: 0000 0bb8 0000 07d0 0000 05dc 0000 03e8  ................
...
00007c0: 2000 0400 0c00 0000 0000 4935 3038 3834   .........I50884
00007d0: 3936 0000 010b 5365 6335 3538 7573 6572  96....Sec558user

There are few possibilities for the next step, that is the option of just reading the hex or to write a script to parse the conversation. The script version would obviously have been preferred, but I will leave that as an exercise for the reader (or for a later post).  But to give a small preview of how this would be handled, we can see that every conversation starts with the following pattern:

  • Two bytes indicating the type of client
  • One byte, indicating the length of the user name
  • The user name (equal to the length byte sent previously)

So for this conversation, we can see the following:

0001 0b 5365 6335 3538 7573 6572 3100
00 01 =>  AIM_CLIENTTYPE_MC
0b => length of user name is 0x0b or 11 bytes
username => Sec558user1

We can then see that the username is “Sec558user1″.  Then some more header values follow, before leading us to the real conversation.  Again, a Perl script to parse and display the IM content in a “nice” method are left as an exercise for the reader, or possibly for a later blog post here.

Then there is the possibility to use already existing tools to read the IM conversation, for example the tool aimsniff:

perl aimSniff.pl --nodb --read evidence.pcap
Working in Offline Mode
Reading File: evidence.pcap
Handle File not defined
Type: Outgoing Message
 Timestamp: 2009-8-13 05:57:37
 direction: 00040006
 proto: AIM
 handle: Sec558user1
 ip: 192.168.1.158
 fromHandle:
 message: Here's the secret recipe... I just downloaded it from the file server.
Just copy to a thumb drive and you're good to go &gt;:-)

Type: Incoming Message
 Timestamp: 2009-8-13 05:58:12
 direction: 00040007
 proto: AIM
 handle:
 ip: 192.168.1.158
 fromHandle: Sec558user1
 message: <HTML><BODY><FONT FACE="Arial" SIZE=2 COLOR=#000000>thanks
dude</FONT></BODY></HTML>

Type: Incoming Message
 Timestamp: 2009-8-13 05:58:26
 direction: 00040007
 proto: AIM
 handle:
 ip: 192.168.1.158
 fromHandle: Sec558user1
 message: <HTML><BODY><FONT FACE="Arial" SIZE=2 COLOR=#000000>can't
wait to sell it on ebay</FONT></BODY></HTML>

untieing hashes

Dumping Pcap Stats:
Packets Received: 137957384
Packets Dropped: 141197632
Packets Dropped by interface: 3083938203
Dumping AIM Stats:
Buddy List Captured: 0
File Xfer Count: 0
Login Count: 0
Chat Message Count: 0
Message Count: 3

Dumping MSN Stats:
Login Count: 0
Message Count: 0
Parent process exiting

The client types that are defined inside the header are the following:

#define AIM_CLIENTTYPE_UNKNOWN  0x0000
#define AIM_CLIENTTYPE_MC       0x0001
#define AIM_CLIENTTYPE_WINAIM   0x0002
#define AIM_CLIENTTYPE_WINAIM41 0x0003
#define AIM_CLIENTTYPE_AOL_TOC  0x0004

By examining this we see that Ann’s username is Sec558User1. We could also find out this by issuing a simple strings command against the file

strings -a -t d evidence.pcap > evidence.str
cat evidence.str
...
2277 Sec558user1
2308 Here's the secret recipe... I just downloaded it from the file server.
Just copy to a thumb drive and you're good to go &gt;:-)
2612 Sec558user1
...

We also see in the same output (strings):

67136 GET /adiframe/3.0/5113.1/221794/0/-1/size=120x90;noperf=1;alias=93245558;
cfp=1;noaddonpl=y;artexc=all;artinc=art_image%2Cart_img1x1%2Cart_3pimg%2C
art_text%2Cart_imgtrack;kvmn=93245558;target=_blank;aduho=360;grp=143115875;
misc=143115875 HTTP/1.1
67383 Accept: */*
67396 Referer: http://www.aim.com/redirects/inclient/AIM_UAC_v2.adp?magic=93245558
&width=120&height=90&sn=Sec558user1
67509 Accept-Language: en-us
67533 Accept-Encoding: gzip, deflate
67565 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
67634 Host: at.atwola.com
67655 Connection: Keep-Alive
67679 Cookie: JEB2=4A839DDB6E65181C45921CB2F00016D8;
ATTACID=a3Z0aWQ9MTU4NzdpYTAwYTh2Ymk=;ATTAC=a3ZzZWc9OTk5OTk6NTAyODA=;
badsrfi=V0d710994e8ccb8db64a83a07939b2; atdemo=a3ZhZz1hbTM6dWEzOTtrdnVnPTE7;
AxData=; atdses=0

That is we can see that there is a HTTP request containing the Referer “sn=Sec558user1″, furhter strenghtening our hypothesis that the username is Sec558user1

Let’s examine all connections that were created during the capture time.

pcapcat -r evidence.pcap
[1] TCP 192.168.1.2:54419 -> 192.168.1.157:80
[2] TCP 192.168.1.159:1271 -> 205.188.13.12:443
[3] TCP 192.168.1.159:1272 -> 192.168.1.158:5190
[4] TCP 192.168.1.159:1273 -> 64.236.68.246:80

If we examine the created conversations, we see that only one ocurred between two internal hosts (that is direct connection).  The other newly created sessions
all belong to Ann (192.168.1.159) except one empty HTTP session.

[3] TCP 192.168.1.159:1272 -> 192.168.1.158:5190

This could be an indication of a file transfer.  We know that the IP address of Anna is 192.168.1.158 (given) and of the unknown laptop 192.168.1.159.  Let’s examine that in further detail:

pcapcat -r evidence.pcap -w file -d 3

Try to find out which kind of file this is…

file file
file: data

No luck here, examine the header:

cat file | xxd | head -4
0000000: 4f46 5432 0100 0101 0000 0000 0000 0000  OFT2............
0000010: 0000 0000 0001 0001 0001 0001 0000 2ee8  ................
0000020: 0000 2ee8 0000 0000 b164 0000 ffff 0000  .........d......
0000030: 0000 0000 0000 0000 ffff 0000 0000 0000  ................

Now we see that the file has the magic number OFT2, indicating that this is an OFT file (Oscar File Transfer).

To extract the transferred file itself, I created a script called oftcat. To get the structure I downloaded the source code for Pidgin and read the file oft.c found inside libpurple/protocols/oscar/oft.c, there you can get a nice description of the structure (in c)

We want to extract the file content, so let’s try it out (just to see the output, explaining the structure of an OFT file would make this blog post too long, but the sub routine read_header inside oftcat explains the structure of OFT files for those that are interested):

oftcat -r file
------------------------------------------------------------
 File name: file
------------------------------------------------------------
Parsing OFT (Oscar File Transfer) header
Name of file transferred:
 Total number of files 1
 Files left: 1
 Total parts: 1
 Parts left: 1
 Total size: 12008
 Size: 12008
 Checksum: 2976120832
 ID string 'Cool FileXfer'
 Type: PEER_TYPE_GETFILE_RECEIVELISTING, PEER_TYPE_RESUMEACK,
PEER_TYPE_RESUME, PEER_TYPE_GETFILE_REQUESTLISTING, PEER_TYPE_RESUMEACCEPT,
PEER_TYPE_GETFILE_ACKLISTING, PEER_TYPE_PROMPT,
 Name offset 0
------------------------------------------------------------
------------------------------------------------------------
Parsing OFT (Oscar File Transfer) header
Name of file transferred:
 Cookie value: 7174647
 Total number of files 1
 Files left: 1
 Total parts: 1
 Parts left: 1
 Total size: 12008
 Size: 12008
 Checksum: 2976120832
 ID string 'Cool FileXfer'
 Flag: PEER_CONNECTION_FLAG_IS_INCOMING,
 Type: PEER_TYPE_GETFILE_RECEIVELISTING, PEER_TYPE_DONE, PEER_TYPE_RESUMEACK,
PEER_TYPE_RESUME, PEER_TYPE_RESUMEACCEPT, PEER_TYPE_GETFILE_RECEIVEDLISTING,
PEER_TYPE_ACK, PEER_TYPE_GETFILE_REQUESTFILE, PEER_TYPE_GETFILE_ACKLISTING,
 Name offset 28
------------------------------------------------------------
parsing file information
Final header (after file transfer)
------------------------------------------------------------
Parsing OFT (Oscar File Transfer) header
Name of file transferred:
 Cookie value: 7174647
 Total number of files 1
 Files left: 1
 Total parts: 1
 Parts left: 1
 Total size: 12008
 Size: 12008
 Checksum: 2976120832
 ID string 'Cool FileXfer'
 Flag: PEER_CONNECTION_FLAG_IS_INCOMING,
 Type: PEER_TYPE_GETFILE_RECEIVELISTING, PEER_TYPE_DONE, PEER_TYPE_RESUMEACK,
PEER_TYPE_RESUME, PEER_TYPE_RESUMEACCEPT, PEER_TYPE_GETFILE_RECEIVEDLISTING,
PEER_TYPE_ACK, PEER_TYPE_GETFILE_REQUESTFILE, PEER_TYPE_GETFILE_ACKLISTING,
 Name offset 28
------------------------------------------------------------
File: recipe.docx saved in file recipe.docx

We see that the file name is “recipe.docx”, which the tool oftcat saves as the file name “recipe.docx”.

We can then use tools such as cat_open_xml.pl (antiword for office 2007 documents) or simply open it using Word to extract the content itself.

cat_open_xml.pl recipe.docx
Recipe for Disaster:
1 serving
Ingredients:
4 cups sugar
2 cups water
In a medium saucepan, bring the water to a boil. Add sugar. Stir gently over
low heat until sugar is fully dissolved. Remove  the  saucepan from heat.
Allow to cool completely. Pour into gas tank. Repeat as necessary.

And to extract metadata information from the file:

read_open_xml.pl recipe.docx
==========================================================================
 cmd line: ./read_open_xml.pl recipe.docx
==========================================================================
Document name: recipe.docx
Date: Fri Aug 14 20:09:28 GMT 2009
--------------------------------------------------------------------------
Application Metadata
--------------------------------------------------------------------------
 Template = Normal.dotm
 TotalTime = 1
 Pages = 1
 Words = 43
 Characters = 249
 Application = Microsoft Office Word
 DocSecurity = 0
 Lines = 2
 Paragraphs = 1
 ScaleCrop = false
 HeadingPairs = Title1
 TitlesOfParts =
 Company =
 LinksUpToDate = false
 CharactersWithSpaces = 291
 SharedDoc = false
 HyperlinksChanged = false
 AppVersion = 12.0000
--------------------------------------------------------------------------
File Metadata
--------------------------------------------------------------------------
 title =
 subject =
 creator = lmg
 keywords =
 description =
 lastModifiedBy = lmg
 revision = 2
 created = 2009-08-12T21:33:00Z
 modified = 2009-08-12T21:33:00Z

So to answer the questions that we were asked:

  1. What is the name of Ann’s IM buddy?
    Sec558user1
  2. What was the first comment in the captured IM conversation?
    Here’s the secret recipe… I just downloaded it from the file server.Just copy to a thumb drive and you’re good to go &gt;:-)
  3. What is the name of the file Ann transferred?
    recipe.docx
  4. What is the magic number of the file you want to extract (first four bytes)?
    well the OFT2 is the magic number of the file in transit.
    Then we need to “extract” the real file which is captured inside the transit file (the docx file).
    So to answer this, the magic value or the first four bytes are:
    4f46 5432       or OFT2

    And for the docx file (since that is a ZIP file) we have:
    504b 0304            or PK..

  5. What was the MD5sum of the file?
    8350582774e1d4dbe1d61d64c89e0ea1  recipe.docx
    52c13d8c0a99ac0d3210e8e8edb046bf  file
  6. What is the secret recipe?

Recipe for Disaster:
1 serving
Ingredients:
4 cups sugar
2 cups water
In a medium saucepan, bring the water to a boil. Add sugar. Stir gently over low heat until sugar is fully dissolved. Remove  the  saucepan from heat.  Allow to cool completely. Pour into gas tank. Repeat as necessary.

Using SIMILE for timeline visualization

September 2nd, 2009 kiddi 1 comment

As my previous post discussed, the new version of log2timeline includes the option to output the timeline in a XML document that can be read by timeline visualization tools such as SIMILE timeline widget.

Timeline analysis can be very time consuming, especially since we are often dealing with tremendous amount of data in a traditional timeline.  This is a problem that tools like log2timeline, which extract timeline information from artifacts, only add to, by increasing the amount of information an investigator has to review.  Yet adding artifact information into the timeline can provide a great wealth of information, shortening the time an analyst needs to “solve” a case (given that the investigator manages to find the needle in the haystack).

Traditionally timeline analysis has been a manual process, where an investigator needs to sift through the information in an Excel sheet or a text file (a la mactime output) but the notion of visually representing the timeline has always been a very pleasant one, something that would possibly make the analysis easier.

One solution to the visualization is to use tools like SIMILE to represent the data.  An example of such a timeline can be found here.  This timeline represents the timeline that I discussed in previous posts on the SANS Forensic blog site (here and here).

Although very promising this timeline creates some problems, for one it requires some manual work for the investigator to get it to work, that is to create a HTML file with the correct parameters to load the XML file as well as to have access to a web server to place the timeline in.   Another problem, which is quite significant is that this method does not assist the investigator in reducing the data set, there is no real way to remove unneeded events (just some simple filters) as well as no method to easily zoom in and out in the timeline.

But still, this is one step towards better visualization of timeline analysis, something that can be further developed…

Categories: Forensics Tags:

Update to log2timeline and timeline visualization

September 2nd, 2009 kiddi No comments

I finally managed to push a new version of log2timeline out, version 0.30b, which contains several changes (see changelog). Among the changes was to include the possibility to use timeline visualization tools.

I wasn’t able to find that many timeline visualization tools that I could use. I will mention few of the projects that I’ve seen that can visually represent timeline.

  • Zeitline, a tool written in Java in 2005, and hasn’t been updated since then.
  • SIMILE timeline project, which is a widget that can be put on a web site that reads a XML document and produces a very nice visual timeline.
  • CFTL or the CyberForensics TimeLab, which was written by Jens Olsson and Martin Boldt (a paper describing it can be found here).

There are pros and cons about every one of those tools, some of problems that I can think of are:

  • Zeitline is not very flexible and difficult to get working.
  • SIMILE requires the user to create a HTML file that describes the timeline, and use a web server (possible to have it localized) to visually inspect the timeline
  • CFTL is still a beta tool that hasn’t been released (and will probably not be free anyway).

The most promising visualization tool that I saw was definitely the CFTL, although I haven’t been able to test it myself, since there is no publicly available version out there.

I decided to create some output modules for log2timeline so that timelines created by the tool could be visually analyzed using one of these tools.  Since both the SIMILE and CFTL projects use XML documents to describe the timeline it was quite easy to create an output module that has the possibility to output a file that can be read by the tools.  As soon as I’ve tested and evaluated both of these tools I will post reviews and show how they can be used to augment timeline analysis (using log2timeline to create the timeline and these tools to visually represent it).

Categories: Forensics Tags:
-->