Wednesday, July 25, 2007

Supercomputing done with style

I was just reading through a list of the top 7 supercomputers and just saw this:




That's some style putting together a supercomputer, what a location!
It's the MareNostrum in Barcelona.

Saturday, July 14, 2007

BlackHat Vegas is nearly here...

I'll be there, teaching with Pedram a couple of rounds (weekend and week) of our training, Reverse Engineering on Windows: Application in Malicious Code Analysis. And then ranting together with Halvar in a turbo talk, 4 x 5

Pedram and myself "live on stage"


For the people more into cutting edge vulnerability research, Halvar will also be doing his Analyzing Software for Security Vulnerabilities. Feel free to grab any of us during the conference if you have any questions regarding BinDiff, BinNavi or VxClass.

And now that I am in the mood of advertising things, be sure to check OpenRCE's event calendar, you can even subscribe to the iCal feed. I try to keep it up to date with whatever events fall into my ears. If anyone knows of more, please let me know.

Saturday, July 07, 2007

Windows XP and Bochs

This might be of interest for anyone out there attempting to get Windows XP to install inside Bochs with no luck.

I had not been able to get it to install in any recent version, for one reason or another it always failed during install with a problem regarding the "catalogs" (error message was along the lines of "Setup failed to install the product catalogs. This is a fatal error."). No matter what options I had compiled in Bochs.

I had got it to install in the past... a long long time ago, so I figured out that I might get lucky with other versions of Bochs, so I started trying. 1.4.1 nothing, 2.0.2 nothing, 2.1.1 nothing (trying all of them with different configuration options, that took a while)... but finally got to 2.2.6 and bingo! it made it through with no errors! Once installed the image runs just fine in the latest incarnation of Bochs. Here it is, Bochs running Windows XP on Fedora running inside Parallels on my OS X...

Virtualization(emulation) = madness^2 ?


I hope this saves somebody from hours of compiling, recompiling and reinstalling...

Friday, July 06, 2007

Scanning data for entropy anomalies II

Recently Phantal (aka Brian) left some comments on my blog and in OpenRCE on some calculations he did following up on my post Scanning data for entropy anomalies.

He develops the algorithm aiming at improving the execution speed of the entropy "scanner" example I had shown. I ran through his steps and arrived to the same conclusions he did on his latest comment. I just thought it'd be worth showing his work as a separate post rather than just a comment.

His idea is, by looking into the standard definition of entropy , to isolate all that doesn't change in the expression when the window slides and just update the entropy, instead of blindly recalculating it's value from scratch for each offset of the scan window.

Shannon' s entropy, usually represented as H, takes the following form if we work with the 256 possible byte values as the symbols :



where p(b) is the probability of the occurrence of a given byte.

H, the entropy, will tend towards its maximum value, 8, if the data has the maximum possible entropy. In such case the probability of each byte occurring would be the same which produces

Note that, although this is usually thought of as measuring the "amount of randomness", it is not that much the case. A sequence of bytes starting at 0 and increasing until 255 going through all the values in order would reach the maximum entropy value 8, even that it is all but random.

The probability of a given byte appearing in our window can also be expressed as . being the number of times the byte appears within the window and the width of the window.

The expression for the entropy can be expanded as follows


The entropy after sliding the window, , will have the same sum expansion except for two terms, the ones of the bytes going out and entering the window. We can then just update those and recalculate the expression by first removing the old values for the incoming and outgoing bytes and then adding the new values for both, after updating their count.



and that's all. Now on to some implementations in Mathematica and Python (but creating a Mathematica function with Pythonika). His implementation in C can be found in the comments of the previous post.



EntropyScan = Function[{Data, WindowScanSize},

  SummationTerm [Prob_] := If[Prob > 0, Prob Log[2, Prob], 0];

  (* Get the initial chunk and calculate the entropy *)
  CurrentChunk = Data[[ Range[1, WindowScanSize] ]];

  (* Calculate initial byte count and probabilities *)
  ByteCounts = Table[Count[CurrentChunk, i - 1], {i, 1, 256}];
  ByteProbs = Table[ByteCounts[[i]]/WindowScanSize, {i, 1, 256}];

  FilteredByteProbs = Select[ByteProbs, # > 0 &];
  H = - Total[
    Table[FilteredByteProbs[[i]] Log[2, FilteredByteProbs[[i]]],
    {i, 1, Length[FilteredByteProbs]}]];
  Entropies = {H};

  (* Slide the window and recalculate for incoming and outgoing bytes *)
  For[offset = 1, offset + WindowScanSize <= Length[Data], offset++,

    (* Get incoming and outgoing bytes *)
    ByteOut = Data[[offset]] + 1;
    ByteIn = Data[[offset + WindowScanSize]] + 1;

    (* Get the old probabilities *)
    OldValByteOut = SummationTerm[ByteProbs[[ByteOut]]];
    OldValByteIn = SummationTerm[ByteProbs[[ByteIn]]];

    (* Update counters and values *)
    ByteCounts[[ByteOut]]--;
    ByteCounts[[ByteIn]]++;
    ByteProbs[[ByteOut]] = ByteCounts[[ByteOut]]/WindowScanSize;
    ByteProbs[[ByteIn]] = ByteCounts[[ByteIn]]/WindowScanSize;

    (* Get the new probabilities *)
    ValByteOut = SummationTerm[ByteProbs[[ByteOut]]];
    ValByteIn = SummationTerm[ByteProbs[[ByteIn]]];

    (* Update the entropy *)
    H = H + OldValByteOut + OldValByteIn - ValByteIn - ValByteOut;

    Entropies = Append[Entropies, H];
  ];

  Entropies
];





Py["import math"]

EntropyScanPython = PyFunction["\<
def entropy_scan(args):
  data = args[0]
  window_size = float(args[1])

  summation_term = lambda p: p*math.log(p,2) if p>0 else 0

  current_chunk = data[:int(window_size)]
  byte_counts = [
    len(filter(lambda a:a==i, current_chunk))
    for i in range(256)]
  byte_probs = [byte_counts[i]/window_size for i in range(256)]

  H = -sum(
    [byte_probs[i]*math.log(byte_probs[i], 2)
      for i in range(256) if byte_probs[i]>0])
  entropies = [H]

  for offset in range(len(data)-window_size):
    byte_out, byte_in = data[offset], data[int(offset+window_size)]

    old_val_byte_out = summation_term(byte_probs[byte_out])
    old_val_byte_in = summation_term(byte_probs[byte_in])

    byte_counts[byte_out] -= 1;
    byte_counts[byte_in] += 1;
    byte_probs[byte_out] = byte_counts[byte_out]/window_size;
    byte_probs[byte_in] = byte_counts[byte_in]/window_size;

    val_byte_out = summation_term(byte_probs[byte_out])
    val_byte_in = summation_term(byte_probs[byte_in])

    H = H + old_val_byte_out + old_val_byte_in - val_byte_out - val_byte_in
    entropies.append(H)
  return entropies
\>"];

Wednesday, July 04, 2007

iPhone restore image on the loose

Just getting my morning coffee and browsing through the news feeds I bumped into a post pointing to the iPhone's restore image. Apparently it's been making the rounds for a couple of days already.

On July 2nd there was a thread on Full Disclosure already discussing the contents. Of special interest seems the password protected 82MB image "694-5262-39.dmg", likely to contain the whole of iPhone's software. People seem to be already attempting to crack it open.

iPhone madness in San Francisco Apple Store

Also a couple of UNIX passwords for two user accounts were cracked but that might be of limited use.

iPhone madness in San Francisco Apple Store

The photos are from the iPhone launch in San Francisco on the 28th of June. That was something worth seeing.

Tuesday, July 03, 2007

Talk and visualization of third world statistics by Hans Rosling

A friend just pointed me to a great talk, titled "Debunking third-world myths with the best stats you've ever seen", by Hans Rosling, the founder of Gapminder.



I like their dynamic visualization of statistical data. In my opinion, adding the time dimension to the data definitely allows to extract a better understanding of the evolution of the systems under study, and this is an excellent example.