Today I finally got working a hacked-together minimal version of the iPhone debugger client for BinNavi. It's heavily based on Patrick Walton's (with HD's updates) weasel debugger. Once tied to BinNavi debug client framework the whole client-server interaction is trivial.
It feels just right, the best looking debugger together with the slickest device.. recipe for fun.. ;-)
The test application is telnet on the iPhone. On the iPhone's screen is the debug output from BinNavi's debug client. telnet is launched from an ssh session in OSX, where BinNavi is running.
For anybody trying to link Mach's debugging interface with a C++ iPhone application, remember the extern "C" when defining boolean_t exc_server(mach_msg_header_t *in, mach_msg_header_t *out); (which is not defined in the header files, as pointed in weasel's source code). Otherwise you'll get a nasty "Undefined symbols" message when linking.
extern "C" is also needed for catch_exception_raise(...) so exc_server can call it to handle exceptions. Documented here. (I've used the standard iPhone toolchain on Debian, this is running on the firmware 1.1.3)
Some weeks ago I started updating the code to support all the attributes and enhancements in GraphViz 2.16. In attempting to make it pass all the regression tests some severe shortcomings it had became apparent. pydot users had also provided with insight into how to improve performance by redesigning the way the data for the objects is stored internally. All in all, the limitations I was facing led me to rewrite the whole core of pydot, which took much longer than I wanted but I feel it was well worth it as it's orders of magnitude better than the last release 0.9.
Performance-wise the new pydot stores graphs and their objects using a hierarchy of nested dictionaries and lists. Graph, Node, Edge objects are mere proxies to the data and are created on demand. So that now it's possible to have a graph with a 1 million edges and there will not be a single Edge instance (only if requested, then they will be created on demand, mapping the data and providing with all the methods to act on the data in the global dictionary). Storing a graph with 1 million edges in pydot 1.0 has approximately the same memory requirements (~813MiB) as dealing with one with only 40.000 edges in pydot 0.9 (~851MiB), the 40.000 edges graph needs ~35MiB in pydot 1.0 . Handling graphs should be much faster, as no linear searches are performed in pydot 1.0.2
And ran it through the packer time-series I harvested from Google Groups. Then I picked some widget demo code and put it all together in a mash-up. The results of the quick hack are here... much nicer to visualize than in the previous post. (and it's interactive!)
Use the mouse-wheel to zoom
Drag the plot left/right to browse around different date ranges
You can pick any packer and the data will be plotted against the previously selected one
Lately a animation of a woman has been going around. The animation shows a rotating silhouette, the catch is that it can be perceived to be rotating clockwise or counter-clockwise. It tends to be a bit hard to change the perception of the direction of rotation once one particular direction has been recognized (at least in my personal case), I've read that for some people it switches direction more or less randomly, after looking at it for a while. I was curious as to why it works, whether I could reproduce the trick and if I could make myself see her rotating in one direction or the other at will.
The why it works is relatively straightforward. Whether the rotation is clockwise or counter-clockwise is impossible to say if it happens in the same plane as where the viewer's viewpoint lays and there's no feeling of depth. The brain needs the perspective in order to tell the direction for sure, perspective will make the objects that are father look smaller and the ones closer bigger, that will help the brain discriminate one direction over the other. The dancing woman has been created in such way that it appears to have some perspective, yet it's still ambiguous (and you can see things jumping strangely at rotation as a result of this composition, just pay attention at the magical stretch of the arm closer to the body when it passes in front/behind)
It's easy to reproduce, just look at this example I quickly put together. With perspective it can be easily said whether it rotates in one direction or the other. We can either display it by setting the viewpoint above or directly in front with with a large aperture angle that exaggerates the perspective.
Then, if we now we set the viewpoint in front, yet so far that the projection lines become nearly parallel so that we lose the sense of perspective. It becomes much harder to tell the direction of motion and it's even possible to see it going both directions.
Then regarding the choosing at will of one direction over the other... I figured out that given that the only thing preventing my brain from deciding is the ambiguity caused by lack of information that would bias some layer of my neural networks to decide clockwise/counter-clockwise... I went really high tech and starting moving my finger in front of the dancing woman in the direction I wanted to see her to rotate... that seems to solve the ambiguity and I can make her turn one way or another at will...
The other day I was talking with a friend and the discussion went into when certain anti-disassembly, anti-debug, etc. techniques might have appeared. That's bound to be difficult because tricks are usually simultaneously discovered by different people.
So I though, a trick will usually be regarded as "common" once it gets implemented in some packer, as those try to make analysis difficult and will attempt to embedded whichever tricks are good/popular within the underground at the time in order to make the reverse engineering process as cumbersome as possible. Therefore if I could somehow place packers in time I'd have a starting point...
That led me to remember about Google Groups. It's possible to make queries restricted to date ranges and the archives go back to 1981. I quickly put together a script to scan with a one-month window through 1981 to 2007 for a set of popular packers.
The most painful part of the whole process was to fool Google... they sure do not like robots... whenever they get a bunch of very simply automated queries they'll server back a "403 Forbidden" telling queries look like coming from a virus or spyware app... But my script is good, it's no evil spyware... so I got into the mood of working my way around the checks. I needed to do quite some queries (> 10K) so I better make it believe I'm not a robot. Besides finding the right timing for the queries (too often will make Google sad) I had to distribute the search over a few hosts, randomize headers and User-Agents and the query itself (just throw in some randomized, "orthogonal" (nothing to do with your query) search terms). After that the script was good to go...
So, after mining the news groups for popular packer names ( the search string was, most of the time, " exe" plus the "randomized" terms ) I got a cute small data set to throw into Mathematica...
The results will have some inaccuracies, as it's possible some of the terms appeared in some news post not related to the packers. Yet I think they look plausible. When the volume of hits is high enough or constant over time it feels like it would indicate the approximate release date of the packer in question, or at least the first public discussion about it which, I would tend to think, will not necessarily be too far apart. If someone can either corroborate or refute the data I'll be glad to hear.
I also did some test overlaying virus release times in order to try to spot correlations between big outbreaks and news-posts about packers, but I couldn't see anything particularly significant.
I've always found that clear diagramming and laying out of complicated information makes it much more accessible and understandable. When I started looking into the Portable Executable format I found it really helpful to lay out all the headers and structures I was trying to understand, to visualize how they relate to each other and the information they contain. The resulting diagrams have been available under the corresponding section in OpenRCE for a some time already.
Now, given the feedback I received about some of those, I decided to put them up in an online store so people can get the real posters, high-resolution, updated and redesigned versions of those diagrams.
Rantings on whatever I'm tinkering with... computer security, reverse enginnering, tools, mathematics, linguistics, economics, etc.
Visit also dkbza, my site.