Are You Prepared for the NSA?

IMHO, the greatest value of the Snowden leaks is that they illustrate the threats you face. Even if the NSA and related organizations were dismantled today, you would still face the same threats and attack strategies as described in these documents.

The threats may come from other countries' state operations (almost all countries have them now), professional cyber criminal organizations, "Ocean's 11" style hacker teams, sophisticated individuals, and fully automated, multi-vector mal-systems operating perhaps independently of their creators (25 years ago Robert T. Morris accidentally released such a beast). 

The latest revelations from the Snowden leaks are a series of documents including one titled "I Hunt Sys Admins".

Of course sys admins are going to be targets!

So will your organizational partners, your suppliers and contractors, your supply chain, and your employees. I remember hearing a story where a VMS system administrator received a tape in the mail with a new OS release. It was a Trojaned operating systems. VMS? Software distribution by tapes? Yes, these techniques are that old.

So while you can wring your hands and decry the NSA's activities, there is one thing you should absolutely ask yourself:

Am I prepared to deal with these types of threats?

Windows EVTX Log Format

About four years ago I added Windows' EVTX audit log analysis to my Audit Monitoring Framework (AMF) code base. AMF is the foundation library to a number of my software tools, including Audit Explorer, Data Fence, and Audit Viewer.

Unfortunately, at the time there seemed to be very little detailed information about using Windows auditing (configuring auditing and analyzing the data) and virtually nothing about the underlying binary data format of the log files in case you wanted to write your own tools. That led to a large number of experiments and reverse engineering of the data. The results of that work was two documents:

Windows 7 Auditing: An Introduction

Windows 7’s auditing system can provide a rich source of information to detect and analyze a wide range of threats against computer systems. Unfortunately few people know this auditing system exists much less how to turn it on and configure it. This paper provides step-by-step instructions to configure a simple audit policy useful for understanding how data was exfiltrated from the computer.

Windows 7 Security Event Log Format

Windows security event log provides a rich source of information to detect and analyze a wide range of threats against computer systems. Unfortunately Windows' tool to view these logs, Event Viewer, is extremely limited in its functionality. Furthermore, there are very few third-party analysis tools to fill the gap between what the Event Viewer provides and the potential information that can be leveraged from the security event logs. One potential reason for this gap is that the format of these event logs is poorly documented making it very difficult for third-party developers to write tools to analyze the data. This paper documents the event log format, thus providing a blueprint for developers to create native tools to analyze Windows 7 event logs. We begin by providing an overview of the format and key concepts that are needed to understand the details. Then we dive into a detailed description of the syntax of the event log format.

I added the Windows 7 EVTX parsing and analysis capability into AMF, and built a number of internal tools to analyze Windows 7 audit data. I posted a write up and screenshots of an internal version of Audit Explorer analyzing the data: Analyzing Windows EVTX Logs. I also posted a video showing additional analysis tools using data flow analysis to track insiders collaborating to exfiltrate classified information (this was in 2010, well before Edward Snowden): Windows 7 Audit Trails: Exfiltration of the Swift (reproduced below).

I tried to get the government to fund additional R&D on this, but they were never interested. Maybe they didn't think insiders were a problem (cough, cough). Still, the latest version of Data Fence and Audit Viewer have the Windows audit analysis code embedded in the executables. It just isn't exposed. If there is enough interest (ping me on twitter), I'll expose the Windows analysis code.

Audit Viewer: Getting Started

Sometimes software can take an unexpectedly winding path to release.

I basically finished Audit Viewer version 1.1 a year ago, but I delayed it to wait until FAAS was released. Audit Viewer version 1.1 takes advantages of process snapshots to enhance the BSM data, and FAAS generated these snapshots. But then Mavericks broke a few pieces of FAAS including its crypto (which is funny because Apple's crypto changes also created a huge vulnerability for Apple), and I became focused on Data Fence (currently under review at the Mac App Store).

In the meantime, I created PS Logger, so users can take advantage of process snapshots without installing all of FAAS's infrastructure. With PS Logger released, I finally figured it was time to release version 1.1 of Audit Viewer (and it too is under review at Apple right now, sigh...).

Here is a video showing how to get started with Audit Viewer. It also gives you a glimpse at how you can zoom into different levels of audit analysis. 

Is NSA's TURBINE just a high-end botnet?

The Intercept's "How the NSA Plans to Infect ‘Millions’ of Computers with Malware" by Gallagher and Greenwald describes more Snowden documents including an NSA system called TURBINE. While I encourage everyone to read the article, I kept asking myself, "Is there anything new here?"

I think the answer is "No." Most of the techniques described have been done before to one degree or another by various hacker groups. If you review the HBGary Federal documents released by Anonymous several years ago, they also described many of the same goals and techniques HBGary Federal proposed for clients. Even my 1996 paper (has it been 18 years?!) "ATTACK CLASS: ADDRESS SPOOFING" describes various spoofing strategies, including rerouting packet flows and session hijacking.

The Intercept's article is just another example of the increasing professionalization of cyberspace conflict. You can think of TURBINE and related components as a high-class botnet.

Cyberspace is a contested & valuable space. Virtually every government, criminal organization, and patriotic hacker group is developing tools, techniques, and talent to do similar things. You and your site may or may not be targets of the NSA programs, but there is a very good chance you *will* be the target of another one of these groups using similar techniques.

Patent Trial and Appeal Board

I ran across a fascinating article today in the Wall Street Journal: "A New Weapon in Intellectual Property Wars". It is about the Patent Trial and Appeal Board (PTAB). This did not exist when I served my time as an expert witness in a couple of patent lawsuits. (note: I read the physical paper version of the article and not the online version)

One of the big challenges facing an expert witness is trying to explain your argument for validity or invalidity of a patent (as well as infringement) to a jury who has no experience in the technical field. For example, for a patent to be valid, it had to be non-obvious to a "person having ordinary skill in the art" (abbreviated PHOSITA) at the time of the patent. An example of a PHOSITA might be someone with 6 or more years of training and experience in the field,  perhaps someone who went through 4 years of college to get a computer science degree and then spent 2 years working in that specific field (e.g., computer security).

How do you explain to a member of the jury who has 0 years of experience in the technical field what a person who has 6 or more years of experience in the field might think? And chances are, everyone in the jury will have no experience in the technical field. For example, I was told about a computer patent case where in the initial jury pool of 40 people, only 1 person had even heard of the word "Linux".

And of course, the "other side" will have their own expert who will say everything you said is wrong.

Here is a quote in the WSJ article describing the PTAB:

"It's fast and has a whole fleet of expert judges that understand the science and know the technology."

One patent lawyer described appearing before the PTAB judges as

"getting CAT-scanned, MRI-ed, and X-rayed, all within a three-hour period"

That sounds like a completely different situation than a jury case. Fascinating. Very fascinating.

Is My Bug Report Related to "goto fail" Vulnerability?

Apple recently suffered an embarrassing security vulnerability known as the "goto fail" bug when an SSL certificate check was done wrong.

This reminded me of a bug report I filed earlier about some changes Apple made to SSL that broke my Free Audit Aggregation System (FAAS), and I have to wonder if somewhere the problem I was having and this "goto fail" bug intersected at some point. At a minimum, it shows Apple was breaking people's crypto code (e.g., curl command line program and php), which would have made it harder to spot the original source of the problem.

Speaking for myself, when something that, according to documentation, should work but doesn't, I start trying lots of things hoping to find something that does work. We call these "work arounds". Perhaps the extra "goto fail" line in Apple's code was a work around to make something else pass a test?

Below is my bug report that I posted on 25 June 2013 followed by an update added later that same day:


Original bug report 25-June-2013 11:56 AM

Summary:

When using curl (either the command-line tool or embedded in a PHP script) to connect to a web server over HTTPS that uses a self-signed certificate, passing the certificate to curl doesn't help. The connection fails.


Steps to Reproduce:

(1) Create a web server that uses a self-signed certificate. I'm using Mountain Lion with the Server App.

(2) Get a local copy of server's certificate. I use

$ echo -n | openssl s_client -connect bigmac.lab.netsq.com:443 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > server.crt

(3) Use that certificate to connect the the server web server over HTTPS via curl

$ curl --cacert server.crt https://bigmac.lab.netsq.com/


Expected Results:

The HTML for the page.


Actual Results:

curl: (60) SSL certificate problem: Invalid certificate chain


Regression:

This works properly on Lion and Mountain Lion, but it fails on 10.9 DP1 and DP2


Notes:

The workaround is to turn off checking of the server's certificate. For the curl command line, this is the -k option

$ curl -k https://your-secure-server/

For curl embedded in PHP use the following line

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false);


Update 25-June-2013 01:31 PM

I added the certificate to my keychain, and now I can use curl (both command line and inside PHP).

You might want to add a developer note that -k for the curl command, and

    curl_setopt($ch, CURLOPT_CAINFO, $certificate_file);

inside PHP is essentially a no-op for OS X 10.9 and the certificates should be added to the keychain.


America the Vulnerable and Today's WSJ Article

One of my favorite cyber security books is Joel Brenner's 2011 "America the Vulnerable: Inside the New Threat Matrix of Digital Espionage, Crime, and Warfare" (I just noticed the paperback version  has a new title "Glass Houses: Privacy, Secrecy, and Cyber Insecurity in a Transparent World". I prefer the old title). I also have the audio version from audible.com, and it is a great performance too. I've used this book's materials in a number of talks, and I will be coming back to it frequently in my blog posts. Since reading this book several years ago, I see many news stories in a completely new light. I cannot encourage you strongly enough to read this book.

Today I am highlighting a hypothetical scenario described in the book's chapter "June 2017" and today's (4 March 2014) Wall Street Journal article "Transformers Expose Limits in Securing Power Grid". 

The "June 2017" chapter paints a grim but compelling story of what could happen if an adversary (in this case China) leverages our cyber vulnerabilities in a coherent campaign. Examples of each cyber component in the story has already happened or has been shown to be possible. In other words, the story is hypothetical but very real. Let me pick up from several pages in...

Washington–12:00 P.M.; San Diego–9:00 A.M.; Honolulu–3:00 A.M.

The San Diego grid goes down, followed by the grids in Seattle (another big Navy base) and Honolulu. In California's Central Valley, turbines in three electric generators mysteriously blow up. The secretary of energy tells the president that this kind of equipment takes twelve to twenty-four months to replace.

"What?!" the president says. "Don't I have emergency powers to deal with that?"

"We don't make those generators in this country anymore, Mr. President–haven't made them for years."

"Who does make them?"

"India, sir. The Indians make them, and the Chinese."

Today's Wall Street Journal article refers to the vulnerability of transformers instead of generators, but much of the threat is the same.

The U.S. electric grid could take months to recover from a physical attack due to the difficulty in replacing one of its most critical components.

The article describes the long process it took FirstEnergy Corp to order a transformer from South Korea and install it in a new substation in Pennsylvania.

Total elapsed time from purchase to delivery: about two years.

The gist of this part of Brenner's "June 2017" chapter and the WSJ article is that our most critical infrastructure (our power grid) is vulnerable, can take months to recover from if damaged, and to a great extent we will depend on foreign countries for critical components. This dependence on foreign countries and our vulnerability means that foreign countries can dictate our national policies.

Very scary stuff. Again, if you have a chance, I strongly encourage you to read Brenner's book.

BTW: Here is video of a generator being physically damaged by a cyber attack. Fortunately, in this case the attack's purpose was just to demonstrate the potential.

“Intelligence” in Cyber Security

David Bianco recently posted Use of the term "Intelligence" in the RSA 2014 Expo. I suspect “intelligence”, as used in much of cyber security, is a buzzword used for marketing purposes. However, I thought I would put a stake in the ground by providing my definition of intelligence in cyber security.

Intelligence is extrinsic knowledge applied to local data to optimize analysts’ efforts.

Let me unpack that definition.

“Extrinsic knowledge” is the most important term. This is knowledge from outside your network. By definition, no matter how good your sensors or analysts are, extrinsic knowledge cannot be generated locally.

“Applied to local data” means that the knowledge must enhance the value of local data. In a sense, it would say, “pay attention to this fact, not that fact.” Suggested in the phrase is that the intelligence is structured so that algorithms can automatically apply the intelligence to potentially large volumes of data.

“focus analysts’ attention” means the purpose of this intelligence is to optimize people’s time. Of all the aspects in cyberspace, people’s time is the least scalable, so it is the most precious. The end result is that analysts should be provided with actionable data prioritized to give them the biggest bang for their buck (or time) for protecting their network.

What is intelligence applied to?

An important point is that I refer to applying the intelligence to “data”, but I did not explicitly say what the data was. Certainly one source of data is event logs of activity -- syslog messages, netflow data, and so on. This is for detecting malicious activity. But “data” can also refers to data collected about a network’s configurations -- programs installed, patches applied, hashing algorithms used to protect passwords, etc. Intelligence should also be applied to hardening the network to prevent successful attacks to begin with.

Where does that intelligence come from?

Intelligence comes from many sources. Here are a few. Aggregated sensor logs. If 100 of a community of 1000 sites have recently been attacked by a new attack tool, it would be nice if the other 900 sites knew to be prepare for the attack and be alert for its use. Cyber attack investigations. When investigating an attack at one site, tools, techniques, targets, external network addresses, and other evidence are uncovered that can then be looked for at other sites. Human intelligence. Infiltrating hacker groups, grooming contacts, lurking on bulletin boards, and looking for stolen data for sale are great sources of intel (think of Brian Krebs’ work). Penetrating attackers’ networks. This is a great source for intel, but it is something probably left to those who can legally get away with it.

What do you need to think about when using intelligence?

Here are some questions you should consider:

  1. How comprehensive and accurate is the original intelligence?
  2. How expressive is the structured language used to represent that intelligence?
  3. How well do the algorithms apply the intelligence to data streams?
  4. What is the coverage and quality of the data streams against which the intelligence is applied?

(1) reflects the size, skills, and tools of the organization gathering the intelligence. (4) depends on the local organization. I think questions (2) and (3) are what David Bianco wanted to know about. For example how much of David’s “Pyramid of Pain” can the structured language for intelligence capture? I think these are are areas ripe for research & development.

There you have it. Intelligence is knowledge collected, processed, and packaged outside your network and applied to the data collected from your network in order to maximize prevention and detection.

Proud To Be an Apple Customer

I have Apple in my blood. My first computer was an Apple II+. An Apple IIc got me through my undergraduate degree. A Mac SE/30 got me through my graduate degree. I started my business with code written on a Nextstation. And once Mac OS X came out, I've had one Apple laptop or workstation after another. I like Macs. I like UNIX. And I like Cocoa.

But after Tim Cook's response at a shareholder meeting today (as reported in the Mac Observer):

"When we work on making our devices accessible by the blind," he said, "I don't consider the bloody ROI."

I have never been prouder to be an Apple customer.

Half Billion Dollar Bank Heist?

The Wall Street Journal is reporting that Mt. Gox has "lost" 850,000 bitcoins worth about $473 million today or $975 million about 3 months ago according to CoinDesk. To put that in perspective, the most expensive diamonds in the world, The Cullinan diamond and the more famous Hope Diamond, are less than the lower loss estimate. Somewhere along the way, Mt. Gox lost the equivalent of about two Hope Diamonds.

Was this "lost" as in the disk was corrupt and they had no backup? Or "lost" as in their software was bad and information fell into the ether? Or "lost" as in "stolen" where someone has them? There is a lot of speculation but no clear answers.

When was the last time you heard of a half billion dollar bank heist?

I wonder if this will be turned into a movie?

Honeypot Credit Cards?

In Brian Krebs' most recent blog, he touches on the false positive problem, something that has plagued those of us in intrusion detection since the very beginning.

However, in the shadow of massive card thefts like the one that occurred at Target, false positives abound, Sartin said. The problem of false positives often come from small institutions that may not have a broader perspective on how far a breach like Target can overlap with purchasing patterns at similar retailers.

And that can lead to a costly and frustrating situation for many retailers, particularly if enough banks report the errant finding to Visa, MasterCard and other card associations. ...

I wonder if these banks and retailers use (or should use) honeypot credit cards – cards swiped every day by employees at retailers but only used at a single retailer. If the card information shows up anywhere, they'll know the exact path where the compromise could have happened.

Wolfram Language

Stephen Wolfram released a 13 minute firehose of a video about his new Wolfram Language. Actually, it has been decades in the making and used in many of Wolfram's key products. Every time you use Siri on your iOS device there is a good chance it is using Wolfram Language on the back end.

The video is a slick sales job to geeks like me. At one point he enters a two line program that crawls the web from a starting point and generates a graph showing what that corner of the web looks like.

webcorner.png

In another example, in two quick commands his language sucks in his friends list from Facebook, clusters them into groups, and then plots the groups.

I can imagine this could be an incredible boon to analyzing your site's security – that is, once you have the right data and know the right Wolfram Language instructions/commands to use.

Double Encryption for Command & Control

Unveiling "Careto" - The Masked APT

The communication between the C&Cs and the victims uses an encrypted protocol over HTTP or HTTPs.

In case of the Careto implant, the C&C communication channel is protected with two layers of encryption. The data received from the C&C server is encrypted using a temporary AES key, which is also passed with the data and is encrypted with an RSA key. The same RSA key is used to encrypt the data that is sent back to the C&C server. This double encryption is uncommon and shows the high level of protection implemented by the authors of the campaign.

Kaspersky refers to the "high level of protection" of The Mask espionage system because it double encrypts its data. I take a little umbrage at the suggestion that double encryption implies a "high level" of anything. I've been at least encrypting payloads and usually using double encryption whenever I write little espionage demo systems. It is an easy an obvious thing to do. Here are a few videos from over the years showing toy espionage systems using encryption:

Penetration by Contractor

Target Hackers Broke in Via HVAC Company

Sources close to the investigation said the attackers first broke into the retailer’s network on Nov. 15, 2013 using network credentials stolen from Fazio Mechanical Services, a Sharpsburg, Penn.-based provider of refrigeration and HVAC systems.

In 1998 or 1999 DARPA's 97-11 security program conducted an "Integration Feasibility Demonstration" (IFD) to show how automated response could make it harder for an attacker to carry out his mission. The demonstration network, set up at DARPA's Technology Integration Center (TIC), included multiple network sensors monitoring the perimeter (one being my Network Radar tool). The attacker used stolen contractor credentials to log into the network via ssh.

Surprise! (or not) The attacker carried out his attack without a single automated response being triggered to slow him down. The problem: the first line of network sensors just saw a normal encrypted ssh connection.

It seems that after 15+ years, some things are remarkably the same.

Scrubbing systems is hard

Target Data Breach Went on Longer Than Thought

John Mulligan said the company has learned software on another 25 checkout machines continued to steal payment card data three days after Dec. 15, the date by which the discounter had said the malware was removed from its system.

The machines that were still infected after Dec. 15 had been offline when the initial cleaning occurred. Cleaning a heavily infected network, especially while the organization tries to remain operational, must be one of the hardest jobs in computer security. There are just so many nooks and crannies in complex networks for threats to hide.

Curse You Google Software Updates!

I try to keep track of changes to the software on my machine. I look for programs run out of unusual locations (possibly malware dropped on my machine) and modifications made to existing programs (possibly installing Trojan version of the program). To track which programs were running or being changed I used Apple's BSM audit data.

I quickly realized how naive this simple view was. Today's operating systems and applications are constantly updating themselves. Sometimes they notify you to the change, but more and more frequently they just do it behind the scenes and never tell you about it.

Google may have led the way with this, quietly and frequently updating their software like Google Chrome. I wrote a pair of articles about how this software could serve as a model for malware that an Advanced Persistent Threat (APT) could use to maintain a presence inside your network: 

The Advanced Persistent Threat You Have: Google Chrome

The Making of "The Advanced Persistent Threat You Have: Google Chrome"

Recently, as I've been developing my Data Fence program, I've had to revisit this problem, and Google has again been annoying me.

I have a fence rule that tracks updates to files in the /Applications folder. Part of that fence rule requires me to specify legitimate pathways that can lead to changes to applications, and now I think I've seen Google Chrome update itself through programs installed in at least three different locations. Arg!

/Users/heberlei/Library/Google/GoogleSoftwareUpdate/GoogleSoftwareUpdate.bundle/Contents/Resources/GoogleSoftwareUpdateAgent.app/Contents/MacOS/GoogleSoftwareUpdateAgent
/Users/heberlei/Library/Google/GoogleSoftwareUpdate/GoogleSoftwareUpdate.bundle/Contents/MacOS/GoogleSoftwareUpdateDaemon
/Library/Google/GoogleSoftwareUpdate/GoogleSoftwareUpdate.bundle/Contents/MacOS/GoogleSoftwareUpdateDaemon

When I get it wrong (last night I only had two of these pathways marked), Data Fence lights up (and in this case it also purrs since Data Fence includes sound effects) while Google Chrome is updating itself. At this stage of testing I actually find this amusing, but I need to address this issue.

Another lesson learned I guess. I've generalized the regular expression for this match a bit more, and I realized I hadn't added the "track current working directory" code back into the audit analysis program. I guess that is today's task.

Data Fence: Audit Data, Last Access Time, and the Two-Fence Strategy

Abstract

This article discusses strange behaviors in programs, when these behaviors occur, and how these behaviors cause discrepancies between audit data, last access times for files, and a user's actual actions. These behaviors have led to the "Two-Fence Strategy" I use when constructing fence rules for Data Fence.

This article is basically the transcript of a video presentation showing the experiment being carried out.

Introduction

Modern operating systems like Apple's OS X can record the last time a file was accessed. This can be useful for a number of reasons including forensics. During testing of my Data Fence program, I discovered several interesting issues with audit data, last access times, and certain program behaviors.

With many Mac programs, when you want to open an existing file or save a new file, the program presents you with the open or save panel. For example, when saving a new document, a panel appears to let you set the file’s name and choose the folder where you want to save the file. As it turns out, with many programs, as soon as one of these panels appears, the program immediately starts opening many of the files in the current folder.

This noisy behavior can make it harder for intrusion detection programs to detect suspicious behavior. In general, when trying to detect signals (for example, an attack), noise is a bad thing. It can also make it harder for forensics specialists to determine what data might have actually been compromised.

There is another issue that an attorney defending a client might love. Sometimes there appears to be a discrepancy between opening a file (as observed in the audit data) and the last access time (as observed on the disk). This discrepancy in the evidence can open an attack path for a defense lawyer.

This article presents my analysis of this activity.

Setting up the Experiment

The first thing to note is I did a lot of testing, data collection, and analysis with Data Fence before I got to the point where I conducted the experiment described here. During this time I was able to narrow down the time when these noisy activities occurred, and they occurred when the open or save panels were displayed.

At this point I created a folder with many different document types, and then I ran many different programs and tried to open files in or save files to this folder. 

I used Data Fence analyzing live data to determine when programs were apparently reading files in this folder. Analyzing live data made it much easier to correlate user activity with the audit data.

I also used the “ls” command to determine a file’s recorded “last access time”.

I began with five windows open: three Terminal windows, a Safari window displaying a picture I will ask Safari to save, and the Data Fence window.

First I looked at the last access times for all the documents in the folder. This became the baseline.

The Save Panel, Data Fence Alerts, and File Access Time Changes

I switched over to Safari and selected File > Save As… to bring up the Save Panel. Immediately Data Fence started throwing up alerts with an accompanying “tink” sound. Meanwhile, the Safari Save Panel showed the contents of the folder.

By clicking on each alert in Data Fence I could see the files that Safari opened. I chose the file foo.pages on which to carry out some additional tests.

In the second Terminal window I listed file last access times again and compared them to the baseline. A few of the access times were changed; many were not.

For example, the access time for the Keynote version 6 document titled “Keynote 6 doc.key” changed from 13:43 to 13:47 even though I never selected or viewed this document. This was one apparent discrepancy: access times of a document changed even though I never (apparently) opened the file.

However, the file foo.pages showed no change to its last access time. Both the first and second windows showed a last access time of 13:23.

At first blush, this would seem right. I never opened the file “foo.pages”, so the last access time should not change.

But this introduced a second apparent discrepancy: the last access time said the file was never opened, but the audit trail said it was. Inconsistent evidence seems like a defense attorney’s best friend.

The open_test Program

I came up with a hypothesis for the cause of the discrepancy between the audit data and the last access time, and I wrote a program called open_test to test the hypothesis.

open_test prompts the user for a file to open and whether it should also read data from that file. That is, the program would follow one of two paths: either it would open() the file then close() the file, or it would open() the file, read() a single byte from the file, and then close() the file. It is important to note that from the audit trail perspective, both paths appear the same because read() calls do not generate audit records.

open_test Run 1

For the first run of the program I choose not to read any data.

Data Fence fired off an alert about the “foo.pages” being opened. In fact, it fired off two alerts. The second alert, signified by the chimpanzee scream, indicated that this is actually a more suspicious act.

The Data Fence rules I have installed grant Safari a little more trust than a random program run out of a random location, so the open_test program also earned this second, louder alert.

When I examined the last access time for “foo.pages”, again I see it had not changed. It was still 13:23. An open() followed by a close() generated an audit record, but it did not change the last access time.

open_test Run 2

For the second run of the program I told open_test to read in a single byte from the file.

Again, Data Fence fired off two alerts.

However, when I examined the last access time for “foo.pages”, the time had changed from 13:23 to 13:49. An open() followed by a read() followed by a close() generated an audit record, and it changed the last access time.

Conclusion One

From this experiment, plus the many others I carried out, I reached several conclusions.

The first conclusion is that the discrepancy between the audit data and the last access time is due to the fact that sometimes a program, when displaying an open or save panel, is probably opening files in the currently selected folder but not reading any data from them. This path leads to the apparent inconsistency between the audit data and the last access time data.

Conclusion Two

Sometimes the program does open a file and reads its contents. This creates a consistency between the audit data and the last access time.

In particular, programs do this when the file is actually a bundle and not a simple file. Apple’s latest Pages, Keynote, and Numbers documents fall into this category.

However, sometimes the program also reads the contents of simple files too, like the PHP text file I have in the folder. I have not yet figured out the conditions when a program’s open or save panel chooses to read the content of these files.

It is important to note that even though both the audit data and last access time say that a file was opened and read, this does not mean the user selected that file or that its contents were ever shown to the user.

Conclusion Three

While not shown in the experiments discussed here, sandboxed applications behave differently when opening or saving new documents.

When you are running a sandboxed app, and you choose to open an existing document or save a new document, the actual application does not open the files in the folder. The files are still read, but this duty is handed off to a number of other Apple processes. While this will still lead to a change in a file’s last access time, the user’s intent and what is actually happening is much easier to understand when looking at the audit data.

By handing off the open/save panel to Apple processes there is less noise in the audit data. In particular, when a sandboxed app opens a document, it usually means the user selected that document. In other words, it reflects actual use. This reduced noise makes detection easier, and in particular, it makes Data Fence a more valuable program.

Two-Fence Strategy

The behavior of non-sandboxed applications opening lots of file when displaying an open or save panel led me to develop a two-fence strategy when monitoring access to files with Data Fence.

The first fence is for the highly paranoid, in that it does not trust anything. It can chatter a bit when a non-sandboxed app displays an open or save panel, but this chattering can be safely ignored.

However, if you hear the fence rule firing when you are not explicitly opening or saving a document, it may mean that one of your standard applications has a vulnerability that is currently being exploited, or one of your applications is a Trojan horse. These alerts you may want to pay attention to.

As more of the applications you use become sandboxed (a requirement for new apps submitted to the Mac App Store), the false positive chattering should die away. This will make this first, paranoid fence strategy a better strategy.

The second type of fence I use to protect the data is more forgiving to non-sandboxed apps. It trusts programs you’ve installed in the Applications folder. This second rule type, which I personally use the chimpanzee scream for, is great for detecting malicious code that has been dropped onto your machine without your knowledge and is harvesting data for exfiltration.

Final Thoughts

While this has been a fairly technical presentation, I hope it gives you the underlying reason for the two-fence strategy I recommend for Data Fence. And should you ever need to conduct a forensic analysis on a Mac, I hope this helps you understand some of the data better.

Finally, when given a choice between a non-sandboxed application and a sandboxed application, I recommend you go with the sandboxed application. It not only prevents that program from being used as a vector for attacking your computer, it also reduces the noise in the data, which makes it easier to detect and analyze attacks that use other vectors.

Sandboxing – it really is a good idea.

Handing off responsibilities

I've managed my own web site since 1996 – over 17 years. I tried to keep it simple because I was focused on the content – papers, articles, presentations, videos, and tools. I ran it Mac OS X, and it progressed through many operating system versions and several pieces of hardware. But I've decided it was time to hand off the web site management to someone else.

My current system for my web site is an old Mac mini running the dated (but still beloved) Snow Leopard operating system. It is co-located at a small, local ISP. The ISP has moved its offices since the last time I visited, so I don't even know where my box is physically located. Managing the system remotely was always a bit of a pain (in part because of poor bandwidth), and I was always worried when rebooting, especially after a software update, that there might be a problem with it coming back up. More than once I've had to go over to the ISP's office (when I knew where it was) with keyboard, mouse, and monitor to physically manage my system.

And then there was the fear that I would post something that was actually popular, and my little Mac mini would choke under the load. Wanting to have a web site to share information but remain unpopular because of load concerns seems like a contradiction.

So today I've decided to hand off web site management to Square Space.

They are responsible for the hardware, software updates, security issues, and bandwidth. I just need to focus on the content.