This article discusses strange behaviors in programs, when these behaviors occur, and how these behaviors cause discrepancies between audit data, last access times for files, and a user's actual actions. These behaviors have led to the "Two-Fence Strategy" I use when constructing fence rules for Data Fence.
This article is basically the transcript of a video presentation showing the experiment being carried out.
Modern operating systems like Apple's OS X can record the last time a file was accessed. This can be useful for a number of reasons including forensics. During testing of my Data Fence program, I discovered several interesting issues with audit data, last access times, and certain program behaviors.
With many Mac programs, when you want to open an existing file or save a new file, the program presents you with the open or save panel. For example, when saving a new document, a panel appears to let you set the file’s name and choose the folder where you want to save the file. As it turns out, with many programs, as soon as one of these panels appears, the program immediately starts opening many of the files in the current folder.
This noisy behavior can make it harder for intrusion detection programs to detect suspicious behavior. In general, when trying to detect signals (for example, an attack), noise is a bad thing. It can also make it harder for forensics specialists to determine what data might have actually been compromised.
There is another issue that an attorney defending a client might love. Sometimes there appears to be a discrepancy between opening a file (as observed in the audit data) and the last access time (as observed on the disk). This discrepancy in the evidence can open an attack path for a defense lawyer.
This article presents my analysis of this activity.
Setting up the Experiment
The first thing to note is I did a lot of testing, data collection, and analysis with Data Fence before I got to the point where I conducted the experiment described here. During this time I was able to narrow down the time when these noisy activities occurred, and they occurred when the open or save panels were displayed.
At this point I created a folder with many different document types, and then I ran many different programs and tried to open files in or save files to this folder.
I used Data Fence analyzing live data to determine when programs were apparently reading files in this folder. Analyzing live data made it much easier to correlate user activity with the audit data.
I also used the “ls” command to determine a file’s recorded “last access time”.
I began with five windows open: three Terminal windows, a Safari window displaying a picture I will ask Safari to save, and the Data Fence window.
First I looked at the last access times for all the documents in the folder. This became the baseline.
The Save Panel, Data Fence Alerts, and File Access Time Changes
I switched over to Safari and selected File > Save As… to bring up the Save Panel. Immediately Data Fence started throwing up alerts with an accompanying “tink” sound. Meanwhile, the Safari Save Panel showed the contents of the folder.
By clicking on each alert in Data Fence I could see the files that Safari opened. I chose the file foo.pages on which to carry out some additional tests.
In the second Terminal window I listed file last access times again and compared them to the baseline. A few of the access times were changed; many were not.
For example, the access time for the Keynote version 6 document titled “Keynote 6 doc.key” changed from 13:43 to 13:47 even though I never selected or viewed this document. This was one apparent discrepancy: access times of a document changed even though I never (apparently) opened the file.
However, the file foo.pages showed no change to its last access time. Both the first and second windows showed a last access time of 13:23.
At first blush, this would seem right. I never opened the file “foo.pages”, so the last access time should not change.
But this introduced a second apparent discrepancy: the last access time said the file was never opened, but the audit trail said it was. Inconsistent evidence seems like a defense attorney’s best friend.
The open_test Program
I came up with a hypothesis for the cause of the discrepancy between the audit data and the last access time, and I wrote a program called open_test to test the hypothesis.
open_test prompts the user for a file to open and whether it should also read data from that file. That is, the program would follow one of two paths: either it would open() the file then close() the file, or it would open() the file, read() a single byte from the file, and then close() the file. It is important to note that from the audit trail perspective, both paths appear the same because read() calls do not generate audit records.
open_test Run 1
For the first run of the program I choose not to read any data.
Data Fence fired off an alert about the “foo.pages” being opened. In fact, it fired off two alerts. The second alert, signified by the chimpanzee scream, indicated that this is actually a more suspicious act.
The Data Fence rules I have installed grant Safari a little more trust than a random program run out of a random location, so the open_test program also earned this second, louder alert.
When I examined the last access time for “foo.pages”, again I see it had not changed. It was still 13:23. An open() followed by a close() generated an audit record, but it did not change the last access time.
open_test Run 2
For the second run of the program I told open_test to read in a single byte from the file.
Again, Data Fence fired off two alerts.
However, when I examined the last access time for “foo.pages”, the time had changed from 13:23 to 13:49. An open() followed by a read() followed by a close() generated an audit record, and it changed the last access time.
From this experiment, plus the many others I carried out, I reached several conclusions.
The first conclusion is that the discrepancy between the audit data and the last access time is due to the fact that sometimes a program, when displaying an open or save panel, is probably opening files in the currently selected folder but not reading any data from them. This path leads to the apparent inconsistency between the audit data and the last access time data.
Sometimes the program does open a file and reads its contents. This creates a consistency between the audit data and the last access time.
In particular, programs do this when the file is actually a bundle and not a simple file. Apple’s latest Pages, Keynote, and Numbers documents fall into this category.
However, sometimes the program also reads the contents of simple files too, like the PHP text file I have in the folder. I have not yet figured out the conditions when a program’s open or save panel chooses to read the content of these files.
It is important to note that even though both the audit data and last access time say that a file was opened and read, this does not mean the user selected that file or that its contents were ever shown to the user.
While not shown in the experiments discussed here, sandboxed applications behave differently when opening or saving new documents.
When you are running a sandboxed app, and you choose to open an existing document or save a new document, the actual application does not open the files in the folder. The files are still read, but this duty is handed off to a number of other Apple processes. While this will still lead to a change in a file’s last access time, the user’s intent and what is actually happening is much easier to understand when looking at the audit data.
By handing off the open/save panel to Apple processes there is less noise in the audit data. In particular, when a sandboxed app opens a document, it usually means the user selected that document. In other words, it reflects actual use. This reduced noise makes detection easier, and in particular, it makes Data Fence a more valuable program.
The behavior of non-sandboxed applications opening lots of file when displaying an open or save panel led me to develop a two-fence strategy when monitoring access to files with Data Fence.
The first fence is for the highly paranoid, in that it does not trust anything. It can chatter a bit when a non-sandboxed app displays an open or save panel, but this chattering can be safely ignored.
However, if you hear the fence rule firing when you are not explicitly opening or saving a document, it may mean that one of your standard applications has a vulnerability that is currently being exploited, or one of your applications is a Trojan horse. These alerts you may want to pay attention to.
As more of the applications you use become sandboxed (a requirement for new apps submitted to the Mac App Store), the false positive chattering should die away. This will make this first, paranoid fence strategy a better strategy.
The second type of fence I use to protect the data is more forgiving to non-sandboxed apps. It trusts programs you’ve installed in the Applications folder. This second rule type, which I personally use the chimpanzee scream for, is great for detecting malicious code that has been dropped onto your machine without your knowledge and is harvesting data for exfiltration.
While this has been a fairly technical presentation, I hope it gives you the underlying reason for the two-fence strategy I recommend for Data Fence. And should you ever need to conduct a forensic analysis on a Mac, I hope this helps you understand some of the data better.
Finally, when given a choice between a non-sandboxed application and a sandboxed application, I recommend you go with the sandboxed application. It not only prevents that program from being used as a vector for attacking your computer, it also reduces the noise in the data, which makes it easier to detect and analyze attacks that use other vectors.
Sandboxing – it really is a good idea.