Thursday 29 October 2009

File Systems

In order to be able to visualise a timeline of a PCs usage, it first of all needs to be possible to get the information from the PC in the first place. The most common method of doing this is by capturing an 'image' of the hard disks (and other rewriteable media) within the physical hardware; these images then exist as files on the analyser's own systems, with each bit in the file representing an individual address on the drive itself.

Note that as little work as possible is done on the original hardware; the Association of Chief Police Officers has released guidelines on how evidence should be gathered, and one fundamental principle (unique to digital evidence) is that the data should, whereever possible, always be analysed without altering the original. (PDF here)

There are already any number of free tools that can create these images, from the Unix dd command to AccessData's FTK Imager; therefore, replicating this functionality is pointless (not to say timeconsuming in the development process). We can therefore assume that at the time that the visualisation tool starts the image has already been captured.

The first step of the analysis process is then to work out how the drive was structured. Some manual input may be required here (which is allowable, as an actual person with a real brain would have had to create the images in the first place) in the event that an analysis covers multiple images (i.e. multiple hard drives), but in general what would need to be worked out at this stage is:
  • Partition information, i.e. how the physical drive was divided into individual driver letters; and,
  • The file system of each partition.
Partition information can vary between different operating systems (for example, I believe FAT originally allowed one primary partition, with a secondary partition then holding multiple individual partitions), and in some cases it may be essential to work out the drive letter that the original PC assigned to each partition.

The next step is to then work out the file systems held in each partition; these could be FAT16, FAT32, NTFS, ext3, or any combination of the above. How to work this out is the subject of a later post, however.


  1. If you're aiming to work with the most commonly seen filesystems, you're missing Mac formats - HFS and HFS+ (default Mac FSs) and a couple of *NIX formats - ext2, xfs and reiserfs. Those make up the majority of filesystems seen out in the wild, I think.

  2. I did mean to mention the Mac formats, but couldn't remember their names offhand - thanks for reminding me.

    For timelining, any journalling filesystem (which appears to be pretty much all the newer ones) may potentially produce by far the most interesting albeit complex results; however I'll aim to implement maybe a couple of formats initially and leave the metaphorical door open to implement others in the future.

    Personally, I'm glad WinFS isn't in use yet.