>> Today we’ve got Andrew Hoog from Via Forensics, correct? >> HOOG: That’s correct >> All right And he’s going to give us a brief chat on >> HOOG: Digital Forensics >> Digital Forensics >> HOOG: All right, so Well, I first of all want to thank Google for the invite today Most of the companies that we end up working with would like us to go out of business and instead I met Fitz a little while ago and he said, “Hey, why don’t you come out and give a talk about Digital Forensics?” And I said, “I’d love to do that.” I said, [INDISTINCT] lot of people at Google that may not like some of the things that we’ve uncovered, some of the chances, I–” “It doesn’t matter Come on, we’d want to hear about it.” So, I want to thank you guys for the chance to be here We’re going to talk today about Digital Forensics And obviously with the crowd that’s here today, it’s going to be a very technical talk but if you have any questions in the middle of it, go ahead and interrupt me Let me know if you got questions in the middle Otherwise, we’ll cover some at the end Via Forensics has been around since about 2008 My background is a computer scientist I’ve had various management roles in different companies and then maybe about 2008, we started Via Forensics We’ve got a couple of books They literally came out this week One of them is on Android Forensics, which is–which is my particular specialty The other one’s on iPhone Forensics, which I’m sure would be a big hit here So, those just came out and we focus on both forensics and data security We’ve got a couple of patents pending in this phase And I also do quite a bit of expert witness works, so in the forensics phase, you actually have to have certifications and then you can be an expert in state and federal courts I’m also [INDISTINCT] like most of you, I’ve been using [INDISTINCT] for quite some time And I remember the first time I ever learned [INDISTINCT] special edition came out with an 800-page book and I literally opened it up, started at page 1 and went through the whole thing I was hooked from there and haven’t looked back since We did quite a bit of work in a mobile space I’ll just tell you a quick story This gentleman approaches and needed his phone examined and so we said, you know, “Okay, but what kind of phone is it? Is it Android device?” you know, those guys are rolling out 400,000 devices a day He said, “No, it’s not an Android device.” We said,” how about IOS? Is it an iPhone? You got 200 million iPhones out there.” He’s like, “Oh no, no, no.” He’s like, “I don’t have a smart phone, I have a Blackberry.” So, we do quite a bit of work in the mobile space And again, iPhone and Android is really where we spend quite a bit of time Today’s talk is really meant to be an overview of Digital Forensics It’s going to be a quick run through We’re going to skip over some of the detail and boring stuff We’re going to jump right into examples and give you–if you’re interested in tinkering in the space, it’ll give you some things that you can go back and install in your workstations start playing around with it But briefly, Digital Forensics is a science, it’s recognized as a science, which so I can be an expert in federal court And we’re interested in the preservation and analysis and reporting of digital artifacts So that would typically would cover computers, laptops, obviously things like thumb drives, USB storage Mobile phones have become a very, very big deal That’s why we chose to specialize electronic documents that are used in court cases What we’ll talk about near the very end is that forensics is typically a reactive science, so we get called inn when there’s been a problem, when there’s been a civil law suit or criminal case, an intrusion, incident response The big thing that we’re interested in is kind of a [INDISTINCT] of the company is, we do all the forensics cases and that’s kind of fun and interesting, you learn a bunch of stuff What gets really exciting is when you move forensics out of the reactive and move it into the proactive space And so near the end, we’ll cover a couple of topics in mobile app security and in enterprise security that kind of contain outside the typical forensics spots of being a reactive after the–after the scene and then come in and do things proactively So real quickly, the three types of storage devices that we typically deal with, the traditional hard drive spinning magnetic media, that’s pretty simple We could physically disconnect these things, hook up right blockers and deal with them The solid-state drives that come out–every time the new technology comes out, there’s a bunch of [INDISTINCT] in the industry that say, “This is going to change We’ve got to recover any data Its forensics is over,” and that never happens So obviously, we deal with solid-state drives and they have their own kind of host of issues and challenges that they come with Where we really play in quite a bit, it’s similar to the solid-state drives but basically in the raw NAND Flash memory

And so this is the type of memory that you’ll find on a smart phone, on a USB, thumb drive, other types of portable devices They’re obviously not easy to remove We have some techniques where you can hook up J-type clips and take open the CPU and basically pull data out in a debug mode You can either chip off where you basically take out your plastic chip off of the PCB, put it into a special reader and you can pull the data off that way But for the most part, you’re not pulling data off of these devices with the physical technique, so you have to come out with other ways to image them The other big thing about NAND Flash and the reason why I spend a lot of time on this, is that NAND Flash has really changed the forensics and–I’m sorry, the security space So, there’s many characteristics, average NAND Flash can only have about 10,000 writes It wont sustain a charge after that And for that reason, the Android team here at Google and then folks at Apple and a number of other companies developed or chose special file systems that are optimized for NAND storage And we’ll come into that in a little while when we talk about types of data that you can recover off of that The other thing too is that in the forensics space, we’re very interested in preserving a piece of evidence and proving to the court or whomever may be that we have the exact copy In traditional hard drives, that was very simple for us to do We pulled a plug, hook up a write locker, and if you don’t plug that thing back in and you image it ten times, we’ll always be able to verify it and we have a bit for bit copy NAND Flash memory, thumb drives, solid-state drives, it’s impossible to do The reason is is that it’s a lot of drive management and things going on behind scenes and then prevents you from getting an exact copy So, we’ll talk about some strategies to get around that but NAND Flash memory is the other big type of storage that we deal with considerably So, in the forensics space, we have to talk about how we’re going to apply our data There’s three primary things that we do And the most simple approach is essentially doing backup files We’ll get these in court cases that are involving a discovery, but we don’t have to come in and look for deleted data They really just need to get a bunch of files out, take a look at them and then–and then analyze it We also do this quite a bit on iPhones where we’ll do a backup of an iPhone and then we’ll basically logically analyze the backup files that came off You’ll see it in email files, word docs, all different types of documents It’s the least forensicly sound, it’s the most uninteresting from a technical perspective and it’s probably the largest thing that people do because most cases simply don’t warn the other techniques A second approach and kind of emphasizing this in the mobile space is the logical acquisition So, on an iPhone or an Android device, we can pull out data through content providers, we can pull out data from the Apple backup protocol But what if we can get into the phone and basically do a tar gz of the entire file system Now, I’m not going to get deleted data, but I can get anything in slash mobile or slash data and everything underneath that So, there’s a type of acquisition that we do, and you could do these on Windows computers too We are not pulling out all the unallocated space but you are going in there and preserving the date, time stamps and everything of that sort on the actual file system So we consider that a logical acquisition of the device And then kind of the gold standard of what we really strive for at forensic space as a physical acquisition Physical acquisition is a bit for bit copy of the storage medium when we did our acquisition In a traditional hard drive space, we can repeat and verify that But again, in a NAND Flash memory and solid-state drives, we’ll be getting a point [INDISTINCT] copy The device, whenever it’s powered on, even if it’s connected to a write blocker will always be changing behind the scenes The nice thing about a physical acquisition is that we can easily recover deleted data out of those And there are some specialized tools, some specialized software, hard drive that we can use in order to do the physical acquisition But in terms of just time tinkering, anybody could hook up a drive, do a physical acquisition with basically a–say, at a USB adapter, you hook it up to a Linux box We’ll go through some of the tools that you can use to do that So physical acquisition is really the kind of the gold standard of what we’re looking for Now, how we do the–in–the verification of the data that we have an exact copy? Well, it’s very simple, we simply do hash values It’s accepted in court, everybody eventually knows how they work For those that may not be familiar, it’s just a hex value that’s calculated with some sort of import of data The nice thing about the hash value is of course is that, a single byte change in your source data will have an avalanche effect and will have a radically different hash value So, we use hash values, they’re admissible in court and allows us to say the [INDISTINCT] that are identical, I have an exact copy of the original I did all my investigation on the copy Now, we really don’t–we don’t have to reproduce–so we don’t have produce that physical media every single time we do referencing And again, this is the challenge in the NAND Flash

In the mobile space is that we just can’t get a hash signature to stay the same–image that same device [INDISTINCT] times So you kind of do a point in time hash signature and basically say this is what it was, data hasn’t changed since then and we’re going to operate off of–off of that data set Two common ones, md5 is what most people use The forensics folks are starting with the sha256 because there’s a possibility of some collisions now that the number of files has increased And again, for anybody that hasn’t seen like an md5, if you took my name and ran it through md5, here’s the hex signature at the bottom of this slide that you would get our for that particular data set We do this on entire drives Yes? >> Sorry, so–I apologize for [INDISTINCT] my question, so, I know that if you got [INDISTINCT] that there’s something we’re allowed to [INDISTINCT] like move things around it that’s additional writes that would explain why you’re not getting any [INDISTINCT] Hash? >> Hoog: Exactly >> But if you turn off writes to the thing, you should be able to gather a few Hash on the flash as well? >> Hoog: You–the problem is, is behind the scenes, even if you’re not writing, the disc is still managing its base actively And so there are–there’s ware leveling, and it’s very difficult for forensics folks to come in because most of the information and in that topic is intellectual property So when we grab a solid-state drive from Toshiba or Intel, that >> They don’t tell you what they’re doing? >> Hoog: They don’t tell you what they’re doing That the ware leveling, the bad block management, the re-manipulating and moving data around to optimize it, all that happens behind the scenes, we’re not aware of it Now, we’ll talk about how Android’s a little bit different We have some more access in the Android space It’s still problematic and you don’t even have to write anything You can literally hook up a write blocker, nothing is being written and it’ll still come out with a different hash value >> Okay >> Hoog: So let’s talk for a minute about how to acquire a hash forensic image So just conceptually, if possible, if you’re dealing with a solid-state drive that you pull out or if you’re dealing with a traditional hard drive, you hook it up to a physical write blocker These are little black boxes you can buy in from [INDISTINCT] and a number of other companies And that physically prevents any writes from ever going back to the drive, it essentially intercepts them and then doesn’t pass them through There are software techniques you can use, Linux, you can flip some flags, Windows has got a USB driver If you’re really good at something for a cord or maybe use somewhere else, you know, don’t put us off around, that’s not why [INDISTINCT] in a–on a write blocker–physical write blocker And again, this is essentially impossible to do in the NAND Flash space unless you do a physical chip removal and put it on a chip reader where you’re stripped of any, you know, essentially all of that flash translation layer and things of that sort Then you physically acquire this–the–this–the device with software We don’t do a lot of commercial stuff, in fact, I don’t think we do any commercial tools in house with this So we primarily focus on involvement source and the presentation will just give you examples all on open source There’s a whole bunch of different tools out there They’re maintained sometimes by different federal agencies, by different forensics companies There’s a couple of examples up here The Department of Defense “DC3DD” is the one that we use the most, the mobile [INDISTINCT] example There’s also some free tools out there For instance, FTK puts on an imager FTK is a commercial–forensics company, but they have a free imager out there that you can use to apply your–an image So you can download that and run a command line or do a hacking widows and I think a couple of other environments And then there’s the full-blown commercial tools that will also do this So, a lot of forensic shops go down to commercial, they kind of drink that cool-aid and they go do all of their acquisition and analysis in a particular commercial tool After you do the forensic analysis, you then want to do the verification where you essentially reread the source device and you compare the hash signature and make sure that you have that identical copy So here’s an example, if you guys want to refer back to this, we’re going to post this out on viaforesics.com, our web site Anybody can take a look at it and I know the Google folks are going to put this up on youtube But the Department of Defense has a cyber crime center, they have invested interest in making sure there’s validated software that works and allows them to do their job and so they put that out there as open source It’s a patched version of DD that you’ve seen on many unit systems But they do get a number of features that are helpful in the forensic space It’s–I put an example up here where essentially you hasten the DC3DD command, you give it your source device, which would be typically be /.sta or std, whatever [INDISTINCT] device that have been assigned by the operating system We always put the verb “of” You put “of=” so this the output file that you’re going to write into So you give it some name like driver01EDD, turn on verbose, do a hash signature on the fly, track that in a log file, and the very last thing is rec=off

And that basically determines how you handle when the drive has errors So we [INDISTINCT] two drives yesterday, it was going to be a one-day turn around, we were going to get it back and overnight of–to that same day, would let you know if both drives were throwing out read errors So we’re unable to acquire the drive and get a hash signature So this tells the program what to do when you encounter an error And so we basically say “if you see an error, do you want to keep going with the recovery or do you want to stop and then go figure out what to do next?” I have this from source, if you’re going to use this on a workstation, it takes like 10 seconds You can just download it from sourecforge And I just gave an example here On the second line, you can see write protect is on This was a 500k power drive When you do a physical write blocker and you connect it up, if you look into system messages or d message, our link spot or pots or whatever, you’ll actually see write protect is on so that the operating system has detected that it’s unable to write through the device And so this is the type of logs that we capture to show the process that we use About 10% percent of our cases involve failing hard drives, so what you with those, if you don’t want to throw in the towel and say you can’t get anything off this device So you have that little flag that says, “Hey, what do you want to do?” You either stop when you have an error and then decide “Hey, I’m going to go down a totally different path and try to figure out what I’m going to do.” A lot of times people will continue on air and simply pad the sectors that you can’t read and pad them with notes That way you maintain the same size of the DD image as the actual hard drive, you rip past the bad locks and you pad it with zero’s and then you decide what do you do, do you go back and explain that later or how do you want to deal with it The other option that exists is that you skip the bad locks and that’s probably a really bad idea If the image of 500 gig hard drive and 499 gigs come through, you’ve got a problem explaining in why are these differences So this an example here, what you’ll see or what you don’t want to see, we’re imaging a hard drive that’s connected to STE, we start getting abort commands, we can’t sense the information and that it says “Hey, I can’t read this sector I’ve got to find out I’m buffering or I can’t do anything.” The trick that we found in its great software is–again, this is an open source under the [INDISTINCT] project, is Ddrescue It’s an extremely powerful program, you go out there, you compile it and what it does is it begins to read the drive as fast as it can As soon as it starts hitting bad blocks, it essentially skips over them and it keeps going And the idea is, if your hard drive is going to fail, let’s rip every piece of data we can off of it as quick as possible As soon as we get to the end of the drive, it maintains a list of which blocks are bad, and then it goes back and it takes the size of the sector that it reads and it makes it smaller and smaller and smaller And it takes a long time, but we typically get very, very good recovery by skipping over the bad blocks It has the ability to read things backwards, to read them direct or indirect, it has a whole bunch of different options So if you ever have a hard drive for a family member which is a–I’m sure you guys have gotten these requests before The hard drive is partially failed, all is not lost Take a look at Ddrescue, it’s a great program and again you can just download that and compile it So, I want to spend just quick overview on what does a typical forensic investigation look like And I say typical because there really isn’t a typical one, but there are a number of steps that you ought to consider We believe very, very strongly in building a timeline of events, it’s the first thing that we want to do when we get a computer We want to figure out the entire, what we call MACD, the Modified Access Changed And Created Timestamps on the entire file system We want to rip through the metadata inside the files themselves and we want to build an entire timeline, anything that happened on that device And now, we can zero in, and say this file was modified at this time, we saw registry change here, and somebody connecting a USB drive, and we could find out everything that is happening So the first thing that we do is we create a timeline and I’m going to step you through these in some examples You don’t have to mount the dd image to do that, so we’ve got some special software–open source software that will allow you to do it We then mount the dd image as a read-only so working–we’re only working on the copy but we mount that read-only We list off every file that’s on there and every file that the file system is aware that’s deleted, so we list off every single file deleted and not deleted And then we begin to analyze key files If you’re in the window space, you’re going to be looking at registry files, link files, user profile, web history, whatever it happens to be If you’re running an [INDISTINCT] or Linux, you’re going to look at the past history, you’re going to look at the recently run programs, you’re going to look the g–vf–vfs metadata about the file systems that have been connected You’re going to try to basically piece together all of the information on the system At that point, we typically remove–move into recovery deleted files that may be–that may

be important to the case Now, deleted files are typically still referenced inside the database, if you will, of the file system So, in an NTFS Files System, you have the Master File Table, the MFT or you’ve got the file system that has their own back table It is essentially a list of all the different files that’s a where of, were there inodes, where they point And when you delete a file, the entries can still exist in that master database So, we’ll go into the NTFS database, the MFT and we’ll parse out and then decide have we recovered to be somebody’s deleted files Here, I’ll show you an example of that in a minute If you’re unable to recover them, there’s still the possibility that the data exists in what we call unallocated space, space that was perhaps allocated at some point The operating system says I am not using this anymore but maybe there are files or file fragments in there So, we use a technique called file carving We’ll go in and see if we can extract out or carve out unallocated files that are unallocated Then we may do something like a full index search of the dd image, and a full index search of all of the logical files so that we can come and search and look for keywords and things of that sort And then from there it really goes in a million directions People hide sensitive data in other files That’s called stenography You can go in and try to figure these things out So, there’s a million different specialties that have been in the forensic space But these first six or seven steps are really what you’re going to do in many, many investigations to get that start So, this kind of part of the talk, we’re going to go into specific examples These are all open source tools that again, you can download and install There’s an excellent tool, we use it all the time It’s called the Sleuth Kit It’s written by Brian Carrier He still actively maintains it He also wrote a fantastic book called–we call it FSFA It’s File System Forensic Analysis It’s 400 pages of everything you wanted or didn’t want to know about file systems And if you have insomnia and, with all respect to Brian, pick up that book late in the evening, you’ll be set or pick up our book, it’s pretty much the same thing But what you need to go in there and understand why did Microsoft update this time in milliseconds, this one in an hour and then here it’s every two seconds Brian’s got all the details in his book It’s kind of the bible for forensics people when it comes to file systems So, he’s got the book and then he has the Sleuth Kit that’s out there You can install that with different forensic packages and whatnot But again, if you’re going to be playing around with this, just download it from source It’s very, very simple to compile He pushed down an update two or three days ago It supports a lot of file systems that you may run into NTFS, FAT, different Linux file systems, CD-ROMS and it’s just sitting out there it’s sleuthkit.org So we’re going to spend a few slides going through in some examples One of the first programs you’ll find out there is called mmls, Media Management ls, if you will And that basically gives you partition info So, if you look at the screen here, you can see that I’m doing an mmls on .spv So that’s on a physically connected disk You can just as well do these on dd images And you can see each one of the different pieces of the file system It’s probably quite obvious here that we’re dealing with a Linux file system You can see that–and this is very, very common The DOS partition cable is typically 63 bytes long The first byte tells us everything that we need to know about the file system and then you got 62 bytes that are essentially unused, unallocated So, that’s why you see the primary allocation table and then unallocated And then you can see that we’ve got a Linux partition EXT3 And you go down the table and you can see all the different data So, this would tell us our physical device or dd image, what does a file system look like and where should we be looking for data? In a lot of cases, we’re going to jump right into the EXT3 or the NTFS If we’ve got somebody that’s very good technically, we might start looking on unallocated space And say, “You know what? Somebody could hide data physically on the drive and move it into an unallocated partition.” That’s easy enough for us than to see here to tap and kind of focus the investigation So, mmls will give you that background information The next thing that you can do is you can run a program in the Sleuth kit called fsstat, File System statistics It will give you a lot of information On this particular one, you see I switched to a different file This time I’m looking at a dd image on a WebOS taken from a palm tree So, this is a dd where we went out We got a physical image of a–of a palm tree, it was running WebOS We’ve done a secure race on it We wanted to see how effective is secure race on WebOS And so buy doing fsstat, you can see a lot of these didn’t fit on the screen So, I just kind of cut it off after the first couple of lines But you can see information about the file system What file system? What was the volume ID? When it was last written or updated, mounted? A whole bunch of information and it gets into all the metadata and then will individually

list out each of the files and the inodes, what files they’re connected to, and essentially allow you to reverse the entry and recover information So, that’s fsstat Now, the one that’s really interesting and we spent–we use quite often is something called forensics list or forensic ls This utility where we can come in and we can clear the things like the Master File Table, the MFT, that’s part of NTFS And we can say rip through that whole database and tell me everything that you see on the system You can provide different offsets So, you can take a dd image and you could just examine that third partition or that fourth partition So, fls will basically rip through and pull out everything about the allocated–about the file system that it can find Again, I refer back to this MACB This is going to give us any time that a file is modified, accessed, changed or deleted And we use this to build our timeline analysis And here’s an example of a command on running fls, putting in the essential time, otherwise it will be in GMT We track what the skew is in terms of the real time versus what the BIOS are supporting So, we do investigation, we boot up the computer, we look at the atomic flaps that we have, we look at the BIOS, we figure out there’s a three-second skew And that’s probably important in a–like in a–incident response But we have to come in and try to decide whether or not something happened three seconds ago and it matters if we’re matching up loss So you can tell what the skew offset is, you give it a label, a file system, some offsets and then you basically point it at the file So in this particular example, we’re actually looking at down here, at the command I did it against a NTFS File System And you can begin to see–and it’s difficult to read The next slide will address that The different files, if you’d noticed there is–about halfway down, there’s one that says $mft and then $mftmirror Those are the two NTFS databases that track your entire file system It actually stewards a primary MFT and then it mirrors the MFT So, if somebody tries to wipe out your entire system, we have the ability to protect you You can come back and grab the mirror MFT and essentially recover what have been mirrored by the operating system So, by looking at it from a forensics perspective, we’re actually looking at the dollar sign special files that you don’t have access to with the normal operating system Then we can then parse that information out Now, looking at it in this format is a little challenging So, the file that you created is called a body file It’s just the terminology that the forensics community came up with And so what you–what you do is you then take a program called mactime And you point it at a body file and you say, “Hey, I need to make this human readable So, give me something that’s better to use.” And so forensics, what we’ll typically do is we’ll put it onto CSV And then we can hand this off to attorneys We can go in and fill a dirt and say, “Hey, show me everything that was modified at this time Show me anything that was deleted or whatnot.” In this particular example–I’m going to go back on page–well, in this particular example, maybe we can cover it later, you can actually see the files that have been deleted and the files that have been deleted and reallocated If they’d been reallocated, that’s basically been reused by another file but we don’t have it fully recovered If it simply shows up as deleted, we have tools that will then jump in and recover that deleted information off of the drive of the dd image So, once you’ve got the dd image, you’ve got a forensics copy, you’ve got your hash signature, now, what you want to do? You need to mount that dd image You need to be able to open that single file up and do stuff on it >> I just want–what circumstances, there’s a record of the file actually saying on the disk either as just deleted or it’s reallocated? >> Hoog: So, if somebody deletes it–and I actually had a different example and I think I changed it out at the last minute But if somebody comes in and deletes a file in NTFS File System–and we’ll talk about the Apps2 next, which is a large structured file system It’s totally different in how they handle it But essentially, that record stays in the MFT database until it gets reallocated, until the file system says, “Hey, I need to reuse that space.” And so, what we end up having is the file system marks it as deleted but it’s still sitting there It’s still allocated on the disk It’s still referenced as deleted but it’s never shown up in the actual file system So, when you come in with like fls, you’ll find tons of references of deleted files that are sitting there and recoverable Now, there’s another case where it’s still sitting in the MFT but some of the sectors on the disk that were rising that file get reallocated for another file And then we have a situation where we’re aware that the file existed but part or all of it has been reused on the–on the disk So then we get a status fact of deleted/reallocated >> But for most NTFS systems, files that you deleted years ago still show up?

>> HOOG: It’s kind of a mix The file–the system level files tend to get reallocated quite a bit but we find a lot of users based files that we do end up recovering and it depends If somebody had a 250 gig hard drive, they only use 30 gigs, let’s say they came in or deleted all their files or went into internet explorer and tried to do a clear cash because, you know, they wanted to hide what they were doing, we’ll essentially recover all of that Now, it was five years ago, it kind of depends >> You also have a record of files >> HOOG: A lot of times you have to record unless the MFT itself doesn clean up and basically, you know, completely gets rid of that But in general, we see all that information So, in this particular example, we need to mount the dd image So we come in with mmls and we take a look at the dd image that we have out there Just like you saw with the physical disk, you can see the partition table, this actually was pulled out of an Android device and was the SD cards We pulled out the SD card, we imaged it and we can see that unlike most hard drives, we actually had the primary partition table in the first, of basically one sector, and then we have a hundred and twenty-nine bytes that are unallocated and a hundred and twenty-eight and then a hundred twenty-nine byte the [INDISTINCT] the FAT16 File System So, we basically use that information, that 129 could then go out and mount the file system So, we go out and we create a directory and with pseudoaccess, you basically say, “Hey, I want you to mount the vfat file system I want you mount it on a loop back.” So we’re setting loop back device because we don’t have a physical device that we’re using, mounted read-only and here’s my offset The offset number is basically the start of the FAT16 times size of the sectors, so 129 times 512 will lead you out to 66048 So, it tells Mount to seek out to that part of the file, the Mounts is read-only as a FAT File System and then here’s my dd file and where to mount it If you then go out and take a look at the–at your mount tables, you’ll see that on /dev/loop0, we have the VFAT File System and then you can see that the VFAT File System here at the very bottom is got 1 9 gigs, 244 megs are used So, you can basically mount that dd and enjoy your work station At that point in time you jump and you can do any analysis that you want because if you’re working on a read-only copy of the original source media So, couple more slides and I want to just kind of give you some ideas There’s a gentleman I’ve been speaking with for probably over a year now, Kristinn Gudjonsson, he’s out at Iceland, he developed Log2timeline, which I slightly misspelled and I have it here But a lot of the timeline was Kristinn’s attempt to basically say there’s a lot of valuable metadata in individual files sitting in registry files I can pull out timing info for registry files from the vent blocks, from the MFT, prefetch, browser history, flash cookies done by the flash [INDISTINCT] so, he’s got 46 different file types that he can extract timeline data out And so, if you download his software and essentially compile that, he’ll export it onto ten different formats, just sitting out there at log2timeline It’s great software and so what we do is just we run a piece of his software, it’s called Timescanner, so we basically tell time scanner to go in, to look at the mount SD card directory that we mounted the file system at, to put everything in Central Time Zone and to rip off any piece of forensic metadata, file–timeline metadata it can of all the files that it finds And so it’ll find dll’s, what time the dll’s were created, what sort of cookies are found, any kind of information and it will put that into a body file We take that same body file that the Sleuth Kit helped us fill, put those two things together and then we run a [INDISTINCT] against that and create a–basically called a super timeline So, we’ve got every piece of information we could want, whether we could positively track out that device and now we’ve got a timeline Couple of other tools to mention, Harlan Carvey, he’s also published by a few books [INDISTINCT] he focuses really only in the Windows space and he developed a tool of RegRipper A lot of people use this together in peril–I tried to convince Kristinn to move to–Kristinn to move to hightime and I think he’s considering Harlan does all his stuff in Pearl and he actually wrote it for the Windows platform but there is a Linux [INDISTINCT] which is the one that we use And the goal or RegRipper is essentially to parse out the Windows registry files, pull out every piece of information it can possibly get out of the registry And it’s pretty amazing what you can find there in the registry So, you can go out the regripper.wordpress.com and essentially download that tool, compile it and you can specify the registry file and then what sort of data you want to extract

from it This is open source software and it will rip out forensic data out of Windows system The last tool that I want to talk about is Scalpel Scalpel is a file carving utility Again, this is open source for years and years of sitting, I’ve [INDISTINCT] and about a month ago, they released a [INDISTINCT] version So you can go out to the website, download Scalpel, you can compile it and download it and they don’t actually have a make installed and you can basically copy that and use a local bin And what you do with Scalpel is that many files have essential–a magic number at the very top and you can identify a single live file, you can identify a JPEG And so what Scalpel does is it rips through the dd image and it says, “Hey, I’m looking for any of these known file headers.” They specify a bunch of them ahead of time for you You can basically put your own ones in there In anytime it finds it, it then will parse through the system and look for the footer If it has a defined footer, it will go through and do 10k or 800k or whatever you tell it to do It’ll look for a separate type of identifier and then reverse and go backwards, so some PDF files, you need to find the start, find the bottom marker and then go back up a couple spots and so it will find the last one So, there’s a lot of functionality built in the Scalpel that will allow you to carve files out at the file system So, there’s kind of a standard scalpel.com tha comes with it We developed our own Scalpel configuration possible in Android and iphone because they are different types of file system and we’re pulling out different information All of that goes in the Scalpel output directly and then you can go in and see all the recovered files So, I’m going to shift gears here and I want to talk a little bit abound the Android space The Android obviously uses NAND Flash Memory This is, again, we have a specialty in this We got our books around, we’ve got some commercial software Unlike iphone and other platforms, the Android folks decided to not have NAND Flash Memory where the manufacturer had to use a certain one So, it allows them to use any NAND Flash that they want And they provide this layer that sits between the developer and the NAND Flash called the Flash Translation Layer, that basically exposes the flashes that block the [INDISTINCT] so, that is implemented in software in the Android space and the Flash Translation handles the ware leveling, bad block management, some of the stuff we were talking about earlier In Android and in Linux, the Flash Translation ware that most people use is called MTD, Memory Technology Devices Again, it’s another open source device The newer Android devices, Samsung started doing this fist They’re actually beginning to move away from the MTD and they’re coming out with their own NAND Flash chips that have the Flash Translation that are built in the [INDISTINCT] it’s already baked in so we don’t have the same kind of access we have in the earlier Android devices But–so those are built in the [INDISTINCT] but on a lot of the other phones, we still have the /dev/mtd devices were we can do our physical imaging Now, MTD divides the memory essentially into different blocks The set up is a little different in educational hard drive, you’re normally looking at 128k block and there is a 64 bytes of Out-of-Band data that store inside the block for each junk or each particular cage And inside that is where–for instance, you have two storage of bunch of metadata bad block, error correction code and things of that sort So, this is kind of what it looks like in the Android space So, if you have an Android device using the MTD, Memory Technology Devices, that access your NAND memory, you basically have 132k as your block site In case you have 64 two kilobyte chunks and after each one of these 2k chunks you have 64 bytes of Out-of-Band data The great thing about doing forensics on Android with MTD is that when you’re able to get your hands on the OOB data, you can do a lot more with the devices because we’re actually seeing how the NAND Flash is being managed by the Flash Translation layer So, we can see where the bad blocks have been marked What is the ware leveling technique? Can we reassemble the blocks back in [INDISTINCT] allocation even though they’re scattered all out over the physical image? So that is a big change for us and something that in the Android space, we’re now able to do and this is kind of what it looks like AS we talked about earlier, there’s a couple of different forensic techniques that you use You’ve got your physical techniques, but first thing that you start out with is a logical recovering Most cases it’s sufficient to start there, it’s the least complex In the Android space, you do a logical recovery using content providers It adds an interface that the Android team built in to allow apps to share data So we essentially come in, we say, “Hey, we want to share some of the–take some of that information that’s being shared.” We have a free tool that we developed We have a free tool that we developed We give it away to law enforcement and to different government agencies It’s called AFLogical and it basically goes out and it takes the content providers, it reads that information logically so it won’t get any deleted data

And then it stores it and analyzes it We [INDISTINCT] about 10 days ago to release the commercial tool based on the AFLogical that takes all of the manual stuff that had to be done, does different analysis on it and puts it into a virtual machine and makes it point that–kind of easy So, logical recoveries are the primary thing that you could on Android devices But we’re interested in moving beyond those content providers, those–the CPros because we’re only getting the information that the Android developer chose to share with us So, we can pull an SMS, we can pull out–right now, we pull out about 40–40, 45 content providers, we’re working on a new version that may pull out a couple hundred, but it’s still a limited amount of data So, to get beyond the content providers, you basically need to escalate privileges, you need to get some sort of assets to the device Now, if you had the original group of Dev Phones, you just had–as your access that was great, no problem This talk is not about how to–how to get [INDISTINCT] on Android If you want to do that, you know, you can Google Dev Phone, go out to XDA, go buy our book So we’re not really going to cover how to do that, but basically if you escalate privileges on the device, you can then take the next step forward which says, “All right, I want to tar gz up for the entire file system It’s not the same as having unallocated, but it is going to get us everything under /data/data And if you’re in Android space, if you can get that to record, you’ve got a lot of what you need So, that would be all of the sequel, like databases, preferences, files, pictures, images that APP developers are storing inside their protected space that they could–when they spin off any [INDISTINCT] so we’ll–if we can escalate privileges, then we’ll go for a logical acquisition You could push [INDISTINCT] up to the phone as long as you recompile it For the ARM platform, you could tar gz it and send it out like over NetCap or you can just use something like an adadad data as a recursive hole We have some issues when you do larger cursive holes that you could run into some issues So, in general, if we’re doing it for a case, we’ll do a tar gz and send it out over NetCap But the real goal in the forensic space, of course, is this physical acquisition And so in the Android space, once you’ve escalated privileges–and quite frankly, it’s the same deal in the iPhone space, you get escalated privileges and then you got two options In the Android space, dd comes built in I love it There’s no copy command, there’s no CP command in Android If you want to copy a file, you got shell access, you have to dd it from one file to the next and I like that and it makes me smile and mostly confuses everybody else If you do the dd, dd does not have access to the Out-of-Band data So, if you go on you do a dd on one of the MTD devices, you’re actually not going to get all of the information that you would want for a forensic analysis Now, it gives you quite a bit of data, it’s going to get you unallocated data, but it’s not going to get you all the pieces of the puzzle So, what you really need to do–and this took us, some folks, other people some time to figure out, but you need to go in there and do a full NAND dump That’s going to include all of that Out-of-Band data that we talked about We have a custom version of NAND dump that we developed, it allows us to get a full dump of the MTD partition and then on top of that, you have to deal with things like bad blocks and things of that sort So we basically build our own, you could go out and compile so of the NAND dump out there that are available and do it for ARM and essentially use that as well And once you do that, now, you can take advantages of all of the special stuff that you get with YAFFS2 YAFFS2 is the file system that originally Google shows that–it’s basically angled away from and some people run AXT3 and some people are now–the Google team and are Android teams with the XD4, but YAFFS2 is great It was open source, it’s a log structured file system, so the best way to think about that and I had to look it up when I first read about it, is that it’s essentially like source control on your file system Because it doesn’t go back and ever rewrite a block, it can only erase the block and then–and then write the data there, it just says it’s more efficient for me to write in front of the wall So, if you have a file and you change a couple of bytes in it, it just says go ignore that previous byte and rewrite that entire block and in front of the wall So, what we get when we analyze the YAFFS2 file system, if garbage collection hasn’t occurred, is basically an entirely version file system that we can recreate every single state the file was ever in Now, of course, the practice, we have to reclaim a space on the device and so garbage collection occurs and so we may end up having fragments of different files But in fact, we get a very, very dramatic recovery from the YAFFS2 file system I don’t think we’ve been geeky enough, so I want to take it up one more notch here and say that–let’s take a look at YAFFS2 from a [INDISTINCT] point So, if we’re–you have access on the device, you can get into the dev/MTD, so, here we do a NAND dump of the dev/MTD and I wanted to get rid of a bunch of zeros and Fs that

go flying by that’s important to the file system but it’s not that interesting when we’re looking at it on screen So, what you essentially have here is we’re looking at the raw flash NAND dump of a particular file At the very top, you can see that the file one.txt is the name The YAFFS2 file system has basically two types of data It either has an object header or it has object data What we’re looking at here is an object header, so this is giving us the file name And then most people would say, “Well, there’s no other information over here, there’s nothing else I can do so, well, there’s just a bunch of binary data and a couple here, so let’s move on.” But honestly, there’s quite a bit more information here, you just have to look at the YAFFS2 source code, figure out what it is So, Android stores integers in little endian, so right to left And if you look in here, I highlighted a couple different things, you’ll see a repeating pattern of 6399D5D4 In the end, this [INDISTINCT] of being a time stamp So what you’ve got is you’ve got a little endian number at the [INDISTINCT] so you actually have to completely reverse these guys So, you take that 6399 and you flip it around completely So you end up with 4D5D9936 You take that number, that hex number and you convert it to a base ten number You come out with a time stamp and actually Android does time stamps in milliseconds for the most part And so you end up getting the number of milliseconds since 1970 As soon as you recognize that date format, you can pass it into a number of tools, convert that date, format their date time stamp So, file one.txt that was written on Thursday, February 17th at 3:55 PM, which means that I was working on my book in the middle of a workday in mid February So, this is actually an example that was taken out of the book But it’s very interesting, with the YAFFS2 file system, you could essentially come back in, rip up all of the object header files and recreate every single time that a file was accessed, modified or changed on the entire file system And so what you have to do is you got to get into the source book You have to look at the stuff in hash, you have to try to figure out what the data looks like and then essentially write programs The type of stuff that we’re doing here is not supported in the commercial forensic tools for the most part So, what we can do if we spend a couple of years ahead of what the big forensics tools are going to be and we write our own tight on scripts to essentially rip through the image, pull off the OOB stuff and do some data carving, go back in, re-put the file system back and block allocation order, start ripping out the object header, build a timeline and let’s figure out what would happen on this device So, by starting with the basic tool, the Sleuth Kit, dd, hex editors, you can basically get physical images of these devices, work your way all the way up into the hex dumps and then again figure out the file system structure YAFFS2 is interesting, they actually don’t track the access time Because every time a file is accessed, they didn’t want to rewrite a new object header, which would be a new write to the–to the NAND Flash which would ultimately wear the device out So, there’s an A time that’s in there It’s actually the first time that it was created and then they never updated the access time after that But they do also track the modified time and the changed time on the file You could pull up the object ID that’s out on the Out-of-Band, you can do different cross referencing and basically figure out, you know, what file is this, what I know, you know, what are the different blocks that are used in the allocating So we could build that entire timeline and then you can also go and begin your [INDISTINCT] files and other pieces of binary data that might be of importance to your investigation for your analysis So, the last slide to kind of wrap this up, this is all kind of interesting stuff, it’s Android, it’s iPhone, it’s whatever the different files and you can do this on or anything that’s out there But the forensic space is a kind of in the corner of security So, you’ve got security that sometimes sits at the side and then in the side of that, we’re all the way off at the corner So, we’re the guys that don’t get out of the lab that often And for a commercial forensics company, the traditional technique was do more investigations How do I get bigger? I do ten times investigations and then ten times that and maybe someday we could have a couple hundred employees doing investigations There’s a change that’s happening and we’d like to think that we’re kind of at the forefront of that While we find the forensic investigation, the hex analysis fascinating, what’s far more interesting is if you take this reactive science of Forensics and say, “Let’s not call the Forensics guys in after there’s an incident, let’s kind not invite them to the party ahead of time, you know, we want be in the nice offices and have the nice foosball tables and hang out with you guys.” So, let’s get us out of the corner and move us into the proactive space And when you apply forensics in the proactive space, amazing things happen And I want to just give you a couple of quick examples

You can check this stuff out online and take a look at it The first thing that we do, we do some basic mobile apps security testing It’s low hanging stuff I mean, it’s kind of an easiest than the easiest So you go out there, you take a device, you may have privileges on it, you may not need privileges depending on what’s your view of content providers or what app comes in the backup utilities And you go out there and you look for data that says since you stored it on the device and in an insecure fashion Now, we’ve been doing this for a little while We’ve got about a hundred mobile app reviews out on our website You can take a look, you can filter it, you can see what applications or storing data in an insecure fashion So what’s interesting about this? Well, by using Forensics, we can spot different issues that we may say to the development team, “Hey, there’s a better way to store this information.” Now, we can have lots of debates about, well, if you’re storing information onto a device and you encrypt it and you did not type in a 32-character, you know, key file every time they want to access their SMS, have you really secured the information? And in the space and in the mobile space, especially when you look at the threat to consumers, the main threat to consumers are cyber criminals, people that want to steal their identity, they want to get financial information So what they go for is the easiest stuff, the lowest hanging proof If they have to come in, compromise the device, perhaps revoke the–get in there, find out what programs are running, try to pull the encryption keys out, get that data off and then maybe get a user name or password It’s way easier for them to just take all of the different apps that store your username and password and plain theft, they just copy them all So, there’s kind of this–yes, you can’t necessarily fully secure a device if somebody gets rude on it, but you can make it far more difficult for them So that’s one space when you apply Forensics to mobile app security and you take a look at what sort of data exists on this device I actually did a presentation down in American Banker Conference so I think it was a week or so ago So, there’s a lot more information about this and if you hit that second link, you kind of go through the presentation and get some more details on, “How do you apply Forensics to this space? What kind of information can be recovered?” We have a very simple rating of pass or a fail, something around 17% of the apps passed I think somewhere around 30s or so percent get a warning and almost 50% of the apps failed the most basic tests And now with these information that you would typically consider private that would be protected by a username and password, this basically contribute to pull up the device So that’s kind of interesting space applying Forensics proactively in the security space and say, “What can we find out about these advices?” So if you change some of our development techniques and only store the information that really needs to be stored there If my Android device were my iPhone or whatever I happen to have is always online, then why do we have to cash pieces of info Now, there are applications that require data to be cashed In those particular cases, you do a balance between security and usability and a number of other things But a lot of times, we’ll simply find information that has no business being out of device and it’s just sitting there So, that’s kind of an interesting application in the mobile space The, you know, the other space I want to talk about is that when Forensics guys get called in instead of response guys getting called in, we’ll come and we’ll look at a computer or we’ll look at a server and something happened It may be an hour ago, most likely it was a day or a week or a month ago and we basically are said–told, “Hey, something happened Can you help piece–get the puzzle?” And we’re actually really good at that But it’s a really tough job, so we’ll come in about 70 or 80% of what we need to tell you what happened is gone Network connections, RAN, link files, somebody cleaned up after themselves, is gone, you can never get it back Windows does a great thing Windows will only track the last time you plugged in a USB drive And that’s also only what it feels like Sometimes it just doesn’t track it at all So we come back and somebody said, “Well, we know this USB drive had sensitive info on it How many times did they connect it?” We can’t tell you Windows doesn’t track So instead of coming in after the milk has been spilled and trying to put Humpty Dumpty back together again, there’s a totally different way to approach this problem Forensic metadata is actually pretty tiny If you look at a registry file, it’s a couple [INDISTINCT] so, if you have a key server that has potential information and they’ve taken all of it, don’t wait until you get comprised, just pull those three meds off everyday, every hour, every 15 minutes You guys know something about storing data and putting it in a database, and analyzing and making sense of it, right? So, why not just gio with that information somewhere? And so that’s what we did One of the other interesting things we called a Continuous Forensics Monitoring, but the idea is let’s not wait until something happens Let’s see if we can get ahead of that And now you’ve got an exact copy of everything that you need to know And if something happened a day or a week ago, guess what, we’ll just pull it up, yeah, all those USB drives are connected, you guys missed it, we missed it, we weren’t monitoring it And I can tell you who, what, when, I can tell you the network connection, I can tell you what happened So it’s a really interesting space and I just wanted to share with you guys to think about if you kind of get into the Forensics stuff and you start tinkering if it’s interesting to find what’s sitting on your device, it’s far more interesting to think about how can

you take the Forensic Science and apply that proactively to security or to improving development techniques so that we can come up with more efficient ways or more effective and more secure ways to store information So, you know, our big goal is to get the Forensics guys out of the corner and we appreciate the opportunity to be with you here today We share tons of information on our website, we update our blogs, we have lots of how to’s out there So if anybody wants to talk to us about it, here’s how you get a hold of us And, you know, by all means, buy our books, give us a call and thanks so much