How to analyze documents with FOCA in ten steps (or fewer)
Every time we create an office document—such as a word processor file (e.g., Microsoft Word), a presentation (PowerPoint), a spreadsheet (Excel), a PDF, or even an image—these files by default store far more information than we might expect.
Embedded within these files is additional content known as metadata, which can include details such as the author's name, creation/modification dates, or even the document’s title.
While this already provides quite a bit of information, deeper analysis can extract even more data, including highly valuable insights into the infrastructure where the document was created.

For instance, it’s possible to extract passwords, usernames, folder names, server names, printers, version history, and more—all from a simple office document.
This kind of information can pose a serious threat not only to personal privacy but also to the security of an entire company or organization, since it exposes valuable data that a potential cybercriminal could use to study your infrastructure (a technique known as fingerprinting) and potentially launch a targeted attack. In the case of images, the most sensitive information is often the geolocation data, which could, for example, reveal the route of a trip.
Metadata is more important than it may initially seem. Perhaps the most well-known case is that of Tony Blair and the Word document that supposedly proved Iraq had weapons of mass destruction—but a review of the metadata revealed a host of hidden content, including revisions and comments, that ultimately proved the information was false.
FOCA is a free tool created by ElevenPaths, designed to analyze metadata in both individual documents and across entire organizations. FOCA is open source and available for download from the ElevenPaths GitHub repository. Let’s explore how easy it is to extract all the data from an office document and obtain metadata reports for an entire organization in just a few simple steps.
Extracting metadata from one or more local files
Step 1: Once FOCA is open, simply select the “Metadata” option [1]. Then, right-click on the area shown in the image [2] and select “Add file” [3] (if you want to analyze the contents of an entire folder, use “Add folder”). Choose the file whose metadata you want to analyze (you can also drag and drop files or folders directly into the interface).

Step 2: Once the file is loaded, right-click on it [4] and select “Extract Metadata” [5].

Step 3: To view the results, check the left panel where the “Metadata” section will display the file name and format (e.g., a .docx file called “Test1”) [6]. Clicking on it will display a summary of all extracted metadata in the right panel [7].

Extracting all metadata from an organization
Step 1: The first step is to define a project. Go to the “Project” section and select “New Project” [1].

Step 2: Use the “Select Project” field [2] only if you’ve already created a project and want to reuse it. If starting from scratch, leave this blank. Enter a Project name [3] and the Domain website [4] you want to audit.
If there are alternative domains you want FOCA to include in the search, add them in “Alternative domains” [5]. The files you download (we’ll go over that process shortly) will be saved to the folder you define under “Folder where save documents” [6]. Then click “Create” [7] to set up your project.

Step 3: You’ll now return to the “Metadata” screen. First, select your search engines [8] (in the example, all three are selected).
In the “Extensions” section, choose which file types you want FOCA to look for in your project [9]. Then click “Search All”. Depending on the number of files found at the project’s URL, a list will appear after a short wait [10].

Step 4: To analyze the files found, the process is similar to the one for a single file. However, here you must first download them. Right-click on the file [11] (you can select multiple files by holding the Shift key), then choose “Download” as shown in [12]. To download everything, use “Download All”.

Step 5: Once downloaded, you’ll see a dot on the right side along with the date and time of the download. Now proceed to extract the metadata using “Extract Metadata” [13], and then analyze it using “Analyze Metadata” [14].

Step 6: Finally, you’ll see the output shown in the following image (content hidden for privacy). The analysis reveals the name of the computer where the file was created [15], server data [16], two usernames [17], the software type used [18], and other general information such as the creation date.

Using a tool like FOCA is essential to audit both personal and organizational files. It helps you understand the kind of information you might be unknowingly exposing and prevents potential data leaks.