skip to content
DFIRinProgress Blog

Directory Tree to Folder Paths

/ 3 min read

TL;DR

Why Should I Care? Ransomware groups will often provide a directory tree as proof of data exfiltration which is hard to work with.

What Should I Consider? Converting the directory tree to folder paths which are easier to work with.

Introduction

Ransomware groups will often post proof of data exfiltration on their dark web sites prior to releasing the data. The reasoning behind this is to encourage a ransomware negotiation between the ransomware group and the compromised. Within the proof of breach is often a directory tree of the exfiltrated data. The directory tree will often follow a similar format to the below:

Drive:.
├───Private
│ └───confidential
│ └───SensitiveData
├───DepartmentX
│ ├───Section1
│ │ ├───SubsectionA
│ │ ├───SubsectionB
│ │ └───SubsectionC
│ ├───Section2
│ └───Section3
│ ├───SubsectionA
│ └───SubsectionB
├───ProgramY
│ └───Year
│ ├───Research
│ └───Proposal
├───UserPersonalData
│ ├───Documents
│ └───PrivateJournal
│ └───Weekly
└───GenericTemplates

Several questions will be raised and need to be answered relatively quickly before the full data set is published. This includes:

  • What are the files within the directories?
  • What are the sizes of these files?
  • Is any sensitive data included?

The directory tree that is provided often isn’t easy to work with in comparison with folder paths. With the help of ChatGPT the following script will parse and output the folder paths from a directory tree. This will be helpful in comparing the published directory tree with a MFTCmd output.

Conversion Script

class FileProcessor:
def __init__(self, input_file):
self.input_file = input_file
self.content = self.read_file()
def read_file(self):
try:
with open(self.input_file, 'r', encoding='utf-8') as file:
return file.read()
except Exception as e:
print(f"Error reading file: {e}")
return None
def replace_characters(self, replacements):
if self.content is None:
return
for old_char, new_char in replacements.items():
self.content = self.content.replace(old_char, new_char)
def extract_paths(self, base_path=""):
if self.content is None:
return []
paths = []
lines = self.content.split('\n')
stack = [(0, base_path)] # (indent_level, current_path)
for line in lines:
if not line.strip():
continue # Skip empty lines
indent_level = len(line) - len(line.lstrip())
folder_name = line.strip()
while stack and stack[-1][0] >= indent_level:
stack.pop()
current_path = stack[-1][1] if stack else ""
full_path = current_path + '\\' + folder_name if current_path else folder_name
paths.append(full_path)
stack.append((indent_level, full_path))
return paths
def write_paths_to_file(self, output_file_path, paths):
try:
with open(output_file_path, 'w') as output_file:
for path in paths:
output_file.write(path + '\n')
print(f"Output written to {output_file_path}")
except Exception as e:
print(f"Error writing to file: {e}")
# Replacements
replacements = {
"├───": " ",
"└───": " ",
"": " ",
}
# Example usage
file_processor = FileProcessor('filetree.txt')
file_processor.replace_characters(replacements)
paths = file_processor.extract_paths()
file_processor.write_paths_to_file('outputpaths.txt', paths)

Example Output

Drive:.
Drive:.\Private
Drive:.\Private\confidential
Drive:.\Private\confidential\SensitiveData
Drive:.\DepartmentX
Drive:.\DepartmentX\Section1
Drive:.\DepartmentX\Section1\SubsectionA
Drive:.\DepartmentX\Section1\SubsectionB
Drive:.\DepartmentX\Section1\SubsectionC
Drive:.\DepartmentX\Section2
Drive:.\DepartmentX\Section3
Drive:.\DepartmentX\Section3\SubsectionA
Drive:.\DepartmentX\Section3\SubsectionB
Drive:.\ProgramY
Drive:.\ProgramY\Course123
Drive:.\ProgramY\Course123\Assignment1
Drive:.\ProgramY\Course123\Assignment2
Drive:.\UserPersonalData
Drive:.\UserPersonalData\Documents
Drive:.\UserPersonalData\PrivateJournal
Drive:.\UserPersonalData\PrivateJournal\Weekly
Drive:.\GenericTemplates