dfTimewolf

A framework for orchestrating forensic collection, processing and data export.

dfTimewolf consists of collectors, processors and exporters (modules) that pass data on to one another. How modules are orchestrated is defined in predefined “recipes”.

Travis-CI Codecov CodeFactor

Table of contents

Getting started

Installation

Ideally you’ll want to install dftimewolf in its own virtual environment. We leverage pipenv for that.

$ pip install pipenv
$ git clone https://github.com/log2timeline/dftimewolf.git && cd dftimewolf
$ pipenv install -e .

Attention

If you want to leverage other modules such as log2timeline, you'll have to install them separately and make them available in your virtual environment.

Then use pipenv shell to activate your freshly created virtual environment. You can then invoke the dftimewolf command from any directory.

You can still use python setup.py install or pip install -e . if you’d rather install dftimewolf this way.

Quick how-to

dfTimewolf is typically run by specifying a recipe name and any arguments the recipe defines. For example:

$ dftimewolf local_plaso /tmp/path1,/tmp/path2 --incident_id 12345

This will launch the local_plaso recipe against path1 and path2 in /tmp. In this recipe --incident_id is used by Timesketch as a sketch description.

Details on a recipe can be obtained using the standard python help flags:

$ dftimewolf -h      
usage: dftimewolf [-h]
                  {grr_huntresults_plaso_timesketch,local_plaso,...}

Available recipes:

 local_plaso                        Processes a list of file paths using plaso and sends results to Timesketch.

positional arguments:
  {grr_huntresults_plaso_timesketch,local_plaso,...}

optional arguments:
  -h, --help            show this help message and exit

To get more help on a recipe’s specific flags, specify a recipe name before the -h flag:

$ dftimewolf local_plaso -h
usage: dftimewolf local_plaso [-h] [--incident_id INCIDENT_ID]
                              [--sketch_id SKETCH_ID]
                              paths

Analyze local file paths with plaso and send results to Timesketch.

- Collectors collect from a path in the FS
- Processes them with a local install of plaso
- Exports them to a new Timesketch sketch

positional arguments:
  paths                 Paths to process

optional arguments:
  -h, --help            show this help message and exit
  --incident_id INCIDENT_ID
                        Incident ID (used for Timesketch description)
                        (default: None)
  --sketch_id SKETCH_ID
                        Sketch to which the timeline should be added (default:
                        None)

User manual

dfTimewolf ships with recipes, which are essentially instructions on how to launch and chain modules.

Listing all recipes

Since you won’t know all the recipe names off the top of your head, start with:

$ dftimewolf -h
usage: dftimewolf [-h]
                  {grr_huntresults_plaso_timesketch,local_plaso,timesketch_upload,grr_artifact_hosts,grr_hunt_artifacts,grr_flow_download,grr_hunt_file}
                  ...

Available recipes:

 grr_artifact_hosts                 Fetches default artifacts from a list of GRR hosts, processes them with plaso, and sends the results to Timesketch.
 grr_flow_download                  Downloads the contents of a specific GRR flow to the filesystem.
 grr_hunt_artifacts                 Starts a GRR hunt for the default set of artifacts.
 grr_hunt_file                      Starts a GRR hunt for a list of files.
 grr_huntresults_plaso_timesketch   Fetches the findings of a GRR hunt, processes them with plaso, and sends the results to Timesketch.
 local_plaso                        Processes a list of file paths using plaso and sends results to Timesketch.
 timesketch_upload                  Uploads a .plaso file to Timesketch.

positional arguments:
  {grr_huntresults_plaso_timesketch,local_plaso,timesketch_upload,grr_artifact_hosts,grr_hunt_artifacts,grr_flow_download,grr_hunt_file}

optional arguments:
  -h, --help            show this help message and exit

Get detailed help for a specific recipe

To get more details on a specific recipe:

$ dftimewolf grr_artifact_hosts -h
usage: dftimewolf grr_artifact_hosts [-h] [--artifacts ARTIFACTS]
                                 [--extra_artifacts EXTRA_ARTIFACTS]
                                 [--use_tsk USE_TSK]
                                 [--approvers APPROVERS]
                                 [--sketch_id SKETCH_ID]
                                 [--incident_id INCIDENT_ID]
                                 [--grr_server_url GRR_SERVER_URL]
                                 hosts reason

Collect artifacts from hosts using GRR.

- Collect a predefined list of artifacts from hosts using GRR
- Process them with a local install of plaso
- Export them to a Timesketch sketch

positional arguments:
hosts                 Comma-separated list of hosts to process
reason                Reason for collection

optional arguments:
-h, --help            show this help message and exit
--artifacts ARTIFACTS
                    Comma-separated list of artifacts to fetch (override
                    default artifacts) (default: None)
--extra_artifacts EXTRA_ARTIFACTS
                    Comma-separated list of artifacts to append to the
                    default artifact list (default: None)
--use_tsk USE_TSK     Use TSK to fetch artifacts (default: False)
--approvers APPROVERS
                    Emails for GRR approval request (default: None)
--sketch_id SKETCH_ID
                    Sketch to which the timeline should be added (default:
                    None)
--incident_id INCIDENT_ID
                    Incident ID (used for Timesketch description)
                    (default: None)
--grr_server_url GRR_SERVER_URL
                    GRR endpoint (default: http://localhost:8000/)

Running a recipe

One typically invokes dftimewolf with a recipe name and a few arguments. For example:

$ dftimewolf <RECIPE_NAME> arg1 arg2 --optarg1 optvalue1

Given the help output above, you can then use the recipe like this:

$ dftimewolf grr_artifact_hosts tomchop.greendale.xyz collection_reason

If you only want to collect browser activity:

$ dftimewolf grr_artifact_hosts tomchop.greendale.xyz collection_reason --artifact_list=BrowserHistory

In the same way, if you want to specify one (or more) approver(s):

$ dftimewolf grr_artifact_hosts tomchop.greendale.xyz collection_reason --artifact_list=BrowserHistory --approvers=admin
$ dftimewolf grr_artifact_hosts tomchop.greendale.xyz collection_reason --artifact_list=BrowserHistory --approvers=admin,tomchop

~/.dftimewolfrc

If you want to set recipe arguments to specific values without typing them in the command-line (e.g. your development Timesketch server, or your favorite set of GRR approvers), you can use a .dftimewolfrc file. Just create a ~/.dftimewolfrc file containing a JSON dump of parameters to replace:

$ cat ~/.dftimewolfrc
{
  "approvers": "approver@greendale.xyz",
  "timesketch_endpoint": "http://timesketch.greendale.xyz/"
}

This will set your timesketch_endpoint and approvers parameters for all subsequent dftimewolf runs. You can still override these settings for one-shot usages by manually specifying the argument in the command-line.

Recipe list

dfTimewolf uses recipes, which are a way to configure Collectors, Processors, and Exporters (called Modules).

grr_artifact_hosts

Use this recipe to collect a predefined set of artifacts from a specific list of hosts. If you want to collect the BrowserHistory and LinuxLogFiles from tomchop.greendale.xyz and admin.greendale.xyz, use this command:

$ dftimewolf grr_artifact_hosts tomchop.greendale.xyz,admin.greendale.xyz --artifact_list=BrowserHistory,LinuxLogFiles

If artifact_list is not provided, the list defaults to:

  • Linux
    • AllUsersShellHistory
    • BrowserHistory
    • LinuxLogFiles
    • AllLinuxScheduleFiles
    • LinuxScheduleFiles
    • ZeitgeistDatabase
    • AllShellConfigs
  • Mac OS
    • MacOSRecentItems
    • MacOSBashHistory
    • MacOSLaunchAgentsPlistFiles
    • MacOSAuditLogFiles
    • MacOSSystemLogFiles
    • MacOSAppleSystemLogFiles
    • MacOSMiscLogs
    • MacOSSystemInstallationTime
    • MacOSQuarantineEvents
    • MacOSLaunchDaemonsPlistFiles
    • MacOSInstallationHistory
    • MacOSUserApplicationLogs
    • MacOSInstallationLogFile
  • Windows
    • WindowsAppCompatCache
    • WindowsEventLogs
    • WindowsPrefetchFiles
    • WindowsScheduledTasks
    • WindowsSearchDatabase
    • WindowsSuperFetchFiles
    • WindowsSystemRegistryFiles
    • WindowsUserRegistryFiles
    • WindowsXMLEventLogTerminalServices

grr_flow_download

Use this recipe to download the results of a given GRR flow.

If because of test_reason you want to fetch flow F:920AFD8 from tomchop.greendale.xyz and dump results into /tmp/tomflow/, use the following command:

$ dftimewolf grr_flow_download tomchop.greendale.xyz F:920AFD8 test_reason /tmp/tomflow

grr_hunt_artifacts

Launches a hunt for specific artifacts. The hunt is launched with a client limit set to 100 hosts.

If because of test_reason you want to launch a fleet-wide artifact hunt on BrowserHistory artifacts, use the following command:

$ dftimewolf grr_hunt_artifacts BrowserHistory test_reason

NOTE: Since hunts take time to complete, dfTimewolf will launch the hunt and return a Hunt ID that you can then feed to grr_huntresults_plaso_timesketch.

grr_hunt_file

Launches a hunt for specific files. The hunt is launched with a client limit set to 100 hosts. This is standard procedure for creating new hunts anyways.

If because of test_reason you want to launch a fleet-wide file hunt on /tmp/billgates.pl files, use the following command:

$ dftimewolf grr_hunt_file /tmp/billgates.pl test_reason

Note

Since hunts take time to complete, dfTimewolf will launch the hunt and return a Hunt ID that you can then feed to grr_huntresults_plaso_timesketch.

grr_huntresults_plaso_timesketch

Use this recipe to collect results from a GRR Hunt, process them with a local instance of plaso, and send them to our Timesketch server.

If you want to fetch results for H:7481F262 because of test_reason, use the following command:

$ dftimewolf grr_huntresults_plaso_timesketch H:7481F262 test_reason

local_plaso

Use this recipe to process a local file using plaso and send the results to our Timesketch server.

If because of test_reason you want to process all files in /mnt/winroot with plaso and send results to Timesketch, use the following command:

$ dftimewolf local_plaso /mnt/winroot test_reason

timesketch_upload

Use this recipe to upload a .plaso or .csv file to Timesketch:

$ dftimewolf timesketch_upload ~/cases/sem12345/sdb1.plaso

Module list

This is a list of existing dfTimewolf modules. To see how well they play together, see the recipe list.

Collectors

  • FilesystemCollector - a simple collector that just passes a local path on to the processors.

GRR hunts

Launch or fetch results from fleet-wide GRR hunts.

  • GRRHuntArtifactCollector - Launches a fleet-wide GRR ArtifactCollectorFlow
  • GRRHuntFileCollector - Launches a fleet-wide GRR FileFinder
  • GRRHuntDownloader - Downloads results from a GRR hunt.

GRR flows

Launch and fetch flows on a specific list of hosts.

  • GRRArtifactCollector - Launches a GRR ArtifactCollectorFlow on specific hosts.
  • GRRFileCollector - Launches a FileFinder flow on specific hosts.
  • GRRFlowCollector - Downloads the results of an arbitrary flow.

NOTE: As a general rule, GRRHuntArtifactCollector and GRRHuntFileCollector collectors are asynchronous. They will create a hunt and return the hunt ID that should be used with GRRHuntDownloader once the hunt is complete. GRRArtifactCollector, GRRFileCollector and GRRFlowCollector will wait for results before exiting.

Processors

  • LocalPlasoProcessor - processes a list of file paths with a local plaso (log2timeline.py) instance.

Exporters

  • TimesketchExporter - exports the result of a processor to a remote Timesketch instance.
  • LocalFileSystemExporter - exports the results of a processor to the local filesystem.

Developer’s guide

This page gives a few hints on how to develop new recipes and modules for dftimewolf. Start with the architecture page if you haven’t read it already.

Creating a recipe

If you’re not satisfied with the way modules are chained, or default arguments that are passed to some of the recipes, then you can create your own. See existing recipes for simple examples like local_plaso. Details on recipe keys are given here.

Recipe arguments

Recipes launch Modules with a given set of arguments. Arguments can be specified in different ways:

  • Hardcoded values in the recipe’s Python code
  • @ parameters that are dynamically changed, either:
    • Through a ~/.dftimewolfrc file
    • Through the command line

Parameters are declared for each Module in a recipe’s recipe variable in the form of @parameter placeholders. How these are populated is then specified in the args variable right after, as a list of (argument, help_text, default_value) tuples that will be passed to argparse. For example, the public version of the grr_artifact_hosts.py recipe specifies arguments in the following way:

args = [
    ('hosts', 'Comma-separated list of hosts to process', None),
    ('reason', 'Reason for collection', None),
    ('--artifacts', 'Comma-separated list of artifacts to fetch '
     '(override default artifacts)', None),
    ('--extra_artifacts', 'Comma-separated list of artifacts to append '
     'to the default artifact list', None),
    ('--use_tsk', 'Use TSK to fetch artifacts', False),
    ('--approvers', 'Emails for GRR approval request', None),
    ('--sketch_id', 'Sketch to which the timeline should be added', None),
    ('--incident_id', 'Incident ID (used for Timesketch description)', None),
    ('--grr_server_url', 'GRR endpoint', 'http://localhost:8000')

]

hosts and reason are positional arguments - they must be provided through the command line. artifact_list, extra_artifacts, use_tsk, sketch_id, and grr_server_url are all optional. If they are not specified through the command line, the default argument will be used.

Modules

If dftimewolf lacks the actual processing logic, you need to create a new module. If you can achieve your goal in Python, then you can include it in dfTimewolf. “There is no learning curve™”.

Check out the Module architecture and read up on simple existing modules such as the LocalPlasoProcessor module for an example of simple Module.

Architecture

Three main objects

The main concepts you need to be aware of when digging into dfTimewolf’s codebase are:

  • Modules
  • Recipes
  • The state attribute

Modules are individual Python objects that will (for the most part) take some kind of input and produce some kind of output. Recipes are instructions that define how modules are chained, essentially defining which Module’s output becomes another Module’s input. Input and output are all stored in a State object that is attached to each module.

Modules

Modules all extend the BaseModule class, and implement the setup, process and cleanup methods.

setup is what is called with the recipe’s modified arguments. Actions here should include things that have low overhead and can be accomplished sequentially with no big delay, like checking for permissions on a cloud project, creating an analysis VM, verifying that a file exists, etc.

process is where all the magic happens - here is where you’ll want to parallelize things as much as possible (copying a disk, running plaso, etc.). You’ll be adding information to the state (e.g. processed plaso files) in the module’s output as you go. You can access a previous module’s output (i.e. your input) using self.state.input and manipulate the current module’s output using self.state.output.

cleanup is mostly optional, in case you manipulated the state in a way that needs post-processing (e.g. adding a “# out of #” description to the module’s output)

Recipes

Recipes are a Python dictionary that describe how Modules are chained, and which parameters can be ingested from the command-line. These dictionaries have a few specific keys:

  • name: This is the name with which the recipe will be invoked (e.g. local_plaso)
  • short_description: This is what will show up in the help message when invoking dftimewolf -h
  • modules: A list of dicts describing modules and their corresponding arguments.
    • name: The name of the module class that will be instantiated
    • args: A list of (argument_name, argument) tuples that will be passed on to the module’s setup() method. If argument starts with an @, it will be replaced with its corresponding value from the command-line or the ~/.dftimewolfrc file.

Recipes need to describe the way arguments are handled in a global args variable. This variable is a list of (switch, help_message, default_value) tuples that will be passed to the argparse.add_argument method for later parsing.

State

The State object is an instance of the DFTimewolfState class. It has a couple of useful methods:

  • add_error: Used by modules to indicate that an error occurred during execution (e.g. missing file, unauthorized access).
  • check_errors: Display any errors that have been added. If any critical errors were added, dftimewolf will stop the execution of the recipe and exit. Non-critical errors will just be displayed and execution will continue.
  • cleanup: Resets the state: moves the output data to the input attribute and clears the output for the next Module. Moves remaining (and therefore non-critical) errors to global_errors for later processing.

What happens when you run a recipe

The dftimewolf cycle is as follows:

  • The recipe is parsed, and the first Module is instantiated
  • Command-line arguments are taken into account and passed to Module’s setup method.
    • Errors are checked
  • The module’s process method is called
    • Errors are checked
  • Cleanup occurs; the output becomes input and the process is repeated with the next module in the recipe.