Architecture¶
Three main objects¶
The main concepts you need to be aware of when digging into dfTimewolf’s codebase are:
- Modules
- Recipes
- The
state
attribute
Modules are individual Python objects that will (for the most part) take some kind of input and produce some kind of output. Recipes are instructions that define how modules are chained, essentially defining which Module’s output becomes another Module’s input. Input and output are all stored in a State object that is attached to each module.
Modules¶
Modules all extend the BaseModule
class,
and implement the setup
, process
and cleanup
methods.
setup
is what is called with the recipe’s modified arguments. Actions here
should include things that have low overhead and can be accomplished
sequentially with no big delay, like checking for permissions on a cloud
project, creating an analysis VM, verifying that a file exists, etc.
process
is where all the magic happens - here is where you’ll want to
parallelize things as much as possible (copying a disk, running plaso, etc.).
You’ll be adding information to the state (e.g. processed plaso files) in the
module’s output as you go. You can access a previous module’s output (i.e. your
input) using self.state.input
and manipulate the current module’s output using
self.state.output
.
cleanup
is mostly optional, in case you manipulated the state in a way that
needs post-processing (e.g. adding a “# out of #” description to the module’s
output)
Recipes¶
Recipes are a Python dictionary that describe how Modules are chained, and which parameters can be ingested from the command-line. These dictionaries have a few specific keys:
name
: This is the name with which the recipe will be invoked (e.g.local_plaso
)short_description
: This is what will show up in the help message when invokingdftimewolf -h
modules
: A list of dicts describing modules and their corresponding arguments.name
: The name of the module class that will be instantiatedargs
: A list of (argument_name, argument) tuples that will be passed on to the module’ssetup()
method. Ifargument
starts with an@
, it will be replaced with its corresponding value from the command-line or the~/.dftimewolfrc
file.
Recipes need to describe the way arguments are handled in a global args
variable. This variable is a list of (switch, help_message, default_value)
tuples that will be passed to the argparse.add_argument
method for later
parsing.
State¶
The State object is an instance of the DFTimewolfState class. It has a couple of useful methods:
add_error
: Used by modules to indicate that an error occurred during execution (e.g. missing file, unauthorized access).check_errors
: Display any errors that have been added. If any critical errors were added, dftimewolf will stop the execution of the recipe and exit. Non-critical errors will just be displayed and execution will continue.cleanup
: Resets the state: moves the output data to the input attribute and clears the output for the next Module. Moves remaining (and therefore non-critical) errors to global_errors for later processing.
What happens when you run a recipe¶
The dftimewolf cycle is as follows:
- The recipe is parsed, and the first Module is instantiated
- Command-line arguments are taken into account and passed to Module’s
setup
method.- Errors are checked
- The module’s
process
method is called- Errors are checked
- Cleanup occurs; the output becomes input and the process is repeated with the next module in the recipe.