Software Structure

Overview

Sunbeam is a snakemake pipeline with a python library acting as a wrapper. Calling sunbeam [cmd] [args] [options] is a call to this wrapper library which then invokes the necessary snakemake commands. The main Snakefile can be found in the root directory and it makes use of rules from rules/ and extensions/, scripts from scripts/, and environments from envs/. Tests are managed by a script tests/run_tests.bash which collects test functions from tests/test_suite.bash. Documentation lives in docs/ and is served by ReadTheDocs.

Sections

sunbeam/ (root directory)

The root sunbeam directory holds a few important files including environment.yml, setup.py, and Snakefile. The environment file defines the dependencies required to run sunbeam and is used to create the main sunbeam environment. The setup file defines the structure and dependencies of the sunbeamlib and makes it installable via pip. The snakefile manages all of the snakemake components of the pipeline.

Tip

environment.yml defines the main sunbeam environment that you activate in order to run the pipeline. Internally, sunbeam then manages a number of other environments (defined in envs) on a per-rule basis.

docs/

Each page of the sunbeam documentation is here in the form of a .rst file. The additional files are all involved in the setup and deployment of the docs to ReadTheDocs using Sphinx. Most of these are autogenerated by Sphinx. The one bit of trickiness comes from importing the version of sunbeam into the docs build. This is done in conf.py by adding the sunbeam root to sys.path and then importing sunbeamlib which stores the version tag in a __version__ variable using semantic_version.

envs/

This directory contains .yml files defining environments that will be managed by snakemake as it runs. Anywhere that a rule is defined with conda: /path/to/ENV_NAME.yml, when snakemake reaches that rule, that environment will be created if it doesn’t exist already and then activated while running the rule. These environments are created in sunbeam/.snakemake/ by default.

extensions/

This directory will contain any extensions you install with sunbeam extend or any extensions that you develop as well as a .placeholder file that is just there to make sure the directory always exists. Any extensions should be in their own directories that start with sbx_.

rules/

This directory contains all of the snakemake rules that get imported by the main Snakefile. The rules are organized into subdirectories by function and each subdirectory has an associated environment to run its rules in envs/.

scripts/

This directory contains any python code that needs to be executed by snakemake rules. Again they are organized into subdirectories to match function and each is named according to the rule that calls it.

sunbeamlib/

This directory contains the python library that acts as a wrapper for snakemake. The python files in the root contain a number of utility functions whiles those in scripts/ define the commands for sunbeam. scripts/command.py takes in sunbeam [cmd] and then routes it to the file matching the given command. The data/ directory contains the default config file as well as some sample configs for running on a cluster.

tests/

This directory contains all of the testing framework and tests for sunbeam. The framework is written mostly in bash with test_suite.bash holding the main test suite and run_tests.bash to run it. The first step in running the tests is to create dummy data which is generated using generate_dummy_data.py and any installed extensions are then moved temporarily to the extensions/ subdirectory to avoid any interference from them (extensions should each be tested separately from sunbeam). Some tests will end by calling find_targets.py to check that all of the files specified in targets.txt or targets_singleend.txt are present. Unit tests for the scripts/ directory live in unit_tests/ subdirectory. Other subdirectories are auxiliary fixtures for running certain tests.

Hidden Directories

.circleci/

This directory contains the config.yml file which defines the CI jobs to be run by CircleCI as well as any scripts that are included in those jobs.

.github/

This directory contains the PULL_REQUEST_TEMPLATE.md file which defines a template for any pull requests on the sunbeam repository. This is also where definitions for any CI/CD workflows run through GitHub Actions would live.

.snakemake/

This directory is created the first time you run sunbeam. It will contain all the auxiliary environments created by snakemake (each environment will be named by a hash of the .yml file, so any changes to those files will result in a new environment being built).