Sunbeam is a snakemake pipeline with a python library acting as a wrapper (sunbeamlib). Calling sunbeam run [args] [options] is a call to this wrapper library which then invokes the necessary snakemake commands. The main Snakefile can be found in the workflow/ directory and it makes use of rules from workflow/rules/ and extensions/, scripts from workflow/scripts/, and environments from workflow/envs/. Tests are run with pytest and live in the tests/ directory. Documentation lives in docs/ and is served by ReadTheDocs.

Tip

Some of these sections won’t exist if you install via tar.

Sections

sunbeam/ (root directory)

The root sunbeam directory holds a few important files including environment.yml, pyproject.toml, Dockerfile, and install.sh. The environment file defines the dependencies required to run sunbeam and is used to create the main sunbeam environment. The pyproject file defines the structure and dependencies of the sunbeamlib and makes it installable via pip. The Dockerfile defines the image containing all internal environments for containerized runs. The install script is used to install sunbeam and has its own page in the documentation.

Tip

environment.yml defines the main sunbeam environment that you activate in order to run the pipeline. Internally, sunbeam then manages a number of other environments (defined in envs) on a per-rule basis.

There is also .readthedocs.yaml, which sets up the Sphinx build of the documentation to be able to import sunbeamlib, and MANIFEST.in, which tells sunbeamlib to include the data/ subdirectory while installing.

docs/

Each page of the sunbeam documentation is here in the form of a .rst file. The additional files are all involved in the setup and deployment of the docs to ReadTheDocs using Sphinx. Most of these are autogenerated by Sphinx. The one bit of trickiness comes from importing the version of sunbeam into the docs build. This is done in conf.py by adding the sunbeam root to sys.path and then importing sunbeamlib which stores the version tag in __version__.

workflow/envs/

This directory contains .yml files defining environments that will be managed by snakemake as it runs. Anywhere that a rule is defined with conda: /path/to/ENV_NAME.yml, when snakemake reaches that rule, that environment will be created if it doesn’t exist already and then activated while running the rule. These environments are created in sunbeam/.snakemake/ by default.

The accompanying files named something like ENV_NAME.ARCH.pin.txt are generated with snakedeploy. They list all the packages and exact versions in a given environment (for the architecture they were generated on, e.g. linux-64) so that snakemake can first try to use that exact environment and only if it fails, try to solve the .yml file for itself.

extensions/

This directory will contain any extensions you install with sunbeam extend or any extensions that you develop as well as a .placeholder file that is just there to make sure the directory always exists. Any extensions should be in their own directories that start with sbx_.

workflow/rules/

This directory contains all of the snakemake rules that get imported by the main Snakefile. The rules are organized into subdirectories by function and each subdirectory has an associated environment to run its rules in envs/.

workflow/scripts/

This directory contains any python code that needs to be executed by snakemake rules. Each is named according to the rule that calls it.

src/sunbeamlib/

This directory contains the python library that acts as a runner/utility for the underlying snakemake. Many python files contain utility functions whiles those prefixed by script_ define the commands for sunbeam. script_sunbeam.py takes in sunbeam [cmd] and then routes it to the file matching the given command. The .yml/.yaml data files include the default config file as well as some sample config templates for running on a cluster. It also contains the default profile template and one for slurm.

tests/

This directory contains the tests for the core sunbeam pipeline. Under data/ are raw, shortened bacterial genomes and host genomes used for generating the reads used as input. e2e/ contains end-to-end tests for each sunbeam programm: config, extend, init, list_samples, and run. unit/ contains unit tests broken into two sections, rules/, which tests each rule’s logic individually, and sunbeamlib, which tests functions within sunbeamlib.

Hidden Directories

.github/

This directory contains the PULL_REQUEST_TEMPLATE.md file which defines a template for any pull requests on the sunbeam repository and ISSUE_TEMPLATE/ which contains issue templates for the repository. It is also where CI/CD job workflows live.

.snakemake/

This directory is created the first time you run sunbeam. It will contain all the auxiliary environments created by snakemake (each environment will be named by a hash of the .yml file, so any changes to those files will result in a new environment being built). It also includes things like logs of previous runs and singularity images/builds if you use singularity.