Project Tour

Let’s take a tour of the project repository. If you haven’t already, clone the repository for the project:

git clone git@github.com:phyletica/ecoevolity-model-prior.git

and then cd into it:

cd ecoevolity-model-prior

Here’s an overview of the contents of the project repository

bin directory

This directory includes executable Bash scripts psub, nsub, and spawn_job_array.

These Bash scripts were written for members of the Phyletica Lab to submit jobs to the queues on Auburn University’s Hopper cluster. If you are working on a different system and want to use these scripts to submit analyses for this project, you will need to edit these files to work for your system. Alternatively, you can submit the analyses “manually;” simple for loops at the command prompt would work just fine to submit the jobs.

Also, when you run the setup_project_env.sh, described below, all of the ecoevolity tools will get installed in this bin directory.

conda-environment.yml file

This is a YAML-formatted configuration file that tells conda how to set up a Python environment with the necessary requirements to allow all the Python scripts in the project to run successfully.

This Python environment will get created when you run the setup_project_env.sh (described below).

data directory

This directory contains data files in a file format (YAML) recognized by ecoevolity.

All of the files with the naming scheme of comp##-#species-#genomes-######chars.txt are “dummy” data files that will be used by the simcoevolity tool to simulate datasets. They are fully valid data files for ecoevolity, but are only meant to serve the purpose of “telling” simcoevolity the size of datasets to simulate.

Dockerfile file

This is text file with the commands for assembling the image of a Docker container. If you are familiar with containers, you can use this file to create a container of an environment with all the requirements to setup and work on this project.

docs directory

This directory contains the HTML of the project documunetation which is served by GitHub at http://phyletica.org/ecoevolity-model-prior.

These HTML files are automatically generated by Sphinx from the files in docs-source.

docs-source directory

This directory contains the source files that are used by Sphinx to create the HTML files for the project site. These files are in reStructuredText format.

If you want to add to or edit the documentation for this project, docs-source is where to do that. This is covered in the Working on project docs section.

ecoevolity-configs directory

As the name implies, this directory contains all of the configuration files needed for the ecoevolity tools. These configuration files specify where the data files are located, and all of the settings for analysis.

These configs are a critical component of the project and are covered more thorought in the The ecoevolity configs section.

For more details about ecoevolity config files, please see the ecoevolity documentation.

modules-to-load.sh file

This file contains the shell commands to load the modules on AU’s Hopper cluster that are needed for setting up and working on the project. If you are working on a different system, you will have to determine what modules are needed on your system to replace these; if your cluster is relatively new, perhaps none!

README.md file

This file contains some basic information about the project, and is rendered on the GitHub landing page for the project repository.

scripts directory

This contains a number of Bash and Python scripts that will be doing most of the “heavy lifting” for this project.

setup_project_env.sh script

This is a executable Bash script that will

  1. Compile and install (locally; within the project directory) ecoevolity.

  2. Create a conda Python environment called ecoevolity-model-prior-project and activeate it (if conda is available on the system).

See the Setting up the project section for instructions on using this script to setup the working environment for this project.