4. Use case III: Mutational signatures analysis for C.elegans#

4.1. Background#

In this example, we will re-analysis whole genome sequencing data from Volkova et,al..

In the original paper, 54 gentypes C.elegans were treated by 12 genotoxins with 2-3 different doses, generated 2717 total mutagenesis experiments and whole genome sequencing data.

original paper experiment design

here, we select ten samples which:

  1. xpc-1 gene knockout with UV treat.

xpc-1 gene predicted to enable damaged DNA binding activity and single-stranded DNA binding activity. Involved in response to UV. Predicted to be located in nucleus. Predicted to be part of XPC complex and nucleotide-excision repair factor 2 complex. Predicted to be active in cytoplasm. Is expressed in germline precursor cell; intestine; and nervous system. Used to study xeroderma pigmentosum. Human ortholog(s) of this gene implicated in pancreatic cancer; serous cystadenocarcinoma; xeroderma pigmentosum; and xeroderma pigmentosum group C. Orthologous to human XPC (XPC complex subunit, DNA damage recognition and repair factor).

  1. mlh-1 gene knockout.

mlh-1 gene predicted to enable ATP hydrolysis activity. Predicted to be involved in mismatch repair. Predicted to be located in nucleus. Predicted to be part of MutLalpha complex. Human ortholog(s) of this gene implicated in several diseases, including Lynch syndrome (multiple); carcinoma (multiple); and cervix uteri carcinoma in situ. Is an ortholog of human MLH1 (mutL homolog 1). Curator: Ranjana Kishore; Valerio Arnaboldi

  1. mrt-2 gene knockout.

mrt-2 gene predicted to enable damaged DNA binding activity. Involved in DNA metabolic process and intracellular signal transduction. Predicted to be located in nucleus. Predicted to be part of checkpoint clamp complex. Orthologous to human RAD1 (RAD1 checkpoint DNA exonuclease).

表4.1 re-analysis samples info.#

Sample

Genotype

Generation

Replicate

Mutagen

CD0009b

mrt-2

0

0

CD0009f

mrt-2

20

3

CD0001b

N2

0

0

CD0134a

mlh-1

20

2

CD0134c

mlh-1

20

3

CD0134d

mlh-1

20

4

CD0392a

xpc-1

0

0

CD0842b

xpc-1

1

1

UV

CD0842c

xpc-1

1

2

UV

CD0842d

xpc-1

1

3

UV

4.2. Download data#

4.3. Download and confing C.elegans genome file#

4.4. write Snakemake file#

For this project, we need change the sample sheet info.

4.5. Run clindet#

nohup snakemake --profile workflow/config_slurm \
-j 30 --printshellcmds -s snake_wgs_worm.smk \
--use-singularity \
--singularity-args "--bind /public/home/:/public/home/,/public/ClinicalExam:/public/ClinicalExam" \
--latency-wait 300 --use-conda --conda-frontend conda  -k > worm.out &

4.6. Results#

4.6.1. 突变检测#

4.6.2. #

在本例中我们选取了两种加