msaexplorer
What is MSAexplorer?
MSAexplorer allows the analysis and straight forward plotting of multiple sequence alignments. Its focus is to act as a simple python3 extension or shiny app with minimal dependencies and syntax. It's easy to set up and highly customizable.
Usage as a shiny application
The current version of the app is deployed to GitHub pages. This application is serverless, and all computation runs through your browser. There is no need to install anything. Enjoy the app!
However, you can also deploy it yourself and host it however you like!
git clone https://github.com/jonas-fuchs/MSAexplorer
cd MSAexplorer
pip install -r requirements.txt # installs all dependencies
shiny run app.py
Now just follow the link provided in your terminal.
Usage as a python3 package
Installation
Some simple steps are needed at the moment but in the future you will be able to install MSAexplorer via pip install msaexplorer
.
git clone https://github.com/jonas-fuchs/MSAexplorer
cd MSAexplorer
pip install .
Now you are able to use MSAexplorer like any package that you would install via pip
.
Analysis
The explore
module lets you load an alignment file and analyze it.
'''
a small example on how to use the MSAexplorer package
'''
from msaexplorer import explore
# load MSA
msa = explore.MSA('example_alignments/DNA.fasta')
annotation = explore.Annotation(msa, 'example_alignments/DNA_RNA.gff3')
# you can set the zoom range and the reference id if needed
msa.zoom = (0, 1500)
msa.reference_id = 'your_ref_id'
# access functions on what to compute on the MSA
msa.calc_pairwise_identity_matrix()
Importantly, multiple sequence alignments should have the format:
>Seq1
ATCGATCGATCGATCG
>Seq2
ATCGATCGATCGATCG
>Seq3
ATCGATCGATCGATCG
Additionally, you can also read in an annotation in bed
, gff3
or gb
format and connect them to the MSA. Importantly,
the sequence identifier has to be part of the alignment. All genomic locations are then automatically adapted to the
alignment.
Plotting
The plotting draw
module has several predefined functions to draw alignments.
'''
an example demonstrating how to plot multiple sequence alignments
'''
# import all packages
import matplotlib.pyplot as plt
from msaexplorer import explore
from msaexplorer import draw
# load alignment
aln = explore.MSA("example_alignments/DNA.fasta", reference_id=None, zoom_range=None)
# set reference to e.g. the first sequence
aln.reference_id = list(aln.alignment.keys())[0]
fig, ax = plt.subplots(nrows=2, height_ratios=[0.2, 2], sharex=False)
draw.stat_plot(
aln,
ax[0],
stat_type="entropy",
rolling_average=1,
line_color="indigo"
)
draw.identity_alignment(
aln,
ax[1],
show_gaps=False,
show_mask=True,
show_mismatches=True,
reference_color='lightsteelblue',
color_scheme='purine_pyrimidine',
show_seq_names=False,
show_ambiguities=True,
fancy_gaps=True,
show_x_label=False,
show_legend=True,
bbox_to_anchor=(1,1.05)
)
plt.show()
1r""" 2# What is MSAexplorer? 3 4MSAexplorer allows the analysis and straight forward plotting of multiple sequence alignments. 5Its focus is to act as a simple python3 extension or shiny app with minimal dependencies and syntax. It's easy 6to set up and highly customizable. 7 8# Usage as a shiny application 9 10The current version of the app is deployed to [GitHub pages](https://jonas-fuchs.github.io/MSAexplorer/app/). This application is serverless, and all 11computation runs through your browser. There is no need to install anything. Enjoy the app! 12 13However, you can also deploy it yourself and host it however you like! 14 15```bash 16git clone https://github.com/jonas-fuchs/MSAexplorer 17cd MSAexplorer 18pip install -r requirements.txt # installs all dependencies 19shiny run app.py 20``` 21 22Now just follow the link provided in your terminal. 23 24 25# Usage as a python3 package 26 27## Installation 28 29Some simple steps are needed at the moment but in the future you will be able to install MSAexplorer via `pip install msaexplorer`. 30 31```bash 32git clone https://github.com/jonas-fuchs/MSAexplorer 33cd MSAexplorer 34pip install . 35``` 36 37Now you are able to use MSAexplorer like any package that you would install via `pip`. 38 39## Analysis 40 41The `explore` module lets you load an alignment file and analyze it. 42 43```python 44''' 45a small example on how to use the MSAexplorer package 46''' 47 48from msaexplorer import explore 49 50# load MSA 51msa = explore.MSA('example_alignments/DNA.fasta') 52annotation = explore.Annotation(msa, 'example_alignments/DNA_RNA.gff3') 53 54# you can set the zoom range and the reference id if needed 55msa.zoom = (0, 1500) 56msa.reference_id = 'your_ref_id' 57 58# access functions on what to compute on the MSA 59msa.calc_pairwise_identity_matrix() 60``` 61 62Importantly, multiple sequence alignments should have the format: 63 64``` 65>Seq1 66ATCGATCGATCGATCG 67>Seq2 68ATCGATCGATCGATCG 69>Seq3 70ATCGATCGATCGATCG 71``` 72 73Additionally, you can also read in an annotation in `bed`, `gff3` or `gb` format and connect them to the MSA. Importantly, 74the sequence identifier has to be part of the alignment. All genomic locations are then automatically adapted to the 75alignment. 76 77## Plotting 78 79The plotting `draw` module has several predefined functions to draw alignments. 80 81```python 82''' 83an example demonstrating how to plot multiple sequence alignments 84''' 85# import all packages 86import matplotlib.pyplot as plt 87from msaexplorer import explore 88from msaexplorer import draw 89 90# load alignment 91aln = explore.MSA("example_alignments/DNA.fasta", reference_id=None, zoom_range=None) 92# set reference to e.g. the first sequence 93aln.reference_id = list(aln.alignment.keys())[0] 94 95fig, ax = plt.subplots(nrows=2, height_ratios=[0.2, 2], sharex=False) 96 97draw.stat_plot( 98 aln, 99 ax[0], 100 stat_type="entropy", 101 rolling_average=1, 102 line_color="indigo" 103) 104 105draw.identity_alignment( 106 aln, 107 ax[1], 108 show_gaps=False, 109 show_mask=True, 110 show_mismatches=True, 111 reference_color='lightsteelblue', 112 color_scheme='purine_pyrimidine', 113 show_seq_names=False, 114 show_ambiguities=True, 115 fancy_gaps=True, 116 show_x_label=False, 117 show_legend=True, 118 bbox_to_anchor=(1,1.05) 119) 120 121plt.show() 122``` 123""" 124 125import importlib.metadata, pathlib, tomllib 126 127# get __version__ from pyproject.toml 128source_location = pathlib.Path("__file__").parent 129if (source_location.parent / "pyproject.toml").exists(): 130 with open(source_location.parent / "pyproject.toml", "rb") as f: 131 __version__ = tomllib.load(f)['project']['version'] 132else: 133 __version__ = importlib.metadata.version("msaexplorer")