msaexplorer
What is MSAexplorer?
MSAexplorer allows the analysis and straight forward plotting of multiple sequence alignments. Its focus is to act as a simple python3 extension or shiny app with minimal dependencies and syntax. It's easy to set up and highly customizable.
Installation
Via pip (recommended)
pip install msaexplorer # or
pip install msaexplorer[process] # additionally installs pyfamsa and pytrimal (not required, but optional in the app)
From this repo
git clone https://github.com/jonas-fuchs/MSAexplorer
cd MSAexplorer
pip install . # or
pip install .[process]
Usage as a shiny application
The current version of the app is also deployed to GitHub pages. This application is serverless, and all computation runs through your browser. There is no need to install anything. Enjoy the app!
However, you can also run it yourself or host it however you like!
Running the app
msaexplorer --run
Now just follow the link provided in your terminal.
Exporting as a static site
pip install shinylive
git clone https://github.com/jonas-fuchs/MSAexplorer
cd MSAexplorer
shinylive export ./ site/ # you should now have a new 'site' folder with the app
Usage as a python3 package
If you only want to use the MSAexplorer package without the shiny app, you can install it as follows:
pip install msaexplorer
Analysis
The explore
module lets you load an alignment file and analyze it.
'''
a small example on how to use the MSAexplorer package
'''
from msaexplorer import explore
# load MSA
msa = explore.MSA('example_alignments/DNA.fasta')
annotation = explore.Annotation(msa, 'example_alignments/DNA_RNA.gff3')
# you can set the zoom range and the reference id if needed
msa.zoom = (0, 1500)
msa.reference_id = 'your_ref_id'
# access functions on what to compute on the MSA
msa.calc_pairwise_identity_matrix()
Importantly, multiple sequence alignments should have the format:
>Seq1
ATCGATCGATCGATCG
>Seq2
ATCGATCGATCGATCG
>Seq3
ATCGATCGATCGATCG
Additionally, you can also read in an annotation in bed
, gff3
or gb
format and connect them to the MSA. Importantly,
the sequence identifier has to be part of the alignment. All genomic locations are then automatically adapted to the
alignment.
Plotting
The plotting draw
module has several predefined functions to draw alignments.
'''
an example demonstrating how to plot multiple sequence alignments
'''
# import all packages
import matplotlib.pyplot as plt
from msaexplorer import explore
from msaexplorer import draw
# load alignment
aln = explore.MSA("example_alignments/DNA.fasta", reference_id=None, zoom_range=None)
# set reference to e.g. the first sequence
aln.reference_id = list(aln.alignment.keys())[0]
fig, ax = plt.subplots(nrows=2, height_ratios=[0.2, 2], sharex=False)
draw.stat_plot(
aln,
ax[0],
stat_type="entropy",
rolling_average=1,
line_color="indigo"
)
draw.identity_alignment(
aln,
ax[1],
show_gaps=False,
show_mask=True,
show_mismatches=True,
reference_color='lightsteelblue',
color_scheme='purine_pyrimidine',
show_seq_names=False,
show_ambiguities=True,
fancy_gaps=True,
show_x_label=False,
show_legend=True,
bbox_to_anchor=(1,1.05)
)
plt.show()
1r""" 2# What is MSAexplorer? 3 4MSAexplorer allows the analysis and straight forward plotting of multiple sequence alignments. 5Its focus is to act as a simple python3 extension or shiny app with minimal dependencies and syntax. It's easy 6to set up and highly customizable. 7 8# Installation 9 10#### Via pip (recommended) 11```bash 12pip install msaexplorer # or 13pip install msaexplorer[process] # additionally installs pyfamsa and pytrimal (not required, but optional in the app) 14``` 15 16#### From this repo 17```bash 18git clone https://github.com/jonas-fuchs/MSAexplorer 19cd MSAexplorer 20pip install . # or 21pip install .[process] 22``` 23 24# Usage as a shiny application 25 26The current version of the app is also deployed to [GitHub pages](https://jonas-fuchs.github.io/MSAexplorer/app/). This application is serverless, and all 27computation runs through your browser. There is no need to install anything. Enjoy the app! 28 29However, you can also run it yourself or host it however you like! 30 31#### Running the app 32```bash 33msaexplorer --run 34``` 35Now just follow the link provided in your terminal. 36 37#### Exporting as a static site 38```bash 39pip install shinylive 40git clone https://github.com/jonas-fuchs/MSAexplorer 41cd MSAexplorer 42shinylive export ./ site/ # you should now have a new 'site' folder with the app 43``` 44 45# Usage as a python3 package 46 47If you only want to use the MSAexplorer package without the shiny app, you can install it as follows: 48 49```bash 50pip install msaexplorer 51``` 52 53## Analysis 54 55The `explore` module lets you load an alignment file and analyze it. 56 57```python 58''' 59a small example on how to use the MSAexplorer package 60''' 61 62from msaexplorer import explore 63 64# load MSA 65msa = explore.MSA('example_alignments/DNA.fasta') 66annotation = explore.Annotation(msa, 'example_alignments/DNA_RNA.gff3') 67 68# you can set the zoom range and the reference id if needed 69msa.zoom = (0, 1500) 70msa.reference_id = 'your_ref_id' 71 72# access functions on what to compute on the MSA 73msa.calc_pairwise_identity_matrix() 74``` 75 76Importantly, multiple sequence alignments should have the format: 77 78``` 79>Seq1 80ATCGATCGATCGATCG 81>Seq2 82ATCGATCGATCGATCG 83>Seq3 84ATCGATCGATCGATCG 85``` 86 87Additionally, you can also read in an annotation in `bed`, `gff3` or `gb` format and connect them to the MSA. Importantly, 88the sequence identifier has to be part of the alignment. All genomic locations are then automatically adapted to the 89alignment. 90 91## Plotting 92 93The plotting `draw` module has several predefined functions to draw alignments. 94 95```python 96''' 97an example demonstrating how to plot multiple sequence alignments 98''' 99# import all packages 100import matplotlib.pyplot as plt 101from msaexplorer import explore 102from msaexplorer import draw 103 104# load alignment 105aln = explore.MSA("example_alignments/DNA.fasta", reference_id=None, zoom_range=None) 106# set reference to e.g. the first sequence 107aln.reference_id = list(aln.alignment.keys())[0] 108 109fig, ax = plt.subplots(nrows=2, height_ratios=[0.2, 2], sharex=False) 110 111draw.stat_plot( 112 aln, 113 ax[0], 114 stat_type="entropy", 115 rolling_average=1, 116 line_color="indigo" 117) 118 119draw.identity_alignment( 120 aln, 121 ax[1], 122 show_gaps=False, 123 show_mask=True, 124 show_mismatches=True, 125 reference_color='lightsteelblue', 126 color_scheme='purine_pyrimidine', 127 show_seq_names=False, 128 show_ambiguities=True, 129 fancy_gaps=True, 130 show_x_label=False, 131 show_legend=True, 132 bbox_to_anchor=(1,1.05) 133) 134 135plt.show() 136``` 137""" 138 139from importlib.metadata import version, PackageNotFoundError 140 141try: 142 __version__ = version("msaexplorer") 143except PackageNotFoundError: 144 __version__ = "unknown"