PSORT.org provides links to the PSORT family of programs for
subcellular
localization prediction as well as other datasets and resources
relevant
to localization prediction. The page is currently hosted by the
Brinkman
Laboratory at Simon Fraser University, and our goal is to
provide an
open-source resource centre for researchers interested in
subcellular
localization prediction.
Please choose
from the
following PSORT programs for localization
prediction:
Locally hosted
resources:
- PSORTdb A
two-component searchable
and browsable database. ePSORTdb contains bacterial
proteins of
experimentally verified localization used in training and
testing
of PSORTb. cPSORTdb contains predictions of localization
for bacterial
genomes.
- Standalone PSORTb for
Linux A downloadable version of PSORTb which can be run
locally.
- Datasets of Proteins of
Known Localization Datasets of proteins used to train and evaluate
PSORTb.
NB: the datasets used in PSORTb development can now be
accessed
through ePSORTdb.
- Precomputed Genomes Precomputed
PSORTb
results for available bacterial genomes. NB: these
results
are now available in a more powerful searchable and
browsable
form via cPSORTdb.
- Motifs and Profiles
Associated with
Specific Localizations Motifs and Profiles
characteristic
of specific localization sites used in PSORTb's Motif,
Profile,
and OMPMotif modules.
PSORTb and PSORTdb are maintained by the Brinkman
Laboratory,
Simon Fraser University, British Columbia, Canada.
PSORT and
PSORT II are maintained by Kenta Nakai, at the Human Genome Center, Institute for Medical Science,
University
of Tokyo, Japan. iPSORT is maintained by Hideo
Bannai at the Human Genome Center.
Other predictive methods, datasets and resources:
The following is a collection of links relevant
to
subcellular localization prediction. If you would like to see a
link
to a particular program or resource added to this page, please
contact us.
At the bottom of the page, we have also
provided a
suggested reading list containing selected review articles
describing
SCL and SCL prediction.
Other prokaryotic subcellular
localization
predictors (with web servers):
- Augur (Billion et al, 2006) is a computational pipeline for Gram-positive bacterial whole-genome sufrace protein predictions.
- SubcellPredict (Niu et al, 2008) uses AdaBoost algorithm to predict cytoplasmic, periplasmic and extracellular localizations sites for prokaryotic organisms.
- P-classifier (Wang et al, 2005) predicts subcellular localizations of proteins for Gram-negative bacteria based on amino acid subalphabets and a combination of multiple support vector machines
- PSLDoc (Chang et al, 2008) uses document classification techniques and incorporates a probabilistic latent semantic analysis with a support vector machine model, for prediction on prokaryotes and eukaryotes.
- TBpred (Rashid et al, 2007) is a prediction server that predicts four subcellular localization (cytoplasmic,integral membrane,secretory and membrane attached by lipid anchor) of mycobacterial proteins.
- PSL101 (Su et al, 2007) is a hybrid prediction method for Gram-negative bacteria that combines a one-versus-one support vector machine(SVM) model and a structure homology approach
- SLP-Local (Matsuda et al, 2005) predicts localizations for chloroplast, mitochondria, secretory pathway, and other locations (nucleus or cytosol) for eukaryotic proteins, as well as cytoplasm, extracell, and periplasm for Gram negative organisms.
- Gpos-PLoc (Shen and Chou, 2007) and Gneg-PLoc (Chou and Shen, 2006) use K-nearest neighbor-based classifier to predict localizations for Gram-positive and Gram-negative bacteria, respectively.
- CELLO version 2 (Yu et al, 2006) uses a two-level Support Vector Machine system to assign localizations to both prokaryotic and eukaryotic proteins. Version 1 of the software is described in the Yu et al, 2004 paper.
- PSLpred (Bhasin et al, 2005) is a localization prediction tool
for
Gram-negative bacteria which utilizes support vector machine
and
PSI-BLAST to generate predictions for 5 localization
sites.
- Proteome Analyst's Subcellular Localization Server (Lu et al, 2004) This specialized server available at
the PENCE
Proteome Analyst site is able to classify Gram-negative,
Gram-positive,
fungi, plant and animal proteins to many localization sites.
A database
of predictions is also available and is described below.
- LOCtree (Nair and Rost, 2005). LOCtree is a eukaryotic and
prokaryotic
localization prediction tool available at the CUBIC site. Databases of localization predictions made
by
CUBIC's servers are also available and are described below.
- SubLoc (Hua and Sun, 2001) uses Support Vector Machine to assign a
prokaryotic protein to the cytoplasmic, periplasmic, or extracellular
sites, and a eukaryotic protein to the cytoplasmic, mitochondrial,
nuclear, or extracellular sites. A modified version of SubLoc was
used in PSORT-B v.1.1 to differentiate cytoplasmic and non-cytoplasmic
proteins.
- SignalP (Bendtsen et al, 2004) predicts traditional N-terminal
signal
peptides in both prokaryotic and eukaryotic proteins.
- TatP (Bendtsen et al, 2005) predicts twin-arginine signal peptides in Bacteria.
- LipoP (Juncker et al,2003) uses HMM to predict lipoprotein signal peptides in Gram-negative bacteria.
Other prokaryotic subcellular localization prediction methods (without web servers):
- FFT-based SCL predictor (Wang et al, 2007) is a fast Fourier transform-based support vector machine for subcellular localization prediction using different substitution models
- GNBSL (Guo et al, 2006) generates subcellular localization prediction for Gram negative bacteria using a combination of several different SVM's based on the PSSM and PSFM generated from the input protein
- HensBC (Bulashevska and Eils, 2006) predicts localizations by constructing a hierarchical ensemble of classifiers, namely Bayesian classifiers based on Markov chain models
Other eukaryotic subcellular localization predictors:
- AdaBoost Learner (Jin et al, 2008) predicts 12 eukaryotic localizations using the AdaBoost algorithm.
- SubcellPredict (Niu et al, 2008) uses AdaBoost algorithm to predict cytoplasmic, nuclear, mitochondrial, and extracellular localizations sites for eukaryotic organisms.
- PSLDoc (Chang et al, 2008) uses document classification techniques and incorporates a probabilistic latent semantic analysis with a support vector machine model, for prediction on prokaryotes and eukaryotes.
- EpiLoc (Brady and Shatkay, 2008) is a text-based system for predicting animal, plant and fungal protein subcellular locations.
- ProLoc-GO (Huang et al, 2008) utilizes Gene Ontology terms for sequenced-based prediction of subcellular localization.
- AAIndexLoc (Tantoso and Li, 2007) predicts protein subcellular localization by using amino acid composition and physicochemical properties.
- SLPFA (Tamura and Akutsu, 2007) predicts localizations by feature vectors based on amino acid composition (frequency) and sequence alignment. Subcellular locations predicted include chloroplast, mitochondria, secretory pathway, and other locations (nucleus or cytosol) for eukaryotic proteins
- YimLOC (Shen and Burger, 2007) integrates previously published subcellular localization prediction tools using a stacked decision tree and makes predictions for mitochondrial proteins.
- SLP-Local (Matsuda et al, 2005) predicts localizations for chloroplast, mitochondria, secretory pathway, and other locations (nucleus or cytosol) for eukaryotic proteins, as well as cytoplasm, extracell, and periplasm for Gram negative organisms.
- SherLoc (Shatkay et al, 2007) intergrates several sequence and text-based features and provides predictions for plant, animal, and fungal proteins.
- SLPS (Jia et al, 2007), or Subcellular
Localization Predicting System, predicts localization using a Nearest Neighbor Algorithm (NNA) and incorporating
a protein functional domain profile.
- Hum-mPLoc (Shen and Chou, 2007) is a localization predictor specific for human proteins. It uses an ensemble classifier that handles cases where a human protein has multiple possible location sites.
- Hum-PLoc (Chou and Shen, 2006) uses a KNN classifier to predict localizations of human proteins.
- Euk-mPLoc (Chou and Shen, 2007) is a general eukaryotic predictor. It uses an ensemble classifier that handles cases where a protein has multiple possible location sites.
- Euk-PLoc (Shen et al, 2007) is a general eukarytoic predictor that uses KNN (K-Nearest Neighbor)based algorithm to predict localizations.
- Plant-PLoc (Chou and Shen, 2007) is a plant-specific predictor that uses KNN algorithm to predict localizations.
- BaCelLo (Pierleoni et al, 2006) is a predictor for five classes of eukaryotic subcellular localization (secretory pathway, cytoplasm, nucleus, mitochondrion and chloroplast) and it is based on different SVMs organized in a decision tree.
- Protein Prowler version 1.2 (Hawkins and Boden, 2006) uses a multi-layer classifer system for predicting the subcellular localization of proteins based on their amino acid sequence. It classifies eukaryotic targeting signals as secretory, mitochondrion, chloroplast or other. Version 1.1 was originally described in Boden and Hawkins, 2005 paper.
- pTARGET (Guda 2006),
(Guda
and Subramaniam, 2005) uses amino acid composition and localization-specific Pfam domains to assign a eukaryotic protein to one of nine localization sites.
- CELLO version 2 (Yu et al, 2006) uses a two-level Support Vector Machine system to assign localizations to both prokaryotic and eukaryotic proteins.
- Golgi Localization Predictor (Yuan and Teasdale, 2002) predicts Golgi Type II membrane proteins and can discriminate between proteins destined for the Golgi apparatus or other post-Golgi locations.
- pSLIP (Sarda et al, 2005) uses support vector machine and multiple
physiochemical properties of amino acids to assign a eukaryotic
protein to one of six localization sites.
- HSLpred (Bhasin et al, 2005) is a localization prediction tool for
human proteins which utilizes support vector machine and PSI-BLAST
to generate predictions for 4 localization sites.
- LOCSVMPSI (Xie et al, 2005) is a eukaryotic
localization prediction method that incorporates evolutionary
information into its predictions. The method uses PSI-BLAST and
support vector machine to generate predictions for up to 12 localization
sites.
- PSLT (Scott et al, 2004) is a Bayesian network-based method that
predicts human protein localization based on motif/domain co-occurence.
The tool is not yet available online, however its predictions
for 9793 human proteins in SWISS-PROT are available for download
from the PSLT site.
- ESLPred (Bhasin and Raghava, 2004) uses Support Vector Machine and
PSI-BLAST to assign eukaryotic proteins to the nucleus, mitochondrion,
cytoplasm, or extracellular space.
- Proteome Analyst's Subcellular Localization Server (Lu et al, 2004) This specialized server available at the PENCE
Proteome Analyst site is able to classify Gram-negative, Gram-positive,
fungi, plant and animal proteins to many localization sites. A
database of predictions is also available and is described below.
- LOCtree (Nair and Rost, 2005). LOCtree is a eukaryotic and prokaryotic
localization prediction tool available at the CUBIC site. Databases of localization predictions made by
CUBIC's servers are also available and are described below.
- SecretomeP (Bendtsen et al, 2004) predicts eukaryotic proteins which are
secreted via a non-traditional secretory mechanism.
- SignalP (Bendtsen et al, 2004) predicts traditional N-terminal signal
peptides in both prokaryotic and eukaryotic proteins.
- SubLoc (Hua and Sun, 2001) uses Support Vector Machine to assign a
prokaryotic protein to the cytoplasmic, periplasmic, or extracellular
sites, and a eukaryotic protein to the cytoplasmic, mitochondrial,
nuclear, or extracellular sites. A modified version of SubLoc
was used in PSORT-B v.1.1 to differentiate cytoplasmic and non-cytoplasmic
proteins.
- TargetP (Emanuelsson et al, 2000) predicts the presence of signal peptides,
chloroplast transit peptides, and mitochondrial targeting peptides
for plant proteins, and the presence of signal peptides and mitochondrial
targeting peptides for eukaryotic proteins.
- Predotar is designed to predict the presence of mitochondrial
and plastid targeting peptides in plant sequences.
Other eukaryotic subcellular localization prediction methods (without web servers):
- ngLOC (King and Guda, 2007) uses an n-gram-based Bayesian classifier that predicts the localization of a protein sequence over ten distinct subcellular organelles. An enhanced version of ngLOC was developed to estimate the subcellular proteomes of eight eukaryotic organisms: yeast, nematode, fruitfly, mosquito, zebrafish, chicken, mouse, and human.
Nucleus-specific localization predictors:
- Nuc-PLoc (Shen and Chou, 2007) is a web-server for predicting protein subnuclear localization by fusing PseAA composition and PsePSSM.
- NUCLEO (Hawkins et al, 2007) predicts possible nuclear localization by taking into consideration of
dually localized proteins. It uses an SVM-based approach with a custom kernel that employs a composite spectrum (or multiple k-mer) encoding conjoined with a bit vector indicating the presence or absence of a range of sequence motifs known to be important for nuclear proteins.
- NucPred (Brameier et al, 2007) predicts possible nuclear localization by using a genetic programming-based algorithm. Previous version was described in Heddad et al, 2004 paper.
- ProLoc (Huang et al, 2007) predicts subnuclear localizations using an evolutionary SVM based classifier with automatic selection from a large set of physicochemical composition (PCC) features.
- Subnuclear Compartments Prediction System (Lei and Dai, 2006), (Lei and Dai, 2005) predicts subnuclear localization by combining an SVM-based system for sequence analysis with a nearest-neighbor classifier using a similarity measure derived from the GO annotation terms for the protein sequences.
- NetNES (la Cour et al, 2004) predicts nuclear export signals using neural network and HMMs.
- predictNLS (Cokol et al, 2000) uses nuclear localization signal motifs
to predict whether a protein might be localized to the nucleus.
Viral protein subcellular localization predictors:
- Virus-PLoc (Shen and Chou, 2007) predicts viral protein subcellular localizations using a fusion of classifiers implemented with K-nearest neighbor rules and Swissprot annotated viral proteins as training data.
Other subcellular localization-related
databases:
|