ISREC Profile
Homepage
Overview
Application of generalized profiles is a very sensitive method for the discovery
of distant sequence relationships. In contrast to conventional sequence
comparison and database searching methods, not a single sequence is used
as a query object but a profile constructed from a family of related sequences.
These profiles are normally derived from multiple alignments of the initial
sequence set. In addition to the sequences themselves, a profile contains
the following information:
-
which types of residues are allowed at what position
-
which positions are important (=highly conserved), which ones are not
-
which positions or regions can allow insertions, which regions may be dispensable
In collaboration with Amos Bairoch in Geneva, we are currently creating profiles
of various protein domains that are being incorporated into the PROSITE pattern
library. For this purpose, we created a new, generalized profile format
containing much more parameters than the previous one. A new set of
profilesearch-programs can take advantage of these new parameters and allows
more sensitive searches and also novel types of searches.
For a detailed description of this format and related topics see the documents
below.
Selected references
-
The original profile method:
-
-
Gribskov, M., McLachlan, A.D. and Eisenberg,D. (1987)
Profile analysis: detection of distantly related proteins.
Proc. Natl. Acad. Sci. USA 84:4355-4358
-
Improvements to the profile method:
-
-
Lüthy, R., Xenarios, I. and Bucher, P. (1994)
Improving the sensitivity of the sequence profile method
Prot. Sci. 3:139-146
-
Thompson, J.D., Higgins, D.G. and Gibson, T. (1994)
Improved sensitivity of profile searches through the use of sequence
weights and gap excision.
Comput. Applicat. Biosci. 10:19-29
-
The generalized sequence profiles:
-
-
Bucher, P. and Bairoch A. (1994)
A generalized profile syntax for biomolecular sequence motifs and its
function in automatic sequence interpretation.
In: Proceedings of the 2nd ISMB Conference, pp. 53-61, AAAI press.
-
Bucher, P. Karplus, K. Moeri, N. and Hofmann K. (1996)
A flexible search technique based on generalized profiles.
Computers and Chemistry 20:3-24
-
The PROSITE pattern library:
-
-
Bairoch, A., Bucher, P. and Hofmann, K. (1996)
The PROSITE database, its status in 1995.
Nucleic Acids Res. 24:189-196
For various applications of the generalized profile technique, see out
publication list and check the documents
listed below.
Documents on generalized profile syntax and methods
-
The syntax of profiles in PROSITE
-
This document is part of the current PROSITE release. It contains a detailed
description of the format and provides all information needed for writing
programs that read or write the new format. Note, however, that we also have
released a set of free programs that do sequence comparisons and database
searches with profiles in the new format. This program package also contains
portable routines for reading and writing of the new format that can be used
in other programs as well.
-
PROSITE users manual
-
This document, written by Amos Bairoch, explains all the information stored
in PROSITE and how they can be used.
-
Methods for the construction of profile entries for
the PROSITE database
-
(K.Hofmann and P. Bucher, 1995). Poster presented at the 3rd International
Conference for Intelligent Systems in Molecular Biology, Cambridge/UK, July
1995. This documents explains, how the generalized profiles in the PROSITE
database are constructed. Issues like iterative profile refinement and profile
scaling are briefly discussed.
-
Normalized profile scores
-
This document deals with the assessment of the statistical significance of
matches found by the profilesearch methods. Application of the 'normalized
profile score' (NScore) is explained.
A collection of posters on profile applications
-
Benefits of a Generalized Profile Syntax for Biomolecular
Sequence Motifs
-
(K.Hofmann and P. Bucher, 1994). Poster presented at the 3rd conference on
Genes, Proteins and Computers, Chester/UK 1994. This poster is also available
in compressed Postscript format. It contains
a description of the advantages of profile-based database searches. As an
example, the detection of sequence similarity between inositol-monophosphatase,
fructose-1,6-bisphosphatase and inositol polyphosphate 1-monophosphatase
is demonstrated.
-
Detection and Analysis of Distantly Related
C2-like Membrane Attachment Domains
-
(K.Hofmann and P. Bucher, 1995). Poster presented at the 1st European Protein
Society Meeting, Davos/CH 1995. This poster is also available in
compressed Postscript format. The generalize
profile method is used to demonstrate the occurence of C2-like
domains in proteins like the novel PLC isoforms, phospholipase C, cytosolic
phospholipase A2, perforin, and many more.
-
Conserved sequence domains in cell cycle regulatory
proteins
-
(K.Hofmann and P. Bucher, 1996). Poster presented at the joint ISREC/AACR
meeting "Cancer and the Cell cycle", Lausanne/CH January 1996. This document
shows several examples of weakly conserved domains in cell cycle regulatory
proteins, which have been detected using the profile method.
Profile-related software
-
ISREC ProfileScan Server
-
(Search a the profiles-entries in PROSITE with your sequence). This
is an experimental implementation of the pfscan program. The
profile-entries contained in PROSITE, recognizable by the keyword
MATRIX, can be searched with a single, user-supplied sequence.
Major new data release and Pfam now
searchable!
-
Download the pftools package
-
The pftools package contains programs for generalized profile
applications. The source code in FORTRAN77 and executables for various platforms
are available. The current release 1.0 contains the programs
pfsearch,
pfscan, and GtoP. Problems should
be reported to Philipp Bucher, the author
of the package.
Pftools 2.0 now available!
People who are interested in getting more information on profiles or who
would like to contribute profiles or good multiple alignments of protein
domains should contact Philipp Bucher or
Kay Hofmann
For getting more information on PROSITE, visit the
PROSITE homepage in Geneva.
This document was last modified on
Monday, February 28, 2005 at 09:24:12 AM
Go to the ISREC-bioinformatics home page