NAME
     pfscan - scan a protein  or  DNA  sequence  with  a  profile
     library


SYNOPSIS
     pfscan [ -abflLrsuxy ] [ seq-file | - ]
                 [ profile-library-file | - ]    [L=#] [W=#]

DESCRIPTION
     pfscan compares a protein or nucleic acid sequence against a
     profile  library. The result is an unsorted list of profile-
     sequence matches written to the standard output.  A  variety
     of  output  formats containing different informations can be
     specified via the options -a, -l, -L, -r, -u,  -s,  and  -z.
     seq-file  contains  a  sequence  in  EMBL/SWISS-PROT  format
     (assumed by default) or in Pearson/Fasta  format  (indicated
     by  option  -f).  profile-library-file contains a library of
     profiles in PROSITE format. pfscan can be used as  a  filter
     if - is used instead of one of the input filenames.

OPTIONS
     -a   Report  optimal  alignment  scores  for  all   profiles
          regardless of the cut-off value. This option simultane-
          ously forces DISJOINT=UNIQUE.

     -b   Search the complementary strand of the DNA sequence  as
          well.

     -f   Input sequence is in Pearson/Fasta format.

     -l   Indicate highest cut-off level exceeded  by  the  match
          score in the output list.

     -L   Indicate by character string the highest cut-off  level
          exceeded  by  the  match score in the output list. Note
          that the generalized profile  format  includes  a  text
          string field to specify a name for a cut-off level. The
          -L option causes the program to display the  first  two
          characters  of this text string (usually something like
          "!" "?", "??", etc.) at the  beginning  of  each  match
          description.

     -r   Use raw scores rather than normalized scores for  match
          selection.  Normalized scores will not be listed in the
          output.

     -s   List the sequences of the matched regions as well.  The
          output   will  be  a  Pearson/Fasta-formatted  sequence
          library.

     -u   Forces DISJOINT=UNIQUE.
     -x   List profile-sequence alignments in pftools PSA format.

     -y   Display alignments between the profile and the  matched
          sequence regions in a human-friendly format.

     -z   Indicate starting and ending position  of  the  matched
          profile  range.  The latter position will be given as a
          negative offset from the end of the profile.  Thus  the
          range [    1,    -1] means entire profile.

PARAMETERS
     L=#  Cut-off level to be used for match selection.  If level
          L  is not specified in the profile, the next higher (if
          L is negative) or next lower (if L is  positive)  level
          specified is used instead.

     W=#  Output width.  Output lines will be truncated  after  W
          characters.  Default: W=132.

EXAMPLES
     (1)  pfscan -s GTPA_HUMAN prosite13.prf

          Scans the human GAP protein for matches to profiles  in
          PROSITE  release 13. GTPA_HUMAN contains the SWISS-PROT
          entry P20936|GTPA_HUMAN.   prosite13.prf  contains  all
          profile  entries of PROSITE release 13. The output is a
          Pearson/Fasta-formatted sequence library containing all
          sequence  regions of the input sequence matching a pro-
          file in the profile library.

     (2)  pfscan -by CVPBR322 ecp.prf L=2

          Scans both strands of plasmid PBR322  for  high-scoring
          (level  2)  E. coli promoter matches. CVPBR322 contains
          EMBL entry J01749|CVPBR322.  ecp.prf contains a profile
          for  E.  coli  promoters.  The output includes profile-
          sequence alignments in a human-friendly format.

AUTHOR
     Philipp Bucher
     Philipp.Bucher@isrec.unil.ch