Performing a Search
1. Go to the ProfileScan Server sequence entry page. As with other tool pages, this page has a number of links to help and information.
2. ProfileScan is used for the game to expand the databases that are searched for family, motif and domain information. So, check all four of the database boxes to get the most information from the search.
3. Set the sensitivity by selecting "significant matches only" in the first drop menu.
4. Do not change the hit sorting with the next drop menu.
5. Type "unknown protein" into the query's title box, and then paste the following sequence into the sequence entry box.
MDDDIAALVVDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKD
SYVGDEAQSKRGILTLKYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLL
TEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVLSLYASGRTTGIVMDSGDG
VTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVRDIKE
KLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGM
ESCGIHETTFNSIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPST
MKIKIIAPPERKYSVWIGGSILASLSTFQQMWISKQEYDESGPSIVHRKCF
6. Wait for the results to be returned. The results for this search through all available databases often takes a couple of minutes, so be patient.
Interpreting Results
1. ProfileScan searches provide additional information for playing the game because they search several databases not searched by the Blocks Searcher.
2. This results page also begins by providing information about the query sequence, the parameters that were set and the databases that were searched. An explanation of the color of the hit identifiers is given, and a notice that any significant matches are preceded by a red "!" is shown. Since the search parameters were set to return only significant matches, all hits should have a red "!".
3. The next section shows the significant hits for the query. The line for each hit shows the normalized score (if it is available), the raw score, the position in the query where the match occurs for that hit, the name of the profile (color coded to identify the database it came from), and the name/description of the hit sequence in the database.
4. Hits from the PROSITE Profiles, Pfam, and Gribskov collection databases have normalized scores. These scores have an inverse log relationship to the E-value. A normalized score of 10 corresponds to an E-value of around 1 X 10-2 and 10.5 to an E-value of around 2 X 10-3. So, any score above 11 and certainly any above 20 are very significant matches.
5. Hits from the PROSITE patterns databases do not show a normalized or raw score per se. Marco Pagni, the Webmaster for the ProfileScan Server, says, "In principle, there is no score associated with a match by a pattern. A score was nevertheless assigned to match by pattern in the output of profile-scan: it just acts as a placeholder. It received a value of 1.0 for match by "regular" pattern and a score of 0.1 for match by frequent matcher, i.e. when weak matches are allowed." This means that if an exact match to a pattern in the database is found in the search sequence, the score is given as 1. If a pattern is found in the query that is only similar to one in the database a 0.1 score is given and will only be seen if week matches are allowed.
6. These scores, along with the name/description of the hits, indicate whether or not there are any families or domains related to the query in the databases searched. As stated above, this will add information to that obtained from any search with the Blocks Searcher on the same query.
7. The result page has two additional features that can be of use. Below the list of hits is a button labelled "SEView Applet." If your browser is capable of running java applets, clicking this button will bring up a graphical overview of the aligned family profiles or domains. This gives an idea of the areas of the query that match the hits. Below this button, there is another that is labelled "more about these motifs". In a normal search, this button will display a screen that links to the actual database entries for each hit and other related information. This feature has been disabled in the tutorial, but will be active during game play.
8. For the query sequence run for this tutorial, five hits are returned one family profile from Pfam, one signature from the Gribskov collection and three from PROSITE patterns. All of the hits are from actins and the ones that have normalized scores are obviously very good matches (scores of 293 and 24). This would certainly indicate that the query protein was at least related to actins if not an actin itself.
9. Take a look at another ProfileScan result page. These results show several hits to beta-lactamase B families/domains, so the query is most likely to be related to beta-lactamase B. However, there is also a hit to a lipoprotein pattern. This illustrates the fact that some hits may seem unrelated. These hits will most often indicate something about different properties of the query sequence. The prokaryotic lipoprotein hit in this case may indicates some function of the query.
© Copyright 2000 The Southwest Biotechnology and Informatics Center (SWBIC) / Regents of New Mexico State University. All rights reserved.