2010年10月12日 星期二

Lecture 2

When comparing 2 (pairwise) or more (multiple) DNA or protein sequences by searching for a  series of individual characters or character patterns that are in the same order in the sequences, it is called Sequence alignment.

In order to produce Optimal alignment, gaps are used so that as many identical or similar characters as possible are into vertical register.

It is a powerful tool when exploring functional, structural and evolutionary data of  DNA or protein.

Global vs local

Global: comparing the whole length of the sequence up to both ends.Introduce gaps to matching as many characters as possible

Local: concentrates on the area(es) of the sequences where the longest matches are found.

Three Principle methods:
Dot Matrix analysis
DP algorithm
Word or K-tuple method

Dot matrix:
uses graph to display possible alignments. The possible alignments(s) will be shoen on the graph as a diagonal line running from top left to buttom right and vice versa.

Advantage: Shows the direct and inverted repeats easily
                  Shows the presence of insertion and deletions.
Disadvantage: Do not show the actual alignment.

Dot matrix:

2 approaches: Basic and filtering

Basic
Filtering
Sequence A is listed on the top of matrix Sequence B is listed on the left side of Starting with the first character of B, compare which every single characters of A, then repeat with the second character, third character and so forth
Sliding window can be used
2 sequences are compared at the same time
A dot is printed on the graph only if a certain minimal number of matches (stringency) occur when comparing these windows( ie window size)



Direct repeats show as diagonal lines running for top left to bottom right.
Invert repeats show as diagonal lines running from bottom right to top left

NW algorithm: (for global sequence alignment)
Scoring system is important for optimal alignment. 3 scores: Match, Mismatch and gap
       
Score matrix and backtracking:

Need to know:
Match=?          MISMATCH=?         GAP=?

Calculate the score of 3 directions: ,and
Put down the highest score in the box




     

沒有留言:

張貼留言