PTreeGenerator  1.0
Simple phylogenetic tree generation from multiple sequence alignment.
 All Classes Namespaces Files Functions Variables
Functions
ptreegen.distance_functions Namespace Reference

Functions

def p_distance
 Computation of the uncorrected p-distance.
def poisson_corrected
 Poisson corrected p-distance.
def jukes_cantor
 Distance according to the Jukes-Cantor model.

Function Documentation

def ptreegen.distance_functions.jukes_cantor (   seq1,
  seq2,
  args,
  kwargs 
)

Distance according to the Jukes-Cantor model.

For more info see: http://goo.gl/upr3wR

Parameters
seq1first sequence
seq2second sequence
*argspositional arguments
**kwargskeyword arguments ("sequence_type" option is used to determine the parameters for the formula)
Returns
distance as a single value

Definition at line 66 of file distance_functions.py.

References ptreegen.distance_functions.p_distance().

66 
67 def jukes_cantor(seq1, seq2, *args, **kwargs):
68  p_dist = p_distance(seq1, seq2, *args, **kwargs)
69  if kwargs["sequence_type"] == SeqTypes.AA:
70  return (-19.0/20.0) * math.log(1 - (20.0/19.0) * p_dist)
71  elif kwargs["sequence_type"] == SeqTypes.DNA or kwargs["sequence_type"] == SeqTypes.RNA:
72  return (-3.0/4.0) * math.log(1 - (4.0/3.0) * p_dist)
73  else:
74  assert False
def ptreegen.distance_functions.p_distance (   seq1,
  seq2,
  args,
  kwargs 
)

Computation of the uncorrected p-distance.

The formula is similar to the one used in EMBOSS (see http://emboss.sourceforge.net/apps/release/6.6/emboss/apps/distmat.html).

Parameters
seq1first sequence
seq2second sequence
*argspositional arguments
**kwargskeyword arguments (the "gap_penalty" argument is used to determine the gap penalty)
Returns
distance as a single value

Definition at line 22 of file distance_functions.py.

Referenced by ptreegen.distance_functions.jukes_cantor(), and ptreegen.distance_functions.poisson_corrected().

22 
23 def p_distance(seq1, seq2, *args, **kwargs):
24  assert len(seq1) == len(seq2)
25  gap_penalty = 0
26  if kwargs.has_key("gap_penalty"):
27  gap_penalty = kwargs["gap_penalty"]
28  positions_all = len(seq1)
29  matches = 0
30  gaps = 0
31  for idx in range(positions_all):
32  res1 = seq1[idx]
33  res2 = seq2[idx]
34  if res1 == res2 and not(res1 == "-" and res2 == "-"):
35  matches+=1
36  elif res1 == "-" or res2 == "-" and not(res1 == "-" and res2 == "-"):
37  gaps+=1
38  return 1 - float(matches) / ((positions_all - gaps) + gaps * gap_penalty)
def ptreegen.distance_functions.poisson_corrected (   seq1,
  seq2,
  args,
  kwargs 
)

Poisson corrected p-distance.

For more info see: http://goo.gl/upr3wR

Parameters
seq1first sequence
seq2second sequence
*argspositional arguments
**kwargskeyword arguments
Returns
distance as a single value

Definition at line 50 of file distance_functions.py.

References ptreegen.distance_functions.p_distance().

50 
51 def poisson_corrected(seq1, seq2, *args, **kwargs):
52  p_dist = p_distance(seq1, seq2, *args, **kwargs)
53  return -1 * math.log(1 - p_dist)