hfv top banner
HFV Ebola sequence database

Overview of the PhyloPlace Service

The Phylogenetic Placement (PhyloPlace) services provided here include Pairwise Distance and Branching Index analyses. Both tools work on individual sequences. The sequence is aligned with reference sequences and then analyzed accordingly. Both analysis methods use PAUP*. Pairwise Distance uses uncorrected distances, while Branching Index uses branch lengths from neighbor-joining trees with F84 (Felsenstein, 1984) and BioNJ (Gascuel, 1997).

Pairwise Distance. This summarizes the distribution of pairwise distances among aligned sequences. For n sequences, there are n(n-1)/2 pairwise comparisons. An analysis type menu is for what pairs of distances to compute. Clade-specific options report distances involving at least one sequence from the clade specified. For more detailed analysis and visualization, results can be obtained via the "Get Data" button.

Distances are shown as a histogram, with number of sequence pairs (y-axis) as a function of distance (x-axis). This distribution typically has three peaks (cf. Van Regenmortel 2007). The three peaks correspond to distances (1) within the same subtype, (2) between subtypes of the same genotype, and (3) between genotypes. This method indicates how closely the query sequence is related with sequences in the reference set. In each case, the query sequence is compared with the reference sequences. The resulting distances are summarized by a histogram.

Example of pairwise distance analysis results

The example above shows results from "Type and Subtype" P-dist analysis. Note the tri-modal distribution of distances between Types (blue); within Types, between Subtypes (yellow); and within Types, within Subtypes (green). Distances associated with the query sequence are colored red.

Branching Index. This approach quantifies relatedness with known clades as a ratio of branch lengths where your sequence connects to the reference tree (Wilbe et al., 2003). Values range from 0 (unrelated) to 1 (perfectly related) and are compared with a threshold to infer when the degree of relatedness is significant.

A Branching Index profile slides overlapping windows over the sequence. The window length is 400 nt and moves 80 nt between 2 windows. A minimum sequence length of 200 nt is required. The analysis can take considerable time to complete, longer for lengthy sequences. The result is a profile of branching index values over the extent of the query sequence. Line color indicates predicted taxa, and a horizontal line is drawn to delineate between significant (above) and insignificant (below) results (Wilbe et al. 2003; Hraber et al. 2008).

example BI output

The example above illustrates results from Branching Index analysis of a genome sequence with accession number AY651061. Line color depicts the most closely related subtype clade in a phylogenetic tree. Putative recombination breakpoints are found where value of the BI function is minimal. Multiple breakpoints between subtypes 1a and 1c are clearly evident as alternating peaks in BI values that correspond to different subtypes.


Felsenstein J. (1984) Distance methods for inferring phylogenies: a justification. Evolution, 38:16-24.

Gascuel O. (1997) BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol, 14:685-695.

Hraber P, Kuiken C, Waugh M, Geer S, Bruno W, Leitner T. (2008) Classification of hepatitis C virus and human immunodeficiency virus-1 sequences with the branching index. Journal of General Virology, 89:2098-2107.

Van Regenmortel MHV. (2007) Virus species and virus identification: past and current controversies. Infection, Genetics, and Evolution, 7:133-144.

Wilbe K, Salminen M, Laukkanen T, McCutchan F, Ray SC, Albert J, Leitner T. (2003) Characterization of novel recombinant HIV-1 genomes using the branching index. Virology, 316:116-125.

Questions or comments? Contact us at hfv-info@lanl.gov