PhyloNode

class PhyloNode(*args, **kwargs)
ancestors()

Returns all ancestors back to the root. Dynamically calculated.

append(i)

Appends i to self.children, in-place, cleaning up refs.

ascii_art(show_internal=True, compact=False)

Returns a string containing an ascii drawing of the tree.

Parameters:
  • show_internal – includes internal edge names.

  • compact – use exactly one line per tip.

balanced()

Tree ‘rooted’ here with no neighbour having > 50% of the edges.

Usage:

Using a balanced tree can substantially improve performance of the likelihood calculations. Note that the resulting tree has a different orientation with the effect that specifying clades or stems for model parameterisation should be done using the ‘outgroup_name’ argument.

bifurcating(eps=None, constructor=None, name_unnamed=False)

Wrap multifurcating with a num of 2

child_groups()

Returns list containing lists of children sharing a state.

In other words, returns runs of tip and nontip children.

compare_by_names(other)

Equality test for trees by name

compare_by_subsets(other, exclude_absent_taxa=False)

Returns fraction of overlapping subsets where self and other differ.

Other is expected to be a tree object compatible with PhyloNode.

Note: names present in only one of the two trees will count as mismatches: if you don’t want this behavior, strip out the non-matching tips first.

compare_by_tip_distances(other, sample=None, dist_f=<function distance_from_r>, shuffle_f=<bound method Random.shuffle of <random.Random object at 0x136f3f0>>)

Compares self to other using tip-to-tip distance matrices.

Value returned is dist_f(m1, m2) for the two matrices. Default is to use the Pearson correlation coefficient, with +1 giving a distance of 0 and -1 giving a distance of +1 (the madimum possible value). Depending on the application, you might instead want to use distance_from_r_squared, which counts correlations of both +1 and -1 as identical (0 distance).

Note: automatically strips out the names that don’t match (this is necessary for this method because the distance between non-matching names and matching names is undefined in the tree where they don’t match, and because we need to reorder the names in the two trees to match up the distance matrices).

compare_name(other)

Compares TreeNode by name

copy(memo=None, _nil=None, constructor='ignored')

Returns a copy of self using an iterative approach

copy_recursive(memo=None, _nil=None, constructor='ignored')

Returns copy of self’s structure, including shallow copy of attrs.

constructor is ignored; required to support old tree unit tests.

copy_topology(constructor=None)

Copies only the topology and labels of a tree, not any extra data.

Useful when you want another copy of the tree with the same structure and labels, but want to e.g. assign different branch lengths and environments. Does not use deepcopy from the copy module, so _much_ faster than the copy() method.

deepcopy(memo=None, _nil=None, constructor='ignored')

Returns a copy of self using an iterative approach

descendant_array(tip_list=None)

Returns numpy array with nodes in rows and descendants in columns.

A value of 1 indicates that the decendant is a descendant of that node/ A value of 0 indicates that it is not

Also returns a list of nodes in the same order as they are listed in the array.

tip_list is a list of the names of the tips that will be considered, in the order they will appear as columns in the final array. Internal nodes will appear as rows in preorder traversal order.

distance(other)

Returns branch length between self and other.

extend(items)

Extends self.children by items, in-place, cleaning up refs.

get_connecting_edges(name1, name2)

returns a list of edges connecting two nodes.

If both are tips, the LCA is excluded from the result.

get_connecting_node(name1, name2)

Finds the last common ancestor of the two named edges.

get_distances(endpoints=None)

The distance matrix as a dictionary.

Usage:

Grabs the branch lengths (evolutionary distances) as a complete matrix (i.e. a,b and b,a).

get_edge_names(tip1name, tip2name, clade=True, stem=False, outgroup_name=None)

Return the list of stem and/or sub tree (clade) edge name(s). This is done by finding the common intersection, and then getting the list of names. If the clade traverses the root, then use the outgroup_name argument to ensure valid specification.

Parameters:
  • tip1/2name – edge 1/2 names

  • stem – whether the name of the clade stem edge is returned.

  • clade – whether the names of the edges within the clade are returned

  • outgroup_name – if provided the calculation is done on a version of the tree re-rooted relative to the provided tip.

  • Usage – The returned list can be used to specify subtrees for special parameterisation. For instance, say you want to allow the primates to have a different value of a particular parameter. In this case, provide the results of this method to the parameter controller method set_param_rule() along with the parameter name etc..

get_edge_vector(include_root=True)

Collect the list of edges in postfix order

Parameters:

include_root – specifies whether root edge included

get_figure(style='square', **kwargs)

gets Dendrogram for plotting the phylogeny

Parameters:
  • style (string) – ‘square’, ‘angular’, ‘radial’ or ‘circular’

  • kwargs – arguments passed to Dendrogram constructor

get_max_tip_tip_distance()

Returns the max tip tip distance between any pair of tips

Returns (dist, tip_names, internal_node)

get_newick(with_distances=False, semicolon=True, escape_name=True, with_node_names=False)

Return the newick string for this tree.

Parameters:
  • with_distances – whether branch lengths are included.

  • semicolon – end tree string with a semicolon

  • escape_name – if any of these characters []’”(), nodes name, wrap the name in single quotes

  • with_node_names – includes internal node names (except ‘root’)

  • NOTE (This method returns the Newick representation of this node) –

  • descendents. (and its) –

get_newick_recursive(with_distances=False, semicolon=True, escape_name=True, with_node_names=False)

Return the newick string for this edge.

Parameters:
  • with_distances – whether branch lengths are included.

  • semicolon – end tree string with a semicolon

  • escape_name – if any of these characters []’”(), nodes name, wrap the name in single quotes

  • with_node_names – includes internal node names (except ‘root’)

get_node_matching_name(name)
get_node_names(includeself=True, tipsonly=False)

Return a list of edges from this edge - may or may not include self. This node (or first connection) will be the first, and then they will be listed in the natural traverse order.

Parameters:
  • includeself (bool) – excludes self.name from the result

  • tipsonly (bool) – only tips returned

get_nodes_dict()

Returns a dict keyed by node name, value is node

Will raise TreeError if non-unique names are encountered

get_param_value(param, edge)

returns the parameter value for named edge

get_sub_tree(name_list, ignore_missing=False, keep_root=False, tipsonly=False)

A new instance of a sub tree that contains all the otus that are listed in name_list.

Parameters:
  • ignore_missing – if False, get_sub_tree will raise a ValueError if name_list contains names that aren’t nodes in the tree

  • keep_root – if False, the root of the subtree will be the last common ancestor of all nodes kept in the subtree. Root to tip distance is then (possibly) different from the original tree. If True, the root to tip distance remains constant, but root may only have one child node.

  • tipsonly – only tip names matching name_list are allowed

get_tip_names(includeself=False)

return the list of the names of all tips contained by this edge

get_xml()

Return XML formatted tree string.

index_in_parent()

Returns index of self in parent.

insert(index, i)

Inserts an item at specified position in self.children.

is_root()

Returns True if the current is a root, i.e. has no parent.

is_tip()

Returns True if the current node is a tip, i.e. has no children.

isroot()

Returns True if root of a tree, i.e. no parent.

istip()

Returns True if is tip, i.e. no children.

iter_nontips(include_self=False)

Iterates over nontips descended from self, [] if none.

include_self, if True (default is False), will return the current node as part of the list of nontips if it is a nontip.

iter_tips(include_self=False)

Iterates over tips descended from self, [] if self is a tip.

last_common_ancestor(other)

Finds last common ancestor of self and other, or None.

Always tests by identity.

lca(other)

Finds last common ancestor of self and other, or None.

Always tests by identity.

property length
levelorder(include_self=True)

Performs levelorder iteration over tree

lowest_common_ancestor(tipnames)

Lowest common ancestor for a list of tipnames

This should be around O(H sqrt(n)), where H is height and n is the number of tips passed in.

make_tree_array(dec_list=None)

Makes an array with nodes in rows and descendants in columns.

A value of 1 indicates that the decendant is a descendant of that node/ A value of 0 indicates that it is not

also returns a list of nodes in the same order as they are listed in the array

max_tip_tip_distance()

returns the max distance between any pair of tips

Also returns the tip names that it is between as a tuple

multifurcating(num, eps=None, constructor=None, name_unnamed=False)

return a new tree with every node having num or few children

Parameters:
  • num (int) – the number of children a node can have max

  • eps (float) – default branch length to set if self or constructor is of PhyloNode type

  • constructor – a TreeNode or subclass constructor. If None, uses self

  • name_unnamed (bool) – names unnamed nodes

name_unnamed_nodes()

sets the Data property of unnamed nodes to an arbitrary value

Internal nodes are often unnamed and so this function assigns a value for referencing.

non_tip_children()

Returns direct children in self that have descendants.

nontips(include_self=False)

Returns nontips descended from self.

property parent

Accessor for parent.

If using an algorithm that accesses parent a lot, it will be much faster to access self._parent directly, but don’t do it if mutating self._parent! (or, if you must, remember to clean up the refs).

pop(index=-1)

Returns and deletes child of self at index (default: -1)

postorder(include_self=True)

performs postorder iteration over tree.

Notes

This is somewhat inelegant compared to saving the node and its index on the stack, but is 30% faster in the average case and 3x faster in the worst case (for a comb tree).

pre_and_postorder(include_self=True)

Performs iteration over tree, visiting node before and after.

preorder(include_self=True)

Performs preorder iteration over tree.

prune()

Reconstructs correct tree after nodes have been removed.

Internal nodes with only one child will be removed and new connections and Branch lengths will be made to reflect change.

reassign_names(mapping, nodes=None)

Reassigns node names based on a mapping dict

mapping : dict, old_name -> new_name nodes : specific nodes for renaming (such as just tips, etc…)

remove(target)

Removes node by name instead of identity.

Returns True if node was present, False otherwise.

remove_deleted(is_deleted)

Removes all nodes where is_deleted tests true.

Internal nodes that have no children as a result of removing deleted are also removed.

remove_node(target)

Removes node by identity instead of value.

Returns True if node was present, False otherwise.

root()

Returns root of the tree self is in. Dynamically calculated.

root_at_midpoint()

return a new tree rooted at midpoint of the two tips farthest apart

this fn doesn’t preserve the internal node naming or structure, but does keep tip to tip distances correct. uses unrooted_deepcopy()

rooted_at(edge_name)

Return a new tree rooted at the provided node.

Usage:

This can be useful for drawing unrooted trees with an orientation that reflects knowledge of the true root location.

rooted_with_tip(outgroup_name)

A new tree with the named tip as one of the root’s children

same_shape(other)

Ignores lengths and order, so trees should be sorted first

same_topology(other)

Tests whether two trees have the same topology.

scale_branch_lengths(max_length=100, ultrametric=False)

Scales BranchLengths in place to integers for ascii output.

Warning: tree might not be exactly the length you specify.

Set ultrametric=True if you want all the root-tip distances to end up precisely the same.

separation(other)

Returns number of edges separating self and other.

set_max_tip_tip_distance()

Propagate tip distance information up the tree

This method was originally implemented by Julia Goodrich with the intent of being able to determine max tip to tip distances between nodes on large trees efficiently. The code has been modified to track the specific tips the distance is between

set_param_value(param, edge, value)

set’s the value for param at named edge

set_tip_distances()

Sets distance from each node to the most distant tip.

siblings()

Returns all nodes that are children of the same parent as self.

Note: excludes self from the list. Dynamically calculated.

sorted(sort_order=None)

An equivalent tree sorted into a standard order. If this is not specified then alphabetical order is used. At each node starting from root, the algorithm will try to put the descendant which contains the lowest scoring tip on the left.

subset()

Returns set of names that descend from specified node

subsets()

Returns all sets of names that come from specified node and its kids

tip_children()

Returns direct children of self that are tips.

tip_to_tip_distances(endpoints=None, default_length=1)

Returns distance matrix between all pairs of tips, and a tip order.

Warning: .__start and .__stop added to self and its descendants.

tip_order contains the actual node objects, not their names (may be confusing in some cases).

tips(include_self=False)

Returns tips descended from self, [] if self is a tip.

tips_within_distance(distance)

Returns tips within specified distance from self

Branch lengths of None will be interpreted as 0

to_json()

returns json formatted string {‘newick’: with edges and distances, ‘edge_attributes’: }

to_rich_dict()

returns {‘newick’: with node names, ‘edge_attributes’: {‘tip1’: {‘length’: …}, …}}

total_descending_branch_length()

Returns total descending branch length from self

total_length()

returns the sum of all branch lengths in tree

traverse(self_before=True, self_after=False, include_self=True)

Returns iterator over descendants. Iterative: safe for large trees.

self_before includes each node before its descendants if True. self_after includes each node after its descendants if True. include_self includes the initial node if True.

self_before and self_after are independent. If neither is True, only terminal nodes will be returned.

Note that if self is terminal, it will only be included once even if self_before and self_after are both True.

This is a depth-first traversal. Since the trees are not binary, preorder and postorder traversals are possible, but inorder traversals would depend on the data in the tree and are not handled here.

traverse_recursive(self_before=True, self_after=False, include_self=True)

Returns iterator over descendants. IMPORTANT: read notes below.

traverse_recursive is slower than traverse, and can lead to stack errors. However, you _must_ use traverse_recursive if you plan to modify the tree topology as you walk over it (e.g. in post-order), because the iterative methods use their own stack that is not updated if you alter the tree.

self_before includes each node before its descendants if True. self_after includes each node after its descendants if True. include_self includes the initial node if True.

self_before and self_after are independent. If neither is True, only terminal nodes will be returned.

Note that if self is terminal, it will only be included once even if self_before and self_after are both True.

This is a depth-first traversal. Since the trees are not binary, preorder and postorder traversals are possible, but inorder traversals would depend on the data in the tree and are not handled here.

unrooted()

A tree with at least 3 children at the root.

unrooted_deepcopy(constructor=None, parent=None)
write(filename, with_distances=True, format=None)

Save the tree to filename

Parameters:
  • filename – self

  • with_distances – whether branch lengths are included in string.

  • format – default is newick, xml and json are alternate. Argument overrides the filename suffix. All attributes are saved in the xml format. Value overrides the file name suffix.

Notes

Only the cogent3 json and xml tree formats are supported.