Trie

Also known as radix tree: A dictionary for bit-strings.

  • Comes from retrieval
  • A tree based on bitwise comparisons: Edge labelled with corresponding bit.
  • Similar to radix sort: MSD-Radix Sort and LSD-Radix Sort.

Assumption: Dictionary is prefix-free: no string is a prefix of another

  • Assumption satisfied if all strings have the same length
  • Assumption satisfied if all string end with ‘end-of-word’ character $

Then items (keys) are stored only in the leaf nodes.

  • start from the root and the most significant bit of
  • follow the link that corresponds to the current bit in return failure if the link is missing
  • return success if we reach a leaf (it must store )
  • else recurse on the new node and the next bit of

Trie Insert

  • Search for , this should be unsuccessful
  • Suppose we finish at a node that is missing a suitable child
    • Note: has extra bits left
  • Expand the trie from the node by adding necessary nodes that correspond to extra bits of

Does order of insertion matter? Will it give you the same compressed trie. No, it shouldn’t matter

Trie Delete

  • Search for
  • let be the leaf where is found
  • delete and all ancestors of until we reach an ancestor that has two children

Time Complexity On all operations length of the bit-string? Yes!

Questions

What is the runtime of searching for a string of size in an uncompressed trie that stores words over the alphabet ? (may assume that the children of a particular node are stored in increasing order by character in an array and that accessing a particular child takes time.)

  • , length of the string input, since all operations are , where x is the input.

Inserting the same set of keys into a compressed trie will always result in a trie of the same height, regardless of the order they are inserted.

  • True

The total number of nodes (given that we are storing words) in a standard trie is in

  • True, why is it true?
    1. Storing words: In a trie, each node represents a single character in the words being stored. The root node represents an empty string, and each subsequent node represents the next character in a word. So, the number of nodes needed to store words will be at least (one node for each character).
  1. : This represents the sum of the lengths of all the words being stored in the trie. In other words, it’s the total number of characters across all the words.