Implement Trie (Prefix Tree)

Hero Image

DT

Dhaval Trivedi

Co-founder, Airtribe

Problem Overview

The LeetCode problem "Implement Trie (Prefix Tree)" focuses on constructing and manipulating a trie data structure. A trie is an efficient, tree-based data structure commonly used for storing and searching strings, especially in scenarios involving prefixes. The task requires implementing the trie with the following operations:

  1. Insert: Insert a word into the trie.
  2. Search: Search for a complete word in the trie.
  3. StartsWith: Determine if there is any word in the trie that starts with a given prefix.

The problem provides a foundational understanding of how tries function and are implemented, particularly for applications involving autocomplete, spell-checking, and dictionary searches.

Understanding the Problem

To solve this problem, you must create a class Trie that includes methods to insert, search, and check for prefixes in a collection of strings. Each method needs to efficiently navigate the trie structure both in terms of time and memory.

Example

Assume the following sequence of operations:

  1. Insert "apple"
  2. Search "apple" (returns true)
  3. Search "app" (returns false)
  4. StartsWith "app" (returns true)
  5. Insert "app"
  6. Search "app" (returns true)

This sequence illustrates how insertions change the state of the trie and how search and prefix operations explore its structure.

Key DSA Concepts and Theory

Trie Data Structure

A trie (pronounced "try") is a type of search tree used to store a dynamic set or associative array where the keys are usually strings. Unlike a binary search tree, no node in the trie stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. All the descendants of a node have a common prefix of the string associated with that node, and the root node is associated with the empty string.

Advantages

  • Prefix Searches: Efficiently handles queries related to prefix searches.
  • Space Efficiency: Highly space-efficient if the stored strings share prefixes.
  • Auto-completion: Perfect base for implementing autocomplete features.

Trie Structure

Each node in a trie typically consists of:

  • A collection of child nodes.
  • A boolean flag isEndOfWord that marks the end of a particular string.

Solution Approach

To implement a trie, a class Trie is created with fundamental methods like insert, search, and startsWith. Each node in the trie is represented by a helper class TrieNode having an array of child nodes and a boolean to determine the end of a word.

Step-by-Step Approach

C++ Implementation

class TrieNode {
public:
    TrieNode* children[26];
    bool isEndOfWord;

    TrieNode() {
        for (int i = 0; i < 26; i++) {
            children[i] = nullptr;
        }
        isEndOfWord = false;
    }
};

class Trie {
private:
    TrieNode* root;

public:
    Trie() {
        root = new TrieNode();
    }

    void insert(string word) {
        TrieNode* currentNode = root;
        for (char c : word) {
            int index = c - 'a';
            if (currentNode->children[index] == nullptr) {
                currentNode->children[index] = new TrieNode();
            }
            currentNode = currentNode->children[index];
        }
        currentNode->isEndOfWord = true;
    }

    bool search(string word) {
        TrieNode* currentNode = root;
        for (char c : word) {
            int index = c - 'a';
            if (currentNode->children[index] == nullptr) {
                return false;
            }
            currentNode = currentNode->children[index];
        }
        return currentNode->isEndOfWord;
    }

    bool startsWith(string prefix) {
        TrieNode* currentNode = root;
        for (char c : prefix) {
            int index = c - 'a';
            if (currentNode->children[index] == nullptr) {
                return false;
            }
            currentNode = currentNode->children[index];
        }
        return true;
    }
};

Java Implementation

class TrieNode {
    TrieNode[] children;
    boolean isEndOfWord;

    TrieNode() {
        children = new TrieNode[26];
        isEndOfWord = false;
    }
}

class Trie {
    private TrieNode root;

    public Trie() {
        root = new TrieNode();
    }

    public void insert(String word) {
        TrieNode currentNode = root;
        for (char c : word.toCharArray()) {
            int index = c - 'a';
            if (currentNode.children[index] == null) {
                currentNode.children[index] = new TrieNode();
            }
            currentNode = currentNode.children[index];
        }
        currentNode.isEndOfWord = true;
    }

    public boolean search(String word) {
        TrieNode currentNode = root;
        for (char c : word.toCharArray()) {
            int index = c - 'a';
            if (currentNode.children[index] == null) {
                return false;
            }
            currentNode = currentNode.children[index];
        }
        return currentNode.isEndOfWord;
    }

    public boolean startsWith(String prefix) {
        TrieNode currentNode = root;
        for (char c : prefix.toCharArray()) {
            int index = c - 'a';
            if (currentNode.children[index] == null) {
                return false;
            }
            currentNode = currentNode.children[index];
        }
        return true;
    }
}

Time and Space Complexity Analysis

  • Insert Operation: The time complexity is O(m), where m is the length of the word being inserted. This is because we potentially need to insert up to m new nodes. The space complexity is O(m) for the newly inserted nodes.
  • Search Operation: The time complexity is O(m), where m is the length of the word being searched since it takes m steps to traverse the word. The space complexity is O(1) since no additional space is required beyond the current trie nodes.
  • StartsWith Operation: Similar to search, the time complexity is O(m), and the space complexity is O(1).

Common Mistakes to Avoid

  • Incorrect Node Initialization: Ensure that all node pointers are initialized correctly to avoid segmentation faults or null pointer exceptions.
  • Character Index Calculation: Remember to correctly calculate the index for child nodes using the formula c - 'a'.
  • Handling Case Sensitivity: By default, the given implementations assume lowercase alphabets. Adjust the node array size and index calculation if handling a broader set of characters.

Similar Problems on LeetCode

Additional Resources and References

By carefully implementing and understanding the trie operations, you can handle a wide variety of search-related problems efficiently, leveraging the strengths of this powerful data structure.