14.14 Word Break

14.14.1 Problem Metadata

14.14.2 Description

Given a string s and a dictionary wordDict, determine whether s can be segmented into a space-separated sequence of dictionary words.

14.14.3 Examples

Input: s = "leetcode", wordDict = ["leet","code"]
Output: true

Input: s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]
Output: false

14.14.4 Constraints

  • 1 <= s.length <= 300
  • 1 <= wordDict.length <= 1000
  • Words consist of lowercase letters

14.14.5 Solution 1 - Recursive Prefix Search (Top-Down DP with Memoization)

14.14.5.1 Walkthrough

This solution uses recursive exploration with memoization to determine if a string can be segmented into dictionary words. The approach starts from the beginning of the string and tries every possible prefix, recursively validating the remaining suffix.

Core Idea: - At each position, try to match prefixes of increasing length against the dictionary - If a prefix matches, recursively check if the remaining suffix can be segmented - Cache results to avoid recomputing the same subproblems

Recursive State Definition: - dfs(start) returns true if substring s[start...n-1] can be segmented into dictionary words

Base Case: - If start == n (reached end of string), return true - successfully segmented the entire string

Recursive Exploration: For each starting position start, iterate through all possible ending positions end: 1. Extract substring s[start:end] 2. Check if this substring exists in the dictionary 3. If yes, recursively check if dfs(end) returns true 4. If any combination works, return true 5. If no valid segmentation found, return false

Memoization Strategy: Use a Boolean[] array (not boolean[]) to cache results: - memo[start] == null → not computed yet - memo[start] == true → substring from start can be segmented - memo[start] == false → substring from start cannot be segmented

Why Boolean[] instead of boolean[]?

We use Boolean[] (object array) instead of boolean[] (primitive array) to distinguish between three states:

State Meaning Boolean[] boolean[]
Not computed Haven’t explored this position yet null false (ambiguous!)
Can segment Valid segmentation exists true true
Cannot segment No valid segmentation false false (ambiguous!)

With boolean[], we cannot differentiate between “not yet computed” and “computed to false”. Both would be false, causing incorrect cache hits. The Boolean[] array allows us to check if (memo[start] != null) to distinguish cached results from uncomputed states.

Visual Example with s = "catsandog", wordDict = ["cats", "dog", "sand", "and", "cat"]:

Dictionary as Set: {cats, dog, sand, and, cat}

Call dfs(0): "catsandog"
  ├─ Try prefix "c" [0:1] → not in dict, skip
  ├─ Try prefix "ca" [0:2] → not in dict, skip
  ├─ Try prefix "cat" [0:3] → in dict ✓ → recurse dfs(3)
  │   │
  │   └─ dfs(3): "sandog"
  │       ├─ Try prefix "s" [3:4] → not in dict, skip
  │       ├─ Try prefix "sa" [3:5] → not in dict, skip
  │       ├─ Try prefix "san" [3:6] → not in dict, skip
  │       ├─ Try prefix "sand" [3:7] → in dict ✓ → recurse dfs(7)
  │       │   │
  │       │   └─ dfs(7): "og"
  │       │       ├─ Try prefix "o" [7:8] → not in dict, skip
  │       │       ├─ Try prefix "og" [7:9] → not in dict, skip
  │       │       └─ No valid prefix → memo[7] = false, return false
  │       │
  │       ├─ Try prefix "sando" [3:8] → not in dict, skip
  │       ├─ Try prefix "sandog" [3:9] → not in dict, skip
  │       └─ No valid prefix → memo[3] = false, return false
  │
  ├─ Try prefix "cats" [0:4] → in dict ✓ → recurse dfs(4)
  │   │
  │   └─ dfs(4): "andog"
  │       ├─ Try prefix "a" [4:5] → not in dict, skip
  │       ├─ Try prefix "an" [4:6] → not in dict, skip
  │       ├─ Try prefix "and" [4:7] → in dict ✓ → recurse dfs(7)
  │       │   │
  │       │   └─ dfs(7): "og" → memo[7] = false (cached!), return false
  │       │
  │       ├─ Try prefix "ando" [4:8] → not in dict, skip
  │       ├─ Try prefix "andog" [4:9] → not in dict, skip
  │       └─ No valid prefix → memo[4] = false, return false
  │
  ├─ Try prefixes "catsa", "catsan", ..., "catsandog" → none in dict
  └─ No valid segmentation → memo[0] = false, return false

Result: false (cannot segment "catsandog")

Example with s = "leetcode", wordDict = ["leet", "code"]:

Dictionary as Set: {leet, code}

Call dfs(0): "leetcode"
  ├─ Try prefix "l" [0:1] → not in dict, skip
  ├─ Try prefix "le" [0:2] → not in dict, skip
  ├─ Try prefix "lee" [0:3] → not in dict, skip
  ├─ Try prefix "leet" [0:4] → in dict ✓ → recurse dfs(4)
  │   │
  │   └─ dfs(4): "code"
  │       ├─ Try prefix "c" [4:5] → not in dict, skip
  │       ├─ Try prefix "co" [4:6] → not in dict, skip
  │       ├─ Try prefix "cod" [4:7] → not in dict, skip
  │       ├─ Try prefix "code" [4:8] → in dict ✓ → recurse dfs(8)
  │       │   │
  │       │   └─ dfs(8): base case (index == length) → return true ✓
  │       │
  │       └─ memo[4] = true, return true ✓
  │
  └─ memo[0] = true, return true ✓

Result: true ("leet" + "code")

Key Memoization Benefits: - In the first example, when we reach dfs(7) from different paths, we use the cached false result - Without memoization, time complexity would be exponential O(2^n) - With memoization, each position is computed at most once → O(n^2)

14.14.5.2 Analysis

  • Time Complexity: Exponential without memoization; with memo, O(n^2) substring checks
  • Space Complexity: O(n) recursion depth

14.14.5.3 Implementation Steps

  1. Define dfs(start) returning true if substring from start can be segmented.
  2. Iterate end from start+1 to n, check whether s[start:end] is in the dictionary and dfs(end) is true.
  3. Memoize failed indices to avoid recomputation.

14.14.5.4 Code - Java

public boolean wordBreakRecursive(String s, List<String> wordDict) {
    Set<String> dict = new HashSet<>(wordDict);
    Boolean[] memo = new Boolean[s.length()];
    return dfs(s, 0, dict, memo);
}

private boolean dfs(String s, int start, Set<String> dict, Boolean[] memo) {
    if (start == s.length()) {
        return true;
    }
    if (memo[start] != null) {
        return memo[start];
    }
    for (int end = start + 1; end <= s.length(); end++) {
        if (dict.contains(s.substring(start, end)) && dfs(s, end, dict, memo)) {
            return memo[start] = true;
        }
    }
    return memo[start] = false;
}

14.14.6 Solution 2 - Bottom-Up Dynamic Programming

14.14.6.1 Walkthrough

Let dp[i] denote whether prefix s[0:i) can be segmented. For each ending index i, check dictionary words that could end at i and see if the preceding prefix was valid.

14.14.6.2 Analysis

  • Time Complexity: O(n * m), where m is dictionary size
  • Space Complexity: O(n)

14.14.6.3 Implementation Steps

  1. Initialize dp[0] = true.
  2. For end from 1 to n, iterate dictionary words.
  3. If word fits at the end and dp[end - word.length()] is true, set dp[end] = true.
  4. Return dp[n].

14.14.6.4 Code - Java

public boolean wordBreakDP(String s, List<String> wordDict) {
    boolean[] dp = new boolean[s.length() + 1];
    dp[0] = true;

    for (int end = 1; end <= s.length(); end++) {
        for (String word : wordDict) {
            int len = word.length();
            if (len <= end && dp[end - len] && s.substring(end - len, end).equals(word)) {
                dp[end] = true;
                break;
            }
        }
    }
    return dp[s.length()];
}

14.14.7 Solution 3 - Trie-Assisted DP

14.14.7.1 Walkthrough

Insert dictionary words into a trie. While scanning the string, whenever dp[i] is true, walk the trie starting at i to mark reachable dp[j]. This prunes dictionary iterations when many words share prefixes.

14.14.7.2 Analysis

  • Time Complexity: O(n * L) where L is average word length encountered
  • Space Complexity: O(total dictionary characters)

14.14.7.3 Implementation Steps

  1. Build a trie with isEnd flags.
  2. Initialize dp[0] = true.
  3. For each index i, if dp[i] is true, traverse trie edges following s[i...].
  4. Whenever a trie node is terminal at index j, set dp[j] = true.

14.14.7.4 Code - Java

class TrieNode {
    boolean isEnd;
    TrieNode[] next = new TrieNode[26];
}

class Trie {
    TrieNode root = new TrieNode();

    void insert(String word) {
        TrieNode node = root;
        for (char c : word.toCharArray()) {
            int idx = c - 'a';
            if (node.next[idx] == null) {
                node.next[idx] = new TrieNode();
            }
            node = node.next[idx];
        }
        node.isEnd = true;
    }

    void advance(String s, int start, boolean[] dp) {
        TrieNode node = root;
        for (int i = start; i < s.length(); i++) {
            int idx = s.charAt(i) - 'a';
            if (node.next[idx] == null) {
                return;
            }
            node = node.next[idx];
            if (node.isEnd) {
                dp[i + 1] = true;
            }
        }
    }
}

public boolean wordBreakTrie(String s, List<String> wordDict) {
    Trie trie = new Trie();
    for (String word : wordDict) {
        trie.insert(word);
    }
    boolean[] dp = new boolean[s.length() + 1];
    dp[0] = true;
    for (int i = 0; i < s.length(); i++) {
        if (dp[i]) {
            trie.advance(s, i, dp);
        }
    }
    return dp[s.length()];
}