Word Break

Problem Overview
In this article, we will explore the "Word Break" problem from LeetCode, a classic example that deals with string manipulation and utilizes dynamic programming for efficient solutions. The aim is to determine if a given string can be segmented into a space-separated sequence of one or more dictionary words. Here is the LeetCode link to the problem.
Problem Statement:
Given a string s
and a dictionary of strings wordDict
, return true
if s
can be segmented into a space-separated sequence of one or more dictionary words.
Example:
Input: s = "leetcode", wordDict = ["leet", "code"]
Output: true
Input: s = "applepenapple", wordDict = ["apple", "pen"]
Output: true
Understanding the Problem
The task is straightforward: we have to verify if it's possible to break up the input string into smaller strings, each of which must be found in the provided wordDict
. The goal is then to return true
if such a segmentation is possible, otherwise, return false
.
A naive approach might try all possible segmentations, leading to inefficiency. Hence, a systematic, optimized approach using dynamic programming becomes essential.
Key DSA Concepts and Theory
Dynamic Programming
Dynamic Programming (DP) is an optimization technique that solves problems by breaking them down into simpler subproblems and storing the subproblem results to avoid recalculating them. This retains calculated results using either a top-down or bottom-up approach.
Memoization: Storing results of expensive function calls and returning the cached result when the same inputs occur again.
Tabulation: Solving subproblems first and using their solutions to build solutions to bigger subproblems. Often functions with overlapping subproblems and optimal substructure are candidates for dynamic programming.
In the Word Break problem, DP is used to check systematically if a subsection of the string can be formed using dictionary words, building up to solve the complete problem.
Solution Approach
The core idea is to use a boolean array dp
where dp[i]
is true if the string s
up to index i
can be segmented into dictionary words. This approach uses tabulation for dynamic programming.
Step-by-step Explanation:
Initialization:
- Create a boolean array
dp
of sizen+1
(wheren
is the length of the strings
) initialized tofalse
. - Set
dp[0]
totrue
because an empty string can always be segmented trivially.
- Create a boolean array
Iteration:
- Iterate over each index
i
from1
ton
. - For each
i
, iterate over each indexj
from0
toi
. - For each pair (
i
,j
), check ifdp[j]
istrue
and the substrings[j:i]
exists inwordDict
. If yes, setdp[i]
totrue
and break out of the loop as we found a valid segmentation.
- Iterate over each index
Result:
- The value at
dp[n]
will give the answer –true
if it is possible to segment the entire string, otherwisefalse
.
- The value at
C++ Code:
#include <iostream>
#include <vector>
#include <unordered_set>
using namespace std;
bool wordBreak(string s, vector<string>& wordDict) {
unordered_set<string> wordSet(wordDict.begin(), wordDict.end());
vector<bool> dp(s.size() + 1, false);
dp[0] = true;
for (int i = 1; i <= s.size(); ++i) {
for (int j = 0; j < i; ++j) {
if (dp[j] && wordSet.find(s.substr(j, i - j)) != wordSet.end()) {
dp[i] = true;
break;
}
}
}
return dp[s.size()];
}
Java Code:
import java.util.List;
import java.util.Set;
import java.util.HashSet;
public class WordBreak {
public boolean wordBreak(String s, List<String> wordDict) {
Set<String> wordSet = new HashSet<>(wordDict);
boolean[] dp = new boolean[s.length() + 1];
dp[0] = true;
for (int i = 1; i <= s.length(); i++) {
for (int j = 0; j < i; j++) {
if (dp[j] && wordSet.contains(s.substring(j, i))) {
dp[i] = true;
break;
}
}
}
return dp[s.length()];
}
}
Time and Space Complexity Analysis
Time Complexity:
O(n^2 * m)
, wheren
is the length of the strings
andm
is the maximum length of a word inwordDict
. This results from the nested loop over the string indices and the time taken for substring lookups and dictionary checks.Space Complexity:
O(n)
, due to thedp
array used to store solutions to subproblems.
Common Mistakes to Avoid
- Failing to initialize
dp[0]
: You must setdp[0] = true
to account for the possibility of forming an empty prefix. - Overlooking substring bounds: When using
s.substr(j, i-j)
, ensure the indices remain valid. - Inefficient lookups: Utilize a hash set for
wordDict
to enableO(1)
average time complexity lookups instead ofO(m)
.
Similar Problems on LeetCode
Additional Resources and References
- GeeksforGeeks - Dynamic Programming Introduction
- MIT OpenCourseWare - Introduction to Dynamic Programming
This article should provide a solid overview of solving the Word Break problem using dynamic programming techniques with an emphasis on understanding the underlying concepts and considerations involved.