Wednesday, April 16, 2014

[LeetCode] Longest Substring Without Repeating Characters

Problem Statement (link):
Given a string, find the length of the longest substring without repeating characters. For example, the longest substring without repeating letters for "abcabcbb" is "abc", which the length is 3. For "bbbbb" the longest substring is "b", with the length of 1.

Analysis:
The algorithm is straightforward. We start at the beginning of the string s, for char s[i], we find the longest substr, and move on to the next char s[i+1], repeat.

The interesting part is to choose the best data structure to store the substr's each char as we do need to compare the new char with chars we've seen. I chose unordered_map<string, int> to store and hope the hash map could give me O(1) access each time. I passed all the tests months ago, but now I re-submit the solution, it notified TLE for a rather long string. Apparently, there's collision that drove the complexity beyond O(n), where n is the string length.

Another data structure that fits this problem is array, which provides us O(1) access time. Also, remember that ASCII chars are indexed from 0 to 127. Thus, we could match the ASCII index for each char with our array map, which indicates if we have seen the char before.

Code:
Sol 1: Use unordered_map<string, int>
int lengthOfLongestSubstring(string s) {
    if (s.compare("")==0) return 0;
    int len = s.size();
    int maxLength = 1;
    unordered_map<string, int> table;
    int i = 0;
    while (i<len) {
        table.emplace(s.substr(i,1), i);  // i is the char's location
        for (int j = 1; j < len-i; j++) { // represent the length
            if (table.find(s.substr(i+j, 1)) != table.end()) {   // found the same char, break
                i=table.find(s.substr(i+j,1))->second;
                table.clear();
                break;
            }
            else { // not found - update max length
                maxLength=max(maxLength, j+1);
                table.emplace(s.substr(i+j, 1), i+j);
            }
       }
       i++;
    }
    return maxLength;
}

Sol 2: Use bool arr[128]
class Solution {
public:
    int lengthOfLongestSubstring(string s) {
        if (s.empty()) return 0;
        int len=s.size();
        int maxLen=1;
        int i=0;
        while (i<len) {
            bool arr[128]={false}; // to record if any char appeared in ASCII, Unicode has 256 chars
            arr[s[i]]=true;
            for (int j=1; j<len-i; j++) {
                if (arr[s[i+j]]==true) { // found same char
                    break;
                }
                else {
                    maxLen=max(maxLen,j+1);
                    arr[s[i+j]]=true;
                }
            }
            i++;
        }
        return maxLen;
    }
};


Takeaways:
- hash based data structure cannot guarantee O(1) access. If possible, think about using other O(1) structure instead.

No comments:

Post a Comment