First solution worked, but was slow and used a lot of memory (5th percentile for both):
func lengthOfLongestSubstring(s string) int {
if s == "" {
return 0
}
// This looks like a sliding window.
longest := []byte{}
current := []byte{}
m := make(map[byte]int)
bs := []byte(s)
i := 0
scan := 0
for {
// scan from current point.
m[bs[scan]]++
if m[bs[scan]] < 2 {
// still no repeat, now add it
current = append(current, bs[scan])
scan++
if scan == len(bs) {
// no more chars to add, current finished
// and string finished
break
}
} else {
// repeat; restarting scan at next point
// for a new possible substring, see if
// the new substring is the new longest.
if len(current) > len(longest) {
longest = current
}
// in either case, we have to try for a new
// non-repeater in the next window
i++
if i == len(bs) {
// no new window left, finish up
break
}
scan = i
current = []byte{}
m = make(map[byte]int)
}
}
if len(current) > len(longest) {
longest = current
}
return len(longest)
}
So this is the right track — the map to manage the characters we’ve seen is definitely right, but there has to be a trick here — and that is to use the index of the character as the hash value, and use that to help us manage the sliding window.
func lengthOfLongestSubstring(s string) int {
if s == "" {
return 0
}
m := make(map[byte]int)
bs := []byte(s)
start := 0
maxLen := 0
currentLen := 0
// Iterate through the string and slide the window each
// time you find a duplicate.
// end is the end of current substring
for end := 0; end < len(bs); end++ {
// Check if the current character is a duplicate
if dupIndex, ok := m[bs[end]]; ok {
// New max?
if maxLen < end - start {
maxLen = end - start
}
// forget index for all characters before the dup
m = make(map[byte]int)
// Slide the window to just after the dup.
start = dupIndex + 1
currentLen = 0
end = start
}
// Add the current character to hashmap
m[bs[end]] = end
currentLen++
}
if currentLen > maxLen {
maxLen = currentLen
}
return maxLen
}
That’s a little better, but it’s still pretty slow. 13th percentile runtime, 20th percentile CPU. (If you use strings, the percentiles drop to 5 and 5. No better than the dumb solution!) Looking at the solutions, it is possible to do the whole thing with indexes…but I definitely do not grok it.
func lengthOfLongestSubstring(s string) int {
if s == "" {
return 0
}
if len(s) == 1 {
return 1
}
// brackets current longest found
startMax := 0
endMax := 0
// start of window
start := 0
dupIndex := 0
// Scan end pointer of window forward
for end := 1; end < len(s); end++ {
// The "I didn't find it" value
dupIndex = -1
// looking for duplicate inside window
for j := start; j < end; j++ {
if s[j] == s[end] {
// Found it
dupIndex = j
break
}
}
// Dup found inside window
if dupIndex >= 0 {
if (end - start - dupIndex) > (endMax - startMax) {
startMax = dupIndex
endMax = end
}
start = dupIndex + 1
} else {
// no dup in window
if (end - start + 1) > (endMax - startMax) {
startMax = start
endMax = end + 1
}
}
}
return endMax - startMax
}
71st percentile CPU, 96th percentile memory (memory makes sense, we barely use anything other than indexes into the string). However, the big winner is dynamic programming:
func lengthOfLongestSubstring(s string) int {
rv := 0
// array as hash for extra speed and no dynamic memory.
// create it, mark it all invalid.
last := [256]int{}
for i := range last {
last[i] = -1
}
// define min inline.
min := func(a, b int) int {
if a < b {
return a
}
return b
}
size := 0
for i, b := range []byte(s) {
size = min(size + 1, i - last[b])
last[b] = i
rv = rv + size - min(rv, size) // max(rv, size)
}
return rv
}
This is black magic and I do not understand it yet!
Leave a Reply
You must be logged in to post a comment.