The prompt for this problem was honestly kind of confusing. But, what I was able to understand was that we want to maximize the number of partitions we can make, but a letter can only appear in one partition. So for example, if we partition the string into two halves, both halves cannot have the same letter (let’s say a). So, the leetcode solution for this problem is different from what I came up with and seems like it would be the most efficient solution, however, I would argue my solution is more efficient. So, I used merge intervals to get my solution.
class Solution:
def partitionLabels(self, s: str) -> List[int]:
# T: O(n) S: O(1)
charMap = {}
# make intervals
for i in range(len(s)):
if s[i] not in charMap:
charMap[s[i]] = [i, i]
else:
charMap[s[i]][1] = i
# merge intervals
intervals = sorted(charMap.values())
res = [intervals[0]]
for i in range(1, len(intervals)):
i1, j1 = res[-1]
i2, j2 = intervals[i]
if j1 < i2: # disconnected intervals
res[-1] = j1 - i1 + 1
res.append(intervals[i])
else:
res[-1][1] = max(j1, j2)
res[-1] = res[-1][1] - res[-1][0] + 1
return res
The idea is to first find the first and last positions a character appears in the string, and that would be an interval. Then if we merge all the intervals we get our result. When we make the intervals, we go through the string, and when a character first appears we mark that as the first and last pointer for that character. Then when the character appears again we update the last pointer. This will create a hash map with all the characters in the string, with their appropriate intervals.
Note, that the constraint on the string is that it’s only lowercase English letters. This means that our hash map would at the max be 26 in length which would make the space complexity constant.
Next, we get the intervals by making an array of the values of the hash map, and then we sort it. We then use the merge intervals algorithm to merge the intervals, but with a modification. The result has to be the length of the interval, and we can do that in the same pass as the merging. When we merge, we can simply check if the previous interval and the current interval are disconnected or not, if they are not (meaning they are overlapping) we can merge the previous interval (in the res) and the current by updating the right pointer. If they are not overlapping, we add the new interval to res, but after we convert the previous interval to the length of the interval. Finally, we have to convert the last interval in res to its length, since that doesn’t convert in the for loop, then we return res.
In the leetcode discussion, I saw that people were labeling this approach as an O(nlog(n)) time approach, but that is incorrect, and in fact, this approach is faster. First of all the leetcode solution does 2 passes on the string so O(2n) time complexity, whereas this approach does only 1. We have to understand that each interval represents a lowercase English letter since it is derived from the hash map, so the max it can be is 26, which means that the correct time complexity of the merging would be O(26log(26) + 26), which simplifies to constant time. The space complexity for this approach we could say is worse since in the leetcode official solution it only stores the last pointer in the hash map, but here we store both the first and last, which would be using twice the space. However, at the end of the day, it’s still constant space. So, overall the time complexity is faster for this approach which is O(n) and the space complexity is O(1).