Top K Frequent Words - [C++,Python(3/1 lines)] Two methods: Sorting / Max heap
Problem Statement:
Given an array of strings words
and an integer k
, return the k
most frequent strings.
Return the answer sorted by the frequency from highest to lowest. Sort the words with the same frequency by their lexicographical order.
Example 1:
Input: words = ["i","love","leetcode","i","love","coding"], k = 2 Output: ["i","love"] Explanation: "i" and "love" are the two most frequent words. Note that "i" comes before "love" due to a lower alphabetical order.
Example 2:
Input: words = ["the","day","is","sunny","the","the","the","sunny","is","is"], k = 4 Output: ["the","is","sunny","day"] Explanation: "the", "is", "sunny" and "day" are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.
Constraints:
1 <= words.length <= 500
1 <= words[i].length <= 10
words[i]
consists of lowercase English letters.k
is in the range[1, The number of unique words[i]]
Follow-up: Could you solve it in O(n log(k))
time and O(n)
extra space?
Solution:
Sorting:
We implement a custom comparator for the condition “sort by frequency (descending) and for same frequency sort by string (ascending)”
bool comp(pair<string,int>a, pair<string,int>b)
{
if (a.second==b.second)
return a.first < b.first;
return a.second > b.second;
}
class Solution {
public:
vector<string> topKFrequent(vector<string>& words, int k)
{
unordered_map<string,int> H;
for (string w: words) H[w]++;
vector<pair<string,int>> vec;
for (auto p: H) vec.push_back({p.first,p.second});
sort(vec.begin(),vec.end(),comp);
vector<string>res;
for (int i=0; i<k; ++i) res.push_back(vec[i].first);
return res;
}
};
Same thing in Python3. Python has inbuilt collections.Counter
to get frequency map from array. Also Python’s sorting API accepts the use of 2 keys if the first key has equal value, hence we need not write a separate comparator function.
class Solution:
def topKFrequent(self, words: List[str], k: int) -> List[str]:
counter = Counter(words)
sorted_words = sorted(counter.keys(), key=lambda i: (-counter[i], i))
return sorted_words[:k]
If you “really” like compact albeit slightly difficult to read code, here is a one-liner for you (though I wont recommend it in real life):
def topKFrequent(self, words: List[str], k: int) -> List[str]:
return [i[0] for i in sorted(Counter(words).items(), key=lambda j: (-j[1],j[0]))][:k]
TC: O( n log(n))
SC: O(n)
Priority Queue
Instead of sorting we can maintain a priority queue of length K. If length exceeds K, we pop out the last item as per our condition.
bool comp(pair<string,int>a, pair<string,int>b)
{
if (a.second==b.second)
return a.first < b.first;
return a.second > b.second;
}
class Solution {
public:
vector<string> topKFrequent(vector<string>& words, int k)
{
unordered_map<string,int> H;
for (string w: words) H[w]++;
priority_queue<pair<string,int>, vector<pair<string,int>>, decltype(comp)*> pq(comp);
for (auto p: H)
{
pq.push({p.first,p.second});
if (pq.size()>k) pq.pop();
}
vector<string>res;
while (!pq.empty())
{
res.insert(res.begin(), pq.top().first);
pq.pop();
}
return res;
}
};
TC: O(n log(K))
SC: O(n)
If you have any questions dont hesitate to ask. Please upvote if you find this useful.