Friday, June 6, 2014

[LeetCode] Merge k Sorted List

Problem Statement (link):
Merge k sorted linked lists and return it as one sorted list. Analyze and describe its complexity.
Analysis:
There are three types of solutions:

Suppose the list has k linked list, the longest linked list is with length n.

1) Naive approach:
We compare each of the first k nodes' value, find the node with smallest value, pull that node out and add it to the new list. We repeat this process until all nodes are removed from the original list.

The time complexity is O(k*n*k) = O(n*k^2)

2) Similar idea to Merge sort
We merge every two linked lists in sequence, and repeat this merging until we are left only one linked list.
In run 1, we did k/2 pair merges and left with (k+1)/2 linked lists;
In run 2, we did k/4 pair merges and left with (k+1)/4 linked lists;
...

The time complexity is O(n*k*log k), the log k comes from the fact that we did log k merges in total.

The implementation of this algorithm is below in Sol 1.

3) Use container which auto-sort the incoming data. Three possible options we have are: multiset, heap, priority_queue. I choose to implement using priority_queue for no good reason.

The idea is simple: push each node in into a size k priority queue, each time we pop out the top node - the one with smallest value among all others, and connect it to output list. As we will go through each node once, and each push operation cost log k, the overall time complexity is O(n*k*log k).

Code:
Sol 1: Merge sort
/**
 * Definition for singly-linked list.
 * struct ListNode {
 *     int val;
 *     ListNode *next;
 *     ListNode(int x) : val(x), next(NULL) {}
 * };
 */

class Solution {
public:
    ListNode *mergeKLists(vector<ListNode *> &lists) {
        int k=lists.size();
        if (k==0) return NULL;
        return recur(lists, 0, k-1);
    }
    ListNode *recur(vector<ListNode *> &lists, int left, int right) {
        if (left<right) {
            int mid=(left+right)/2;
            return mergeTwoLists(recur(lists, left, mid), recur(lists, mid+1, right));
        }
        else
            return lists[left];
    }
    ListNode *mergeTwoLists(ListNode* head1, ListNode* head2) {
        if (!head1) return head2;
        if (!head2) return head1;

        ListNode *head=new ListNode(INT_MIN);
        ListNode *curr=head;

        while(head1 && head2) {
            if (head1->val<head2->val) {
                curr->next=head1;
                head1=head1->next;
            }
            else {
                curr->next=head2;
                head2=head2->next;
            }
            curr=curr->next;
        }
        if (head1) curr->next=head1;
        else if (head2) curr->next=head2;
        return head->next;
    }
};


Sol 2: Priority queue
/**
 * Definition for singly-linked list.
 * struct ListNode {
 *     int val;
 *     ListNode *next;
 *     ListNode(int x) : val(x), next(NULL) {}
 * };
 */

class Solution {
private:
    struct cmp {
        bool operator() (ListNode *n1, ListNode *n2) {
            return n1->val>n2->val;
        }
    };

public:
    ListNode *mergeKLists(vector<ListNode *> &lists) {
        if (lists.size()==0) return NULL;
        priority_queue<ListNode *, vector<ListNode *>, cmp> heap;

        // add all lists' heads in heap
        for (int i=0; i<lists.size(); i++)
            if (lists[i]) // skill NULL lists
                heap.push(lists[i]);

        // dummy head
        ListNode* head=new ListNode(INT_MAX);
        ListNode* curr=head;

        // pop and push
        while(!heap.empty()) {
            curr->next=heap.top();
            heap.pop();
            curr=curr->next;
            if (curr->next)
                heap.push(curr->next);
        }
        return head->next;
    }
};

Takeaways:
All the three data structures/abstract data type (ADT) are capable of storing items in sorted order. They allow user-defined comparison functions comp as well.


multiset
heap 
 priority_queue
Type
Container object (ADT)
A way of organizing items in the range
Container adaptor - the standard underlying container is vector
 Implementation 
Self-balanced BST 

 Similar to heap, call make_heap, push_heap, pop_heap to maintain heap properties
Complexity (insert, emplace, erase, find, push, pop etc.)
logarithmic in general, but amortized linear or constant in special cases indicated here
Up to linear in three times the range distance
One push_back call to underlying container, one push_heap call on the range *
Key ops
empty
begin, end
insert, emplace
erase
find
clear
make_heap
push_heap
pop_heap
sort_heap
emplace
empty
pop
push
size
top
Relatives
unordered_multiset - implemented using hash tables 


Other


Default comparison is less, i.e., greatest value at the top

* push_back: in vector case, time complexity is constant in general. But if reallocation happens, the reallocation can be up to linear on the range of the entire data size.
   push_heap: up to logarithmic time cost.


No comments:

Post a Comment