Friday, April 4, 2014

[c++] Parse/split string


Approach I - "strtok"
The token part of the following implementation is a piece of classic C-style code. "strtok" automatically skip multiple delimiter chars - space in this example.

#include <string>
#include <stdio.h>
#include <iostream>
#include <stdlib.h>  // atoi
#include <cstring>   // strcpy
#include <vector>
using namespace std;

// This function accepts multi-lines input
// where each line consists of a sequence of numbers that are separated by space char.
int main() {
    string line;
    char* token;
    char s[1024];     // assume each word has less than 1024 chars
    vector<vector<int> > out;
    vector<int> outLine;
    while(getline(cin, line)) {
        strcpy(s, line.c_str());
        token=strtok(s, "");
        while(token!=NULL) {
            outLine.push_back(atoi(token));
            token=strtok(NULL, "");
        }
        out.push_back(outLine);
    }
    return 0;
}

Approach II - a simpler version
We could simplify the 2 while loops by the following code snippet.

char delim=' ';
string item;
while(getline(cin, item, delim)) {
    outLine.push_back(atoi(item));
}

However, the getline method doesn't consider '/n' as space, if no space entered, the last number of previous line will be concatenated with '\n' and first number of current line. Also, this method cannot recognize multiple spaces.

No comments:

Post a Comment