Calculating Follow Sets of a Context Free Grammar

Having previously discussed the data structures needed for representing a context free grammar as well as a method for computing the 'First' set, in todays post I will cover constructing the 'Follow' set. The follow set, as it's name would imply, is the set of symbols which are located immediatly after a symbol from the first set. 

Untitled Document

Follow sets are used in the generation of LL(1) parse tables, as well as being used to generate the lookahead information for SLR & LALR parse tables. In short, if you're interested in the automatic generation of parsers, then you're going to want to understand how this construction is performed.

 Fixed Point Iteration

As was the case for computing the First set, calculating the follows set requires that your input grammar be in BNF form and is done via a fixed point iterative algorithm, that is: we continue propagating changes from iteration to iteration until no more changes are possible, at which point we are done.

void ComputeFollowSets::initFollows(Grammar& G) {
    for (Symbol t : G.terminals) {
        G.follow[t] = set<Symbol>();
    }
    for (Symbol nt : G.nonterminals) {
        G.follow[nt] = set<Symbol>();
    }
}

We begin by initializing an empty set for all of the symbols - both terminals as well as non-terminals. Next, we add the 'Goal' symbol to the follow set of the start symbol.

void compute(Grammar& G, Symbol start) {
    initFollows(G);
    G.follow[start].insert(GOAL);
    bool didchange = true;
    while (didchange) {
        didchange = false;
        for (auto currNonTerm : G.nonterminals) {
            if (calcFollow(G, currNonTerm))
                didchange = true;                
        }   
    }
}

We iterate over the set of non-terminals, re-calculating each symbols follow set until no more changes occur to any of them. This is done production-by-production for each non terminal symbol. Each production is scanned left to right, ignoring any action symbols we encounter, as they have no effect on the actual grammar itsself.  

bool ComputeFollowSets::calcFollow(Grammar& G, Symbol nonTerm) {
    bool didchange = false;
    for (auto prod : G.productions) {
        Symbol LHS = prod.first;
        ProductionSet RHS = prod.second;
        for (Production alt : RHS) {
            for (int i = 0; i < alt.rhs.size(); i++) {
                if (alt.rhs[i] == ACTSYM)
                    continue;
                if (alt.rhs[i] == nonTerm) {
                    if (symbolFoundOnRight(G, nonTerm, LHS, alt.rhs, i))
                        didchange = true;
                }
            }
        }
    }
    return didchange;
}

On each pass we go through the list of productions and check if the current production is nullable. A production is considered nullable if that by taking said production one could derive Epsilon. If we can derive Epsilon, then we add X's entire follow set to A's follow set.

bool ComputeFollowSets::symbolFoundOnRight(Grammar& G, Symbol currNonTerm, Symbol currLHS, SymbolString alt, int index) {
    bool betaEps = true;
    bool didchange = false;
    bool realSym = false;
    // X -> aAb
    for (Symbol Y : alt.subString(index+1)) {
        if (Y == ACTSYM)
            continue;
        realSym = true;
        for (Symbol s : G.firsts[Y]) {
            if (s != EPS) {
                if (G.follow[currNonTerm].insert(s).second)
                    didchange = true;
            }
        }
        if (G.firsts[Y].find(EPS) == G.firsts[Y].end()) {
            betaEps = false;
            break;
        }
    }

    //X -> aA
    if (!realSym || betaEps) {
        int pre = G.follow[currNonTerm].size();
        G.follow[currNonTerm].insert(G.follow[currLHS].begin(), G.follow[currLHS].end());
        if (pre != G.follow[currNonTerm].size())
            didchange = true;
    }
    return didchange;
}

 


Leave A Comment