Calculating Follow Sets of a Context Free Grammar
Having previously discussed the data structures needed for representing a context free grammar as well as a method for computing the 'First' set, in todays post I will cover constructing the 'Follow' set. The follow set, as it's name would imply, is the set of symbols which are located immediatly after a symbol from the first set.
Follow sets are used in the generation of LL(1) parse tables, as well as being used to generate the lookahead information for SLR & LALR parse tables. In short, if you're interested in the automatic generation of parsers, then you're going to want to understand how this construction is performed.
Fixed Point Iteration
As was the case for computing the First set, calculating the follows set requires that your input grammar be in BNF form and is done via a fixed point iterative algorithm, that is: we continue propagating changes from iteration to iteration until no more changes are possible, at which point we are done.
void ComputeFollowSets::initFollows(Grammar& G) {
for (Symbol t : G.terminals) {
G.follow[t] = set<Symbol>();
}
for (Symbol nt : G.nonterminals) {
G.follow[nt] = set<Symbol>();
}
}
We begin by initializing an empty set for all of the symbols - both terminals as well as non-terminals. Next, we add the 'Goal' symbol to the follow set of the start symbol.
void compute(Grammar& G, Symbol start) {
initFollows(G);
G.follow[start].insert(GOAL);
bool didchange = true;
while (didchange) {
didchange = false;
for (auto currNonTerm : G.nonterminals) {
if (calcFollow(G, currNonTerm))
didchange = true;
}
}
}
We iterate over the set of non-terminals, re-calculating each symbols follow set until no more changes occur to any of them. This is done production-by-production for each non terminal symbol. Each production is scanned left to right, ignoring any action symbols we encounter, as they have no effect on the actual grammar itsself.
bool ComputeFollowSets::calcFollow(Grammar& G, Symbol nonTerm) {
bool didchange = false;
for (auto prod : G.productions) {
Symbol LHS = prod.first;
ProductionSet RHS = prod.second;
for (Production alt : RHS) {
for (int i = 0; i < alt.rhs.size(); i++) {
if (alt.rhs[i] == ACTSYM)
continue;
if (alt.rhs[i] == nonTerm) {
if (symbolFoundOnRight(G, nonTerm, LHS, alt.rhs, i))
didchange = true;
}
}
}
}
return didchange;
}
On each pass we go through the list of productions and check if the current production is nullable. A production is considered nullable if that by taking said production one could derive Epsilon. If we can derive Epsilon, then we add X's entire follow set to A's follow set.
bool ComputeFollowSets::symbolFoundOnRight(Grammar& G, Symbol currNonTerm, Symbol currLHS, SymbolString alt, int index) {
bool betaEps = true;
bool didchange = false;
bool realSym = false;
// X -> aAb
for (Symbol Y : alt.subString(index+1)) {
if (Y == ACTSYM)
continue;
realSym = true;
for (Symbol s : G.firsts[Y]) {
if (s != EPS) {
if (G.follow[currNonTerm].insert(s).second)
didchange = true;
}
}
if (G.firsts[Y].find(EPS) == G.firsts[Y].end()) {
betaEps = false;
break;
}
}
//X -> aA
if (!realSym || betaEps) {
int pre = G.follow[currNonTerm].size();
G.follow[currNonTerm].insert(G.follow[currLHS].begin(), G.follow[currLHS].end());
if (pre != G.follow[currNonTerm].size())
didchange = true;
}
return didchange;
}
-
Calculating Follow Sets of a Context Free Grammar
-
Streaming Images from ESP32-CAM for viewing on a CYD-esp32
-
Constructing the Firsts Set of a Context Free Grammar
-
Inchworm Evaluation, Or Evaluating Prefix-Expressions with a Queue
-
Data Structures For Representing Context Free Grammar
-
A B Tree of Binary Search Trees
-
Implementing enhanced for loops in Bytecode
-
Top-Down Deletion for Red/Black Trees
-
Function Closures For Bytecode VMs: Heap Allocated Activation Records & Access Links
-
Pascal & Bernoulli & Floyd: Triangles
Leave A Comment