Monday, December 21, 2015

Simple symbol table with yacc

In the last post about yacc, we got yacc to evaluate expressions and print out the evaluated value.

In programs that involve setting and looking up variable values, there is usually some kind of symbol table. At the top of the yacc file, I've added the following global symbol table and related functions:

%{
#include <stdio.h>

#define TABLE_SIZE 26
#define UNSET -1

int SYMBOL_TABLE[TABLE_SIZE];
int lookup(int symbol_num);
void set(int symbol_num, int value);
void reset_table();

%}

In this example, the table symbol is just big enough to fit one variable per letter of the alphabet. The table is expected to be initialized with a special "unset" value before use.

The rule section of the yacc file has been modified to include an "assignment statement" type expression. When an assignment statement is encountered, the symbol table is modified. When a variable is encountered and is being used for its value, the value of the expression is looked up in the symbol table.

%%

start   : stmt { printf($1 ? "true\n" : "false\n"); }
        | expr { printf($1 ? "true\n" : "false\n"); }
        ;

expr    : NOT expr { $$ = ! $2; }
        | expr AND expr { $$ = ($1 && $3); }
        | expr OR expr { $$ = ($1 || $3); }
        | LPAREN expr RPAREN { $$ = $2; }
        | NAME { $$ = lookup($1); }
        | FALSE { $$ = 0; }
        | TRUE { $$ = 1; }
        ;

stmt    : NAME ASSIGN expr { set($1, $3); $$ = $3; }
        ;

%%

We're using two new tokens, so they have to be included in the lex file rules section:

=   { return ASSIGN; }
[a-z] { yylval = yytext[0] - 'a'; return NAME; }

The set and lookup functions still have to be defined, and they can be defined at the bottom of the yacc file after the rules section:

int lookup(int symbol_num) {
  printf("lookup %d\n", symbol_num);
  if (SYMBOL_TABLE[symbol_num] == UNSET ||
      symbol_num < 0 || symbol_num >= TABLE_SIZE) {
    return 0;  // false by default
  }
  return SYMBOL_TABLE[symbol_num];
}

void set(int symbol_num, int value) {
  printf("set %d %d\n", symbol_num, value);
  if (symbol_num < 0 || symbol_num >= TABLE_SIZE) {
    return;  // do nothing by default
  }
  SYMBOL_TABLE[symbol_num] = value;
}

After this is added, the parser also accepts assignment expressions and looks up variables in a table.

> a = not true      
false
> b = (true and true)
true
> a or b
true
> a and b
false

No comments:

Post a Comment