> true and false false > false or not false true > a = true, b = false, c = false > (b or not a) and c false
The first step in this task is to make a lexer to tokenize the input -- to convert the input into a sequence of tokens. With
lex
, a lexer is defined in a special domain specific language. Example 1:File: logic.l
%% not { printf("NOT: %s\n", yytext); } or { printf("OR: %s\n", yytext); } and { printf("AND: %s\n", yytext); } \( { printf("LPAREN: %s\n", yytext); } \) { printf("RPAREN: %s\n", yytext); } false { printf("FALSE: %s\n", yytext); } true { printf("TRUE: %s\n", yytext); } . ; // Ignore everything else. %%
Things to note: The lex file contains a rules section, marked at the start and end with
%%
. In the minimal example, a lex example can just contain a sequence of rules. Each rule consists of a regular expression followed by a C block which contains the code to execute when the lexer finds input that matches that regular expression. Inside of that C block, the string that matched the regular expression is available as yytext
.Once we have a lex file like the one above, we can use lex to make a C source file from it:
$ lex logic.l
This creates a file called lex.yy.c (1791 lines!) which can then be compiled and run.
$ cc lex.yy.c -o lexer -ll
$ ./lexer (true and false) ... LPAREN: ( TRUE: true AND: and FALSE: false RPAREN: ) ^C
In this example, each of the rules in the lexer files prints some text to
stdout
and doesn't actually return any token type, but often when lex lexer is used inside a parser, the block of code for each of the rules will return an integer, which represents a token type.
No comments:
Post a Comment