Postgresql源码(98)lex与yacc的定制交互方式
1 背景知识一:LEX %option prefix
Postgresql中使用%option prefix="core_yy"
,影响范围:yy_create_buffer,yy_delete_buffer,yy_flex_debug,yy_init_buffer,yy_flush_buffer,yy_load_buffer_state,yy_switch_to_buffer,yyin,yyleng,yylex,yylineno,yyout,yyrestart,yytext,yywrap,yyalloc,yyrealloc,yyfree。
所以lex提供的yylex在PG中是core_yylex。
‘-PPREFIX, --prefix=PREFIX, %option prefix="PREFIX"’
changes the default ‘yy’ prefix used by flex for all globally-visible variable and function names to instead be ‘PREFIX’. For example, ‘--prefix=foo’ changes the name of yytext to footext. It also changes the name of the default output file from lex.yy.c to lex.foo.c. Here is a partial list of the names affected:
yy_create_buffer
yy_delete_buffer
yy_flex_debug
yy_init_buffer
yy_flush_buffer
yy_load_buffer_state
yy_switch_to_buffer
yyin
yyleng
yylex
yylineno
yyout
yyrestart
yytext
yywrap
yyalloc
yyrealloc
yyfree
(If you are using a C++ scanner, then only yywrap and yyFlexLexer are affected.) Within your scanner itself, you can still refer to the global variables and functions using either version of their name; but externally, they have the modified name.
This option lets you easily link together multiple flex programs into the same executable. Note, though, that using this option also renames yywrap(), so you now must either provide your own (appropriately-named) version of the routine for your scanner, or use %option noyywrap, as linking with ‘-lfl’ no longer provides one for you by default.
https://www.cs.virginia.edu/~cr4bd/flex-manual/Code_002dLevel-And-API-Options.html#Code_002dLevel-And-API-Options
2 背景知识二:YACC %name-prefix
lex and yacc中可以使用prefix指定内置函数、变量的前缀,实现一套代码中包含多套解析器。
Postgresql中使用%name-prefix="base_yy"
,影响范围:yyparse, yylex, yyerror, yynerrs, yylval, yylloc, yychar and yydebug。
所以yacc中调用的yylex函数实际是base_yylex。
但是lex提供的是core_yylex,yacc调用的是base_yylex,怎么找到core_yylex呢?看下一节。
The renamed symbols include yyparse, yylex, yyerror, yynerrs, yylval, yylloc, yychar and yydebug.
If you use a push parser, yypush_parse, yypull_parse, yypstate, yypstate_new and yypstate_delete will also be renamed. The renamed macros include YYSTYPE, YYLTYPE, and YYDEBUG, which is treated specifically — more about this below.
https://www.gnu.org/software/bison/manual/bison.html#Multiple-Parsers
3 yylex与yyparse
- yyparse是yacc入口,pg中在raw_parser中调用。
- yylex是lex入口,yacc通过自定义base_yylex函数,在函数中调用core_yylex进入lex拿token和值。
3.1 yylex和yyparse的参数
首先关注在gram.y中的几个重要配置:
gram.y
%pure-parser
%name-prefix="base_yy"
%locations
%parse-param {core_yyscan_t yyscanner}
%lex-param {core_yyscan_t yyscanner}
参数配置函数:
参数影响 | yylex | yyparse |
---|---|---|
初始形态的yylex函数 | int yylex() | int yyparse () |
加parse-param后 | int yylex(core_yyscan_t yyscanner) | int yyparse (core_yyscan_t yyscanner) |
加pure-parser后 | int yylex(YYSTYPE *lvalp, core_yyscan_t yyscanner) | int yyparse (core_yyscan_t yyscanner) |
加locations后 | int yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner) | int yyparse (core_yyscan_t yyscanner) |
YYSTYPE是什么:token的值保存在全局变量yylval中,yylval的类型是YYSTYPE。
YYLTYPE是什么:token的位置保存在全局变量yylloc中,yylloc的类型是YYLTYPE。
YYSTYPE会在gram.y中编译为联合体YYSTYPE,其中第一个子类型就是core_YYSTYPE core_yystype;
union YYSTYPE
{
#line 230 "gram.y"
core_YYSTYPE core_yystype;
/* these fields must match core_YYSTYPE: */
int ival;
char *str;
const char *keyword;
char chr;
bool boolean;
...
...
#line 604 "gram.h"
};
3.2 yyparse与base_yylex的调用流程
base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
...
cur_token = core_yylex(&(lvalp->core_yystype), llocp, yyscanner);
...
其中lvalp和lvalp->core_yystype地址相同,因为是一个联合体: