一、词法文件结构
flex的词法输入文件.l 结构通过%%
分成三部分
-
声明
-
规则
-
c 代码
声明中 %{
C代码声明 %}
中包裹的内容将直接原样拷贝到生成的C代码中,
最后的段也是直接原样拷贝到生成的C代码中,
中间的规则段,每个规则由 模式
+ 动作
组成,模式一般都是正则表达式书写方式。
二、pg中的词法分析文件
2.1 声明段
src/backend/parser/scan.l
%{
/* LCOV_EXCL_START */
/* Avoid exit() on fatal scanner errors (a bit ugly -- see yy_fatal_error) */
#undef fprintf
#define fprintf(file, fmt, msg) fprintf_to_ereport(fmt, msg)
static void
fprintf_to_ereport(const char *fmt, const char *msg)
{
ereport(ERROR, (errmsg_internal("%s", msg)));
}
/*
* GUC variables. This is a DIRECT violation of the warning given at the
* head of gram.y, ie flex/bison code must not depend on any GUC variables;
* as such, changing their values can induce very unintuitive behavior.
* But we shall have to live with it until we can remove these variables.
*/
int backslash_quote = BACKSLASH_QUOTE_SAFE_ENCODING;
bool escape_string_warning = true;
bool standard_conforming_strings = true;
...
%}
...
%option nodefault
%option noinput
%option nounput
%option noyywrap
%option noyyalloc
%option noyyrealloc
%option noyyfree
%option warn
%option prefix="core_yy"
...
%%
2.2 规则段
{whitespace} {
/* ignore */
}
{xcstart} {
/* Set location in case of syntax error in comment */
SET_YYLLOC();
yyextra->xcdepth = 0;
BEGIN(xc);
/* Put back any characters past slash-star; see above */
yyless(2);
}
<xc>{
{xcstart} {
(yyextra->xcdepth)++;
/* Put back any characters past slash-star; see above */
yyless(2);
}
...
%%
/* LCOV_EXCL_STOP */
2.3 C code 段
/*
* Arrange access to yyextra for subroutines of the main yylex() function.
* We expect each subroutine to have a yyscanner parameter. Rather than
* use the yyget_xxx functions, which might or might not get inlined by the
* compiler, we cheat just a bit and cast yyscanner to the right type.
*/
#undef yyextra
#define yyextra (((struct yyguts_t *) yyscanner)->yyextra_r)
/* Likewise for a couple of other things we need. */
#undef yylloc
#define yylloc (((struct yyguts_t *) yyscanner)->yylloc_r)
#undef yyleng
#define yyleng (((struct yyguts_t *) yyscanner)->yyleng_r)
/*
* scanner_errposition
* Report a lexer or grammar error cursor position, if possible.
*
* This is expected to be used within an ereport() call, or via an error
* callback such as setup_scanner_errposition_callback(). The return value
* is a dummy (always 0, in fact).
*
* Note that this can only be used for messages emitted during raw parsing
* (essentially, scan.l, parser.c, and gram.y), since it requires the
* yyscanner struct to still be available.
*/
int
scanner_errposition(int location, core_yyscan_t yyscanner)
{
int pos;
if (location < 0)
return 0; /* no-op if location is unknown */
/* Convert byte offset to character number */
pos = pg_mbstrlen_with_len(yyextra->scanbuf, location) + 1;
/* And pass it to the ereport mechanism */
return errposition(pos);
}
...
2.4 最终生成的词法分析文件
src/backend/parser/scan.c
通过flex工具,对scan.l文件进行处理,最终将生成scan.c代码,用于后续对sql语言进行词法分析。