Overview
- tinyc basics
- frontend
- backend
TinyC basics
tokens in tinyC
According to their lexical characteristics, the tokens in tinyC are divided into the following
three
categories.
single char operator ( 15 kinds )
1
+ * - / % = , ; ! < > ( ) { }
double char operator ( 6 kinds ) and keywords ( 10 kinds )
1
2<= >= == != && ||
void int while if else return break continue print readintinteger constant, string constant, identifier ( variable name and function name ), ( 3 kinds )
numbering principle
- single char op : the
token number
is the value of its character. - others : the
token number
are numbered from 256
notes
When you are writing the rule for single char op, please pay attention to the character ‘-‘ in RegExp :
1
2(wrong) : {OPERATOR} {[+-*/%=,;<>(){}]}
(right) : {OPERATOR} {[+\-*/%=,;<>(){}]} <-- 正则表达式'-'需要转义字符
Frontend
Grammar on tools
flex
1 | %{ |
bison
1 | %{ |
Details
scanner.l
When I’m implementing
scanner.l
, I find that different order can lead to different results or even errors! For example, first order is like:1
2
3
4"int" { return T_Int; }
"print" { return T_Print; }
...
{IDENTIFIER} { _DUPTEXT; return T_Identifier; }second order is like:
1
2
3
4{IDENTIFIER} { _DUPTEXT; return T_Identifier; }
...
"int" { return T_Int; }
"print" { return T_Print; }the second one results in an error! Because compiler doesn’t know
int
is a keyword, instead, it treatsint
as anidentifier
! Therefore, you need to put all keyword rules before the identifier!!!
Backend
NASM
The Netwide Assembler
based on x86