The Four Stages of Compiling a C Program
Introduction of Compiler Design
走进Golang之编译器原理
目标文件里面到底有什么(1)?
Go: Overview of the Compiler
【Linux】编译,链接,装载简单梳理
Go 程序是怎样跑起来的
我需要知道:编译、链接、装载、运行
程序的编译、链接、装载与运行
Compiler Architecture
1 What is a Compiler?

hello.c
1
2
3
4
5
6
7
8
9
10
11
12
|
/*
* "Hello, World!": A classic.
*/
#include <stdio.h>
int
main(void)
{
puts("Hello, World!");
return 0;
}
|
- 预编译:
- 处理预编译指令;
- 导入 # include 引用的声明代码
- 移除注释
1
2
3
4
5
6
7
8
9
10
11
12
|
[lines omitted for brevity]
extern int __vsnprintf_chk (char * restrict, size_t,
int, size_t, const char * restrict, va_list);
# 493 "/usr/include/stdio.h" 2 3 4
# 2 "hello_world.c" 2
int
main(void) {
puts("Hello, World!");
return 0;
}
|
- 编译: 转换成汇编
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 10
.globl _main
.align 4, 0x90
_main: ## @main
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp0:
.cfi_def_cfa_offset 16
Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp2:
.cfi_def_cfa_register %rbp
subq $16, %rsp
leaq L_.str(%rip), %rdi
movl $0, -4(%rbp)
callq _puts
xorl %ecx, %ecx
movl %eax, -8(%rbp) ## 4-byte Spill
movl %ecx, %eax
addq $16, %rsp
popq %rbp
retq
.cfi_endproc
.section __TEXT,__cstring,cstring_literals
L_.str: ## @.str
.asciz "Hello, World!"
.subsections_via_symbols
|
-
汇编:转换二进制格式文件(目标文件 Executable Linkable Format)
-
链接:集成引用的依赖代码

- 静态链接: 在编译时集成引用的依赖
- 动态链接:在运行时集成
优势:
1. 节省存储和磁盘空间
2. 热更新
excutable and linkable format
是一种通用的格式

- headerfile: 元信息
- section(segment)
- text 代码
- data 数据(全局变量和局部静态变量)
执行程序的过程


-
library
- binary
- pre-compile code
- contain a set of function
-
dymanic vs static?
- static: locked into a program at runtime
- dynamic:exist as separate file

编译过程
- 生成抽象语法树(ast); 并检查和简化;
- 根据ast 生成中间代码从而进一步生成binary code;
tokenList
生成 <token-name, value>
1
2
|
package main
const s = "foo"
|
1
2
3
4
5
6
7
|
PACKAGE(package)
IDENT(main)
CONST(const)
IDENT(s)
ASSIGN(=)
STRING("foo")
EOF()
|
生成AST(abstract syntax tree)

类型检查
- 检查语法树是否符合语法规则;例如值是否赋值给正确的类型。
- 其他一些可能的优化工作

1
2
3
4
5
6
|
while b ≠ 0:
if a > b:
a := a - b
else:
b := b - a
return a
|
- what’s
- abstract syntax tree;
- describe code in a tree;
- used to generate assmebly code;;
node}->node
|
node
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
// All node types implement the Node interface.
type Node interface {
Pos() token.Pos // position of first character belonging to the node
End() token.Pos // position of first character immediately after the node
}
// All expression nodes implement the Expr interface.
type Expr interface {
Node
exprNode()
}
// All statement nodes implement the Stmt interface.
type Stmt interface {
Node
stmtNode()
}
// All declaration nodes implement the Decl interface.
type Decl interface {
Node
declNode()
}
|