Go解析

SYJ-Re 看雪学苑 2022-07-01

收录于合集 #“雪花”创作激励计划 97个

本文为看雪论坛精华‍‍‍文章
看雪论坛作者ID：SYJ-Re

在IDA7.5有插件支持反编译Go， IDA7.6已经支持的情况下，为什么写这篇文章呢？就算能反编译, 也是需要对照汇编的，对于反编译出来的很多函数，在调试时也要很快的知道哪些才是真正存储数据的地方，增加分析效率。

一

Ready

Go源码下载：https://github.com/golang/go

main_main查找

这里简单提一下main_main的查找：

*有符号，直接ctrl+f，IDA中搜索main_main即可

*无符号：

字符串查找

runtime_main调用main_main

runtime_mainIDA反编译代码参考

void runtime_main(){ PVOID ArbitraryUserPointer; // rax __int64 v1; // rdx __int64 i; // rax __int64 v3; // [rsp+0h] [rbp-50h] __int64 v4; // [rsp+0h] [rbp-50h] __int64 v5; // [rsp+0h] [rbp-50h] __int64 v6; // [rsp+10h] [rbp-40h] __int64 v7; // [rsp+20h] [rbp-30h] BYREF __int64 v8; // [rsp+28h] [rbp-28h] __int64 v9; // [rsp+30h] [rbp-20h] __int128 v10; // [rsp+38h] [rbp-18h] void *retaddr; // [rsp+50h] [rbp+0h] BYREF while ( (unsigned __int64)&retaddr <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) ) runtime_morestack_noctxt(); v10 = 0LL; HIBYTE(v7) = 0; v9 = *(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer; *(_QWORD *)(**(_QWORD **)(v9 + 48) + 304LL) = 0LL; runtime_maxstacksize = 1000000000LL; runtime_maxstackceiling = 2000000000LL; runtime_mainStarted = 1; _InterlockedExchange((volatile __int32 *)&unk_5683A0, 1); runtime_systemstack((__int64)&off_4D6020); ArbitraryUserPointer = NtCurrentTeb()->NtTib.ArbitraryUserPointer; ++*(_DWORD *)(*(_QWORD *)(*(_QWORD *)ArbitraryUserPointer + 48LL) + 580LL); v1 = *(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer; *(_QWORD *)(*(_QWORD *)(v1 + 48) + 312LL) = v1; *(_QWORD *)(v1 + 216) = *(_QWORD *)(v1 + 48); if ( *(_UNKNOWN **)(v9 + 48) != &runtime_m0 ) goto LABEL_28; byte_568678 = 1; runtime_nanotime1(v3); runtime_runtimeInitTime = v4; if ( !v4 ) {LABEL_27: runtime_throw((__int64)"nanotime returning zero", 23LL);LABEL_28: runtime_throw((__int64)"runtime.main not on m0", 22LL); runtime_deferreturn(v5); return; } if ( dword_5AB048 ) { qword_5AAE88 = *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 152LL); runtime_inittrace = 1; } runtime_doInit((__int64)&runtime__inittask); HIWORD(v7) = 257; *((_QWORD *)&v10 + 1) = &off_4D6028; *(_QWORD *)&v10 = (char *)&v7 + 6; runtime_gcenable(); v6 = runtime_makechan((__int64)&unk_4B2580, 0LL); if ( runtime_writeBarrier ) runtime_gcWriteBarrier(); else runtime_main_init_done = v6; if ( !runtime_iscgo ) goto LABEL_12; if ( !cgo_thread_start ) {LABEL_26: runtime_throw((__int64)"_cgo_thread_start missing", 25LL); goto LABEL_27; } if ( !cgo_notify_runtime_init_done ) { runtime_throw((__int64)"_cgo_notify_runtime_init_done missing", 37LL); goto LABEL_26; } runtime_startTemplateThread(); runtime_cgocall(cgo_notify_runtime_init_done, 0LL, v6);LABEL_12: runtime_doInit((__int64)&main__inittask); runtime_inittrace = 0; runtime_closechan(runtime_main_init_done); BYTE6(v7) = 0; runtime_unlockOSThread(); if ( !runtime_isarchive && !runtime_islibrary ) { main_main(); if ( runtime_runningPanicDefers ) { for ( i = 0LL; i < 1000 && runtime_runningPanicDefers; i = v8 + 1 ) { v8 = i; runtime_mcall(); } } if ( runtime_panicking ) v7 = runtime_gopark(0LL, 0LL, 4104, 1LL); runtime_exit(0); while ( 1 ) MEMORY[0] = 0; } HIBYTE(v7) = 0; runtime_main_func2(v10);}

可以利用查找那些runtime_main中的字符串在无符号的Go程序中找到runtime_main，从而找到main_main。

编译一份有符号的，同位数的Go可执行程序bindiff恢复符号找到runtime_main或一些关键函数。

notice：本文的数据均在64位环境下得出。

可关掉编译优化

-gcflags "-N -l" go build -o hello.exe -gcflags "-N -l" hello.go

二

Go数据结构解析

数值类型

字符串

Go程序会将字符串全部存放到一块连续的内存当中，使用的时候会利用runtime_convTstring来构造真正的字符串内存格式。

struct String{ char * strPtr; //指向目标字符串的开始地址 int64 size; //字符串大小}

数组

数组有三种存储位置，当数组内元素较少时可以直接存于栈上，较多时存于数据区，而当数据会被返回时会存于堆上。

func ArrDemo() *[3]int { a := [...]int{1, 2, 3} b := [...]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7} c := [...]int{1, 2, 3} if len(a) < len(b) {return &c} return nil }

变量a的汇编如下，它直接在栈上定义并初始化：

type arrayHeader struct { Data uintptr Len int}

变量b的汇编如下，它的初始值被定义在了数据段并进行拷贝初始化：

数据拷贝函数：

变量c的汇编如下，尽管它和a的值一样，但是它的地址是runtime.newobject函数新返回的，该函数传入的是数据的类型指针，它将在堆上申请空间存放对象实例，返回的是新的对象指针：

经常会遇到的情况是返回一个结构体变量，然后将其赋值给newobject申请的新变量上。

下面也是这类情况：

x := []int{1, 2, 3, 4, 5}

通常会有一个地址作为runtime_newobject的参数，这个地址存放的是我们将要构造的数组的总大小，比如这里qword_234820地址处存放的就是40(5 * 8)

runtime_newobject里面会调用runtime_mallocgc分配内存。

再之后会将runtime_newobject返回的指向这块内存的指针，存放到当前函数栈中的一个局部变量中，以后对这个数组的操作都是通过这个局部变量来的，比如这里的rsp+168。

再下方就是对那块内存赋值, 也就是对数组初始化

y := []float32{1000.0, 2.0, 3.4, 7.0, 50.0}

slice切片

内存结构

一个slice是一个数组某个部分的引用。

在内存中，它是一个包含3个域的结构体：

{ 指向slice中第一个元素的指针 slice的长度 slice的容量}

这里解释一下长度和容量的概念：

长度：下标操作的上界，如x[i]中i必须小于长度

容量：分割操作的上界，如x[i:j]中j不能大于容量

数组的slice并不会实际复制一份数据，它只是创建一个新的数据结构，包含了另外的一个指针，一个长度和一个容量数据。

比如上面的分割表达式x[1:3]并不分配更多的数据：它只是写了一个新的slice结构的属性来引用相同的存储数据。

在例子中，长度为2，只有y[0]和y[1]是有效的索引，但是容量为4，y[0:4]是一个有效的分割表达式。

这里先创建一个数组，然后rax存放的是构造好的数组的首地址，add rax, 8之后就会指向数组的下标1对应的元素，也就是3存放的地址，也就是我们切片[1:3]的第一个元素的指针，之后

.text:00000000004A79A8 mov [rsp], rax
.text:00000000004A79AC mov qword ptr [rsp+8], 2
.text:00000000004A79B5 mov qword ptr [rsp+16], 4

这里就在内存中构造出来了一个切片的结构。

调用了runtime_convTslice，返回值是一个指向新分配的内存的指针，这个新分配的内存存放的元素数量就是切片的容量，比如这里就存放了4个元素，[3, 5, 7, 11] (从切片的那个元素开始，往后取容量那么多个)，所以这里就是为什么y[0:4]是一个有效的分割表达式。

make

slice := make([]int, len)

slice1 := make([]int, 5)slice1[3] = 66

make的底层会调用runtime_makeslice分配Array，真正的切片结构是后面的runtime_convTslice时创建的。

// runtime_makeslicefunc makeslice(et *_type, len, cap int) unsafe.Pointer { mem, overflow := math.MulUintptr(et.size, uintptr(cap)) if overflow || mem > maxAlloc || len < 0 || len > cap { // NOTE: Produce a 'len out of range' error instead of a // 'cap out of range' error when someone does make([]T, bignumber). // 'cap out of range' is true too, but since the cap is only being // supplied implicitly, saying len is clearer. // See golang.org/issue/4085. mem, overflow := math.MulUintptr(et.size, uintptr(len)) if overflow || mem > maxAlloc || len < 0 { panicmakeslicelen() } panicmakeslicecap() } return mallocgc(mem, et, true)}

makeslice返回的是一个指向实际数据的指针（不含管理slice的结构体）相当于 malloc(sizeof(Type) * len)。

在访问slice中元素时，一般会检测下标是否小于len，如果越界则调用runtime_panicIndex

append/copy

slice1 = append(slice1, 123)

append 的时候会检测目标 slice1.len + 1 与 slice1.cap 的大小关系。

若 slice1.len + 1 > slice1.cap 则调用runtime_growslice扩容。

在对slice进行append等操作时，可能会造成slice的自动扩容。其扩容时的大小增长规则是：

如果新的大小是当前大小2倍以上，则大小增长为新大小
否则循环以下操作：如果当前大小小于1024，按每次2倍增长，否则每次按当前大小1/4增长。直到增长的大小超过或等于新大小。

copy 就是复制一个新的切片。

切片截取

myvar := slice1[1:3]

myvar的数据结构是新一个新的切片 struct。

map

内存结构

Go语言的map使用Hash表作为底层实现，可以在 $GOROOT/src/pkg/runtime/hashmap.goc 找到它的实现，其中 [runtime.hmap](https://draveness.me/golang/tree/runtime.hmap)是最核心的结构体，我们先来了解一下该结构体的内部字段：

type hmap struct { count int //map中键值对数量 flags uint8 //map当前是否处于写入状态登 B uint8 //2的B次幂表示当前map中桶的数量 noverflow uint16 //map中溢出桶的数量，当溢出桶太多时，map会进行等量扩容 hash0 uint32 //生成hash的随机数种子 buckets unsafe.Pointer //当前map对应的桶的指针

    oldbuckets unsafe.Pointer  //map扩容时指向旧桶的指针，当所有旧桶中的数据转移到新桶时，清空

    nevacuate  uintptr         //扩容时，用于标记当前旧桶中小于nevacute的数据都已经转移到了新桶

extra *mapextra //存储map的溢出桶} type mapextra struct { overflow *[]*bmap oldoverflow *[]*bmap nextOverflow *bmap}

这个hash结构使用的是一个可扩展哈希的算法，由hash值mod当前hash表大小决定某一个值属于哪个桶，而hash表大小是2的指数，即上面结构体中的2^B。每次扩容，会增大到上次大小的两倍。

结构体中有一个buckets和一个oldbuckets是用来实现增量扩容的。正常情况下直接使用buckets，而oldbuckets为空。如果当前哈希表正在扩容中，则oldbuckets不为空，并且buckets大小是oldbuckets大小的两倍。

具体的Bucket结构如下所示：

type bmap struct { tophash [8]uint8 //存储Hash值的高8位 data []byte //key value数据：key/key/key.../value/value/value... overflow *bmap //溢出bucket的地址}

BUCKETSIZE是用宏定义的8，每个bucket中存放最多8个key/value对, 如果多于8个，那么会申请一个新的bucket，并将它与之前的bucket链起来。

按key的类型采用相应的hash算法得到key的hash值。将hash值的低位当作Hmap结构体中buckets数组的index，找到key所在的bucket。将hash的高8位存储在了bucket的tophash中。注意，这里高8位不是用来当作key/value在bucket内部的offset的，而是作为一个主键，在查找时对tophash数组的每一项进行顺序匹配的。先比较hash值高位与bucket的tophash[i]是否相等，如果相等则再比较bucket的第i个的key与所给的key是否相等。如果相等，则返回其对应的value，反之，在overflow buckets中按照上述方法继续寻找。

data区存放的是key—value数据，其中keys放在一起，values放在一起，如此存储是为了节省字节对齐带来的空间浪费。例如map[int64]int8。

overflow指针指向的是下一个bucket，据此将所有冲突的键连接起来

逆向中map的常见函数

字典操作常见的会转换为如下函数：一般fastrand和makemap连用返回一个map，它为一个指针。

读字典时使用mapaccess1和mapaccess2，写字典时会使用mapassign函数，它返回一个地址，将value写入该地址，另外还比较常见的是对字典进行遍历，会使用mapiterinit和mapiternext配合。

func fastrand() uint32 func makemap(mapType *byte, hint int, mapbuf *any) (hmap map[any]any) func mapaccess1(mapType *byte, hmap map[any]any, key *any) (val *any) func mapaccess2(mapType *byte, hmap map[any]any, key *any) (val *any, pres bool) func mapassign(mapType *byte, hmap map[any]any, key *any) (val *any) func mapiterinit(mapType *byte, hmap map[any]any, hiter *any) func mapiternext(hiter *any)

mapaccess和mapassign的第一个参数代表字典的类型，因此能很容易知道字典操作参数和返回值的类型。

赋值和访问

赋值

func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer{}

3个参数，第三个参数是key的指针，rsp + 16，返回值是key对应的数据指针。

访问

func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {}

和赋值同理，区别是这个不会为不存在的 key 创建 key pair。

package main import "fmt" func main() { countryCapitalMap := map[string]string{"France": "Paris", "Italy": "Rome", "Japan": "Tokyo", "India": "New delhi"} fmt.Println("France首都是", countryCapitalMap ["France"])}

比如上方在执行完：

之后，map buckets是存放了一个键值对的，可以过去查看v5(h *hmap) ，0x000000C00014FE38来查看。

可以看见buckets unsafe.Pointer为byte_C00014FE68

前面8字节的是tophash数组，目前只存放了一个键值对，就只有一个存放了，hash(”France”)的高8bit，往下就是存放键值对。

package main import "fmt" func main() { var countryCapitalMap map[string]string /*创建集合, 默认map是nil*/ //如果不初始化 map，那么就会创建一个 nil map, nil map 不能用来存放键值对 countryCapitalMap = make(map[string]string) countryCapitalMap [ "France" ] = "巴黎" countryCapitalMap [ "Italy" ] = "罗马" countryCapitalMap [ "Japan" ] = "东京" countryCapitalMap [ "India " ] = "新德里"

    //或者：countryCapitalMap := map[string]string{"France": "Paris", "Italy": "Rome", "Japan": "Tokyo", "India": "New delhi"}

/*使用键输出地图值 */ for country := range countryCapitalMap { fmt.Println(country, "首都是", countryCapitalMap [country]) }}

void __cdecl main_main(){ int v0; // [rsp+10h] [rbp-230h] __int64 v1; // [rsp+20h] [rbp-220h] __int64 v2; // [rsp+20h] [rbp-220h] __int64 v3; // [rsp+20h] [rbp-220h] __int64 v4; // [rsp+20h] [rbp-220h] _QWORD *v5; // [rsp+30h] [rbp-210h] __int64 *v6; // [rsp+30h] [rbp-210h] __int64 v7; // [rsp+38h] [rbp-208h] __int64 v8; // [rsp+40h] [rbp-200h] __int64 v9; // [rsp+48h] [rbp-1F8h] __int64 v10; // [rsp+50h] [rbp-1F0h] __int64 v11; // [rsp+58h] [rbp-1E8h] __int64 v12; // [rsp+60h] [rbp-1E0h] _QWORD v13[5]; // [rsp+68h] [rbp-1D8h] BYREF __int64 v14; // [rsp+90h] [rbp-1B0h] BYREF __int128 v15; // [rsp+98h] [rbp-1A8h] BYREF __int128 v16; // [rsp+A8h] [rbp-198h] __int128 v17; // [rsp+B8h] [rbp-188h] __int64 v18[12]; // [rsp+C8h] [rbp-178h] BYREF char v19; // [rsp+128h] [rbp-118h] BYREF while ( (unsigned __int64)&v14 <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) ) runtime_morestack_noctxt(); v15 = 0LL; v16 = 0LL; v17 = 0LL; ((void (*)(void))loc_46651C)(); *(_QWORD *)&v16 = &v19; runtime_fastrand(); HIDWORD(v15) = v0; runtime_mapassign_faststr((__int64)&unk_4B7040, (__int64)&v15, (__int64)"France", 6LL); v5[1] = 6LL; if ( runtime_writeBarrier ) runtime_gcWriteBarrier(); else *v5 = "巴黎"; runtime_mapassign_faststr((__int64)&unk_4B7040, (__int64)&v15, (__int64)"Italy", 5LL); v5[1] = 6LL; if ( runtime_writeBarrier ) runtime_gcWriteBarrier(); else *v5 = "罗马"; runtime_mapassign_faststr((__int64)&unk_4B7040, (__int64)&v15, (__int64)"Japan", 5LL); v5[1] = 6LL; if ( runtime_writeBarrier ) runtime_gcWriteBarrier(); else *v5 = "东京"; v8 = runtime_mapassign_faststr((__int64)&unk_4B7040, (__int64)&v15, (__int64)"India ", 6LL); v5[1] = 9LL; if ( runtime_writeBarrier ) runtime_gcWriteBarrier(); else *v5 = "新德里"; ((void (*)(void))loc_466551)(); runtime_mapiterinit((__int64)&unk_4B7040, &v15, (__int64)v18); while ( v18[0] ) { v11 = *(_QWORD *)v18[0]; v10 = *(_QWORD *)(v18[0] + 8); runtime_convTstring(*(_QWORD *)v18[0], v10, v1); v12 = v2; v9 = runtime_mapaccess1_faststr((__int64)&unk_4B7040, (__int64)&v15, v11, v10, (__int64)v5); v7 = runtime_convTstring(*v6, v6[1], v3); v13[0] = &unk_4B3260; v13[1] = v12; v13[2] = &unk_4B3260; v13[3] = &off_4EF518; v13[4] = &unk_4B3260; v14 = v4; fmt_Fprintln((__int64)&go_itab__os_File_io_Writer, os_Stdout, (__int64)v13, 3LL, 3LL, v7, v8, v9); runtime_mapiternext((__int64)v18); }}

查找过程

根据key计算出hash值。
如果存在old table, 首先在old table中查找，如果找到的bucket已经evacuated，转到步骤3。反之，返回其对应的value。
在new table中查找对应的value。

这里一个细节需要注意一下。不认真看可能会以为低位用于定位bucket在数组的index，那么高位就是用于key/valule在bucket内部的offset。事实上高8位不是用作offset的，而是用于加快key的比较的。

do { //对每个桶b //依次比较桶内的每一项存放的tophash与所求的hash值高位是否相等 for(i = 0, k = b->data, v = k + h->keysize * BUCKETSIZE; i < BUCKETSIZE; i++, k += h->keysize, v += h->valuesize) { if(b->tophash[i] == top) { k2 = IK(h, k); t->key->alg->equal(&eq, t->key->size, key, k2); if(eq) { //相等的情况下再去做key比较... *keyp = k2; return IV(h, v); } } } b = b->overflow; //b设置为它的下一下溢出链} while(b != nil);

插入过程

根据key算出hash值，进而得出对应的bucket。
如果bucket在old table中，将其重新散列到new table中。
在bucket中，查找空闲的位置，如果已经存在需要插入的key，更新其对应的value。
根据table中元素的个数，判断是否grow table。
如果对应的bucket已经full，重新申请新的bucket作为overbucket。
将key/value pair插入到bucket中。

这里也有几个细节需要注意一下。

在扩容过程中，oldbucket是被冻结的，查找时会在oldbucket中查找，但不会在oldbucket中插入数据。如果在oldbucket是找到了相应的key，做法是将它迁移到新bucket后加入evalucated标记。并且还会额外的迁移另一个pair。

然后就是只要在某个bucket中找到第一个空位，就会将key/value插入到这个位置。也就是位置位于bucket前面的会覆盖后面的(类似于存储系统设计中做删除时的常用的技巧之一，直接用新数据追加方式写，新版本数据覆盖老版本数据)。

找到了相同的key或者找到第一个空位就可以结束遍历了。不过这也意味着做删除时必须完全的遍历bucket所有溢出链，将所有的相同key数据都删除。所以目前map的设计是为插入而优化的，删除效率会比插入低一些。

删除过程

删除元素实际上也是先查找元素，如果元素存在则把元素从相应的bucket中删除，如果不存在则什么也不做。

notice

Go调用汇编和C：https://tiancaiamao.gitbooks.io/go-internals/content/zh/03.1.html

三

G,M,P模型

在讲解函数调用之前，我们讲解一下GMP结构体。

并发：一个逻辑流的执行在时间上与另一个流重叠，叫并发流，多个流并发地执行被成为并发。

并行：如果两个流并发地运行在不同的处理器核或者计算机上，那么我们成为它们为并行地执行。

并行流是并发流的真子集。

多线程或多进程是并行的基本条件，但单线程也可以用协程(coroutine)做到并发。简单将Goroutine归纳为协程并不合适，因为它运行时会创建多个线程来执行并发任务，且任务单元可被调度到其它线程执行。这更像是多线程和协程的结合体，能最大限度提升执行效率，发挥多核处理器能力。

Go的并发实现非常简单，使用一个go关键字即可

package main import ( "fmt" "time") func say(s string) { for i := 0; i < 5; i++ { time.Sleep(100 * time.Millisecond) fmt.Println(s) }} func main() { go say("world") say("hello")}

Go语言虽然使用一个Go关键字即可实现并发编程，但Goroutine被调度到后端之后，具体的实现比较复杂。Go调度器组成。

G

G是Goroutine的缩写，相当于操作系统中的进程控制块，在这里就是Goroutine的控制结构，是对Goroutine的抽象。其中包括执行的函数指令及参数；G保存的任务对象；线程上下文切换，现场保护和现场恢复需要的寄存器(SP、IP)等信息。

type g struct { // Stack parameters. // stack describes the actual stack memory: [stack.lo, stack.hi). // stackguard0 is the stack pointer compared in the Go stack growth prologue. // It is stack.lo+StackGuard normally, but can be StackPreempt to trigger a preemption. // stackguard1 is the stack pointer compared in the C stack growth prologue. // It is stack.lo+StackGuard on g0 and gsignal stacks. // It is ~0 on other goroutine stacks, to trigger a call to morestackc (and crash). // 记录该goroutine使用的栈 stack stack // offset known to runtime/cgo //下面两个成员用于栈溢出检查，实现栈的自动伸缩，抢占调度也会用到stackguard0 stackguard0 uintptr // offset known to liblink stackguard1 uintptr // offset known to liblink _panic *_panic // innermost panic - offset known to liblink _defer *_defer // innermost defer // 此goroutine正在被哪个工作线程执行 m *m // current m; offset known to arm liblink //这个字段跟调度切换有关，G切换时用来保存上下文，保存什么，看下面gobuf结构体 sched gobuf syscallsp uintptr // if status==Gsyscall, syscallsp = sched.sp to use during gc syscallpc uintptr // if status==Gsyscall, syscallpc = sched.pc to use during gc stktopsp uintptr // expected sp at top of stack, to check in traceback param unsafe.Pointer // passed parameter on wakeup，wakeup唤醒时传递的参数 // 状态Gidle,Grunnable,Grunning,Gsyscall,Gwaiting,Gdead atomicstatus uint32 stackLock uint32 // sigprof/scang lock; TODO: fold in to atomicstatus goid int64 //schedlink字段指向全局运行队列中的下一个g， //所有位于全局运行队列中的g形成一个链表 schedlink guintptr waitsince int64 // approx time when the g become blocked waitreason waitReason // if status==Gwaiting，g被阻塞的原因 //抢占信号，stackguard0 = stackpreempt，如果需要抢占调度，设置preempt为true preempt bool // preemption signal, duplicates stackguard0 = stackpreempt paniconfault bool // panic (instead of crash) on unexpected fault address preemptscan bool // preempted g does scan for gc gcscandone bool // g has scanned stack; protected by _Gscan bit in status gcscanvalid bool // false at start of gc cycle, true if G has not run since last scan; TODO: remove? throwsplit bool // must not split stack raceignore int8 // ignore race detection events sysblocktraced bool // StartTrace has emitted EvGoInSyscall about this goroutine sysexitticks int64 // cputicks when syscall has returned (for tracing) traceseq uint64 // trace event sequencer tracelastp puintptr // last P emitted an event for this goroutine // 如果调用了 LockOsThread，那么这个 g 会绑定到某个 m 上 lockedm muintptr sig uint32 writebuf []byte sigcode0 uintptr sigcode1 uintptr sigpc uintptr // 创建这个goroutine的go表达式的pc gopc uintptr // pc of go statement that created this goroutine

        ancestors      *[]ancestorInfo // ancestor information goroutine(s) that created this goroutine (only used if debug.tracebackancestors)

startpc uintptr // pc of goroutine function racectx uintptr

        waiting        *sudog         // sudog structures this g is waiting on (that have a valid elem ptr); in lock order

cgoCtxt []uintptr // cgo traceback context labels unsafe.Pointer // profiler labels timer *timer // cached timer for time.Sleep,为 time.Sleep 缓存的计时器 selectDone uint32 // are we participating in a select and did someone win the race? // Per-G GC state // gcAssistBytes is this G's GC assist credit in terms of // bytes allocated. If this is positive, then the G has credit // to allocate gcAssistBytes bytes without assisting. If this // is negative, then the G must correct this by performing // scan work. We track this in bytes to make it fast to update // and check for debt in the malloc hot path. The assist ratio // determines how this corresponds to scan work debt. gcAssistBytes int64}

stack 描述了当前 goroutine 的栈内存范围[stack.lo, stack.hi)，其中 stack 的数据结构：

// Stack describes a Go execution stack.// The bounds of the stack are exactly [lo, hi),// with no implicit data structures on either side.// 描述 goroutine 执行栈// 栈边界为[lo, hi)，左包含右不包含，即 lo≤stack<hi// 两边都没有隐含的数据结构。type stack struct { lo uintptr //栈顶，指向内存低地址 hi uintptr //栈底，指向内存搞地址}

stackguard0 和 stackguard1 均是一个栈指针，用于扩容场景，前者用于 Go stack ，后者用于 C stack。

M

M是一个线程或称为Machine，所有M是有线程栈的。如果不对该线程栈提供内存的话，系统会给该线程栈提供内存(不同操作系统提供的线程栈大小不同)。当指定了线程栈，则M.stack→G.stack，M的PC寄存器指向G提供的函数，然后去执行。

type m struct { /* 1. 所有调用栈的Goroutine,这是一个比较特殊的Goroutine。 2. 普通的Goroutine栈是在Heap分配的可增长的stack,而g0的stack是M对应的线程栈。 3. 所有调度相关代码,会先切换到该Goroutine的栈再执行。 */ g0 *g curg *g // M当前绑定的结构体G // SP、PC寄存器用于现场保护和现场恢复 vdsoSP uintptr vdsoPC uintptr // 省略…}

P

P(Processor)是一个抽象的概念，并不是真正的物理CPU。所以当P有任务时需要创建或者唤醒一个系统线程来执行它队列里的任务。所以P/M需要进行绑定，构成一个执行单元。

P决定了同时可以并发任务的数量，可通过GOMAXPROCS限制同时执行用户级任务的操作系统线程。可以通过runtime.GOMAXPROCS进行指定。在Go1.5之后GOMAXPROCS被默认设置可用的核数，而之前则默认为1。

四

Go调度器调度过程

首先创建一个G对象，G对象保存到P本地队列或者是全局队列。

P此时去唤醒一个M。P继续执行它的执行序。

M寻找是否有空闲的P，如果有则将该G对象移动到它本身。

接下来M执行一个调度循环(调用G对象->执行->清理线程→继续找新的Goroutine执行)。

M执行过程中，随时会发生上下文切换。

当发生上下文切换时，需要对执行现场进行保护，以便下次被调度执行时进行现场恢复。

Go调度器M的栈保存在G对象上，只需要将M所需要的寄存器(SP、PC等)保存到G对象上就可以实现现场保护。当这些寄存器数据被保护起来，就随时可以做上下文切换了，在中断之前把现场保存起来。如果此时G任务还没有执行完，M可以将任务重新丢到P的任务队列，等待下一次被调度执行。当再次被调度执行时，M通过访问G的vdsoSP、vdsoPC寄存器进行现场恢复(从上次中断位置继续执行)。

后面分析函数的开头和结尾的时候可以对照。

五

连续栈

Go语言支持goroutine，每个goroutine需要能够运行，所以它们都有自己的栈。假如每个goroutine分配固定栈大小并且不能增长，太小则会导致溢出，太大又会浪费空间，无法存在许多的goroutine。

为了解决这个问题，goroutine可以初始时只给栈分配很小的空间，然后随着使用过程中的需要自动地增长。这就是为什么Go可以开千千万万个goroutine而不会耗尽内存。

Go1.3版本之后则使用的是continuous stack，下面将具体分析一下这种技术。

基本原理

每次执行函数调用时Go的runtime都会进行检测，若当前栈的大小不够用，则会触发“中断”，从当前函数进入到Go的运行时库，Go的运行时库会保存此时的函数上下文环境，然后分配一个新的足够大的栈空间，将旧栈的内容拷贝到新栈中，并做一些设置，使得当函数恢复运行时，函数会在新分配的栈中继续执行，仿佛整个过程都没发生过一样，这个函数会觉得自己使用的是一块大小“无限”的栈空间。

实际分析

Go语言和C不同，不是使用栈指针寄存器和栈基址寄存器确定函数的栈的。

在Go的运行时库中，每个goroutine对应一个结构体G，大致相当于进程控制块的概念。

这个结构体中存了stackbase和stackguard，用于确定这个goroutine使用的栈空间信息。

每个Go函数调用的前几条指令，先比较栈指针寄存器跟g->stackguard(偏移0x10)，检测是否发生栈溢出。

如果栈指针寄存器值超越了stackguard就需要扩展栈空间。

函数开头首先gs:0x28，获取到了g结构体地址

后面的g+0x10也就是g->stackguard的地址

所以是把rsp和g->stackguard的值比较

如果SP大于g->stackguard了，则会跳转到函数结尾调用runtime.morestack函数。函数开头几条指令的作用就是检测栈是否溢出。

runtime.morestack作用：

将一些信息存在M结构体中，这些信息包括当前栈桢，参数，当前函数调用，函数返回地址（两个返回地址，一个是runtime.morestack的函数地址，一个是f的返回地址）。通过这些信息可以把新栈和旧栈链起来。

void runtime.morestack() { if(g == g0) { panic(); } else { m->morebuf.gobuf_pc = getCallerCallerPC(); void *SP = getCallerSP(); m->morebuf.gobuf_sp = SP; m->moreargp = SP; m->morebuf.gobuf_g = g; m->morepc = getCallerPC(); void *g0 = m->g0; g = g0; setSP(g0->g_sched.gobuf_sp); runtime.newstack(); }}

六

函数调用(支持多返回值)

无论是 x86 还是 x86-64，都采用栈传递参数

返回值传递不通过eax/rax等寄存器，也是通过栈。

位置是最后一个参数的下面。例如最后一个参数的地址是 rsp + 0x8, 则:

第一个返回值：rsp + 0x10
第二个返回值：rsp + 0x18
以此类推

一个参数，多个返回值

package main import "fmt" func main() { var a, b int a, b = sayHello(123) a = a + 1 b = b + 2 fmt.Println(a, b)} func sayHello(a int)(i,j int){ i = a + 1 fmt.Println("execute half") j = a + 2 return}

多个参数，多个返回值

package main import ( "fmt") func main() { var a, b int a, b = sayHello(1234, 5678) a = a + 1 b = b + 2 fmt.Println(a, b)} func sayHello(a int, b int)(i,j int){ i = a + 1 fmt.Println("execute half") j = b + 2 return}

返回值都会存放到main函数的栈中，main的局部变量

七

写屏障

golang 的一种垃圾回收机制，逆向时可以认为这个表达式永真即可，不用关注runtime_gcWriteBarrier 分支。

参考：

https://tiancaiamao.gitbooks.io/go-internals/content/zh

https://blog.csdn.net/star_of_science/article/details/121802354
https://panda0s.top/2021/04/14/Golang-underlying-data-representaion/#Golang-underlying-data-representaion

https://draveness.me/golang/docs/part2-foundation/ch03-datastructure/golang-hashmap/

https://tiancaiamao.gitbooks.io/go-internals/content/zh/03.1.html
https://blog.csdn.net/further_eye/article/details/112506505#t18
https://www.1024sou.com/article/287980.html

https://segmentfault.com/a/1190000040933033

看雪ID：SYJ-Re

https://bbs.pediy.com/user-home-921830.htm

*本文由看雪论坛 SYJ-Re 原创，转载请注明来自看雪社区

# 往期推荐

1.[VNCTF2022]InterestingPHP复现

2.分析一道简单安卓中级题

3.由浅入深理解Kerberos协议

4.虎符网络安全赛道 2022-pwn-vdq-WP解题分析

5.为IDA架设私人lumen服务器

6.分析一个安卓简单CrackMe

球分享

球点赞

球在看

点击“阅读原文”，了解更多！

陈佩斯，这次真悬了！

不能返税、不能补贴，招商局长们怎么办？

2024年最佳公众号排行，不用瞎忙，关注它们你就成功了一半

微博遗存之七

刚刚，我国DUV光刻机实现里程碑式突破！

Go解析

Go源码下载：https://github.com/golang/go

main_main查找

数值类型

字符串

数组

slice切片

内存结构

make

append/copy

切片截取

map

内存结构

逆向中map的常见函数

赋值和访问

查找过程

插入过程

删除过程

G,M,P模型

在讲解函数调用之前，我们讲解一下GMP结构体。

G

M

P

Go调度器调度过程

连续栈

基本原理

实际分析

函数调用(支持多返回值)

一个参数，多个返回值

多个参数，多个返回值

您可能也对以下帖子感兴趣

陈佩斯，这次真悬了！

不能返税、不能补贴，招商局长们怎么办？

2024年最佳公众号排行，不用瞎忙，关注它们你就成功了一半

微博遗存之七

刚刚，我国DUV光刻机实现里程碑式突破！

生成图片，分享到微信朋友圈

Go解析

Go源码下载：https://github.com/golang/go

main_main查找

数值类型

字符串

数组

slice切片

内存结构

make

append/copy

切片截取

map

内存结构

逆向中map的常见函数

赋值和访问

查找过程

插入过程

删除过程

G,M,P模型

在讲解函数调用之前，我们讲解一下GMP结构体。

G

M

P

Go调度器调度过程

连续栈

基本原理

实际分析

函数调用(支持多返回值)

一个参数，多个返回值

多个参数，多个返回值

您可能也对以下帖子感兴趣