B-Tree索引代码流程分析

专栏内容：

postgresql内核源码分析
手写数据库toadb
并发编程

开源贡献：

toadb开源库

个人主页：我的主页
管理社区：开源数据库
座右铭：天行健，君子以自强不息；地势坤，君子以厚德载物.

概述

在postgresql最常用的索引就是btree，它支持范围和等值查询。

本文主要介绍btree的代码的入口，接口定义，主要涉及索引的查询，插入，删除，和数据的清理操作。

前言

索引是为了更快的找到实际数据表中的数据，那么索引键值就非常小，可以一次性从磁盘读取大量的索引数据。
但是有些索引值中存储了实际数据，与数据是一一对应的，就是密集型索引，而有一些索引并不存储实际数据，而是存储范围内的最大最小值，此类型索引叫做稀疏索引；

对于密集型索引，如主键，直接可以得到对应的数据位置或对应列的数据，btree算法就可以支持此类型的索引；
而稀疏索引，查到索引后，需要再遍历数据表，或者二级索引才能命中目标数据。

代码入口

postgresql中为了代码的解耦，定义了索引操作的结构体，基成员是一组统一的操作和标识选项；
对于btree的定义如下，可以在这里找到btree索引的操作接口名称，在实际实用的只是调用结构体的成员，也就是函数指针。

/*
 * Btree handler function: return IndexAmRoutine with access method parameters
 * and callbacks.
 */
Datum
bthandler(PG_FUNCTION_ARGS)
{
	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);

	amroutine->amstrategies = BTMaxStrategyNumber;
	amroutine->amsupport = BTNProcs;
	amroutine->amoptsprocnum = BTOPTIONS_PROC;
	amroutine->amcanorder = true;
	amroutine->amcanorderbyop = false;
	amroutine->amcanbackward = true;
	amroutine->amcanunique = true;
	amroutine->amcanmulticol = true;
	amroutine->amoptionalkey = true;
	amroutine->amsearcharray = true;
	amroutine->amsearchnulls = true;
	amroutine->amstorage = false;
	amroutine->amclusterable = true;
	amroutine->ampredlocks = true;
	amroutine->amcanparallel = true;
	amroutine->amcaninclude = true;
	amroutine->amusemaintenanceworkmem = false;
	amroutine->amsummarizing = false;
	amroutine->amparallelvacuumoptions =
		VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
	amroutine->amkeytype = InvalidOid;

	amroutine->ambuild = btbuild;
	amroutine->ambuildempty = btbuildempty;
	amroutine->aminsert = btinsert;
	amroutine->ambulkdelete = btbulkdelete;
	amroutine->amvacuumcleanup = btvacuumcleanup;
	amroutine->amcanreturn = btcanreturn;
	amroutine->amcostestimate = btcostestimate;
	amroutine->amoptions = btoptions;
	amroutine->amproperty = btproperty;
	amroutine->ambuildphasename = btbuildphasename;
	amroutine->amvalidate = btvalidate;
	amroutine->amadjustmembers = btadjustmembers;
	amroutine->ambeginscan = btbeginscan;
	amroutine->amrescan = btrescan;
	amroutine->amgettuple = btgettuple;
	amroutine->amgetbitmap = btgetbitmap;
	amroutine->amendscan = btendscan;
	amroutine->ammarkpos = btmarkpos;
	amroutine->amrestrpos = btrestrpos;
	amroutine->amestimateparallelscan = btestimateparallelscan;
	amroutine->aminitparallelscan = btinitparallelscan;
	amroutine->amparallelrescan = btparallelrescan;

	PG_RETURN_POINTER(amroutine);
}

我们首先来看索引的基本操作，查询btgettuple，插入btinsert和删除。

索引查询

索引查询的调用栈

ExecIndexScan

在执行计划中会有索引查询的节点，如ExecIndexScan，发起索引查询，通过索引查找到数据表的tuple;

-> IndexNext

返回数据表的tuple，如果是稀疏索引，此处会进行二次查找;

-> index_getnext_slot

返回数据表的tuple，此处会使用索引找到的tid，在数据表中查找，并检查可见性，如果不可见，那继续查找下一条;

-> index_getnext_tid

返回索引键中的记录的tid;

->btgettuple

在索引中查找, 通过遍历比较，命中查找键对应的索引项

查找索引数据的基本流程

索引的查找大致分为两个步骤：

找到起始点，也就是查找键值
从起始点开始扫描，返回符合条件的索引项

代码分析

索引的查询入口函数是 btgettuple，下面是它的实现；

bool
btgettuple(IndexScanDesc scan, ScanDirection dir)
{
	BTScanOpaque so = (BTScanOpaque) scan->opaque;
	bool		res;

	/* btree indexes are never lossy */
	scan->xs_recheck = false;

	/*
	 * If we have any array keys, initialize them during first call for a
	 * scan.  We can't do this in btrescan because we don't know the scan
	 * direction at that time.
	 */
	if (so->numArrayKeys && !BTScanPosIsValid(so->currPos))
	{
		/* punt if we have any unsatisfiable array keys */
		if (so->numArrayKeys < 0)
			return false;

		_bt_start_array_keys(scan, dir);
	}

	/* This loop handles advancing to the next array elements, if any */
	do
	{
		/*
		 * If we've already initialized this scan, we can just advance it in
		 * the appropriate direction.  If we haven't done so yet, we call
		 * _bt_first() to get the first item in the scan.
		 */
		if (!BTScanPosIsValid(so->currPos))
			res = _bt_first(scan, dir);		
		else
		{
			/*
			 * Check to see if we should kill the previously-fetched tuple.
			 */
			if (scan->kill_prior_tuple)
			{
				/*
				 * Yes, remember it for later. (We'll deal with all such
				 * tuples at once right before leaving the index page.)  The
				 * test for numKilled overrun is not just paranoia: if the
				 * caller reverses direction in the indexscan then the same
				 * item might get entered multiple times. It's not worth
				 * trying to optimize that, so we don't detect it, but instead
				 * just forget any excess entries.
				 */
				if (so->killedItems == NULL)
					so->killedItems = (int *)
						palloc(MaxTIDsPerBTreePage * sizeof(int));
				if (so->numKilled < MaxTIDsPerBTreePage)
					so->killedItems[so->numKilled++] = so->currPos.itemIndex;
			}

			/*
			 * Now continue the scan.
			 */
			res = _bt_next(scan, dir);
		}

		/* If we have a tuple, return it ... */
		if (res)
			break;
		/* ... otherwise see if we have more array keys to deal with */
	} while (so->numArrayKeys && _bt_advance_array_keys(scan, dir));

	return res;
}

初始化查找点；从代码来看，进入循环后，先 BTScanPosIsValid(so->currPos) 判断currPos是否有效，也就是查找点是否已经初始化；如果没有初始化，则调用 _bt_first 进行初始化；
扫描索引项；初始化查找点后，调用 _bt_next 获取一条索引项数据，找到有效索引后就会返回；

索引插入

索引插入调用栈

从insert来看，调用路径如下

ExecInsert

SQL insert语句的执行入口函数

-> ExecInsertIndexTuples

如果当前表中建有索引，在表数据tuple插入后，调用此函数插入索引，有可能存在多个索引，循环对每个索引调用下级函数进行插入；

index_insert

索引插入的公共调用接口，实际调用对应索引的插入定义接口；

btinsert

btree索引插入的操作的入口函数；在此函数中，首先拼装一个索引tuple，然后调用下级函数进行插入；

_bt_doinsert

执行索引项的插入，会经过查找位置，检查唯一性，插入等一系列流程环节；

索引插入的基本流程

索引插入的大体流程主要有以下环节：

查找索引项插入的位置，因为btree是一个有序的树，所以先要找到插入的位置，保持顺序；此时会与索引查询类似，先初始化查找键，并找到查询点；
唯一性约束的检查，如果索引中属性列都为NULL，是不进行唯一性检查的；
索引的插入环节，调用_bt_insertonpg来完成，其中会有查找空闲空间，可能会索引分裂等；

代码分析

索引插入的入函数是 btinsert，实际执行是 _bt_doinsert，下面来看一下执行的代码流程；

bool
_bt_doinsert(Relation rel, IndexTuple itup,
			 IndexUniqueCheck checkUnique, bool indexUnchanged,
			 Relation heapRel)
{
	bool		is_unique = false;
	BTInsertStateData insertstate;
	BTScanInsert itup_key;
	BTStack		stack;
	bool		checkingunique = (checkUnique != UNIQUE_CHECK_NO);

	/* we need an insertion scan key to do our search, so build one */
	itup_key = _bt_mkscankey(rel, itup);

	if (checkingunique)
	{
		if (!itup_key->anynullkeys)
		{
			/* No (heapkeyspace) scantid until uniqueness established */
			itup_key->scantid = NULL;
		}
		else
		{
			checkingunique = false;
			/* Tuple is unique in the sense that core code cares about */
			Assert(checkUnique != UNIQUE_CHECK_EXISTING);
			is_unique = true;
		}
	}

	insertstate.itup = itup;
	insertstate.itemsz = MAXALIGN(IndexTupleSize(itup));
	insertstate.itup_key = itup_key;
	insertstate.bounds_valid = false;
	insertstate.buf = InvalidBuffer;
	insertstate.postingoff = 0;

search:
	stack = _bt_search_insert(rel, heapRel, &insertstate);

	if (checkingunique)
	{
		TransactionId xwait;
		uint32		speculativeToken;

		xwait = _bt_check_unique(rel, &insertstate, heapRel, checkUnique,
								 &is_unique, &speculativeToken);

		if (unlikely(TransactionIdIsValid(xwait)))
		{
			/* Have to wait for the other guy ... */
			_bt_relbuf(rel, insertstate.buf);
			insertstate.buf = InvalidBuffer;

			if (speculativeToken)
				SpeculativeInsertionWait(xwait, speculativeToken);
			else
				XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);

			/* start over... */
			if (stack)
				_bt_freestack(stack);
			goto search;
		}

		/* Uniqueness is established -- restore heap tid as scantid */
		if (itup_key->heapkeyspace)
			itup_key->scantid = &itup->t_tid;
	}

	if (checkUnique != UNIQUE_CHECK_EXISTING)
	{
		OffsetNumber newitemoff;

		CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate.buf));

		newitemoff = _bt_findinsertloc(rel, &insertstate, checkingunique,
									   indexUnchanged, stack, heapRel);
		_bt_insertonpg(rel, heapRel, itup_key, insertstate.buf, InvalidBuffer,
					   stack, itup, insertstate.itemsz, newitemoff,
					   insertstate.postingoff, false);
	}
	else
	{
		/* just release the buffer */
		_bt_relbuf(rel, insertstate.buf);
	}

	/* be tidy */
	if (stack)
		_bt_freestack(stack);
	pfree(itup_key);

	return is_unique;
}

代码流程如下:

初始化工作；初始化查找键；
查找插入位置；调用 _bt_search_insert 进行查询到一个有足够空闲空间的叶子节点page；
检查唯一性约束；检查唯一性约束，如果有冲突事务，则等待冲突事务执行完成后，再重新查询位置，再检查唯一性约束；然后对结果的判断checkUnique != UNIQUE_CHECK_EXISTING，如果违返那么插入结束；否则执行插入动作；
索引插入；先确定插入位置，再调用_bt_insertonpg；

索引删除

索引的更新，就是删除和插入操作，这里我们来看一下索引删除的概要流程。
对于数据表的tuple的删除，数据并没有真实删除，所以对应的索引项也不会删除，那么什么时候删除索引项呢？

删除索引基本流程

在进行vacuum 或进行 prune paga时，对于HOT链都会在每个page上留下最后一个数据元组，因为同一个page内的HOT链只对应一个索引项，留下这最后一个也是为了删除索引项。
当进行vacuum 索引时，就会通过这个dead tuple找到对应的索引项，先删除索引项，再删除dead tuple。
常常说索引的性能下降了，其实就是索引膨胀导致，也就是deadtuple变多，导致待删除索引项变多，查询效率大降低，同时也会带来索引IO的增加。

代码分析

vac_bulkdel_one_index

调用通用索引处理接口；

->index_bulk_delete

这里通用索引处理接口，其中调用对应索引的处理接口，这里是调用btree索引处理；

->btbulkdelete

btree对应的批量删除接口；避免退出的影响，在开始时会注册退出的回调函数，在解除共享内存前处理善后；然后调用 btvacuumscan 对所有page进行索引删除清理。

结尾

非常感谢大家的支持，在浏览的同时别忘了留下您宝贵的评论，如果觉得值得鼓励，请点赞，收藏，我会更加努力！

作者邮箱：study@senllang.onaliyun.com
如有错误或者疏漏欢迎指出，互相学习。

注：未经同意，不得转载！

postgresql 内核源码分析 btree索引的增删查代码基本原理流程分析，索引膨胀的原因在这里

B-Tree索引代码流程分析

概述

前言

代码入口

索引查询

索引查询的调用栈

查找索引数据的基本流程

代码分析

索引插入

索引插入调用栈

索引插入的基本流程

代码分析

索引删除

删除索引基本流程

代码分析

结尾

相关文章

嵌入式软件英语

菜鸟教程《Python 3 教程》笔记（8）：字典

时序预测 | MATLAB实现Attention-GRU时间序列预测(注意力机制融合门控循环单元,TPA-GRU)

【Vue2.0源码学习】生命周期篇-初始化阶段(initEvents)

C++设计模式_01_设计模式简介（多态带来的便利；软件设计的目标：复用）

电商数据接口API：品牌价格监控与数据分析的重要工具

Python绘图系统10：在父组件中使用子组件的函数

【算法训练-双指针】最长无重复子串（数组）