Skip to content

REVIEW 9 | Query Optimization

:material-circle-edit-outline: 约 606 个字 :material-clock-time-two-outline: 预计阅读时间 2 分钟

查询处理学了,继续学查询优化

这两章是考试重点,必考

  1. 典型的算子的算法,必考
  2. 关系代数表达式等价规则,考小题
  3. 规模估算,必考

An evaluation plan defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated

image-20240514181706218

Generating Equivalent Expressions

等价变换规则有一堆,得记忆

  1. Conjunctive selection operations can be deconstructed into a sequence of individual selections.

image-20240514182700845

  1. Selection operations are commutative(可交换的).

image-20240514182715201

  1. Only the last in a sequence of projection operations is needed, the others can be omitted(可省略的).

image-20240514182842338

  1. Selections can be combined with Cartesian products and theta joins
    • 因为我们希望尽量将选择转换为链接操作

image-20240514182908325

  1. Theta-join operations and natural joins are commutative(可交换的).

image-20240514183018878

  1. associative

    1. Natural join operations are associative(可结合的),可以左旋右旋(雾 1. image-20240514183103652
    2. Theta joins are associative in the following manner: 1. image-20240514183115866
  2. The selection operation distributes(分配)over the theta join operation under the following two conditions:

    1. When all the attributes in \(\theta_0\) involve only the attributes of one of the expressions (\(E_1\)) being joined.

image-20240514183216960

​ ii. When \(\theta_1\) involves only the attributes of \(E_1\) and \(\theta_2\) involves only the attributes of \(E_2\).

image-20240514183423653

  1. The projection operation distributes(分配) over the theta join operation as follows:
    1. if \(\theta\) involves only attributes from \(L_1 \cup L_2:\)

image-20240514183822733

image-20240514191428803

怎么这么多啊我超

image-20240514191510688

Statistics for Cost Estimation

下面这些信息对于估算有用

  • \(n_r\) 关系 \(r\) 中的元组个数
  • \(b_r\) 关系 \(r\) 占用的块个数
  • \(l_r\) 关系 \(r\) 有多长(大小)
  • \(f_r\) 关系 \(r\) 占用的块一个能放多少记录
  • \(V(A,r)\) 关系 \(r\) 在某个属性上有几个值
    • 比如性别有两个值

image-20240614110331408

Selection Size Estimation

没有特别说明就将属性值当成均匀分布

有多属性筛选就当成独立分布去计算

selectivity 中选率,即一个条件选出的tuple占总的多少

交集就是各中选率相乘

并集就是各中选率被1减后相乘,再整体被减1

Estimation of the Size of Joins

分情况

如果两者相交是A集合的key,则 join 得到的 tuple 数量小于等于B集合

两者相交是B集合关于A集合的 foreign key,则等于B集合

两者相交不是 key,则根据交的属性值域得到两个值,取小者

确切说是A和Btuple数量相乘,除以交属性值域

image-20240514194449445

Estimation of Number of Distinct Values

中间结果在某个属性上有几个值,比如 \(V(A,\sigma_\theta(r))\)

Choice of Evaluation Plans

我们下面的优化都是 Cost-based Optimizer

Join Order Optimization Algorithm