玩命加载中 . . .

算法分析与设计常见理论证明


Hash Tables

Expected number of probes for Open-addressed hash table

Given an open-addressed hash table with load factor $\alpha=n/m<1$, the expected number of probes in an unsuccessful search is at most $\displaystyle\frac{1}{1-\alpha}$.

Prove Consider that $n$ keys have been mapped into $m$ slots in a hash table, then the load factor $\alpha$ is $\displaystyle\frac{n}{m}$.

  • For the first probe, the conflict occurs with probability $\frac{n}{m}$, then the second probe is needed;
  • For the second probe, the conflict occurs with probability $\displaystyle\frac{n-1}{m-1}$ since the key in the occupied slot in the first probe is excluded, then the third probe is needed;
  • Similarly, for the i-th probe, the conflict occurs with probability $\displaystyle\frac{n-(i-1)}{m-(i-1)}$, then the (i+1)-th probe is needed.

Thus, the expected number of probes could be calculated as follow:

Tree

HEIGHT OF A RANDOMLY BUILT BINARY SEARCH TREE

Prove Let $X_n$ be the height of a randomly built binary search tree on $n$ nodes, then $Y_n$ be the exponential height of the tree. Specifically, $Y_n=2^{X_n}$. If the root node has rank $k$, then we get

since the height of the root is always one larger than the maximum height of its children. Hence we get

Let indicator variable $Z_{nk}$ indicate whether $X_k$ is the root. Thus,

So the exponential height $Y_n$ could be written as

Then the expected value of $Y_n$ is expressed as follow

We can prove that $\mathbb{E}[Y_n]\leqslant cn^3$. According to $f(\mathbb{E}[X])\leqslant \mathbb{E}(f(X))$, since $Y_n=f(X_n)=2^{X_n}$, we get

Thus, we could conclude that

HEIGHT BOUND

Theorem $n$-nodes RB tree has a height $h<2\log_2(n+1)$.

Prove We claim that a subtree rooted at $x$ contains at least $2^{bh(x)}-1$ internal nodes. Here $bh(x)$ denotes the black height of subtree rooted at $x$.

Base step: If $x$ is root, then $bh(x)=1$, the tree has $2^{bh(x)}-1=1$ nodes.

Inductive step: If $x$ is not root, then consider the number of internal nodes of the tree rooted as $x$. Apparently, we have

That is to say, $bh(x) \geqslant bh(x.left)$

Thus, the number of internal nodes rooted at $x$ (marked as $N_x$) is

Moreover, the height $h$ of the subtree rooted at $x$ satisfies that

Since the claim”$n \geqslant 2^{bh(x)}-1$”, we get $bh(x) \leqslant \log_2(n+1)$, hence

Ultimately we get $h\leqslant 2\log_2(n+1)$.

Skip List

The optimal height of a skip list

Theorem The optimal height of a $n$-elements skip list is $\log_2n$.

Prove For the worst case, searching an element cost times:

Thus,

The equation holds when $\displaystyle|L_1|=\frac{|L_2|}{|L_1|}=\cdots=\frac{|L_k|}{|L_{k-1}|} = n^{\frac{1}{k}}$. Moreover,

The ideal value of $k$ is $\log_2n$, so that $T=2\log_2n$.

The height of a skip list

Theorem With high probability, an n-elements skip list has $\mathcal{O}(\log_2n)$ levels.

Prove We denote an n-elements skip list has at most $c\log_2n$ levels as Event $A$. Then we get

Then we get that A occurs with high probability at least $\displaystyle1-\frac{1}{n^{c-1}}$.

The time cost of search in skip list

Theorem With high probability, each search in a n-elements skip list cost $\mathcal{O}(\log_2n)$.

Prove We prove this by “search backward”. Specifically, we search the element $x$ by starting at $x$ and stopping at $-\infin$. When travelling through $y$ in the searching path

  • if $y$ is promoted (HEAD), we then go up;
  • if $y$ is not promoted (TAIL), we then go left.

Since an n-elements skip list has $\mathcal{O}(\log_2n)$ levels with high probability, we go up for at most $c\log_2n$ times. Next we demonstrate that the flipping times is $\Theta(\log_2n)$ util $c\log_2n$ HEADs.

Specifically, we make $10c\log_2n$ times of flipping coin. Thus,

That is to say, $10c\log_2n$ times of flipping coin results in at most $c\log_2n$ HEADs with high probability at least $\displaystyle\frac{1}{n^{c(9-\log_2(10e))}}$. Generally, we get the flipping times is $\mathcal{O}(\log_2n)$.

Moreover, getting $c\log_2n$ HEADs needs at least $c\log_2n$ times of flipping, i.e. the flipping times is $\mathcal{\Omega}(\log_2n)$.

In brief, the times of flipping coin is $\Theta(\log_2n)$.


文章作者: 鹿卿
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 鹿卿 !
评论
  目录