Hash Tables
Expected number of probes for Open-addressed hash table
Given an open-addressed hash table with load factor $\alpha=n/m<1$, the expected number of probes in an unsuccessful search is at most $\displaystyle\frac{1}{1-\alpha}$.
Prove Consider that $n$ keys have been mapped into $m$ slots in a hash table, then the load factor $\alpha$ is $\displaystyle\frac{n}{m}$.
- For the first probe, the conflict occurs with probability $\frac{n}{m}$, then the second probe is needed;
- For the second probe, the conflict occurs with probability $\displaystyle\frac{n-1}{m-1}$ since the key in the occupied slot in the first probe is excluded, then the third probe is needed;
- Similarly, for the i-th probe, the conflict occurs with probability $\displaystyle\frac{n-(i-1)}{m-(i-1)}$, then the (i+1)-th probe is needed.
Thus, the expected number of probes could be calculated as follow:
Tree
HEIGHT OF A RANDOMLY BUILT BINARY SEARCH TREE
Prove Let $X_n$ be the height of a randomly built binary search tree on $n$ nodes, then $Y_n$ be the exponential height of the tree. Specifically, $Y_n=2^{X_n}$. If the root node has rank $k$, then we get
since the height of the root is always one larger than the maximum height of its children. Hence we get
Let indicator variable $Z_{nk}$ indicate whether $X_k$ is the root. Thus,
So the exponential height $Y_n$ could be written as
Then the expected value of $Y_n$ is expressed as follow
We can prove that $\mathbb{E}[Y_n]\leqslant cn^3$. According to $f(\mathbb{E}[X])\leqslant \mathbb{E}(f(X))$, since $Y_n=f(X_n)=2^{X_n}$, we get
Thus, we could conclude that
HEIGHT BOUND
Theorem $n$-nodes RB tree has a height $h<2\log_2(n+1)$.
Prove We claim that a subtree rooted at $x$ contains at least $2^{bh(x)}-1$ internal nodes. Here $bh(x)$ denotes the black height of subtree rooted at $x$.
Base step: If $x$ is root, then $bh(x)=1$, the tree has $2^{bh(x)}-1=1$ nodes.
Inductive step: If $x$ is not root, then consider the number of internal nodes of the tree rooted as $x$. Apparently, we have
That is to say, $bh(x) \geqslant bh(x.left)$
Thus, the number of internal nodes rooted at $x$ (marked as $N_x$) is
Moreover, the height $h$ of the subtree rooted at $x$ satisfies that
Since the claim”$n \geqslant 2^{bh(x)}-1$”, we get $bh(x) \leqslant \log_2(n+1)$, hence
Ultimately we get $h\leqslant 2\log_2(n+1)$.
Skip List
The optimal height of a skip list
Theorem The optimal height of a $n$-elements skip list is $\log_2n$.
Prove For the worst case, searching an element cost times:
Thus,
The equation holds when $\displaystyle|L_1|=\frac{|L_2|}{|L_1|}=\cdots=\frac{|L_k|}{|L_{k-1}|} = n^{\frac{1}{k}}$. Moreover,
The ideal value of $k$ is $\log_2n$, so that $T=2\log_2n$.
The height of a skip list
Theorem With high probability, an n-elements skip list has $\mathcal{O}(\log_2n)$ levels.
Prove We denote an n-elements skip list has at most $c\log_2n$ levels as Event $A$. Then we get
Then we get that A occurs with high probability at least $\displaystyle1-\frac{1}{n^{c-1}}$.
The time cost of search in skip list
Theorem With high probability, each search in a n-elements skip list cost $\mathcal{O}(\log_2n)$.
Prove We prove this by “search backward”. Specifically, we search the element $x$ by starting at $x$ and stopping at $-\infin$. When travelling through $y$ in the searching path
- if $y$ is promoted (HEAD), we then go up;
- if $y$ is not promoted (TAIL), we then go left.
Since an n-elements skip list has $\mathcal{O}(\log_2n)$ levels with high probability, we go up for at most $c\log_2n$ times. Next we demonstrate that the flipping times is $\Theta(\log_2n)$ util $c\log_2n$ HEADs.
Specifically, we make $10c\log_2n$ times of flipping coin. Thus,
That is to say, $10c\log_2n$ times of flipping coin results in at most $c\log_2n$ HEADs with high probability at least $\displaystyle\frac{1}{n^{c(9-\log_2(10e))}}$. Generally, we get the flipping times is $\mathcal{O}(\log_2n)$.
Moreover, getting $c\log_2n$ HEADs needs at least $c\log_2n$ times of flipping, i.e. the flipping times is $\mathcal{\Omega}(\log_2n)$.
In brief, the times of flipping coin is $\Theta(\log_2n)$.