http://radford.edu/~nokie/classes/360/dp-opt-bst.html

Overview


Optimal Binary Search Trees - Problem

    • Problem:
      • Sorted set of keys k1,k2,...,knk1,k2,...,kn
      • Key probabilities: p1,p2,...,pnp1,p2,...,pn
      • What tree structure has lowest expected cost?
      • Cost of searching for node ii

        : cost(ki)=depth(ki)+1cost(ki)=depth(ki)+1

Expected Cost of tree =∑i=1ncost(ki)pi=∑i=1n(depth(ki)+1)pi=∑i=1ndepth(ki)pi+∑i=1npi=(∑i=1ndepth(ki)pi)+1Expected Cost of tree =∑i=1ncost(ki)pi=∑i=1n(depth(ki)+1)pi=∑i=1ndepth(ki)pi+∑i=1npi=(∑i=1ndepth(ki)pi)+1


Optimal BST - Example

    • Example:
      • Probability table (pipi

        is the probabilty of key kiki

        :

ii

1 2 3 4 5
kiki

k1k1

k2k2

k3k3

k4k4

k5k5

pipi

0.25 0.20 0.05 0.20 0.30

        Two BSTs

      • Given: k1<k2<k3<k4<k5k1<k2<k3<k4<k5
      • Tree 1:
        • k2/[k1,k4]/[nil,nil],[k3,k5]k2/[k1,k4]/[nil,nil],[k3,k5]
        • cost = 0(0.20) + 1(0.25+0.20) +2(0.05+0.30) + 1 = 1.15 + 1
      • Tree 2:
        • k2/[k1,k5]/[nil,nil],[k4,nil]/[nil,nil],[nil,nil],[k3,nil],[nil,nil]k2/[k1,k5]/[nil,nil],[k4,nil]/[nil,nil],[nil,nil],[k3,nil],[nil,nil]
        • cost = 0(0.20) + 1(0.25+0.30) +2(0.20) + 3(0.05) + 1 = 1.10 + 1
  • Notice that a deeper tree has expected lower cost


Optimal BST - DP Approach

    • Optimal BST TT

      must have subtree T′T′

      for keys ki…kjki…kj

      which is optimal for those keys

      • Cut and paste proof: if T′T′

        not optimal, improving it will improve TT

        , a contradiction

    • Algorithm for finding optimal tree for sorted, distinct keys ki…kjki…kj

      :

      • For each possible root krkr

        for i≤r≤ji≤r≤j

      • Make optimal subtree for ki,…,kr−1ki,…,kr−1
      • Make optimal subtree for kr+1,…,kjkr+1,…,kj
      • Select root that gives best total tree
    • Formula: e(i,j)e(i,j)

      = expected number of comparisons for optimal tree for keys ki…kjki…kj

e(i,j)={0, if i=j+1mini≤r≤j{e(i,r−1)+e(r+1,j)+w(i,j)}, if i≤je(i,j)={0, if i=j+1mini≤r≤j{e(i,r−1)+e(r+1,j)+w(i,j)}, if i≤j

  • where w(i,j)=∑k=ijpiw(i,j)=∑k=ijpi

    is the increase in cost if ki…kjki…kj

    is a subtree of a node

  • Work bottom up and remember solution


Optimal BST - Algorithm and Performance

    • Brute Force: try all tree configurations
      • Ω(4n / n3/2) different BSTs with n nodes
    • DP: bottom up with table: for all possible contiguous sequences of keys and all possible roots, compute optimal subtrees
  1. for size in 1 .. n loop -- All sizes of sequences
  2. for i in 1 .. n-size+1 loop -- All starting points of sequences
  3. j := i + size - 1
  4. e(i, j) := float'max;
  5. for r in i .. j loop -- All roots of sequence ki .. kj
  6. t := e(i, r-1) + e(r+1, j) + w(i, j)
  7. if t < e(i, j) then
  8. e(i, j) := t
  9. root(i, j) := r
  10. end if
  11. end loop
  12. end loop
  13. end loop
    • Θ(n3)
    • Can, of course, also use (memoized) recursion

http://www.geeksforgeeks.org/dynamic-programming-set-24-optimal-binary-search-tree/

Dynamic Programming | Set 24 (Optimal Binary Search Tree)

Given a sorted array keys[0.. n-1] of search keys and an array freq[0.. n-1] of frequency counts, where freq[i] is the number of searches to keys[i]. Construct a binary search tree of all keys such that the total cost of all the searches is as small as possible.

Let us first define the cost of a BST. The cost of a BST node is level of that node multiplied by its frequency. Level of root is 1.

  1. Example 1
  2. Input: keys[] = {10, 12}, freq[] = {34, 50}
  3. There can be following two possible BSTs
  4. 10 12
  5. \ /
  6. 12 10
  7. I II
  8. Frequency of searches of 10 and 12 are 34 and 50 respectively.
  9. The cost of tree I is 34*1 + 50*2 = 134
  10. The cost of tree II is 50*1 + 34*2 = 118
  11.  
  12. Example 2
  13. Input: keys[] = {10, 12, 20}, freq[] = {34, 8, 50}
  14. There can be following possible BSTs
  15. 10 12 20 10 20
  16. \ / \ / \ /
  17. 12 10 20 12 20 10
  18. \ / / \
  19. 20 10 12 12
  20. I II III IV V
  21. Among all possible BSTs, cost of the fifth BST is minimum.
  22. Cost of the fifth BST is 1*50 + 2*34 + 3*8 = 142

1) Optimal Substructure:
The optimal cost for freq[i..j] can be recursively calculated using following formula.

We need to calculate optCost(0, n-1) to find the result.

The idea of above formula is simple, we one by one try all nodes as root (r varies from i to j in second term). When we make rth node as root, we recursively calculate optimal cost from i to r-1 and r+1 to j.
We add sum of frequencies from i to j (see first term in the above formula), this is added because every search will go through root and one comparison will be done for every search.

2) Overlapping Subproblems
Following is recursive implementation that simply follows the recursive structure mentioned above.

  1. // A naive recursive implementation of optimal binary search tree problem
  2. #include <stdio.h>
  3. #include <limits.h>
  4.  
  5. // A utility function to get sum of array elements freq[i] to freq[j]
  6. int sum(int freq[], int i, int j);
  7.  
  8. // A recursive function to calculate cost of optimal binary search tree
  9. int optCost(int freq[], int i, int j)
  10. {
  11. // Base cases
  12. if (j < i) // If there are no elements in this subarray
  13. return 0;
  14. if (j == i) // If there is one element in this subarray
  15. return freq[i];
  16.  
  17. // Get sum of freq[i], freq[i+1], ... freq[j]
  18. int fsum = sum(freq, i, j);
  19.  
  20. // Initialize minimum value
  21. int min = INT_MAX;
  22.  
  23. // One by one consider all elements as root and recursively find cost
  24. // of the BST, compare the cost with min and update min if needed
  25. for (int r = i; r <= j; ++r)
  26. {
  27. int cost = optCost(freq, i, r-1) + optCost(freq, r+1, j);
  28. if (cost < min)
  29. min = cost;
  30. }
  31.  
  32. // Return minimum value
  33. return min + fsum;
  34. }
  35.  
  36. // The main function that calculates minimum cost of a Binary Search Tree.
  37. // It mainly uses optCost() to find the optimal cost.
  38. int optimalSearchTree(int keys[], int freq[], int n)
  39. {
  40. // Here array keys[] is assumed to be sorted in increasing order.
  41. // If keys[] is not sorted, then add code to sort keys, and rearrange
  42. // freq[] accordingly.
  43. return optCost(freq, 0, n-1);
  44. }
  45.  
  46. // A utility function to get sum of array elements freq[i] to freq[j]
  47. int sum(int freq[], int i, int j)
  48. {
  49. int s = 0;
  50. for (int k = i; k <=j; k++)
  51. s += freq[k];
  52. return s;
  53. }
  54.  
  55. // Driver program to test above functions
  56. int main()
  57. {
  58. int keys[] = {10, 12, 20};
  59. int freq[] = {34, 8, 50};
  60. int n = sizeof(keys)/sizeof(keys[0]);
  61. printf("Cost of Optimal BST is %d ", optimalSearchTree(keys, freq, n));
  62. return 0;
  63. }

Output:

  1. Cost of Optimal BST is 142

Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems again and again. We can see many subproblems being repeated in the following recursion tree for freq[1..4].

Since same suproblems are called again, this problem has Overlapping Subprolems property. So optimal BST problem has both properties (see thisand this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be avoided by constructing a temporary array cost[][] in bottom up manner.

Dynamic Programming Solution
Following is C/C++ implementation for optimal BST problem using Dynamic Programming. We use an auxiliary array cost[n][n] to store the solutions of subproblems. cost[0][n-1] will hold the final result. The challenge in implementation is, all diagonal values must be filled first, then the values which lie on the line just above the diagonal. In other words, we must first fill all cost[i][i] values, then all cost[i][i+1] values, then all cost[i][i+2] values. So how to fill the 2D array in such manner> The idea used in the implementation is same as Matrix Chain Multiplication problem, we use a variable ‘L’ for chain length and increment ‘L’, one by one. We calculate column number ‘j’ using the values of ‘i’ and ‘L’.

  1. // Dynamic Programming code for Optimal Binary Search Tree Problem
  2. #include <stdio.h>
  3. #include <limits.h>
  4.  
  5. // A utility function to get sum of array elements freq[i] to freq[j]
  6. int sum(int freq[], int i, int j);
  7.  
  8. /* A Dynamic Programming based function that calculates minimum cost of
  9. a Binary Search Tree. */
  10. int optimalSearchTree(int keys[], int freq[], int n)
  11. {
  12. /* Create an auxiliary 2D matrix to store results of subproblems */
  13. int cost[n][n];
  14.  
  15. /* cost[i][j] = Optimal cost of binary search tree that can be
  16. formed from keys[i] to keys[j].
  17. cost[0][n-1] will store the resultant cost */
  18.  
  19. // For a single key, cost is equal to frequency of the key
  20. for (int i = 0; i < n; i++)
  21. cost[i][i] = freq[i];
  22.  
  23. // Now we need to consider chains of length 2, 3, ... .
  24. // L is chain length.
  25. for (int L=2; L<=n; L++)
  26. {
  27. // i is row number in cost[][]
  28. for (int i=0; i<=n-L+1; i++)
  29. {
  30. // Get column number j from row number i and chain length L
  31. int j = i+L-1;
  32. cost[i][j] = INT_MAX;
  33.  
  34. // Try making all keys in interval keys[i..j] as root
  35. for (int r=i; r<=j; r++)
  36. {
  37. // c = cost when keys[r] becomes root of this subtree
  38. int c = ((r > i)? cost[i][r-1]:0) +
  39. ((r < j)? cost[r+1][j]:0) +
  40. sum(freq, i, j);
  41. if (c < cost[i][j])
  42. cost[i][j] = c;
  43. }
  44. }
  45. }
  46. return cost[0][n-1];
  47. }
  48.  
  49. // A utility function to get sum of array elements freq[i] to freq[j]
  50. int sum(int freq[], int i, int j)
  51. {
  52. int s = 0;
  53. for (int k = i; k <=j; k++)
  54. s += freq[k];
  55. return s;
  56. }
  57.  
  58. // Driver program to test above functions
  59. int main()
  60. {
  61. int keys[] = {10, 12, 20};
  62. int freq[] = {34, 8, 50};
  63. int n = sizeof(keys)/sizeof(keys[0]);
  64. printf("Cost of Optimal BST is %d ", optimalSearchTree(keys, freq, n));
  65. return 0;
  66. }

Output:

  1. Cost of Optimal BST is 142

Notes
1) The time complexity of the above solution is O(n^4). The time complexity can be easily reduced to O(n^3) by pre-calculating sum of frequencies instead of calling sum() again and again.

2) In the above solutions, we have computed optimal cost only. The solutions can be easily modified to store the structure of BSTs also. We can create another auxiliary array of size n to store the structure of tree. All we need to do is, store the chosen ‘r’ in the innermost loop.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

DP Intro - OBST的更多相关文章

  1. DP Intro - poj 2342 Anniversary party

    今天开始做老师给的专辑,打开DP专辑 A题 Rebuilding Roads 直接不会了,发现是树形DP,百度了下了该题,看了老半天看不懂,想死的冲动都有了~~~~ 最后百度了下,树形DP入门,找到了 ...

  2. DP Intro - poj 1947 Rebuilding Roads

    算法: dp[i][j]表示以i为根的子树要变成有j个节点的状态需要减掉的边数. 考虑状态转移的时候不考虑i的父亲节点,就当不存在.最后统计最少减去边数的 时候+1. 考虑一个节点时,有两种选择,要么 ...

  3. DP Intro - poj 1947 Rebuilding Roads(树形DP)

    版权声明:本文为博主原创文章,未经博主允许不得转载. Rebuilding Roads Time Limit: 1000MS   Memory Limit: 30000K Total Submissi ...

  4. DP Intro - Tree DP Examples

    因为上次比赛sb地把一道树形dp当费用流做了,受了点刺激,用一天时间稍微搞一下树形DP,今后再好好搞一下) 基于背包原理的树形DP poj 1947 Rebuilding Roads 题意:给你一棵树 ...

  5. DP Intro - Tree POJ2342 Anniversary party

    POJ 2342 Anniversary party (树形dp 入门题) Anniversary party Time Limit: 1000MS   Memory Limit: 65536K To ...

  6. DP Intro - Tree DP

    二叉苹果树 题目 有一棵苹果树,如果树枝有分叉,一定是分2叉(就是说没有只有1个儿子的结点) 这棵树共有N个结点(叶子点或者树枝分叉点),编号为1-N,树根编号一定是1. 我们用一根树枝两端连接的结点 ...

  7. BZOJ 1911: [Apio2010]特别行动队 [斜率优化DP]

    1911: [Apio2010]特别行动队 Time Limit: 4 Sec  Memory Limit: 64 MBSubmit: 4142  Solved: 1964[Submit][Statu ...

  8. 2013 Asia Changsha Regional Contest---Josephina and RPG(DP)

    题目链接 http://acm.hdu.edu.cn/showproblem.php?pid=4800 Problem Description A role-playing game (RPG and ...

  9. AEAI DP V3.7.0 发布,开源综合应用开发平台

    1  升级说明 AEAI DP 3.7版本是AEAI DP一个里程碑版本,基于JDK1.7开发,在本版本中新增支持Rest服务开发机制(默认支持WebService服务开发机制),且支持WS服务.RS ...

随机推荐

  1. 【C#】CLR内存那点事(string)

    string是比特殊的类,说引用类型,但不存在堆里面,而且String str=new String("HelloWorld")这样的重装也说没有的. 我们先来看一个方法 clas ...

  2. CentOS 用户/组与权限

    useradd:添加用户 useradd abc,默认添加一个abc组 vipw:查看系统中用户 groupadd:添加组groupadd ccna vigr:查看系统中的组 gpasswd:将用户a ...

  3. [转载] C++ namespaces 使用

    原地址:http://blog.sina.com.cn/s/blog_986c99d601010hiv.html 命名空间(namespace)是一种描述逻辑分组的机制,可以将按某些标准在逻辑上属于同 ...

  4. kali linux之netcat

    网络工具中的瑞士军刀----体积小,功能强大 侦听模式/传输模式 telnet/获取banner信息 传输文本信息,文件,目录 加密传输文件,远程控制/木马,加密所有流量(来做远程控制是非常理想的选择 ...

  5. 【noip2017】【Luogu3960】列队 线段树

    题目描述 Sylvia 是一个热爱学习的女♂孩子. 前段时间,Sylvia 参加了学校的军训.众所周知,军训的时候需要站方阵. Sylvia 所在的方阵中有 n \times mn×m 名学生,方阵的 ...

  6. nginx负载均衡fair模块安装和配置

    nginx-upstream-fair-master fair模块源码 官方github下载地址:https://github.com/gnosek/nginx-upstream-fair说明:如果从 ...

  7. (转)2-SAT小结

    2-sat小结 原文作者:老K 原文传送门 2-sat是什么 一类问题是这样的: (两个符号的意思 \(\lor \ or,\land \ and\)) 有n个布尔变量,现在对它们做出限制,比如\(a ...

  8. bit、Byte、bps、Bps、pps、Gbps的单位的说明及换算

    一.bit与Byte区别 1. bit(比特) 是电脑记忆体的最小单元,在二进制计算机中,每一比特代表0或1的数位信号. 2. Byte(单位字节) 一般表示存储介质大小的单位,数字.字母和特殊符号占 ...

  9. JS 方法注入 attachEvent

    写法1: <html> <head> <title></title> <script language="javascript" ...

  10. JUC包下CountDownLatch学习笔记

    CountDownLatch的作用是能使用多个线程进来之后,且线程任务执行完毕之后,才执行, 闭锁(Latch):一种同步方法,可以延迟线程的进度直到线程到达某个终点状态.通俗的讲就是,一个闭锁相当于 ...