这些天看了一些关于采样矩阵(大概是这么翻译的)的论文,简单做个总结。

  • FAST MONTE CARLO ALGORITHMS FOR MATRICES I: APPROXIMATING MATRIX MULTIPLICATION

算法如下:

目的是为了毕竟矩阵的乘积AB, 以CR来替代。

其中右上角带有i_t的A表示A的第i_t列,右下角带有i_t的B表示B的第i_t行。

关于 c 的选择,以及误差的估计,请回看论文。

下面是一个小小的测试:

代码:

  1. import numpy as np
  2. def Generate_P(A, B): #生成概率P
  3. try:
  4. n1 = len(A[1,:])
  5. n2 = len(B[:,1])
  6. if n1 == n2:
  7. n = n1
  8. else:
  9. print('Bad matrices')
  10. return 0
  11. except:
  12. print('The matrices are not fit...')
  13. A_New = np.square(A)
  14. B_New = np.square(B)
  15. P_A = np.array([np.sqrt(np.sum(A_New[:,i])) for i in range(n)])
  16. P_B = np.array([np.sqrt(np.sum(B_New[i,:])) for i in range(n)])
  17. P = P_A * P_B / (np.sum(P_A * P_B))
  18. return P
  19. def Generate_S(n, c, P): #生成采样矩阵S 简化了一下算法
  20. S = np.zeros((n, c))
  21. T = np.random.choice(np.array([i for i in range(n)]), size = c, replace = True, p = P)
  22. for i in range(c):
  23. S[T[i], i] = 1 / np.sqrt(c * P[T[i]])
  24. return S
  25. def Summary(times, n, c, P, A_F, B_F, AB): #总结和分析
  26. print('{0:^15} {1:^15} {2:^15} {3:^15} {4:^15} {5:^15} {6:^15}'.format('A_F', 'B_F', 'NEW_F', 'A_F * B_F', 'AB_F', 'RATIO', 'RATIO2'))
  27. print('{0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15}'.format(''))
  28. A_F_B_F = A_F * B_F
  29. AB_F = np.sqrt(np.sum(np.square(AB)))
  30. Max = -1
  31. Min = 99999999999
  32. Max2 = -1
  33. Min2 = 99999999999
  34. Max_NEW_F = 0
  35. Min_NEW_F = 0
  36. Mean_NEW_F = 0
  37. Mean_ratio = 0
  38. Mean_ratio2 = 0
  39. for i in range(times):
  40. S = Generate_S(n, c, P)
  41. CR = np.dot(A.dot(S), (S.T).dot(B))
  42. NEW = AB - CR
  43. NEW_F = np.sqrt(np.sum(np.square(NEW)))
  44. ratio = NEW_F / A_F_B_F
  45. ratio2 = NEW_F / AB_F
  46. Mean_NEW_F += NEW_F
  47. Mean_ratio += ratio
  48. Mean_ratio2 += ratio2
  49. if ratio > Max:
  50. Max = ratio
  51. Max2 = ratio2
  52. Max_NEW_F = NEW_F
  53. if ratio < Min:
  54. Min = ratio
  55. Min2 = ratio2
  56. Min_NEW_F = NEW_F
  57. print('{0:^15.5f} {1:^15.5f} {2:^15.5f} {3:^15.5f} {4:^15.5f} {5:^15.3%} {6:^15.3%}'.format(A_F, B_F, NEW_F, A_F_B_F, AB_F, ratio, ratio2))
  58. Mean_NEW_F = Mean_NEW_F / times
  59. Mean_ratio = Mean_ratio / times
  60. Mean_ratio2 = Mean_ratio2 / times
  61. print('{0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15}'.format(''))
  62. print('{0:^15.5f} {1:^15.5f} {2:^15.5f} {3:^15.5f} {4:^15.5f} {5:^15.3%} {6:^15.3%}'.format(A_F, B_F, Mean_NEW_F, A_F_B_F, AB_F, Mean_ratio, Mean_ratio2))
  63. print('{0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15}'.format(''))
  64. print('Count: {0} times'.format(times))
  65. print('Max_ratio: {0:<15.3%} Min_ratio: {1:<15.3%}'.format(Max, Min))
  66. print('Max_ratio2: {0:<15.3%} Min_ratio2: {1:<15.3%}'.format(Max2, Min2))
  67. print('Max_NEW_F: {0:<15.5f} Min_NEW_F: {1:<15.5f}'.format(Max_NEW_F, Min_NEW_F))
  68. #下面是关于矩阵行列的一些参数,我是采用均匀分布产生的矩阵
  69. m = 47
  70. n = 120
  71. p = 55
  72. A = np.array([[np.random.rand() * 100 for j in range(n)] for i in range(m)])
  73. B = np.array([[np.random.rand() * 100 for j in range(p)] for i in range(n)])
  74. #构建c的一些参数 这个得参考论文
  75. Thelta = 1/4
  76. Belta = 1
  77. Yita = 1 + np.sqrt((8/Belta * np.log(1/Thelta)))
  78. e = 1/5
  79. c = int(1 / (Belta * e ** 2)) + 1
  80. P = Generate_P(A, B)
  81. #结果分析
  82. AB = A.dot(B)
  83. A_F = np.sqrt(np.sum(np.square(A)))
  84. B_F = np.sqrt(np.sum(np.square(B)))
  85. times = 1000
  86. Summary(times, n, c, P, A_F, B_F, AB)

粗略的结果:

用了原矩阵的一半的维度,代价是约17%的误差。

用正态分布生成矩阵的时候,发现,如果是标准正态分布,效果很差,我猜是由计算机舍入误差引起的,这样的采样的性能不好。当均值增加的时候,和”均匀分布“差不多,甚至更优(F范数的意义上)。

补充:

















Sampling Matrix的更多相关文章

  1. 【NLP】Conditional Language Modeling with Attention

    Review: Conditional LMs Note that, in the Encoder part, we reverse the input to the ‘RNN’ and it per ...

  2. Sampling Distributions and Central Limit Theorem in R(转)

    The Central Limit Theorem (CLT), and the concept of the sampling distribution, are critical for unde ...

  3. [LeetCode] Random Flip Matrix 随机翻转矩阵

    You are given the number of rows n_rows and number of columns n_cols of a 2D binary matrix where all ...

  4. 【RS】Sparse Probabilistic Matrix Factorization by Laplace Distribution for Collaborative Filtering - 基于拉普拉斯分布的稀疏概率矩阵分解协同过滤

    [论文标题]Sparse Probabilistic Matrix Factorization by Laplace Distribution for Collaborative Filtering  ...

  5. 470. Implement Rand10() Using Rand7() (拒绝采样Reject Sampling)

    1. 问题 已提供一个Rand7()的API可以随机生成1到7的数字,使用Rand7实现Rand10,Rand10可以随机生成1到10的数字. 2. 思路 简单说: (1)通过(Rand N - 1) ...

  6. [Python] 01 - Number and Matrix

    故事背景 一.大纲 如下,chapter4 是个概览,之后才是具体讲解. 二. 编译过程 Ref: http://www.dsf.unica.it/~fiore/LearningPython.pdf

  7. 目录:Matrix Differential Calculus with Applications in Statistics and Econometrics,3rd_[Magnus2019]

    目录:Matrix Differential Calculus with Applications in Statistics and Econometrics,3rd_[Magnus2019] Ti ...

  8. 【论文笔记】SamWalker: Social Recommendation with Informative Sampling Strategy

    SamWalker: Social Recommendation with Informative Sampling Strategy Authors: Jiawei Chen, Can Wang, ...

  9. angular2系列教程(十一)路由嵌套、路由生命周期、matrix URL notation

    今天我们要讲的是ng2的路由的第二部分,包括路由嵌套.路由生命周期等知识点. 例子 例子仍然是上节课的例子:

随机推荐

  1. spring4笔记----spring生命周期属性

    init-method : 指定bean的初始化方法-spring容器会在bean的依赖关系注入完成后调用该方法 destroy-method :指定bean销毁之前的方法-spring容器将会在销毁 ...

  2. php 计算出一年中每周的周一日期

    最近接到一个任务,归纳起来,就是:要算出每年当中,每周的周一日期.想了一会,看了下date函数,深入了解了一下date函数各个参数的含义之后,终于把这道题做出来了! 在date()函数中,有一个参数对 ...

  3. Fetch请求后台的数据

    <style> #btn{ width: 50px; height: 50px; background-color: red; } #output{ width: 100px; heigh ...

  4. Kafka 0.11新功能介绍:空消费组延迟rebalance

    Kafka 0.11新功能介绍:空消费组延迟rebalance 在0.11之前的版本中,多个consumer实例加入到一个空消费组将导致多次的rebalance,这是由于每个consumer inst ...

  5. A - Packets 贪心

    A factory produces products packed in square packets of the same height h and of the sizes 1*1, 2*2, ...

  6. c# base64编码解码

    1.base64转pdf

  7. 使用chrome远程调试设备及调试模拟器设备

    使用chrome开发工具远程在Android上远程调试 准备工作 开始远程调试之前,需要做好如下准备: 在你电脑上安装Chrome 32 或者更新的版本 一根连接Android设备的USB线 手机系统 ...

  8. PHP操作Redis常用技巧总结

    一.Redis连接与认证 //连接参数:ip.端口.连接超时时间,连接成功返回true,否则返回false $ret = $redis->connect('127.0.0.1', 6379, 3 ...

  9. python flask里 post请求,JSON数据获取方式总结

    #!flask/bin/python #encodig=utf-8 # _*_ coding:utf-8 _*_ # Writer : byz # dateTime : 2016-08-05 from ...

  10. pytorch例子学习——TRANSFER LEARNING TUTORIAL

    参考:https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html 以下是两种主要的迁移学习场景 微调convnet : ...