欧几里德距离评价:

aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAASQAAAB4CAIAAADUoBiDAAAan0lEQVR4nO3dZ0BTV8PA8fR5rK19bZ9qa6ut2tY6aodWcYt7AFUpIFpRcQ+qgARULCigCIIMQRAQBNkbmbK3bAJhhb0JCSRk73XP+0GtigJBMVFyfh8RkpuYP9x7zr3nosArtXlt+mixfd2r//FF/GKzH6f8Fd0nzfdCcsTPMpg2RTeOKu/tUFwoeW8ABCkKGBsEyQiMDYJkBMYGQTICY4MgGYGxQZCMwNggSEZgbBAkIzA2CJIRGBsEyQiMDYJkBMYGQTKi6LEJ6fiultpK6F3S2toq78/FW6HosRGTrxhpLf4aepccOHBA3p+Lt0LRY+uNRdtaGZ+47g+9OzIzM+X9uXgrFD221mC0b0hYeJO8twNSAIoeW5kLOiA0LB1e+gq9fTA2GBskIzA2GBskI4ocGwIAv8De0DcQxgbJgiLHJgGgM874tKs7jA2SBUWOTQxARyz6sp9/UilP3tsCKQAYm21IWHq1vDcFUgQKHltLtL5NUDCMDZIFRY5NCMAjD3WLO64wNkgWFDy2LBddJ1+fwnZ5bwqkCBQ+NrRPaFgFHIyEZADGBmODZATGBmODZESBY0MEgBFrbRYYGlvHHObbWE1lDyP8w1MSS+qZDWl3vTzcQvPL2we4MttOaLxQ4NjEPNDhg7aNCksfbnyE1xTieHjT4tXa241cCr1PL/pl9hfLL9jH1ZBktp3QeAFjGyE2TEzIozhrq7Pqi37ZusYiohrne+GwWyiMDRo9GNsIsRGamtoyblj9c1z5qKN/Hk5Q53QAfT8gvQXuRkKjpcixcQHOBe0YF5bbM/w3dsVdvHL1wpHgFjGrnxystdPmgVdKA3VgAJ5QCY2KAscmZIOsi2ifjLAK2rDfh2DvHXRxuOSGAdyBznTDz0wD8tz9Mgtz83pltKHQOAFjGzG22kiDE1ZGzhEdT2Jbr7xM2ywwFkvly2hDoXECxjZibB1lodEp0UV1DCDkUJvjLjtcveARV4yFc3PQKClwbAK2IOXcubupI8UGQWNDcWOT8Bjd97X0XONgbJBsKG5sYh6jw0fdLCA7sVEg722BFIKix2YbVTLsNBsEjRkYW0l6O0AkIhGPzXgzLA6XL5b3q4LeYTC2kvR2wOkoynM7tObN7DOyCayR96uC3mGKG5uIy6i0V3dJLCkgAD6ptjzYYNtM1CcTUCjUnN827jtlIaXLFhYGe1fO+u5T1ILtx2wL5f2qoHeY4sYmZDOyLqr7ZJRU0AAAbEp7RpDO7B8/nTABtVINHfCQxJHuYSQA9ObY7dZc+OliGBs0LBjb49gQCQfPLLumNfuLKagPJ64+fihA+j1CiaA11OHgWlUYGzQsGFvJk2k2MU9MrYsz2bhs9mTU1O9/3nPJLZ8sQaR7LG5NnNUeA10YGzQchY1NLGAR4k8fCMitqH26w4hIxAOP7lzSXvjDF6jJs5dvMomtoyM8qQYYqaXep+3Pw9ig4ShsbFweo9ZH3TiqBDdomq3G/+DR1R9PRE2d9oP2tTJWL1eqP2/N8VcCXIxcy9/KtkLjA4xtcGwc8sOAc6orURMmfDb3231BWZ1UaYZKeDQ8ua+rj/1WthUaH2Bsg2MDgNic6nZ919JvJvwf6tst2m5Zpd0MeWwhNN4obGxsLr3s9iZ0RMHLsQHAwJUFme6ePfkDFAq1/pxDYh1ZyokACBqawsZGZVPiTWca+qe9KjZEwmhIT7qw+L//+QD13w+3mobEwTvcQ29MYWOjsOkJF9fbJTxqe+U6WSIWqasg0FL1P19/ivpy7hG0R3GrrLdQYVBbe9IctDQ1NTQ0NP6+5ZFST5H3Fr0lbxoboTwiPtTTMyojpe79WtyNwmYkXlR3zyjpHOpqNj69uy5kp9p3n/0PtUBJy9qtqIMJgJQTb4OxmstTolMySt+zN2mMSER8fHFgdFJ+SQt58Kpk/D48JjHA8sJ+Pb2Ten9uXKt56nJIwTh9l94sNgQpc9msvmjS/9bo7rtXO0abJBPIAJsSi1Z3Sxs6NgA4AOTe3rzo148/RE1fu+ZsYCVLJB5lbYhEJGIQ8JX+Vw+oHNe/OYrzUsYREZdWdGPlzv3oy365Df10tui5f6NVFsb5/W0Y0y5BBKDY/8gOA10L73H6Lr3pXzYuuTDIaKPe2Ys3C96rld34BCbOQ3uHa9xwsUkAYA/EnN+98jvUhP+bvkJLLwbPFUpG9TxCWk9XoI7aUVvHiOy6XhL1vXqTxgoiEXNILR2YYGcDYyMDzyTic/8m5nIYVHwvXQAAAtqSzK97OYY8Gqfv0psfs1WF6e03P+cY2jIGWyM7XDyj+rb60aAsLGGERbJ6S6NsdDb/MuWjqTN/1jQOqOC+tCc0NH4/qfKB1Ybd+h5pOV2KGdpz2K1VwQ5XT5joWqY3APDy5fFUjFeAZ1hEZmu/HDZOFt44Nm7a7cNoi0uhee/XUS0Xz6h1UzdOKMGRR/xeGsbP6eTKWZNRkz7/Zpt5SVEPR7pshOyeggJP/bU/34hv6n+/3p63pTsr1sZow1p9j04OTfhsj1wi5HI6izPum/kklxaOsGTue+y1YkOe0xV4Be1i511EGNvtQkb0Zo8/mtgA6CsPunDq5w9QqMko1BHHwtYuqZ5joCnD5arqvLUuza3kl36PD3oVY/Kixtjb2EQy7qG95c7fNri1dtOE/35VQOlocV2n45KV2jDcDYXed68RW3PhfRtTFRUVFXUVFbTvHbTKeS/LyLqxW7FUBAAhx/m04V6VIantUtEPSq3vY732k4wuNgGjNzfJWXcT6r8TUN8uMAqKqZVivIxZ9eie+b7FOy8Vsomc54/0+ABURJw7fkLlyHmHBwUDQMihYb3/+nv/qWv+eZXvykpfnTWJ7pYqmrs0zYJqaX180P3Ix/vC3gOnHON6AUs08s8PQUDCRAQYbfnqr7AiAvPxDgIXj62ONNdzeZBc3UeueVheVhDfPFav4t0ymtgQCWDU5UfbGB7Yv2vbQTQajTZCm+/drHzWwz69e+w2SQwAuTLM/tYV9JBMLqKdUks6Ka9/dwtmJ6PQWv1qTkmzlJ/uAQIm5I7KT5Mm/Af1K9omtnZgxJ/AZybYGKxYcNafhFCffTqFAnZPVbSn1xV9XeU1WofN7NPrG1Mc7zqb/GNq55eMbXr93x9jiNWNSUtwtbh0UGvz3N9OhjdmZUQ+9LewtbpsZReaTwbcN1hqhdeamXHjr4nzr8Z0DFABADxCS7r7VY3FczT+MbN2cD6/z9zBw22cns89itgQsYCDvXl2y/KfVYyMohsAYCLIQ6tf5mud8gzAvCu/kKVFaWYkGqq7Y0s6pT4Ri1zXFXLk608mTpqqeymwdMQ5bnF9dLDpie+VbDIQ5LmCuCxSXehR64yW8iiXQwa6e9Qvut4/v+Ja+kDb2LyFIg6lF99aP5zWXiKNO+wfp77S++GxvmFpdQ/u7PhynUnwNSP1Wz7eDxrHYgMJJRUeZz776LgXro8gAYBeVRz0j9bCZzSNnILrxuKJ3j2jiE3MZxfb/r5e5ZCee3YDS/A0NjXzm8l5faMbEJe/14iN1UbIMls/+WOlY/fDMWThSN9OK7l320Rn7p779Qh4bh8bkYgFrH4qR8jvSr5mtvun6Uu2/+Ve3EERjnJOYSjEYh+jM8qzh7PR+EpYxbAjfmIek8VmsKg9mELLpbNnz1qhdzuwsIc94muWBr2uPuryiolbbXN62jgASPg85gCx6xkimcYcpwt5Sh8bR8ApvbP1291/u/kUkrgAABEV6bt/ZsHRG34FNWN5jwkxAKSyACtbk1ND0jt7yiahoG3g9U8PHnVsHHJjcrjRuu9/2nMlqrKBMvLHYaDorovxvnkHgpqRVwxzAwC4ZV52h//4caX+rQYmWzRWYyNsQk1OZtjd4UTkFDT0S/OyxaSG7nu7v/1i42HP7GLyGP0fM3ANsabKH84yDG+rU7AhWqljEw/w+oIMvpt/1i41sxsAAETMHuLDY6przN0Ta1+4eRKlsaKgsrgaT/33K9TmqqKMmJiYmITU+EdtZJ5o2H1+EQDER+5GF49oDUl7n9aF8Mym/te+fExMaiBGHNPxa8TgpfoMCan1pWFWukuWKl+Izu2S6gq3kWLj9mfcvKy569eN1+PZgD+oNUQCyHVFOZWYxj76vw9Yn5GXll6AbSfIaMJOyOstr7TZPuPzvf88wDYO+rNGa68vy46JiYmJS4rJbSGw+CIA2P3NDYWJiQ+Lm2iiIa9wfxLbjLOhbVIc+I4rUsfGI3KrHXfP3HLB61EpBSASEa297MGxD2ad9Awr7RGJREKRGEFEAg6Hm2N9eOtJLXRk2eMfRESCQidDzaWTPpk06ct5/9vsmoVnsOQ9xs3FY2vdVI0TyFINRiJicsl9rxMq38yziGXwpDy2ohR7u5rsm6cT0PTCbuTTRxS0p9vpaM+btXCNvkeLhCMUiUQSsQQBAEEkIgGTyk033bH85BGb5KonP4EU2izZPO9ntbPeicTBD/cWIBIxpRsTckN58rRJS0/5FJYPiCUSsfDx6WqISFjmZXVo9aRPJk2a8u2ElXYx9X10BHRkOtvs+urLGVr2WFbvUL8RGLiGGIs1EzfZZPW0Kti1tqOK7ebumd8dsU3J6AQD9RnRpkq/zvjgIx3PsNTYlLQ0r9gSWu9Dl21b1LW2KM3R+ftxbGIeIMS7mN9ydA+Pz4sKu2uiMX3x5Xi8tEOAb83oYqOWPrQ/emDTH0e92nsFEinHvUW4qMCLx79faZf9wgAJAEAEQG/WTQsfT4er/5w4uWP7ofAOentcRDquooUFQD8BF3ljw3rVPVuXfHPw/L+xAUDv9Lu+4+xl06giWRzS0FrzH4SYo52j7qHXfveXVdiDosI6bG5gRj9AJKAv1cf2tp1TcGxebFyomcbUeZeC6qrJgEctzb13Zd9MwzAiZ2CoQ1BiaaWX/tSPjnjUEnsVbAFpqWMTMQU9mTfUZqhp6h42sbG3u2pv9bfetplTV2nvM7K09UuMLmliUzAJlldsrutpLTuJfhybRAiopQkxhSW1BDbo7yoPMZ+vfDOf0CbvX2mjia2nKsTynLam2jG7tDYgkn4Yozs93lpf6SfDIApCe/qp4vQ3V8fduGnn7GzrEVdQnhV/20l36QLNC1fsnTxT6nAEHgB0SndBtNklK5tTuxaeNH8uNsBIdDJ2d7uTUdddmedrY3fDJrqst+ctTAL3lgRFejvctL19zy+6AN+RbKKkprV/r5G1c2BCfBUdAATQKtOTCvPLu5mAQm6Nv/L9MvukBhwDAB6uItX3iKpnFbOvMsUv0NXGOzK15MUbtArasrPtdD6cYxHVPqBgh2yjGo0UcDDuuw7vUFJSUtI4ZX4np6cr9KjG9rVKmhZ2D6qffWh7wmy0L6L/3Y3898e5+PbiWIeDDlkkOhXIGbsbg3FYh44njRgbixDqoLlJef3pE+H1o3sOekWe1yXtXzUsy3l93Cf7zZTm/ASz1WtWHnbKbG1lA2ZjRpLdHqWV65R07yXV9z0LRyIEnX4XNps8H5sA525+K/xuEqYt38dq1ZQffl1yLriu6i2cR1gbeMJoj/ox88CMfiARcHrjTE5qb1TaiTbxKR50B0gJv59Yk3hzr21aG5EEgLivID/eVcMyj8VtitRT3rn0R1V9u8AXrgYRDWCjQ85vm6IR+KiX8frTpO+nUU1qI0I2hUYhkUgkCo3B5ovFXBplgEyiMFm85wauh4iNWf+wwNcQHcfq40jkPlFAbclPPPeV4QPCiLGVOimrLdy02zg0b/ipqZdJ+utTna5sX7jJo61t4Mmen0Qk4DHIZBKNxReJEICI+DwWlUQik6gcnkjy/MmCL8WG9IQfNXYJvlvQ1JYf7qyubvOop4UmGqMJgxcIOTQ6lUJjcPgSABBEzGPQKGQShc7gCAY9G7v9EebeUf0oWhtNIgGAjg2PddPX8G5k8Vsi9Q7aO/sntLE4L4ysUBpTna5pLF7rWN9BFcj7yF3W3sKV2q+KjVmfkh7ufCMouYwA3oVbvVCayxIMltk+6m8dZi0fPg1U3z+1TVnrtNO9/K7Rj5zxGfUPk83Vv1ruEttCHt1R6suxIUiFo7LmjrWrVI/qG9t7R+fV0cRvcibHm2O35uZHOV71jS3qkTyegsPnut8+Pn/Wau19+42vu4anV3f3DRom6c1PvGmyXfmkYyOTOrjc8U8WsXG6ywseuLr7B/rmd4l7MdhePk3eV5tQmrGJhqruWPKQ02wiJrs9N910q9LGM1fCKl7zLCpaW1ey65/L/jIMyC0iMEYxUTU4NiFA8FHHVu39Y43S0jXb/zhpU8wU0JkcsXhM5plfAxdfVZbk4el71y27R4Ivre1lkzm06gjby9vmKamoKU3/6bhXfmE7U8jnP3uDud2NMS43TqN3G8dVg5dHaMe/sYwNkYiFbAqt1ueyuoGenm8mjc7kiRExryX4tKmNg2Vs40Abpjfi9MWkgRp5X/c+UmyImFbXGGu86qPJW64kZza//gmLAmpn+121lXusnRIwnSwOT4pdUUQiEjCJ1Co3g3VnjC9H5NOZLD4NESVeWaR93S3c0+2i2b6tf97C9GJwHRyufBbZk/A7oi9cs7EyDqsf6K4fiDxqldhZ3INNdbBEbzbwwBR67Zq266JvQHx1bw8RDwBAECGHSm+KdD3998nDTjF4uWy0/I1lbDxqT/HNdbs3/jhzyvSvv/5xhfYJdEwbqz3aRkNp0ezvZi74ffmSRSsXzTkS0vPOx8boKvF31v7ls8Xou4UdBOmWIH8lRMTn99XX5PteO7HHyOxOZMPIP8LE12ZfW6ay+ofpn02fMXP+Wl1ji3gc0u11fKe1b0JlU3lY8IXtU+evU9LxSW0lyyE2RAKISa6HNiybPWvGvN+XL/19+fyvDnpXF9dkR7g4njkdgKX0EgIPbFy7Yu5mtJl/MR4AEZdRfltN97SphVdKeQtBitNvxqexjE3Iobal2nk4Wz/m4OEfUUHikzGpfnduPfmata3dzdhqulQnC71FvP664sD9ukFtFMKr9sO43Y+SnE5sX7T6D8diLIkzBp8NRm1exL3ImOxyKS7749HwjYnWt24+ecec7kXEVREBpTQuprS2ncKltjTlhlrfcLL2K24gs+WxM4YAenVOqPdt62eiyoh4YlttWWlyRhNdxEPa0/29blm7x8QVdzIBkAi5HZku90KSs+uI8p70kSfFXMqOjsfmuKmiE8i0VwxGCijN8Tes9m1crGoV3iWhyeuoCBp3YGwv6Uu5b7Rj5wZts3QeX/y6a9dB0EtgbM9DAMClXTu6Z/seTfMwHHkUpTFwDx/4uZp4FvQB8A5MbUDvIsWMjdSJTb2uapVFZjw//YVIRETMdeP167doWdrkSbfOyFM9qdct9HcvQSeSYGzQEBQzts5mbIyhqheWzH42UiMRiZnEWPSctb9onXLMHN21wmJuZcBZvcPbV8C7IUJDg7E9xeoVZF3e8MvUrQa+IRjm6Gbd+zN9DVSXLIWxQcOBsQEAAOD39WOCnTV+mzb5o+9WqWseMxmd0+qbfpk18/ddf97FyfeFQe8yhYxN1NpQEnJc1avySWxCdldepqP2QhRqIur1fbVS+0TY+F1hFHpjChkbvRqb5LJd1av8SWzk+uSb53//4IM3KA3GBo1IUWPL9FQ5Gd9NfXzevIDV31qfn/SGMgoqqodcDACCFDa2nLuq6AwyTQFPPYfkBsYGQTICY4MgGVHI2IhF2CRHVTssmfnsLGMupas6+Ow/xjeC8po7XuOiBGJFUVSIlX1q/avuPAZBQEFj68zBxlxT9WonP3fDWU5/fYnrqlXTtp33LawcdLEov78HV5odP0haZn5NJ0ciRshtRbk5963O6xxatsI2DQA5Xz8EvatgbP+ii3iJ5gtM/GKxLRKxSCgQCIUiCUAAAH3pIZc0F00cZPbPyud8OgQccfZttRVL5839Zv5ObRgbNDQY21NCkrDT+8ixgNQiLCbJ1U530669hx2LJSQOAAJKX2tNafYghSWY5l4eIkaoPVXlZSlejobGMDZoGIoYG78ppSTY9EA4hcJ9tsCTmNHZn3xW2zotD4cr9vnnotoa7UuO0Q0SupRjKOS8pBtWMDZoGIoYG706JufuGXQGeH4wkkvEVdkpqVzy901Kj3Nz8rR1TyQ8vZ5NxKQS2hurBsE1NOMHBODJao8wNmgkMLYnGF2Yh3oTv//2iy+mrd5jci+hhflsVUZSTvS1Q+tnDLJ4zQ6zoK6nt+GEsUEjgbE9RiXUPrBeuOCImZnO1jX7/756O69fNDDQD4AIAMDFt5ZlPPAbJDw6sbiRCUTwLxsknXEfG4/aWY0ryavuB+KnB2jk8pg01zMWxYDx74yYsKOt0EtnxgG/0poE56MGx/ZrXQyISyxvRhBpDtkoTbkFkZamOtqr5uqYR0Yl1vaT3uAe79B4NZ5jE/GYLEZzjs+5y/u2o5N4DCHy+I9QZ05MzLUzXu3g2WAks7Epx/OPHd7FeDq7ytf2xIbP56xbciK8RYJIM0Nd4aFxYM3nj02dNn3D9byi15kXh8a58Rxb9f3D+nuUfpv7zawvF/y88U4Gk//4NjGviE3M5dKJOByRKRBJ2H09LTXF5TXYFjIPkWrNHxaxvqm6+LGSktKabgaTr3AL2UMjGs+xded5Bdy2sjI9dVx984KvVY0zujqYIvDK2CDo7RvPsT3BbMBGXNL8YdrvZ6MzGqkcMYwNkg8FiA0R91XF++z9+LPPNp6Pqq1jwNgg+VCA2ADg05tqE/V3ffrNCoOgyOq2+sQof2OrWBbgwuWOIRlSiNgAYLL6Mu9umTNn3cHz/nGxvpF3zzhkKOQtwiA5UpDYgIhN6grQUp77+ZoTZkbmQR4wNkjmFCU2RMIWDISYrZz32+9qq1VMLGFskMwpSmwAiADoKLU9tGPV0i9/0thz5l4lAPBuUJAsKU5sAAAAcAEWB1ZM+0F5+Zmo9sfnPUKQrChYbEhXpr2O1moYGyQHChYb4JGzna/qn4SxQbKnaLEBwKdTSP3tRIYQwLuKQjKleLFBkJzA2CBIRmBsECQjMDYIkhEYGwTJCIwNgmQExgZBMvL/gddi8BHCND0AAAAASUVORK5CYII=" alt="" />

以经过人们一致评价的物品为坐标轴,然后将参与评价的人绘制到图上,并考察他们彼此之间的距离远近。计算出每一轴向上的差值,求平方之后再相加,最后对总和取平方根。

# -*- coding: UTF-8 -*-

#一个涉及影评者及其对几部影片评分情况的字典
critics={'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5,
'Just My Luck': 3.0, 'Superman Returns': 3.5, 'You, Me and Dupree': 2.5,
'The Night Listener': 3.0},
'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5,
'Just My Luck': 1.5, 'Superman Returns': 5.0, 'The Night Listener': 3.0,
'You, Me and Dupree': 3.5},
'Michael Phillips': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.0,
'Superman Returns': 3.5, 'The Night Listener': 4.0},
'Claudia Puig': {'Snakes on a Plane': 3.5, 'Just My Luck': 3.0,
'The Night Listener': 4.5, 'Superman Returns': 4.0,
'You, Me and Dupree': 2.5},
'Mick LaSalle': {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0,
'Just My Luck': 2.0, 'Superman Returns': 3.0, 'The Night Listener': 3.0,
'You, Me and Dupree': 2.0},
'Jack Matthews': {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0,
'The Night Listener': 3.0, 'Superman Returns': 5.0, 'You, Me and Dupree': 3.5},
'Toby': {'Snakes on a Plane':4.5,'You, Me and Dupree':1.0,'Superman Returns':4.0}} from math import sqrt #返回一个有关person1与person2的基于距离的相似度评价
def sim_distance (prefs,person1,person2):
#得到shared_items的列表
si={}
for item in prefs[person1]:
for item in prefs[person2]:
si[item]=1 #如果两者没有共同之处,则返回0
if len(si)==0: return 0 #计算所有差值的平方和
sum_of_squares=sum([ pow(prefs[person1][item]-prefs[person2][item],2)
for item in prefs[person1] if item in prefs[person2]]) return 1/(1+sqrt(sum_of_squares)) print(sim_distance(critics,'Lisa Rose','Gene Seymour'))

皮尔逊相关度评价:

Mick Lasalle为《Superman》评了3分,而Gene Seyour则评了5分,所以该影片被定位中图中的(3,5)处。在图中还可以看到一条直线。

皮尔逊相关系数是判断两组数据与某一直线拟合程度的一种度量。

通常情况下:

相关系数0.8-1.0为极强相关

0.6-0.8为强相关

0.4-0.6为中等程度相关

0.2-0.4为弱相关

0.0-0.2为极弱相关或无相关

最佳拟合线:尽可能地靠近图上的所有坐标点。

修正“夸大分值”情况。

皮尔逊积差系数:

  数学特征:

  

     其中,E数学期望,cov表示协方差

     因为μX = E(X),σX2 = E(X2) − E2(X),同样地,对于Y,可以写成

  

  当两个变量的标准差都不为零,相关系数才有定义。从柯西—施瓦茨不等式可知,相关系数不超过1. 当两个变量的线性关系增强时,相关系数趋于1或-1。当一个变量增加而另一变量也增加时,相关系数大于0。当一个变量的增加而另一变量减少时,相关系数小 于0。当两个变量独立时,相关系数为0.但反之并不成立。 这是因为相关系数仅仅反映了两个变量之间是否线性相关。比如说,X是区间[-1,1]上的一个均匀分布的随机变量。Y = X2. 那么Y是完全由X确定。因此YX是不独立的。但是相关系数为0。或者说他们是不相关的。当YX服从联合正态分布时,其相互独立和不相关是等价的。

  假设有两个变量X、Y,那么两变量间的皮尔逊相关系数可通过以下公式计算:

  公式一:

  公式二:

  公式三:

  公式四:

  以上列出的四个公式等价,其中E是数学期望,cov表示协方差,N表示变量取值的个数。

利用公式一代码:

# -*- coding: UTF-8 -*-

#一个涉及影评者及其对几部影片评分情况的字典
critics={'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5,
'Just My Luck': 3.0, 'Superman Returns': 3.5, 'You, Me and Dupree': 2.5,
'The Night Listener': 3.0},
'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5,
'Just My Luck': 1.5, 'Superman Returns': 5.0, 'The Night Listener': 3.0,
'You, Me and Dupree': 3.5},
'Michael Phillips': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.0,
'Superman Returns': 3.5, 'The Night Listener': 4.0},
'Claudia Puig': {'Snakes on a Plane': 3.5, 'Just My Luck': 3.0,
'The Night Listener': 4.5, 'Superman Returns': 4.0,
'You, Me and Dupree': 2.5},
'Mick LaSalle': {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0,
'Just My Luck': 2.0, 'Superman Returns': 3.0, 'The Night Listener': 3.0,
'You, Me and Dupree': 2.0},
'Jack Matthews': {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0,
'The Night Listener': 3.0, 'Superman Returns': 5.0, 'You, Me and Dupree': 3.5},
'Toby': {'Snakes on a Plane':4.5,'You, Me and Dupree':1.0,'Superman Returns':4.0}} from math import sqrt #返回p1和p2的皮尔逊相关系数
def sim_pearson(prefs,p1,p2):
#得到双方都曾评价过的物品列表
si={}
for item in prefs[p1]:
if item in prefs[p2]:
si[item]=1 #得到列表元素的个数
n=len(si) #如果两人没有共同之处,则返回0
if n==0: return 0 #对所有偏好求和
sum1=sum([prefs[p1][it] for it in si])
sum2=sum([prefs[p2][it] for it in si]) #求平方和
sum1Sq=sum([pow(prefs[p1][it],2) for it in si])
sum2Sq=sum([pow(prefs[p2][it],2) for it in si]) #求乘积之和
pSum=sum([prefs[p1][it]*prefs[p2][it] for it in si]) #计算皮尔逊评价值
num=pSum/n-(sum1*sum2)/(n*n)
den=sqrt((sum1Sq/n-pow(sum1,2)/(n*n))*(sum2Sq/n-pow(sum2,2)/(n*n)))
if den==0:
return 0 r=num/den
return r print(sim_pearson(critics,'Lisa Rose','Gene Seymour'))

Note3 :《集体智慧编程》用户相似度计算的更多相关文章

  1. Python 集体智慧编程PDF

    集体智慧编程PDF 1.图书思维导图http://www.pythoner.com/183.html p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 12. ...

  2. win7基于mahout推荐之用户相似度计算

    http://www.douban.com/note/319219518/?type=like win7基于mahout推荐之用户相似度计算 2013-12-03 09:19:11    事情回到半年 ...

  3. 《集体智慧编程学习笔记》——Chapter2:提供推荐

    知识点: 1. 协作型过滤--Collaboraive Filtering 通常的做法是对一群人进行搜索,并从中找出与我们品味相近的一小群人,算法会对这些人的偏好进行考察,并将它们组合起来构造出一个经 ...

  4. 集体智慧编程-discovering groups

    这一章讲的是利用聚集算法对blog进行分类. 首先是构造数据,找到一组blog,每个blog包含一组单词.这样就形成了(blog-name, word*)*的数据结构. 在构造该数据结构的过程中,还需 ...

  5. 《集体智慧编程》第7章代码 Python3执行出错

    电子工业出版社,2015年第3版 P153,增加了buildtree函数后执行出错,报错为: ----------------------------------------------------- ...

  6. Spark Mllib里相似度度量(基于余弦相似度计算不同用户之间相似性)(图文详解)

    不多说,直接上干货! 常见的推荐算法 1.基于关系规则的推荐 2.基于内容的推荐 3.人口统计式的推荐 4.协调过滤式的推荐 协调过滤算法,是一种基于群体用户或者物品的典型推荐算法,也是目前常用的推荐 ...

  7. 海量数据相似度计算之simhash短文本查找

    在前一篇文章 <海量数据相似度计算之simhash和海明距离> 介绍了simhash的原理,大家应该感觉到了算法的魅力.但是随着业务的增长 simhash的数据也会暴增,如果一天100w, ...

  8. 皮尔逊相似度计算的例子(R语言)

    编译最近的协同过滤算法皮尔逊相似度计算.下顺便研究R简单使用的语言.概率统计知识. 一.概率论和统计学概念复习 1)期望值(Expected Value) 由于这里每一个数都是等概率的.所以就当做是数 ...

  9. LSF-SCNN:一种基于 CNN 的短文本表达模型及相似度计算的全新优化模型

    欢迎大家前往腾讯云社区,获取更多腾讯海量技术实践干货哦~ 本篇文章是我在读期间,对自然语言处理中的文本相似度问题研究取得的一点小成果.如果你对自然语言处理 (natural language proc ...

随机推荐

  1. linux 中文件权限和磁盘管理、linux服务器项目如何部署

    chmod chmod 421 xx.txt //4=r,2=w,1=x df 查看已挂载磁盘的总容量.使用容量.剩余容量等,可以不加任何参数,默认是按k为单位显示的 df常用参数有 –i -h -k ...

  2. 【OpenJudge 8463】Stupid cat & Doge

    http://noi.openjudge.cn/ch0204/8463/ 挺恶心的一道简单分治. 一开始准备非递归. 大if判断,后来发现代码量过长,决定大打表判断后继情况,后来发现序号不对称. 最后 ...

  3. hdu2874 LCA在线算法

    Connections between cities Time Limit: 10000/5000 MS (Java/Others)    Memory Limit: 32768/32768 K (J ...

  4. UI: 概述, 启动屏幕, 屏幕方向

    UI 设计概述 启动屏幕(闪屏) 屏幕方向 示例1.UI 设计概述UI/Summary.xaml <Page x:Class="Windows10.UI.Summary" x ...

  5. VBScript使用CDO.Message发送邮件

    Const Email_From = "from@163.com" Const Password = "password" Const Email_To = & ...

  6. A=AUB

    #include<stdio.h>#include<stdlib.h> #define LIST_MAX 10#define LIST_ADD 2 typedef struct ...

  7. Maven+Spring MVC Spring Mybatis配置

    环境: Eclipse Neon JDK1.8.0 Tomcat8.0 先决条件: Eclipse先用maven向导创建web工程.参见本站之前随笔. 本机安装完成mysql5:新建用户xuxy03设 ...

  8. Solr学习总结(一)Solr介绍

       最近一直在搞Solr的问题,研究Solr 的优化,搜索引擎的bug修改等,这几天终于有时间,闲下来总结分享,以便大家参考,与大家一起来共同学习. Solr是一个基于Lucene的全文搜索引擎,同 ...

  9. php多进程总结

    本文部分来自网络参考,部分自己总结,由于一直保存在笔记中,并没有记录参考文章地址,如有侵权请通知删除.最近快被业务整疯了,这个等抽时间还需要好好的整理一番.   多进程--fork 场景:日常任务中, ...

  10. zend studio汉化

    在help菜单中选择Install New Software,在 work with栏中添加上这样的地址 http://archive.eclipse.org/technology/babel/upd ...