优点:精度高,对异常值不敏感,无数据输入假定
缺点:计算复杂度高,空间复杂度高
适用数据范围:数值型和标称型
 
一般流程:
    (1). 收集数据(网络抓取)
    (2).处理数据,将数据处理成结构化的数据格式。
    (3).分析数据
    (4).测试算法(主要是计算模型的出错率)
    (5).使用算法,
 
K-近邻算法采用测量不同特征值之间的距离的方法进行分类
 
工作原理是:存在一个训练样本集,且样本集中每个数据都存在标签(与分类的对应关系)。
当输入没有标签的新数据后,将新数据的每个特征与样本集中的数据对应的特征进行比较,
然后算法提取训练样本集中前k个最相似的数据的分类标签,且k不大于20 。
选择最相似数据中出现次数最多的分类,作为新数据的分类。
 
 
数据归一化的作用,只用在特征数据相差较大且同等重要的条件下
 
aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAcIAAAA3CAYAAACfMEpbAAAKrGlDQ1BJQ0MgUHJvZmlsZQAASImVlgdUFFkWhl9VdQ6kbhCQ0OSM5Cg5NqDkKApNN6EJbdvQJDMyOIJjQEUEFUEHQRRMBBlUxIBpEFDAPCCDgDIOBjCgMgUszc7u2d2z/zu33ndu3bp169V751wAKA9ZfH4yLAFACi9NEODhzAgLj2DgBwCEDgKQA1gWO5Xv5OfnA1DNz3/XZC8ai+q+wUyuf7//XyXJiU1lAwD5oRzDSWWnoHwetVY2X5AGAIIaUMtI489wMcp0AVogyidnOH6OW2c4Zo4fzMYEBbigPAIAgcJiCeIBIH9A/Yx0djyah0JH2YjH4fJQdkXZnp3A4qCci7J+SsrqGT6NsnbMP+WJ/1vOGFFOFitexHPfMiuCKzeVn8zK+j+X438rJVk4/w5V1CgJAs8AdJZF16wqabW3iHkxy33nmcuZjZ/lBKFn8DyzU10i5pnDcvWeZ2FSsNM8swQLz3LTmEHzLFgdIMrPS17uI8ofyxRxbKpb4DzHcd2Z85ydEBQ6z+nckOXznJoU6L0Q4yLyC4QBoprjBO6ib0xJXaiNzVp4V1pCkOdCDWGiejixrm4iPy9YFM9Pcxbl5Cf7LdSf7CHyp6YHip5NQzfYPCeyvPwW8viJ1ge4Ajfggw4GOpugwxiYAU+QnhabObOngctqfpaAG5+QxnBCT00sg8ljG+ozTIyMLQGYOYNzv/j9w9mzBckQFnxJFAAsKlHnmQVfbCIALd8BEHNb8GlGodsjEoCrlmyhIH3Oh5m5YAEJiAM6erqVgBrQBgZodRbAFjiilXoBXxAEwsEqwAYJIAUIQAZYBzaDPFAAdoF9oASUgaOgCpwCZ0EjaAFXwA1wB3SCHvAE9IMh8BqMg0kwBUEQHqJCNEgOUoY0ID3IBLKC7CE3yAcKgMKhaCge4kFCaB20BSqACqESqByqhs5AF6Ar0C2oC3oEDUCj0DvoC4zAFJgOK8Ka8BLYCnaCveEgeCUcD6+Bs+FceAdcDFfAJ+EG+Ap8B+6B++HX8AQCEDIig6ggBogV4oL4IhFIHCJANiD5SBFSgdQizUg7ch/pR8aQzxgchoZhYAwwthhPTDCGjVmD2YDZjinBVGEaMNcw9zEDmHHMdywVq4DVw9pgmdgwbDw2A5uHLcJWYuux17E92CHsJA6Hk8Fp4SxxnrhwXCJuLW477hCuDteK68IN4ibweLwcXg9vh/fFs/Bp+Dz8AfxJ/GV8N34I/4lAJigTTAjuhAgCj5BDKCKcIFwidBOGCVNECaIG0YboS+QQs4g7iceIzcR7xCHiFEmSpEWyIwWREkmbScWkWtJ10lPSezKZrEq2JvuTueRN5GLyafJN8gD5M0WKoktxoURShJQdlOOUVsojynsqlapJdaRGUNOoO6jV1KvU59RPYjQxQzGmGEdso1ipWINYt9gbcaK4hriT+CrxbPEi8XPi98THJIgSmhIuEiyJDRKlEhck+iQmJGmSxpK+kimS2yVPSN6SHJHCS2lKuUlxpHKljkpdlRqkITQ1mguNTdtCO0a7Thui4+hadCY9kV5AP0XvoI9LS0mbSYdIZ0qXSl+U7pdBZDRlmDLJMjtlzsr0ynxZpLjIaVHsom2Lahd1L/oou1jWUTZWNl+2TrZH9oscQ85NLklut1yj3DN5jLyuvL98hvxh+evyY4vpi20XsxfnLz67+LECrKCrEKCwVuGowl2FCUUlRQ9FvuIBxauKY0oySo5KiUp7lS4pjSrTlO2Vucp7lS8rv2JIM5wYyYxixjXGuIqCiqeKUKVcpUNlSlVLNVg1R7VO9ZkaSc1KLU5tr1qb2ri6svoy9XXqNeqPNYgaVhoJGvs12jU+ampphmpu1WzUHNGS1WJqZWvVaD3Vpmo7aK/RrtB+oIPTsdJJ0jmk06kL65rrJuiW6t7Tg/Us9Lh6h/S69LH61vo8/Qr9PgOKgZNBukGNwYChjKGPYY5ho+GbJepLIpbsXtK+5LuRuVGy0TGjJ8ZSxl7GOcbNxu9MdE3YJqUmD0yppu6mG02bTN+a6ZnFmh02e2hOM19mvtW8zfybhaWFwKLWYtRS3TLa8qBlnxXdys9qu9VNa6y1s/VG6xbrzzYWNmk2Z23+tDWwTbI9YTuyVGtp7NJjSwftVO1YduV2/fYM+2j7I/b9DioOLIcKhxeOao4cx0rHYScdp0Snk05vnI2cBc71zh9dbFzWu7S6Iq4ervmuHW5SbsFuJW7P3VXd491r3Mc9zD3WerR6Yj29PXd79jEVmWxmNXPcy9Jrvdc1b4p3oHeJ9wsfXR+BT/MyeJnXsj3Lni7XWM5b3ugLfJm+e3yf+Wn5rfH7xR/n7+df6v8ywDhgXUB7IC0wKvBE4GSQc9DOoCfB2sHC4LYQ8ZDIkOqQj6GuoYWh/WFLwtaH3QmXD+eGN0XgI0IiKiMmVrit2LdiKNI8Mi+yd6XWysyVt1bJr0pedTFKPIoVdS4aGx0afSL6K8uXVcGaiGHGHIwZZ7uw97Nfcxw5ezmjsXaxhbHDcXZxhXEj8Xbxe+JHExwSihLGuC7cEu7bRM/EssSPSb5Jx5Omk0OT61IIKdEpF3hSvCTetdVKqzNXd/H1+Hn8/jU2a/atGRd4CypTodSVqU1pdLTZuSvUFv4gHEi3Ty9N/5QRknEuUzKTl3k3SzdrW9Zwtnv2z2sxa9lr29aprNu8bmC90/ryDdCGmA1tG9U25m4c2uSxqWozaXPS5l9zjHIKcz5sCd3SnKuYuyl38AePH2ryxPIEeX1bbbeW/Yj5kftjxzbTbQe2fc/n5N8uMCooKvi6nb399k/GPxX/NL0jbkfHToudh3fhdvF29e522F1VKFmYXTi4Z9mehr2Mvfl7P+yL2neryKyobD9pv3B/f7FPcdMB9QO7DnwtSSjpKXUurTuocHDbwY+HOIe6Dzseri1TLCso+3KEe+RhuUd5Q4VmRdFR3NH0oy+PhRxr/9nq5+pK+cqCym/Hecf7qwKqrlVbVlefUDixswauEdaMnow82XnK9VRTrUFteZ1MXcFpcFp4+tWZ6DO9Z73Ptp2zOld7XuP8wXpafX4D1JDVMN6Y0NjfFN7UdcHrQluzbXP9L4a/HG9RaSm9KH1x5yXSpdxL05ezL0+08lvHrsRfGWyLantyNezqg2v+1zque1+/ecP9xtV2p/bLN+1uttyyuXXhttXtxjsWdxrumt+t/9X81/oOi46Ge5b3mjqtO5u7lnZd6nbovnLf9f6NB8wHd3qW93T1Bvc+7Ivs63/IeTjyKPnR28fpj6eebHqKfZr/TOJZ0XOF5xW/6fxW12/Rf3HAdeDui8AXTwbZg69/T/3961DuS+rLomHl4eoRk5GWUffRzlcrXg295r+eGsv7Q/KPg2+035z/0/HPu+Nh40NvBW+n321/L/f++AezD20TfhPPJ1Mmpz7mf5L7VPXZ6nP7l9Avw1MZX/Ffi7/pfGv+7v396XTK9DSfJWDNtgIIanBcHADvjgNADQeA1gkASWyuR54VNNfXzxL4TzzXR8/KAoBKRwCCWwHw2wRAOWqaqFFQnx9qQY4ANjUV2T+UGmdqMpeL3Ii2JkXT0+/R3hCvA8C3vunpqcbp6W9or4M8BqB1cq43n5EE2v8fOWLsERDcASHgX/UX2G4FvI6Bn/sAAAGcaVRYdFhNTDpjb20uYWRvYmUueG1wAAAAAAA8eDp4bXBtZXRhIHhtbG5zOng9ImFkb2JlOm5zOm1ldGEvIiB4OnhtcHRrPSJYTVAgQ29yZSA1LjQuMCI+CiAgIDxyZGY6UkRGIHhtbG5zOnJkZj0iaHR0cDovL3d3dy53My5vcmcvMTk5OS8wMi8yMi1yZGYtc3ludGF4LW5zIyI+CiAgICAgIDxyZGY6RGVzY3JpcHRpb24gcmRmOmFib3V0PSIiCiAgICAgICAgICAgIHhtbG5zOmV4aWY9Imh0dHA6Ly9ucy5hZG9iZS5jb20vZXhpZi8xLjAvIj4KICAgICAgICAgPGV4aWY6UGl4ZWxYRGltZW5zaW9uPjQ1MDwvZXhpZjpQaXhlbFhEaW1lbnNpb24+CiAgICAgICAgIDxleGlmOlBpeGVsWURpbWVuc2lvbj41NTwvZXhpZjpQaXhlbFlEaW1lbnNpb24+CiAgICAgIDwvcmRmOkRlc2NyaXB0aW9uPgogICA8L3JkZjpSREY+CjwveDp4bXBtZXRhPgql5SaEAAAZ9UlEQVR4Ae1dD1RU15n/dc9oJzmEGneSECJGDSUJGjAnNsTGMEkxiCfO0azTXZFNWDT+Weuf00CNxoVWXaLdQlvFtYFkQ9WDpCeYVTEHQyQRk5VQSIM2kuOciViIwBYaWZyDE5h29rvvvZl5M/NmeMAMM2Tu9eDcd+937/2+3733++7f975lJwfuOAIcAY4AR4AjEKEI/F2Eys3F5ghwBDgCHAGOgIAAN4S8IXAEOAIcAY5ARCPADWFEVz8XniPAEeAIcAS4IeRtgCPAEeAIcAQiGgFuCCO6+rnwHAGOAEeAI8ANIW8DHAGOAEeAIxDRCHBDGNHVz4XnCHAEOAIcAW4IeRvgCHAEOAIcgYhGgBvCiK5+LjxHgCPAEeAIcEPI2wBHgCPAEeAIRDQC3BBGdPVz4TkCHAGOAEeAG0LeBjgCHAGOAEcgohHghjCiq58LzxHgCHAEOALcEPI2wBHgCHAEOAIRjQA3hBFd/Vx4jgBHgCPAEeCGkLcBjgBHgCPAEYhoBLghjOjq58JzBDgCHAGOADeEvA1wBDgCHAGOQEQjwA1hRFc/F54jwBHgCHAEuCHkbYAjwBHgCHAEIhoBTbhJ31q5GRtKL4YbW5wfjgBHgCPAEVBEIB5Fp17HvCjFyAkRGEaG0IbTP12IvfWzsDg7B9O1kxUB7G//Iy582Ycpd8/BQzNvV6Tp/aIJV/96L+Yl3KkYzwLHjeZv1/Hpx58BxO/DPvjF3/6Mpo+v4d7vPYw7J/lgeRxp+ts/xR++BJKJn9t98DNu+KmpK45x8Ns6xzgiMW4mvTTdn17CICyDU3FXGFkSHxrUb/C37OT8UoxXpKUZK5fkAVn7cHRN8niVOmHKsVr6YIUWU6K0E4ZnzmikIWBFX58VGm0UorQTXDNGWtWpkPebrIPCprW2ffAWOqkytmXMVlElEURiNeHAxrWoMksyxxtRfmAjZnJ7GEGNIPxF7Ws9huc3lKBfYjUlpwiF2fMQNgom/CEMXw4jQAeFyWGZXrzzWiMQvw4L4njXkfeIc3vWonbuLlRVHcG2rBTAXIUdR1rlJNzPEQgtAjYTCsgILi86gqryPTDER6OxPA9vmSyh5YuXHhAEIkEHhYUhtJrqUUVDyfTMpzCB91sD0ujcMrG2oupTA97YmAqdLg4Za7YjKxroNH9By6TccQTCAwFLSy0GjPuQPS8OupnzkfvvGwTGLDdt4cEg52L0CESIDgqL6dfvqw9RRSVh2eMxI6gwGywWZg40iAqXfTMb2yPpwQ3cjjhdAEy6dhYKDv8IOicqWpIViI67m3YLw8vZrBZYbRpoqS78NSqRjmpNEwWtHyFCRRdcVG2wsjar0ZLsflES2rZGJd3wfUBtX1FL545S1Jx/xq/nTnEFTrkb8a6n8PNRP7URtv5qwCfTY0nrM1P1EWr7hb8crVbbMO1PlnoC6SAZ1yP2jqotjLgUfwnYiKO6H9GGTCQqKUZbNyp3rEBpxzqcOpopzBgtpjq8nLcbFx0bEtEpyC/KR1pCAIyPP14V42xoa6jB28dPoLpR3MgzbCtHbkYUbN11ePHFSkB3q1fKgYFe3LP437DzmSGszNiCR/LLkJuW4EGnhU6mX2BrRwttpBoeC52a6W6uxIq8UuQcPIXsxChYOxrwy517UGt2Vgb0OXn4SXaqx+zegroDu7G7ipbAJZdkzMcrG9PChM7BVXB+TXWv4We7K4R9cFZCdHw6CvZuxTydvAta0VD5S+wprXXutTG6vB2bkTrTvW2r7QOBoms9tBIb3v0+yso2wqubaae41aHlsw9hRgp2zJY33uDgqpirgs4Q6Gy9aDhxCNtLPqHj/kdHdtx/LGkVmRxpoNr+4zvfvrYG/FfhdnwytwhHN85zEVovTCgd5GI8gD52ajSUrutMoV2v19urLt9UYKPdXmLQU3yh/YoUffPKSYFer99qb7lBSa632LdSer3eYD/pIFLIKShBN6/YS1azstnfVnvNpS63YppKMqU4B4377+rfXhLob0gybTra4pbe86GlxGDXF5yxD3lGjNNzV32JIE9hzRWhxKH2GrtBkJ3kEurJJV9myXkZVzftNQXEO9FulWRsObpVeDYU1NhdNR8qOhmrQfBeOSnKmrm1yF5SJPoZFnp9kd3VYobsZwpFjMQ4F5Z6/Wr7+R4XY2r7QGDpblAdMp6kfudix8PXZS8k2UqaZAx7UAT30Vtn2O037OePinpGxHa1qDtUMTKWtKoKUEGktl8oZ3Wz/by9MNPVnlaXeeuZiaKDlCUceyjGnsVYcrhh/y0zJIYSu3e3YZXPKi9TpgR6JMOot5c1XXcWfL2pjOhYPmV2V6gzOjiem5fsBYIy09szC6q8+af41RS/uuyMvb2nx97V1SX89VzvstcUZgr8nmx3mbSe+iIhrMyHAuk5T0ZIESe14g3Zr9SftNc0uVSv2pSMbkgy1nIDV1/I6qfA3tIlmrKbPZfsRZksjP2RspEsXM95UTZ9prx+rtvLpEFEiVSXoaIbCQ4jppXaQeEZcfDA0g91nbFnShg1scEccz1nBNwKqlqkgcFN+6UaCTfWjqRBExGq7AOBpmNMdtmL2ICH6tG7v7L4IaHPrj4qDvBYyIjd9cv2k0dr7F2urjGCLJR0BktOWNaft1++fMa+ydE2HbgPm/tY0g6buSoCtf3CV2ZD1C/rWy7b68s2CW1MyRCytMHXQb44DH14SA/L2Do+QjmtJqaseUa2DyZOd7vP/Sddrgf0uYWYL22SWdvEQzUUCr1sT2LK3IUUQq6/AnVt43GMxIpjWzeA2ANis3Bw53Iv/q3tF4jNbXh1TRrtF+oQExMj/OmmaGD+jNY3o42YLzshq0tdh3VJQEXeL9DqIYLVdAyr9nyNN09uFMuxdqOt14OI8eLPWS9hR0ExDn78pT8qH3Hd2LelmOL0KFw/X6SxteJwLZB7JB/JMeKatlaXiNz9+aDzPIIbEs5KsOW+auE5JXMhXItlU7BwqVBrqHrrQzr8Eyo6idkg/VjbP8c9WXvwctpMZwmamEewKJY9DjjDTO/REnpSLvKXJ0v7v1okZuRiV7pACAwOCbRq+0Cg6URGY/CjvevotFYFdh0zOXl3eJpfW4+qafvwemaiEGTpMKF7hM209cTPUFx6EF+OMB0rUElniLwRlqnzkZCQgvkj3lUYS1qx9LH9r7Zf+C5FQ/0yNTkBjz7+iG8iigm6DvJbemgjg2YI+0zNaG7r8ytdy+nDFB+PHz7lUhJCAlsbflPAlGc6XjC44q58LJgeSvIAYuVbK5o78IDUwGvOXfZbZiAiLa3/jZKLYk5ZLz6LKXRQxGKxQH5GTpuQidd3ZnhtyNtoT62K7GD88ic8jGcUnt1ISgaN+OXvXNcjbN2nkbG2BDMy56LtXB3q6K941Vqc+7O8NHVSsZ3KqG/7eFWMnyw6qn8D2sYlu/4CZjpxvw3G3CI8LTPmQhZRd2OGPC/rFbwnYZV43x3yGNx9/wPic+MJXO4NEV2QT/hrE5Zj5xpp8OCQvu9zvEttAEkGxEtbf9HfNWLPT572ai+6e8RRoMNkqu0DgaZzsK5NNGATDdgulhzABRl2pmMvIK/CjMX3D6Cujtrp6UNY+1wFbjrbiyOH4X5ZK70TI26lPnTGcKWFfbza/iOrC18yDa8xgq+DfPEW6vARN1M1DLceewUbSmi6QPcCT70uHnDxSkcNt7qCtAHNmua6nwOA5dI5YbYVa1iAOHnCQfEhPiXRbXOe1DuSn6DeaSaNK9HIkwXWb8UHh0qdWVbkGVHhfIqGcVsRNmYkOEM8Pab33xOCFqfe7xkF7axH6IgBmcLyGnRnJyKG7mftXrFXoLtYSoeDHCloFnqcDqqMj+vD+yfYACQWaXQ83uk0dJ3DIHt2RNgGpHnOg5jGWHSO7OMxe7o7z9rYROF0odmRVvgNFZ0bE0F7sPVewL7N24VDM7teyXTOkGPmZSBGodShQdEEzrn/LjF2hH1AbV9RS8f62kN6aqUXG/FOYzeS02LAVizWloi1WFKw3SlF0qYy2cDJGRwUj0+dEZTSQpGp2n4xNt7CUweNTSY1qQNrCOlk1bHdq1BST9MH5syVaOz+IdJivIvpa3pHMHY5/7DAaxTc3vKJkDxuhlw12NB+5aoQrvTfkKR+B752jJ2VqAIRRkfM/+LKR79uG9LumYRrH1WitNaMqr1r8fWtbyI3Vc67g74X7x9j5syAx5VeDaOJxdx4MoTmavxP24+wfGYCdp4960g8xt9JEM6uTh7hWNv6J3wo6Lg4xLjbMUV+ev9QTycGaXyTaxAUu62nHVclSt8lD+BGp0q6ayrpVOY3wAy1CrkkEUb/QycZD+14EeWNbCoouoK1xSg/nOvHWHTjTLWAJoyPsvaktg/cQPuXV6VSvH9cfUUtnXufin2QLX02ora2CS+mGcBmvWfPLvcuaBQhmsmslVJ53irDb27KOsNvkgkRqbb/BKwdB1UHhS/kI2xu/gVpO/ELlFxLR34usLu4ioj7UfmuCWk0u3F3VtRVsvh0LEr21kIDN8SOl/hgrCyZTeb39g5JafC1uJfiTUGTk94OtN+wqVh2GYLmtljlu4BWusIgGAY2mT2CnRnSrChVTws6z2N3bSeqX63G6tQ1ztG+gxdhWZTGCLFZesXRv4OO/X7VzzS00n0SOZVvf29HG0hUyVE1/6VJmFHG9rajreM20qnOSFI6tyEuTqese6ySEoxPROxw7NBVmF+wJe34HPzY4HtW7OCKKTyHih3CX13BXj4XnU0lndr8hoQFbaVuYEVHW5fbcrcXW1IAtRbEzowbxp5qMXfZeuQ/1YPGqkPidZPOahRWLMbrXv1DzNhU+SthSTpn34+RIGAvqzMFZlx94GuFWFeQi461Md/OReejT3V8JUz4h2sWyiXY0NvRIWuj1AxvAT6pYwPFWLRfbsNU2mx2Skws3BZLF/ajlOqKWpKizlAuebxCA6Jv/DLr6he+27HfDPxGjlUH+c08zCKVW9UomZy5/OdwDAyHmmvpsEs/zOXH0UEd3W0Rrff3OETtPWndMgWD0IdPz5sFDgbF0xYSN1pMnzUDqKeECrOa6L9neymduDX6OxK9548FR1Y9hwoyRKqcoQhnc+d5k9LSn2NCOGu6/OsXGqQsWwTUlhMbjTBb1njdU3Isixr0c7zzDWCI1VQJ41rX8q08687qvchh269uzojjZzd6GW5GYvnTZ8IMz2mx3NLJH6yo/ukGmiek4OCvs515ae6YLuwZXmTzUc/WpvmOsE9KtYapsTNV0d1+jzo6tfl9x8fFdsuFI3hui2vRWy6pkn/PqbOY7z2mc5FqpiB5fqrwnJaxFAsOrEcBvUDW/G4T+qh/uA4RiUmsbdVYW0porjuI7GRHrNo+oMN03QwVfeUOlXS++pR3lboEHs5nwZvP5YANh71dJ4q35HgFG4qOI3eeAwt5tC+dIacZb39g9I3a/uOrHY+31BO1PE/VFDA5FvwTLZXUk1FALd6/sJk6s0tLmN6rorki7ac97TlTZMVH4f45NBPs7CR758HeZJE9cyt7xViibL5kxeWWq2Kkz/81eHhNDiZ/NRmTpXx8kQ4ODuKu781Qjo6agbnEntm1wuWki5o+W2HPyxHtWBZNxwJxeO+ICPxv9HeRk0WyRskEHWxDZXktjT7SkTl/pqtMknVw6mwZlq4o5tPeNUs4BdrPVqx8OhsaDqxCcSN9l+z4z+G+femYTVyEud2KZJnstq4vXPueCBWdslCaux4mDKmtyDFUJCX8MA3fHdG0SIPUf1mN2Cpxr9AzW1v3OazKKUZ8VhF+Lp3AdNKo7QOBpnMyIPP4bRMyOkWvFo/l5uA2j/7Y2VCJahrrpmdlQv4OgUHLIGZP8wWyH52hWPZ4BAZI36juF+Mh0ze3DA9LEzhBoxIX0U5YOdjko/ztRmQlp0kTgm5UsQ/vJm3Co2wS5+U00N0j9rBr7dch16qzkucTNaX93AR2HpXtmoiuD6bP+wXvE4/e5wj0+NViniEbCnM8D7rhHqNwL5vekiFs/aLHjT9rz5/E2RPNcG71QNa5LGp8yn12PFxxo4jXxsxD9hoPSem6Qx0zhI8sQ6ancvVThmaKjpZ8aZHbfA3XaZ1KaWWq+dB6bKeh/a43X4V8wN5ragVm3SccWb9Ik/zLHVRrCbJao+P1got/AvfpQkTnGp+5oaCIoRvFGB8cs2F6FZ+beu9rxvoVBbAYdtHnyGR1SHuMF65ocL/KPjBrkrq+opZOUVpa13YuXSoS+AtU7o8mNJAhHMCyHHrTlEcf8p2bb53hO02wY5TlG3GpWpX9wkc7HnF5EZogaNcnmJlazC7GMVdfiRaL6LVceJfmiIAxU++uAMRo4f+p02YJv5+Z/1cWSrOThMek+4LsMIlrf8PW1iTspdCuHVJnB7tFaDH/GYPAV+OJc5DEEp5t/V+J/CY9jVlu2g1wLIsu+sFDbjK5P1hhETbNYjHnXqUlIHfqET3ZxBnXgHQfTXVa7VQ8SHs1wBV8KRdWyqD12EvIo8ugOfv2I1XH3lVJ7xyldxn20md5jGu34ao1CguM4n1BdrjCpThpFvmOuEabspS9ji1UdKqRGCWhDX297FuS7q7v0/eF2XD60sdd/cDSipeW5cFMp63f2PJ9wpHaA/3Z6N7oMXrNYH7tNdV9QG1fUUsn5942cEN4jP3+bK8lXTndaPw26ZTskCdgw2TmS2e4J7PiL0L/cg91PFk7mlF5qBINite+/Kd15BH4X7X9gpVsQfPpSlRWNwgTBU9erP8n1ptnuPtzEHWQe0Fh9RREQ0iTpaeXSperzaj+qEMQvPFttlyagme+p/MJREzKM8ISY+f5T90rVDMTm/ZkCelKdhxBt+DrRvmOYsFn2PWvfk7g+SxuxBG61GxkxVIyczl2H2oWlTt9s2t/vrintGnjUy7lJuTuWBYlQ+31okZZ8Zar+IBmmohfhIcCbAdlpYzQG4eFhnhKY8YnZjYPdznTaXZNRnx3aPkWI55cmIElS5YgI2MhjPRZHqRvEa7GxGVsEvFqLMarDWKt9TaU01Iq5UUvFvixdFc0VHQuiQLvs7ZWYJlxGTKefAHHmjuorbBBQjWeZ1PolE3YQNcPBEft55WVbI+VnLkUxoULCUfCk/4WZqxASWM0NhjpW51q+0Cg6UQuhf/NH7cIv4ueeEAWGlqvT50hY8va1ozzrH9RW/7wj72yGOa14Hcv5aG0vBTbf1XnNXDxn9YjqwA/qu0XFhp85u0tRWnxdtSYPEcSFlz44LzAmfmD38NTeifLYamDnNwFzaN68WFUHOhSkE2TQnb5vP7wR7A8mYzKenZq8of+DVZUMp43xKKgugI1rZnIlG066eavQfm2fuTsrcCKJ6vJ0PbTP7qQsI1eWq14ZWFUnA+TSIc1h48AL9LhG/ru2kJm2wUXj9yDe2HwMHY2xxtx0tP8ym16V/w4cc6mpcOcQHSUNz6/yc8+j+iKAlQdrsML85aLRp5OiB7Yy+b2vt06Y4q0HE54vVGGfvaB4e0rUBtLtdZJtRZvQBl9ZFgyBZRRqOh8yzDmmFumSlmYUZL3HGh4IDjjtoNYn5HoPD/U+rv/QC1ryL5cUjYWSNeQ1PaBQNMJrJHBfot9JZpOBi+V7fv7Ynt04QPOnTHV6f3oDGbkjr20kgYTLoCrthtRFUubN87rK1pMm0NLH6xd3j7FWS/q0qrmcpSE6vqFduo0cT+fSrn9FpdqZ/c8/5FeyuGUnt4MZHyyAun0cYCXM2TnBShduOqgUQKnOtm32FveVFOPgrD3XDGMwltiopGScica6QsN246cQYbnG0k88yZFuzljAy6yy+NHFa4i9HXjyp/76SrEJNxCx6pjlDavPPMMwjO7pvAVrfdNorPfd8fFeMwEpQJtFvRZbPT1HXpLv8eSqZOlvgasXEaHJ9LzceZlx36qM3bsHjbjWJWHbz+/j76M4d741WTeWrkZG2hvN4tO7q2RbwSqSeyksaG77Qp1SKo1wiuO8HJ1VycReUJFJ+chgH5rH7p7buAmu7Lir52MsEibyj4QSLrmAyuRV2VB/ptvK94PHqEIXuSm6leQd+LbKDqQK10Z8SLxHTCMzvCd0BFjRW+3BVExOuV+7CAL2e/w/cJq6YXFFkVfrfGlaPwwH2wd5KfoUEcF3RCCGudLZNDEBTQSl0aSx193Ha/3B4CltRJLNpQiKWcf9mcn+yOd4HG9OLTSiPLOdBw58zKGGyOERlgaVW9eQrP7JOw7tR9BmwyERjheqgoE+ppfw7I8NpM4QjMJtwtRKlKPD0nk6IxA4zkRdFCgZXblF9Q9QqEYbSKW0TKnw6Vn/kD1BntUYiaqinJwsXwLXnDsxTky+sb8duM1wQjqcfBUuBpBBnYUlu9/EzlJF7FlyWY097qOvXxjqoIL4hOB3oYDghHUbzoYtkaQMR8ZOsNnNY0yYqLooFGKpyJZ8GeExARbo2YvjqY7Ezh4er/yB3j9MGvrvoCaS5OwOM21p+KHfGJF2TpwuqYNjxlSVQ8QQiugFRfqzgLJC5Hs9lHZ0HLFSw8uAt3Np2GKfowOe4XNKS6/An+jdYZfyUcROeF00ChkHCbJuBhC0BmlA0/S5nT6Lpx9WXy7xjB88WiOAEeAI8AR4AiMCwLjZAjp+IOF7lLRa6Z8HhYZF3F5IRwBjgBHgCPAEXBHYNwMoXux/IkjwBHgCHAEOALhgUDwD8uEh5ycC44AR4AjwBHgCCgiwA2hIiw8kCPAEeAIcAQiBQFuCCOlprmcHAGOAEeAI6CIADeEirDwQI4AR4AjwBGIFAS4IYyUmuZycgQ4AhwBjoAiAtwQKsLCAzkCHAGOAEcgUhD4f9EKwSrZaJ/pAAAAAElFTkSuQmCC" alt="" name="en-media:image/png:2bdbd1d41ecee133d740c53250e97a8b:none:none" />
 
上面方程中数字差值最大的属性对计算结果的影响最大,仅仅是因为飞行常客里程数远大于其他特征值。然而我们认为这三种特征同样重要,因此作为三个等权重的特征 
 
直接上代码:
from numpy import *
import matplotlib.pyplot as plot
import operator
from os import listdir def classify0(inX, dataSet, labels, k):
dataSetSize = dataSet.shape[0]
# 距离计算公式
diffMat = tile(inX, (dataSetSize,1)) - dataSet
sqDiffMat = diffMat**2
sqDistances = sqDiffMat.sum(axis=1)
distances = sqDistances**0.5 # 距离从大到小排序,返回距离的序号
sortedDistIndicies = distances.argsort()
# 声明一个空的字典,用于存放标签
classCount={}
for i in range(k):
# sortedDistIndicies[0]返回的是距离最小的数据样本的序号
# labels[sortedDistIndicies[0]]距离最小的数据样本的标签
voteIlabel = labels[sortedDistIndicies[i]]
classCount[voteIlabel] = classCount.get(voteIlabel,0) + 1
# 给该字典排序,sortedClassCount[0][0]是K中支持的标签数最大的
sortedClassCount = sorted(classCount.items(), key=operator.itemgetter(1), reverse=True)
print(sortedClassCount[0][0])
return sortedClassCount[0][0] # 创建数据
def createDataSet():
group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
labels = ['A','A','B','B']
return group, labels # 画图
def draw(xs,ys):
fig = plot.figure()
# 将画布分割成1行1列,图像画在从左到右从上到下的第1块
# 设置画布的大小与图像的位置
ax = fig.add_subplot(221)
# ax.scatter(xs, ys)的两个参数分别是所有点的x坐标,所有点的y坐标
ax.scatter(xs,ys)
plot.show() def firstTest():
test1 = (1.0, 1.2)
test2 = (0.0, 0.4)
dataset, labels = createDataSet()
conclusion1 = classify0(test1, dataset, labels, 3)
conclusion2 = classify0(test2, dataset, labels, 3)
print(str(test1) + "分类后的结果是属于" + conclusion1 + "类")
print(str(test2) + "分类后的结果是属于" + conclusion2 + "类")
# 将32*32的矩阵读为1*1024
def img2vector(filename):
returnVect = zeros((1,1024))
fr = open(filename)
for i in range(32):
lineStr = fr.readline()
for j in range(32):
returnVect[0,32*i+j] = int(lineStr[j])
return returnVect def handwritingClassTest():
hwLabels = []
# 获得训练样本数据集
trainingFileList = listdir('digits/trainingDigits')
# 样本数的个数
m = len(trainingFileList)
# 返回m行1024列的矩阵数据
trainingMat = zeros((m, 1024))
# 文件名下划线_左边的数字是标签
for i in range(m):
fileNameStr = trainingFileList[i]
fileStr = fileNameStr.split(".")[0]
# 分类标签
classNumStr = int(fileStr.split('_')[0])
hwLabels.append(classNumStr)
trainingMat[i, :] = img2vector('digits/trainingDigits/%s' % fileNameStr)
testFileList = listdir('digits/testDigits')
errorCount = 0.0
mTest = len(testFileList)
for i in range(mTest):
fileNameStr = testFileList[i]
fileStr = fileNameStr.split('.')[0] # take off .txt
classNumStr = int(fileStr.split('_')[0])
vectorUnderTest = img2vector('digits/testDigits/%s' % fileNameStr)
classifierResult = classify0(vectorUnderTest, trainingMat, hwLabels, 3)
print("the classifier came back with: %d, the real answer is: %d" % (classifierResult, classNumStr))
if (classifierResult != classNumStr): errorCount += 1.0
print("\nthe total number of errors is: %d" % errorCount)
print("\nthe total error rate is: %f" % (errorCount / float(mTest))) # 主函数调用模块函数
if __name__ == "__main__":
# group,label = createDataSet()
# # group[:, 0] 所有行的第0列
# draw(group[:, 0], group[:, 1])
# # print(group[:, 0])
# firstTest()
handwritingClassTest() 训练数据集合测试集的数据:https://gitee.com/lcl1993213/plist

KNN--用于手写数字识别的更多相关文章

  1. KNN实现手写数字识别

    KNN实现手写数字识别 博客上显示这个没有Jupyter的好看,想看Jupyter Notebook的请戳KNN实现手写数字识别.ipynb 1 - 导入模块 import numpy as np i ...

  2. Softmax用于手写数字识别(Tensorflow实现)-个人理解

    softmax函数的作用   对于分类方面,softmax函数的作用是从样本值计算得到该样本属于各个类别的概率大小.例如手写数字识别,softmax模型从给定的手写体图片像素值得出这张图片为数字0~9 ...

  3. 机器学习(二)-kNN手写数字识别

    一.kNN算法是机器学习的入门算法,其中不涉及训练,主要思想是计算待测点和参照点的距离,选取距离较近的参照点的类别作为待测点的的类别. 1,距离可以是欧式距离,夹角余弦距离等等. 2,k值不能选择太大 ...

  4. 一看就懂的K近邻算法(KNN),K-D树,并实现手写数字识别!

    1. 什么是KNN 1.1 KNN的通俗解释 何谓K近邻算法,即K-Nearest Neighbor algorithm,简称KNN算法,单从名字来猜想,可以简单粗暴的认为是:K个最近的邻居,当K=1 ...

  5. kaggle 实战 (1): PCA + KNN 手写数字识别

    文章目录 加载package read data PCA 降维探索 选择50维度, 拆分数据为训练集,测试机 KNN PCA降维和K值筛选 分析k & 维度 vs 精度 预测 生成提交文件 本 ...

  6. Kaggle竞赛丨入门手写数字识别之KNN、CNN、降维

    引言 这段时间来,看了西瓜书.蓝皮书,各种机器学习算法都有所了解,但在实践方面却缺乏相应的锻炼.于是我决定通过Kaggle这个平台来提升一下自己的应用能力,培养自己的数据分析能力. 我个人的计划是先从 ...

  7. 基于OpenCV的KNN算法实现手写数字识别

    基于OpenCV的KNN算法实现手写数字识别 一.数据预处理 # 导入所需模块 import cv2 import numpy as np import matplotlib.pyplot as pl ...

  8. K近邻实战手写数字识别

    1.导包 import numpy as np import operator from os import listdir from sklearn.neighbors import KNeighb ...

  9. C#中调用Matlab人工神经网络算法实现手写数字识别

    手写数字识别实现 设计技术参数:通过由数字构成的图像,自动实现几个不同数字的识别,设计识别方法,有较高的识别率 关键字:二值化  投影  矩阵  目标定位  Matlab 手写数字图像识别简介: 手写 ...

  10. 利用神经网络算法的C#手写数字识别

    欢迎大家前往云+社区,获取更多腾讯海量技术实践干货哦~ 下载Demo - 2.77 MB (原始地址):handwritten_character_recognition.zip 下载源码 - 70. ...

随机推荐

  1. 正确理解Mysql的列索引和多列索引

    MySQL数据库提供两种类型的索引,如果没正确设置,索引的利用效率会大打折扣却完全不知问题出在这. CREATE TABLE test ( id         INT NOT NULL, last_ ...

  2. 基于JavaBean编辑器读取peroperties文件

    引言 最近在重读<精通Spring+4.x++企业应用开发实战>这本书,看到了有关JavaBean编辑器的部分,了解到PropertyEditor和BeanInfo的使用.不得不说,Bea ...

  3. Servlet 笔记-异常处理

    当一个 Servlet 抛出一个异常时,Web 容器在使用了 exception-type 元素的 web.xml 中搜索与抛出异常类型相匹配的配置. 必须在 web.xml 中使用 error-pa ...

  4. 【框架学习与探究之定时器--Hangfire】

    声明 本文欢迎转载,请注明文章原始出处:http://www.cnblogs.com/DjlNet/p/7603632.html 前言 在上篇文章当中我们知道关于Quartz.NET的一些情况,其实博 ...

  5. 关于echarts、layer.js和jqGrid的知识点

    使用echarts和layer.js直接去官方文档,能解决大部分问题. 但是有些问题,解释不够清楚,在这里记录一下. 1.echarts的使用 第一点:关于echarts的labelline在数据为零 ...

  6. base64减少图片请求

    1. 使用base64减少 a)            2. 页面解析 CSS 生成的 CSSOM 时间增加 Base64 跟 CSS 混在一起,大大增加了浏览器需要解析CSS树的耗时.其实解析CSS ...

  7. MongoDB覆盖索引查询

    官方的MongoDB的文档中说明,覆盖查询是以下的查询: 1. 所有的查询字段是索引的一部分 2. 所有的查询返回字段在同一个索引中 由于所有出现在查询中的字段是索引的一部分, MongoDB 无需在 ...

  8. Spring MVC 快捷定义 ViewController

    WHY  :               为什么我们需要快捷定义 ViewController ? 在项目开发过程中,经常会涉及页面跳转问题,而且这个页面跳转没有任何业务逻辑过程,只是单纯的路由过程 ...

  9. 游标的小知识(借鉴and整理)

    一.游标(用来存储多条查询数据的一种数据结构(结果集),它有一个指针,用来从上往下移动,从而达到遍历每条记录的作用) 游标也可以理解为逐行返回SQL语句的结果集 如何编写一个游标? 1.声明游标 de ...

  10. nginx虚拟机配置(支持php)

    由于本人水平有限,以下记录仅作参考. 下面贴出我的一份正常运行的nginx服务器虚拟机配置./usr/local/nginx/conf/vhost/www.xsll.com.conf server { ...