Algorithm:

Refrence from one ICML15 paper: Word Mover's Distance.

1. First use Google's word2vec tool to get distributed word representing aka. word vectors.

2. Then use earth mover's distance as similarity measure metric.

3. Solve the EMD problem as transportation problem by Hungarian Algorithm.


Outcome:

Result looks not bad, but still have ways to improve the precision.

For example: use n-gram to keep a little bit sentence structure.

Distributed Sentence Similarity Base on Word Mover's Distance的更多相关文章

  1. 唐诗掠影:基于词移距离(Word Mover's Distance)的唐诗诗句匹配实践

    词移距离(Word Mover's Distance)是在词向量的基础上发展而来的用来衡量文档相似性的度量.   词移距离的具体介绍参考http://blog.csdn.net/qrlhl/artic ...

  2. [LeetCode] Sentence Similarity II 句子相似度之二

    Given two sentences words1, words2 (each represented as an array of strings), and a list of similar ...

  3. 734. Sentence Similarity 有字典数组的相似句子

    [抄题]: Given two sentences words1, words2 (each represented as an array of strings), and a list of si ...

  4. [LeetCode] 737. Sentence Similarity II 句子相似度之二

    Given two sentences words1, words2 (each represented as an array of strings), and a list of similar ...

  5. LeetCode 734. Sentence Similarity

    原题链接在这里:https://leetcode.com/problems/sentence-similarity/ 题目: Given two sentences words1, words2 (e ...

  6. [LeetCode] 737. Sentence Similarity II 句子相似度 II

    Given two sentences words1, words2 (each represented as an array of strings), and a list of similar ...

  7. [LeetCode] 734. Sentence Similarity 句子相似度

    Given two sentences words1, words2 (each represented as an array of strings), and a list of similar ...

  8. Comparing Sentence Similarity Methods

    Reference:Comparing Sentence Similarity Methods,知乎.

  9. 论文阅读笔记: Multi-Perspective Sentence Similarity Modeling with Convolution Neural Networks

    论文概况 Multi-Perspective Sentence Similarity Modeling with Convolution Neural Networks是处理比较两个句子相似度的问题, ...

随机推荐

  1. Sqlserver数据库分页查询

    Sqlserver数据库分页查询一直是Sqlserver的短板,闲来无事,想出几种方法,假设有表ARTICLE,字段ID.YEAR...(其他省略),数据53210条(客户真实数据,量不大),分页查询 ...

  2. Java API —— List接口&ListIterator接口

    1.List接口概述         有序的 collection(也称为序列).此接口的用户可以对列表中每个元素的插入位置进行精确地控制.用户可以根据元素的整数索引(在列表中的位置)访问元素,并搜索 ...

  3. epoll和poll效率差异

    http://blog.163.com/sky20081816@126/blog/static/164761023201073033517435/ 百度“epoll和poll”

  4. [转]useradd 与adduser的区别

    转自:Deit_Aaron的专栏 添加用户:useradd -m 用户名  然后设置密码  passwd 用户名 删除用户:userdel  -r  用户名 1. 在root权限下,useradd只是 ...

  5. Android加速度传感器实现“摇一摇”,带手机振动

    由于代码有点多,所以就分开写了,注释还算详细,方便学习 Activity package com.lmw.android.test;   import android.app.Activity; im ...

  6. truncate table 和delete

    delete table 和 truncate table 使用delete语句删除数据的一般语法格式: delete [from] {table_name.view_name} [where< ...

  7. ZJOI2006物流运输

    唉,没想出来…… 注意到预处理的作用.还有CLJ大牛说的话:这么小的数据,想干什么都可以. SPFA预处理+DP 够经典 var f:..,..]of longint; a:..,..]of bool ...

  8. ti processor sdk linux am335x evm setup.sh hacking

    #!/bin/sh # # ti processor sdk linux am335x evm setup.sh hacking # 说明: # 本文主要对TI的sdk中的setup.sh脚本进行解读 ...

  9. BUFFER CACHE之主要的等待事件

    原因:资源紧张,等待其释放. 原因的原因:1. lgwr和DBWn进程写太慢:2. Buffer和latch不可用 原因的原因的原因:全表扫描.library cache latches数太多等. 视 ...

  10. shell中for循环总结

    关于shell中的for循环用法很多,一直想总结一下,今天网上看到上一篇关于for循环用法的总结,感觉很全面,所以就转过来研究研究,嘿嘿... 1. for((i=1;i<=10;i++));d ...