在MYSQL中使用全文索引(FULLTEXT index)

MYSQL的一个很有用的特性是使用全文索引(FULLTEXT index)查找文本的能力.目前只有使用MyISAM类型表的时候有效(MyISAM是默认的表类型,如果你不知道使用的是什么类型的表,那很可能就是 MyISAM).全文索引可以建立在TEXT,CHAR或者VARCHAR类型的字段,或者字段组合上.我们将建立一个简单的表用来解释各种特性.
简单用法(MATCH()函数)对3.23.23以后的版本有效,复杂的用法(IN BOOLEAN MODE修饰语)对4以后的版本有效,本文的第一部分着重简单用法,第二部分讲复杂用法.
一个简单的表 
我们将在整个过程中使用下面的表.
CREATE TABLE fulltext_sample(copy TEXT,FULLTEXT(copy)) TYPE=MyISAM; 
如果你没有把默认的表类型设置成MyISAM以外的类型那么TYPE=MyISAM可以省略.建表之后,向其中填充一些数据,例如:
INSERT INTO fulltext_sample VALUES
('It appears good from here'),
('The here and the past'),
('Why are we hear'),
('An all-out alert'),
('All you need is love'),
('A good alert');
 
如果你已经建立好了一个表,你可以使用ALTER TABLE(就像CREATE INDEX语句一样)语句添加一个全文索引,例如:
ALTER TABLE fulltext_sample ADD FULLTEXT(copy) 
查找文本 
全文索引搜索的语法很简单,你只要MATCH字段,AGAINST你要查找的文本,例如:
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('love');
+----------------------+
| copy |
+----------------------+
| All you need is love |
+----------------------+
 
在全文索引上进行搜索是不区分大小写的,因此下面的语句也可以正常运行:
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('LOVE');
+----------------------+
| copy |
+----------------------+
| All you need is love |
+----------------------+
 
全文索引通常用来搜索自然语言文本,例如报纸文章,网页内容等等.因此MySQL为这类搜索添加了很多特性.MySQL不索引任何长度小于等于3的文本, 也不索引有50%机会出现的单词.这意味着如果你的表少于2条记录,基于全文索引的搜索不会返回任何东西.将来,MySQL会使这项功能更灵活,但是现在 它应该可以适合大部分自然语言的使用.如果你的数据库中的大部分记录都包含”music”,你很可能不希望返回这些记录,你可以使用IN BOOLEAN MODE修饰符来获得50%左右的阀值,见本文第二部分.
结果将按照关联性从高到底的顺序返回.
主要特性 
下面是标准的全文索引搜索的主要特性:
1.排除重复词语
2.排除长度小于4的词语
3.排除在多于一半记录中出现的词语(就是说只要要有3条记录)
4.带连字符的词语被认为两个词语
5.结果按照关联度降序返回
6.忽略列表中的词语也被从搜索结果中排除.忽略列表基于普通的英文单词,因此如果你的数据用作不同的目的,你可能希望改变忽略列表.不幸的是,这样作并 不容易.你需要编辑文件myisam/ft_static.c,重新编辑MySQL,并重建索引!这里有一个忽略列表.注意,这些在不同的版本里有所更 改.
忽略列表
"a", "a's", "able", "about", "above", "according", "accordingly", "across", "actually", "after", "afterwards", "again", "against", "ain't", "all", "allow", "allows", "almost", "alone", "along", "already", "also", "although", "always", "am", "among", "amongst", "an", "and", "another", "any", "anybody", "anyhow", "anyone", "anything", "anyway", "anyways", "anywhere", "apart", "appear", "appreciate", "appropriate", "are", "aren't", "around", "as", "aside", "ask", "asking", "associated", "at", "available", "away", "awfully", "b", "be", "became", "because", "become", "becomes", "becoming", "been", "before", "beforehand", "behind", "being", "believe", "below", "beside", "besides", "best", "better", "between", "beyond", "both", "brief", "but", "by", "c", "c'mon", "c's", "came", "can", "can't", "cannot", "cant", "cause", "causes", "certain", "certainly", "changes", "clearly", "co", "com", "come", "comes", "concerning", "consequently", "consider", "considering", "contain", "containing", "contains", "corresponding", "could", "couldn't", "course", "currently", "d", "definitely", "described", "despite", "did", "didn't", "different", "do", "does", "doesn't", "doing", "don't", "done", "down", "downwards", "during", "e", "each", "edu", "eg", "eight", "either", "else", "elsewhere", "enough", "entirely", "especially", "et", "etc", "even", "ever", "every", "everybody", "everyone", "everything", "everywhere", "ex", "exactly", "example", "except", "f", "far", "few", "fifth", "first", "five", "followed", "following", "follows", "for", "former", "formerly", "forth", "four", "from", "further", "furthermore", "g", "get", "gets", "getting", "given", "gives", "go", "goes", "going", "gone", "got", "gotten", "greetings", "h", "had", "hadn't", "happens", "hardly", "has", "hasn't", "have", "haven't", "having", "he", "he's", "hello", "help", "hence", "her", "here", "here's", "hereafter", "hereby", "herein", "hereupon", "hers", "herself", "hi", "him", "himself", "his", "hither", "hopefully", "how", "howbeit", "however", "i", "i'd", "i'll", "i'm", "i've", "ie", "if", "ignored", "immediate", "in", "inasmuch", "inc", "indeed", "indicate", "indicated", "indicates", "inner", "insofar", "instead", "into", "inward", "is", "isn't", "it", "it'd", "it'll", "it's", "its", "itself", "j", "just", "k", "keep", "keeps", "kept", "know", "knows", "known", "l", "last", "lately", "later", "latter", "latterly", "least", "less", "lest", "let", "let's", "like", "liked", "likely", "little", "look", "looking", "looks", "ltd", "m", "mainly", "many", "may", "maybe", "me", "mean", "meanwhile", "merely", "might", "more", "moreover", "most", "mostly", "much", "must", "my", "myself", "n", "name", "namely", "nd", "near", "nearly", "necessary", "need", "needs", "neither", "never", "nevertheless", "new", "next", "nine", "no", "nobody", "non", "none", "noone", "nor", "normally", "not", "nothing", "novel", "now", "nowhere", "o", "obviously", "of", "off", "often", "oh", "ok", "okay", "old", "on", "once", "one", "ones", "only", "onto", "or", "other", "others", "otherwise", "ought", "our", "ours", "ourselves", "out", "outside", "over", "overall", "own", "p", "particular", "particularly", "per", "perhaps", "placed", "please", "plus", "possible", "presumably", "probably", "provides", "q", "que", "quite", "qv", "r", "rather", "rd", "re", "really", "reasonably", "regarding", "regardless", "regards", "relatively", "respectively", "right", "s", "said", "same", "saw", "say", "saying", "says", "second", "secondly", "see", "seeing", "seem", "seemed", "seeming", "seems", "seen", "self", "selves", "sensible", "sent", "serious", "seriously", "seven", "several", "shall", "she", "should", "shouldn't", "since", "six", "so", "some", "somebody", "somehow", "someone", "something", "sometime", "sometimes", "somewhat", "somewhere", "soon", "sorry", "specified", "specify", "specifying", "still", "sub", "such", "sup", "sure", "t", "t's", "take", "taken", "tell", "tends", "th", "than", "thank", "thanks", "thanx", "that", "that's", "thats", "the", "their", "theirs", "them", "themselves", "then", "thence", "there", "there's", "thereafter", "thereby", "therefore", "therein", "theres", "thereupon", "these", "they", "they'd", "they'll", "they're", "they've", "think", "third", "this", "thorough", "thoroughly", "those", "though", "three", "through", "throughout", "thru", "thus", "to", "together", "too", "took", "toward", "towards", "tried", "tries", "truly", "try", "trying", "twice", "two", "u", "un", "under", "unfortunately", "unless", "unlikely", "until", "unto", "up", "upon", "us", "use", "used", "useful", "uses", "using", "usually", "v", "value", "various", "very", "via", "viz", "vs", "w", "want", "wants", "was", "wasn't", "way", "we", "we'd", "we'll", "we're", "we've", "welcome", "well", "went", "were", "weren't", "what", "what's", "whatever", "when", "whence", "whenever", "where", "where's", "whereafter", "whereas", "whereby", "wherein", "whereupon", "wherever", "whether", "which", "while", "whither", "who", "who's", "whoever", "whole", "whom", "whose", "why", "will", "willing", "wish", "with", "within", "without", "won't", "wonder", "would", "would", "wouldn't", "x", "y", "yes", "yet", "you", "you'd", "you'll", "you're", "you've", "your", "yours", "yourself", "yourselves", "z", "zero", 
让我们看一下其中的一些词.如果你懒的输入,但是想查找”love”这个词,象下面这样:
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('lov');
Empty set (0.00 sec)
 
什么都没返回,因为全文索引只包含完整的单词,不是部分单词.如果想得到返回,你必须把单词写完整,就像第一个例子里一样.
就像我们提过的,连字符单词在全文索引中被排除(它们被作为单独的单词索引),因此下面的语句什么都不返回:
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('all-out');
Empty set (0.00 sec)
 
很不幸,两个单词都小于4个字符,因此单独搜索时也不会出现,而且通常的搜索中也不会出现.本文的第二部分中使用BOOLEAN MODE搜索可以搜索部分的或者包含连字符的单词.
你也可以一次搜索多个单词,用逗号分隔.下面的例子查找包含”here”和”appears”的记录:
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('here','appears');
Empty set (0.01 sec)
 
出乎意料这个语句没有返回.但是仔细看看忽略列表,这个词被列在其中,因此被从索引中排除了.忽略列表可能是人们解释MySQL全文索引没有生效的通常原因.如果你的查询返回了一个结果,那么你的版本的MySQL的忽略列表不包含”here”这个词.
关联度 
下面的例子说明记录返回的优先级
mysql> SELECT * FROM fulltext_sample WHERE MATCH(copy) AGAINST('good,alert');
+---------------------------+
| copy |
+---------------------------+
| A good alert |
| It appears good from here |
| An all-out alert |
+---------------------------+
 
记录”A good alert”首先出现,因为它同时包含要搜索的两个词.你不必相信我-只需要看看MySQL在结果中显示的优先级.简单的在字段列表中重复MATCH()函数,例如:
mysql> SELECT copy,MATCH(copy) AGAINST('good,alert') AS relevance
FROM fulltext_sample WHERE MATCH(copy) AGAINST('good,alert');
+---------------------------+------------------+
| copy | relevance |
+---------------------------+------------------+
| A good alert | 1.3551264824316 |
| An all-out alert | 0.68526663197496 |
| It appears good from hear | 0.67003110026735 |
+---------------------------+------------------+
 
关联度的计算非常复杂,它基于索引中单词的数量,记录中不同单词的个数,索引和返回结果中单词的总数,以及单词的重要程度.这个数字可能在你的MySQL版本中有所不同,MySQL偶尔会强化计算逻辑.
对大多数应用来说标准的全文索引搜索非常有用而充分,MySQL 4让它更加强大.

在MYSQL中运用全文索引(FULLTEXT index)的更多相关文章

  1. MySQL使用全文索引(fulltext index)---高性能

    转载地址:https://blog.csdn.net/u011734144/article/details/52817766/ 1.创建全文索引(FullText index) 旧版的MySQL的全文 ...

  2. MySQL使用全文索引(fulltext index)

    1.创建全文索引(FullText index) 旧版的MySQL的全文索引只能用在MyISAM表格的char.varchar和text的字段上. 不过新版的MySQL5.6.24上InnoDB引擎也 ...

  3. MySQL中的索引提示Index Hint

    MySQL数据库支持索引提示(INDEX HINT)显式的高速优化器使用了哪个索引.以下是可能需要用到INDEX HINT的情况 a)MySQL数据库的优化器错误的选择了某个索引,导致SQL运行很慢. ...

  4. MySQL中的全文索引

    之前曾经发表了一篇关于SQL Server全文索引的文章.现在将MySQL全文索引的配置过程记录一下. Step1:创建Student表 CREATE TABLE `student` ( `id` I ...

  5. CNS数据库网站只用mysql自带的fulltext index功能就可实现了。

    1)编辑脚本script.sql如下 ALTER TABLE `table_name` ADD FULLTEXT (`column_name`) 2)在mysql console下输入命令 SOURC ...

  6. MYSQL中的普通索引,主健,唯一,全文索引区别

    MYSQL索引用来快速地寻找那些具有特定值的记录,所有MySQL索引都以B-树的形式保存.如果没有索引,执行查询时MySQL必须从第一个记录开始扫描整个表的所有记录,直至找到符合要求的记录.表里面的记 ...

  7. mysql中,主键与普通索引

    一.什么是索引?索引用来快速地寻找那些具有特定值的记录,所有MySQL索引都以B-树的形式保存.如果没有索引,执行查询时MySQL必须从第一个记录开始扫描整个表的所有记录,直至找到符合要求的记录.表里 ...

  8. mysql中主外键关系

    一.外键: 1.什么是外键 2.外键语法 3.外键的条件 4.添加外键 5.删除外键 1.什么是外键: 主键:是唯一标识一条记录,不能有重复的,不允许为空,用来保证数据完整性 外键:是另一表的主键, ...

  9. mysql|中主外键关系(转)

    http://my.oschina.net/liting/blog/356150 一.外键: 1.什么是外键 2.外键语法 3.外键的条件 4.添加外键 5.删除外键 1.什么是外键: 主键:是唯一标 ...

随机推荐

  1. BZOJ3670:[NOI2014]动物园

    浅谈\(KMP\):https://www.cnblogs.com/AKMer/p/10438148.html 题目传送门:https://lydsy.com/JudgeOnline/problem. ...

  2. 分布式缓存系统 Memcached 数据存储slab与hashtable

    缓存数据以item为基本单元,以双链表形式存放在对应级别大小的slabclass结构的chunk中.同时该item还存放在链式hashtable中bucket中,用于提供快速查找的索引. 首先是理解缓 ...

  3. window.open()打开新窗口被拦截

    window.open( url )是常用的打开新页面的方法,一般都没有问题,但是如果在ajax回调方法里面使用就会被浏览器拦截,因为在浏览器安全机制中,页面弹窗必须是由用户触发的才是安全弹窗,比如说 ...

  4. Python类(一)-实例化一个类

    #-*- coding:utf-8 -*- __author__ = "MuT6 Sch01aR" class Person(): n = 123 #类变量 def __init_ ...

  5. pcs与crmsh命令比较

    一.概念 1.crmsh This project is not part of the GNU Project. Pacemaker command line interface for manag ...

  6. EF中新建表和关联表的方法

    以机场表为例 private static AIRPORT_HELIPORTManager AirportHeliportManager => ManagerFactory.Instance.A ...

  7. PHP_File文件操作简单常用函数

    php测试文件 <?php header("Content-type:text/html;charest=utf-8");$fileDir='Upload/File/cont ...

  8. Net Framework 4.0 和.Net Framework 4.0 Client Profile

    Net Framework 4.0 和.Net Framework 4.0 Client Profile区别: .Net Framework 4.0毫无疑问就像是.Net Framework 2.0一 ...

  9. 页面中CSS的四种引入方式的介绍与比较

    转自:https://blog.csdn.net/qq_38689666/article/details/79039392 一:行内式 <p style="color:red" ...

  10. 利用jquery mobiscroll插件选择日期、select、treeList的具体运用

    体验更优排版请移步原文:http://blog.kwin.wang/programming/jquery-mobiscroll-select-treeList.html mobiscroll是个很好用 ...