Umbraco中的Examine Search功能讲解
转载原地址: http://24days.in/umbraco/2013/getting-started-with-examine/
Everytime I read the word Examine or Lucene it is always combined with doing some crazy data extravaganza that sounds magical but requires 2 strong men, a Tesla Roadster, some squirrels(N amount) and 400 man-hours to get done.
So once upon a time I got a call from a customer, "We want some simple search thingy on our page", and I was like "sure thing we got packages for that". I installed a brilliant package and they were happy, until they figured out they wanted handling for typos, when their customers searched for products and the speed was also so-so because of the node amount. So I set out to find another way around, and I found it by doing something quite simple with examine.
I'm no examine/lucene ninja, but I hope my 2 cents here can get people started on playing around on their own because with examine there is so many possibilities to use your content (sorting tons of content based on different parameters, searching etc.).
So the idea with this is giving you a little intro to what is needed to get started with examine with a very hands-on aproach :)
Examine (based on Lucene) is an indexing/search engine that takes our content/data and puts it into a "phonebook" of sorts(index), so that we can search/lookup through it with blazing speed even on large amount of data/content.
So we want to do 2 things, first we are gonna have a look on the configuration (what sort of stuff do we want in our index?), secondly we want to make it searchable.
Part 1 : The configuration
So in every Umbraco installation there is a folder called "Config". Its filled with (you guessed it) configuration files that does tons of different stuff to your solution, and don't worry it's not dangerous to play around in here.
Two of these files we want to pop open and poke around inside.
/Config/ExamineIndex.config
/Config/ExamineSettings.config
ExamineIndex.config
In this file we want to define a new indexset, and the indexset contains the info on which doctypes and fields we want to index. For the example it could look something like this:
- <IndexSet SetName="MySearch" IndexPath="~/App_Data/ExamineIndexes/MySearch/">
- <IndexAttributeFields>
- <add Name="id" />
- <add Name="nodeName"/>
- <add Name="updateDate" />
- <add Name="writerName" />
- <add Name="nodeTypeAlias" />
- </IndexAttributeFields>
- <IndexUserFields>
- <add Name="bodyText"/>
- <add Name="siteName"/>
- </IndexUserFields>
- <IncludeNodeTypes>
- <add Name="umbHomePage" />
- <add Name="umbNewsItem" />
- <add Name="umbTextPage" />
- </IncludeNodeTypes>
- </IndexSet>
This block is just placed under the other <indexset>.
SetName is the reference, or the alias if you like that we want to remember when were gonna call the index from our providers.
IndexAttributeFields defines all the default Umbraco fields that a node contains such as name, nodetype and more.
IndexUserFields is the alias of the custom fields you have added to your doctypes.
IncludeNodeTypes is the alias of the doctypes you want to search through.
So now we have defined an indexset that takes 3 doctypes and looks for 2 properties. This is what we can search later on.
That was one file down and one to go.
ExamineSettings.config
So inside the examinesettings.config file we want to do 2 things, and that's adding a few providers (Index and search provider). These two handles, you guessed it, indexing our data/content and giving us the option to search it.
Index provider
- <add name="MySearchIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
- supportUnpublished="false"
- supportProtected="true"
- interval=""
- analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"
- indexSet="MySearch"/>
Paste it in just before
</providers>
</ExamineIndexProviders>
We have different settings here, should it index unpublished or protected nodes? and how often. One thing that is important is the IndexSet equals the Index SetName that we defined inside examineindex.config right before.
Search provider
The next thing we need to config is the search provider.
- <add name="MySearchSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
- analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net" indexSet="MySearch" enableLeadingWildcards="true"/>
This should just be added right before
</providers>
</ExamineSearchProviders>
Again some settings on how the search works, though not something we will talk about since we just run with defaults, but again we want to reference the IndexSet Name and enable leading wildcards.
If you want to dive deeper into the different params and get a better understanding for all this crazyness I recommend that you read this one :
http://umbraco.com/follow-us/blog-archive/2011/9/16/examining-examine.aspx
Part 2 : Let the search begin
So now we have configured all the stuff that we need and can start by cooking up a new razor macro called something as magical as "Search" which we can embed into our awesome search template.
So now we want to create a SIMPLE search that can handle some spelling mistakes and does the "basic" job we want.
First solution
So lets look at how this can be done in a simple example first:
- @inherits umbraco.MacroEngines.DynamicNodeContext
- @using Examine.LuceneEngine.SearchCriteria
- @{
- if (!string.IsNullOrEmpty(Request.QueryString["search"]))
- {
- //Fetching what eva searchterm some bloke is throwin' our way
- var q = Request.QueryString["search"];
- //Fetching our SearchProvider by giving it the name of our searchprovider
- var Searcher = Examine.ExamineManager.Instance.SearchProviderCollection["MySearchSearcher"];
- //Searching and ordering the result by score, and we only want to get the results that has a minimum of 0.05(scale is up to 1.)
- var searchResults = Searcher.Search(q, true).OrderByDescending(x => x.Score).TakeWhile(x => x.Score > 0.05f);
- //Printing the results
- <ul>
- @foreach (var item in searchResults)
- {
- var node = Model.NodeById(item.Fields["id"]);
- <li>
- <a href="@node.NiceUrl">
- @node.Name
- </a>
- </li>
- }
- </ul>
- }
- }
Just to do a quick run down the code we 3 things.
We fetch the term some dude just searched for from our query. We select which searchprovider to use (the one we setup in the config files remember), and last we search, oh yeah and print out (4 things sorry).
When we do the "search" we ask for results ordered by "score". Everytime we do something with Lucene, it's being returned with a score on how close it was to our search. I'm also asking it that we only want the items above a specific threshold so we ensure some kind of quality to our results.
So now when the customer searches for "geting started" instead of "getting started" it still finds the results.
Quite simple.
Little bonus: Also notice that we have something called "fields" where I'm fetching the ID of the node. But inside fields we can add all sorts of properties, so if you just needed to display the name of a page and the URL, we don't need to look up the node to fetch it, we could save some machine powa' by just adding the properties to the list of fields. That is what we did inside examineindex.config.
So this one would probably do it for most simple sites but let's say you wanna dive a bit deeper, you want to control which fields it should search through, and if some fields are more important than others.
Second solution
Let's go a bit deeper down the rabbit hole with this one, so let's do something similar just where we have a few more options.
In Lucene we can build our own queries for content. This can be done in 2 ways, and I will show the "Fluent" (chaining) way, while the other one is writing raw Lucene queries (you can look into this through some of the links at the bottom).
So this next example is quite similar to the first one but instead we are controlling which fields we wanna look into and if a field is more important.
- @inherits umbraco.MacroEngines.DynamicNodeContext
- @using Examine.LuceneEngine.SearchCriteria
- @{
- if (!string.IsNullOrEmpty(Request.QueryString["search"]))
- {
- //Fetching what eva searchterm some bloke is throwin' our way
- var q = Request.QueryString["search"].Trim();
- //Fetching our SearchProvider by giving it the name of our searchprovider
- var Searcher = Examine.ExamineManager.Instance.SearchProviderCollection["MySearchSearcher"];
- var searchCriteria = Searcher.CreateSearchCriteria(Examine.SearchCriteria.BooleanOperation.Or);
- var query = searchCriteria.Field("nodeName", q.Boost()).Or().Field("bodyText", q.Fuzzy());
- //Searching and ordering the result by score, and we only want to get the results that has a minimum of 0.05(scale is up to 1.)
- var searchResults = Searcher.Search(query.Compile()).OrderByDescending(x => x.Score).TakeWhile(x => x.Score > 0.05f);
- //Printing the results
- <ul>
- @foreach (var item in searchResults)
- {
- var node = Model.NodeById(item.Fields["id"]);
- <li>
- <a href="@node.NiceUrl">
- @node.Name
- </a>
- </li>
- }
- </ul>
- }
- }
So the difference here is that we can now control which fields we look at.
We do this trough something we call a SearchCriteria. Inside the SearchCriteria we tell which fields we want to look at like "nodeName" and bodyText.
Also we are telling the search that if it finds something in nodeName it's more important than bodyText since we are giving it a "boost". To a boost we add a value to buff the "score" of a resultitem.
We are also saying that we want to look at the bodyText, and we are adding a "fuzzy" option to it. This tells the search it should match on items that "looks like it". This is where we get some sort of spelling help.
The whole idea here is that we can keep on adding fields, and also if we want something specific to be true (could be inside a certain date range, only look at specific nodetypes etc.).
Umbraco中的Examine Search功能讲解的更多相关文章
- FL Studio中的音频剪辑功能讲解
音频剪辑,是FL Studio中的一个特色功能,音频剪辑的目的是保持在播放列表中显示和触发的音频,可以根据需要对它们进行切片和排列.但需要注意的是音频剪辑这个功能在FL Studio的基础版(果味版) ...
- Umbraco examine search media folder 中的pdf文件
可以参考的文章 http://sleslie.me/2015/selecting-media-using-razor-slow-performance-examine-to-the-rescue/ h ...
- Umbraco Examine Search (Lucene.net) french accent
在项目中使用Umbraco examine search 来search 法语网站时,客户有一个需求,就是 当search expérience 和 experience 时,需要返回一样的结果. ...
- avascript中的this与函数讲解
徐某某 一个半路出家的野生程序员 javascript中的this与函数讲解 前言 javascript中没有块级作用域(es6以前),javascript中作用域分为函数作用域和全局作用域.并且,大 ...
- 网页中插入QQ在线功能
网页中插入QQ在线功能 本随笔记录的是网页中如何插入qq在线聊天,这里讲解的是 普通QQ在线聊天操作. 例:第一种方式 使用 tencent://message/?uin=QQ号码&Site ...
- 【Lucene3.6.2入门系列】第03节_简述Lucene中常见的搜索功能
package com.jadyer.lucene; import java.io.File; import java.io.IOException; import java.text.SimpleD ...
- Kooboo 加Search功能 必须先ReBuild Index Data
加Search功能 有几个要点 1. 需要在Kooboo 必须先 ReBuild Index Data 2. 需要在要搜索的page中启用搜索索引 搜索的代码 @using K ...
- 电脑键盘上的F键有什么用 电脑F键功能讲解
接触电脑这么多年了,F1到F12这几个键你真的会用吗?电脑键盘上的F键有什么用?你了解过吗?这里带来电脑F键功能讲解,一起来看看. F1:帮助 在程序里或者资源管理器界面,按F1会弹出帮助按钮. F2 ...
- Java安全(权限)框架 - Shiro 功能讲解 架构分析
Java安全(权限)框架 - Shiro 功能讲解 架构分析 作者 : Stanley 罗昊 [转载请注明出处和署名,谢谢!] 简述Shiro Shiro出自公司Apache(阿帕奇),是java的一 ...
随机推荐
- Qt之自定义界面(添加自定义标题栏)
简述 通过上节内容,我们实现了自定义窗体的移动,但是我们缺少一个标题栏来显示窗体的图标.标题,以及控制窗体最小化.最大化.关闭的按钮. 自定义标题栏后,所有的控件我们都可以定制,比如:在标题栏中添加换 ...
- CodeForces 489B (贪心 或 最大匹配) BerSU Ball
题意: 有m个男孩和n个女孩,每个人都有一个舞蹈熟练度,用一个不超过100的正整数来表示. 一个男孩和一个女孩能够结为舞伴当且仅当两人的熟练度相差不超过1. 问最多能结成多少对舞伴 分析: 这是一个二 ...
- "xxxx".zip:这个压缩文件格式未知或者数据已经被损坏,打不开压缩文件,总出现这个提示的解决方法
从网上下载了一些压缩文件,有时解压时会出现“这个压缩文件格式未知或者数据已经被损坏”或“未找到压缩文件”的提示. 造成的原因有两种: 一.网站上的压缩文件本来就是坏的. 1.你可以尝试可以使用WINR ...
- Java [Leetcode 328]Odd Even Linked List
题目描述: Given a singly linked list, group all odd nodes together followed by the even nodes. Please no ...
- 01day1
最大音量 动态规划 题意:给出一个初始值和一个变化序列 c,在第 i 步可以加上或减去 c[i],求 n 步之后能达到的最大值.有一个限定值 maxlevel,在变化过程中值不能超过 maxlevel ...
- <三>面向对象分析之UML核心元素之参与者
一:版型 --->在UML里有一个概念叫版型.有些书里也称类型,构造型. --->这个概念是对一个UML元素基础定义的扩展.在同一个元素基础定义的基础上赋予特别 ...
- wifi详解(五)
1 Android平台的Wifi模块移植要点 1.1 Wifi结构 user interface Android WiFiService WPA_Supplicant DHD ...
- [Everyday Mathematics]20150122
设 $f:[0,1]\to [0,1]$. (1). 若 $f$ 连续, 试证: $\exists\ \xi\in [0,1],\st f(\xi)=\xi$. (2). 若 $f$ 单调递增, 试证 ...
- private
成员变量私有化的好处在于可以强制加强面向对象和封装的概念,一个面向对象的系统更加关注行为,而不是数据,所以应该通过发送消息来获得数据,也应该实习细节的封装
- 利用Spring.Net技术打造可切换的分布式缓存读写类
利用Spring.Net技术打造可切换的Memcached分布式缓存读写类 Memcached是一个高性能的分布式内存对象缓存系统,因为工作在内存,读写速率比数据库高的不是一般的多,和Radis一样具 ...