How to Use Lucene.NET with Windows Azure SQL Database
http://social.technet.microsoft.com/wiki/contents/articles/2367.how-to-use-lucene-net-with-windows-azure-sql-database.aspx
How to Use Lucene.NET with Windows Azure SQL Database
![](http://social.technet.microsoft.com/wiki/Utility/images/star-left-on.png)
![](http://social.technet.microsoft.com/wiki/Utility/images/star-right-on.png)
![](http://social.technet.microsoft.com/wiki/Utility/images/star-left-on.png)
![](http://social.technet.microsoft.com/wiki/Utility/images/star-right-on.png)
![](http://social.technet.microsoft.com/wiki/Utility/images/star-left-on.png)
![](http://social.technet.microsoft.com/wiki/Utility/images/star-right-on.png)
![](http://social.technet.microsoft.com/wiki/Utility/images/star-left-on.png)
![](http://social.technet.microsoft.com/wiki/Utility/images/star-right-on.png)
![](http://social.technet.microsoft.com/wiki/Utility/images/star-left-on.png)
![](http://social.technet.microsoft.com/wiki/Utility/images/star-right-on.png)
Table of Contents
- Summary
- Lucene.NET
- The Azure Library for Lucene.NET
- Using Lucene.NET to index SQL Database
- References
Summary
Lucene.NET is a .NET implementation of the Lucene full-text search engine. This article describes how you can use Lucene.NET to index text data stored in Windows Azure SQL Database, and then perform searches against that data.
NOTE: This does not provide an integrated full-text search experience like the full-text search in SQL Server. Lucene.NET is an external process that is queried separately from SQL Database.
NOTE: This article relies on the Azure Library for Lucene.NET (https://azuredirectory.codeplex.com/ ) to store the Lucene.NET catalog in a Windows Azure storage blob.
Prerequisites
- Windows Azure account (offers and purchasing information at http://www.microsoft.com/windowsazure/offers/default.aspx
)
- Visual Studio 2010
- Lucene.NET (http://lucenenet.apache.org/
, both binary and source project are available)
- Azure Library for Lucene.NET (https://azuredirectory.codeplex.com/
)
To use the Azure Library for Lucene.NET and Lucene.NET from a Visual Studio project, you must add a reference to both the AzureDirectory project or assembly, and the Lucerne.NET project or assembly. You must also add the following using statements to your project: using Lucene.Net; using Lucene.Net.Store; using Lucene.Net.Index; using Lucene.Net.Search; using Lucene.Net.Documents; using Lucene.Net.Util; using Lucene.Net.Analysis; using Lucene.Net.Analysis.Standard; using Lucene.Net.Search; using Lucene.Net.QueryParsers; using Lucene.Net.Store.Azure;
Lucene.NET
Lucene.NET is a .NET implementation of Lucene (http://lucene.apache.org/ ) and provides full-text indexing and search of documents. Documents are composed of multiple fields and do not have a predefined schema. When performing a query against the index, you can search across multiple fields within a document. Lucene.NET doesn't directly integrate with SQL Database; instead you must perform a query against a database and construct a Document from the results, which is then cataloged by Lucene.Net. For more information on Lucene.NET, see http://lucenenet.apache.org/
.
The Azure Library for Lucene.NET
This library allows you to expose blob storage as a Lucene.NET.Store.Directory object, which Lucene.NET uses as persistent storage for its catalog. More information on the Azure Library for Lucene.NET, as well as the latest version, can be found on the project homepage at https://azuredirectory.codeplex.com/ .
The current version of the Azure Library (as of 22 May 2013) may require modification before using it in your solution. Specifically:
- It may launch a the Visual Studio project conversion wizard when launched.
- The reference to Microsoft.WindowsAzure.Storage may need to be deleted and recreated to point to the most recent version of the assembly.
- There are several Debug.WriteLine statements that should be converted to Trace.write or another member of the Trace class as documented at http://msdn.microsoft.com/en-us/library/ff966484.aspx
. If you are not interested in diagnostic output, you can simply remove the Debug.WriteLine statements.
Using the Library
The following code creates an AzureDirectory object and uses it as a parameter when creating the IndexWriter:
AzureDirectory azureDirectory =
new
AzureDirectory(
CloudStorageAccount.FromConfigurationSetting(
"Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"
),
"TestCatalog"
);
IndexWriter indexWriter =
new
IndexWriter(azureDirectory,
new
StandardAnalyzer(),
true
);
Using Lucene.NET to index SQL Database
As mentioned previously, Lucene.NET is not integrated directly with SQL Database and is based on indexing 'documents' that contain multiple fields. In order to index data from SQL Database, you must query the database and create a new Document object for each row. Individual columns can then be added to the Document. The following code illustrates querying a SQL Database that contains information on individual bloggers, and then adding the ID and Bio column information to the Lucene index using an IndexWriter and Document:
// Create the AzureDirectory against blob storage and create a catalog named 'Catalog'
AzureDirectory azureDirectory=
new
AzureDirectory(CloudStorageAccount.FromConfigurationSetting(
"Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"
),
"Catalog"
);
IndexWriter indexWriter =
new
IndexWriter(azureDirectory,
new
StandardAnalyzer(),
true
);
indexWriter.SetRAMBufferSizeMB(10.0);
indexWriter.SetUseCompoundFile(
false
);
indexWriter.SetMaxMergeDocs(10000);
indexWriter.SetMergeFactor(100);
// Create a DataSet and fill it from SQL Database
DataSet ds =
new
DataSet();
using
(SqlConnection sqlCon =
new
SqlConnection(sqlConnString))
{
sqlCon.Open();
SqlCommand sqlCmd =
new
SqlCommand();
sqlCmd.Connection = sqlCon;
sqlCmd.CommandType = CommandType.Text;
// Only get the minimum fields we need; Bio to index, Id so search results
// can look up the record in SQL Database
sqlCmd.CommandText =
"select Id, Bio from bloggers"
;
SqlDataAdapter sqlAdap =
new
SqlDataAdapter(sqlCmd);
sqlAdap.Fill(ds);
}
if
(ds.Tables[0] !=
null
)
{
DataTable dt = ds.Tables[0];
if
(dt.Rows.Count > 0)
{
foreach
(DataRow dr
in
dt.Rows)
{
// Create the Document object
Document doc =
new
Document();
foreach
(DataColumn dc
in
dt.Columns)
{
// Populate the document with the column name and value from our query
doc.Add(
new
Field(
dc.ColumnName,
dr[dc.ColumnName].ToString(),
Field.Store.YES,
Field.Index.TOKENIZED));
}
// Write the Document to the catalog
indexWriter.AddDocument(doc);
}
}
}
// Close the writer
indexWriter.Close();
Note: The above sample returns all rows and adds them to the catalog. In a production application you will most likely only want to add new or updated rows.
Searching the Lucene.NET catalog
After you have added documents to the catalog, you can perform a search against them using the IndexSearcher. The following example illustrates how to create perform a search against the catalog for a term contained in the 'Bio' field and return the Id of that result:
// Create the AzureDirectory for blob storage
AzureDirectory azureDirectory =
new
AzureDirectory(CloudStorageAccount.FromConfigurationSetting(
"Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"
),
"Catalog"
);
// Create the IndexSearcher
IndexSearcher indexSearcher =
new
IndexSearcher(azureDirectory);
// Create the QueryParser, setting the default search field to 'Bio'
QueryParser parser =
new
QueryParser(
"Bio"
,
new
StandardAnalyzer());
// Create a query from the Parser
Query query = parser.Parse(searchString);
// Retrieve matching hits
Hits hits = indexSearcher.Search(query);
// Loop through the matching hits, retrieving the document
for
(
int
i = 0; i < hits.Length(); i++)
{
//Retrieve the string value of the 'Id' field from the
//hits.Doc(i) document.
TextBox_Results.Text +=
"Id: "
+ hits.Doc(i).GetField(
"Id"
).StringValue()+
"\n"
;
}
Based on the Id, you can perform a query against SQL Database to return additional fields from the matching record.
References
- https://azuredirectory.codeplex.com/
- http://www.logiclabz.com/c/create-lucene-index-in-c-for-given-sql-stored-procedure.aspx
- http://lucene.apache.org/
- http://lucenenet.apache.org/
- http://www.ifdefined.com/blog/post/2009/02/Full-Text-Search-in-ASPNET-using-LuceneNET.aspx
- http://blogs.msdn.com/b/windows-azure-support/archive/2010/11/01/how-to-use-lucene-net-in-windows-azure.aspx
How to Use Lucene.NET with Windows Azure SQL Database的更多相关文章
- [Windows Azure] Guidelines for Connecting to Windows Azure SQL Database
Guidelines for Connecting to Windows Azure SQL Database 6 out of 12 rated this helpful - Rate this t ...
- [Windows Azure] Development Considerations in Windows Azure SQL Database
Development Considerations in Windows Azure SQL Database 3 out of 5 rated this helpful - Rate this t ...
- [Windows Azure] Windows Azure SQL Database library
Microsoft Windows Azure SQL Database extends SQL Server capabilities to the cloud. SQL Database offe ...
- [Windows Azure] Getting Started with Windows Azure SQL Database
In this tutorial you will learn the fundamentals of Windows Azure SQL Database administration using ...
- 使用SQL Database Migration Wizard把SQL Server 2008迁移到Windows Azure SQL Database
本篇体验使用SQL Database Migration Wizard(SQLAzureMW)将SQL Server 2008数据库迁移到 Azure SQL Database.当然,SQLAzure ...
- [转]Azure 表存储和 Windows Azure SQL Database - 比较与对照
本文转自:https://msdn.microsoft.com/library/azure/jj553018 更新时间: 2014年10月 作者:Valery Mizonov 和 Seth Manhe ...
- [Windows Azure] Getting Started with Windows Azure SQL Data Sync
Getting Started with Windows Azure SQL Data Sync In this tutorial, you learn the fundamentals of Win ...
- 使用SSMS 2014将本地数据库迁移到Azure SQL Database
使用SQL Server Management Studio 2014将本地数据库迁移到Azure SQL Database的过程比较简单,在SSMS2014中,有一个任务选项为“将数据库部署到Win ...
- [Windows Azure] Create and use a reporting service in Windows Azure SQL Reporting
Create and use a reporting service in Windows Azure SQL Reporting In this tutorial you will learn ab ...
随机推荐
- unity中三种调用其他脚本函数的方法
第一种,被调用脚本函数为static类型,调用时直接用 脚本名.函数名()第二种,GameObject.Find("脚本所在的物体的名字").SendMessage(" ...
- 史航416第十次作业&总结
作业1: 计算两数的和与差.要求自定义一个函数 #include <stdio.h> void sum_diff(float op1,float op2,float *psum , flo ...
- Git命令学习摘要
1.git init --初始化git项目 2.git status --查看项目的状态 3.git add filename --添加文件到项目 4.git diff filename --查看工 ...
- HTTPS (HTTP Secure)
what is HTTPS HTTPS = HTTP + TSL Hypertext Transfer Protocol Secure (HTTPS) is acommunications proto ...
- C++设计模式-Adapter适配器模式(转)
Adapter适配器模式作用:将一个类的接口转换成客户希望的另外一个接口.Adapter模式使得原本由于接口不兼容而不能一起工作的那些类可以一起工作. 分为类适配器模式和对象适配器模式. 系统的数据和 ...
- NPOI支持excel2003和excel2007
IWorkbook wk = null; if (filePath.ToLower() == ".xls") { wk = new HSSFWorkbook(fs); } else ...
- linux命令(6):rmdir 命令
rmdir命令 rmdir是常用的命令,该命令的功能是删除空目录,一个目录被删除之前必须是空的.(注意,rm - r dir命令可代替rmdir,但是有很大危险性.)删除某目录时也必须具有对父目录的写 ...
- MINA系列学习-IoBuffer
在阅读IoBuffer源码之前,我们先看Mina对IoBuffer的描述:A byte buffer used by MINA applications. This is a replacement ...
- 编译Hadoop
Apache Hadoop 生态圈软件下载地址:http://archive.apache.org/dist/hadoop/hadoop下载地址 http://archive.apache.org/d ...
- JavaScrip之对象与继承
这章主要学习对象.原型.原型链和继承,比较核心,所以单独整理这一章的内容. 理解对象:一组名值对,值可以是数据或函数. 属性类型:1数据属性:包含一个数据值的位置.在这个位置可以读取和写入值,4个描述 ...