Apache Solr vs Elasticsearch
http://solr-vs-elasticsearch.com/
Apache Solr vs Elasticsearch
The Feature Smackdown
API
Feature | Solr 6.2.1 | ElasticSearch 5.0 |
---|---|---|
Format | XML, CSV, JSON | JSON |
HTTP REST API | ![]() |
![]() |
Binary API ![]() |
![]() |
![]() |
JMX support | ![]() |
![]() |
Official client libraries ![]() |
Java | Java, Groovy, PHP, Ruby, Perl, Python, .NET, Javascript Official list of clients |
Community client libraries ![]() |
PHP, Ruby, Perl, Scala, Python, .NET, Javascript, Go, Erlang, Clojure | Clojure, Cold Fusion, Erlang, Go, Groovy, Haskell, Java, JavaScript, .NET, OCaml, Perl, PHP, Python, R, Ruby, Scala, Smalltalk, Vert.x Complete list |
3rd-party product integration (open-source)![]() |
Drupal, Magento, Django, ColdFusion, Wordpress, OpenCMS, Plone, Typo3, ez Publish, Symfony2, Riak (via Yokozuna) | Drupal, Django, Symfony2, Wordpress, CouchBase |
3rd-party product integration (commercial)![]() |
DataStax Enterprise Search, Cloudera Search, Hortonworks Data Platform, MapR | SearchBlox, Hortonworks Data Platform, MapR etc Complete list |
Output![]() |
JSON, XML, PHP, Python, Ruby, CSV, Velocity, XSLT, native Java | JSON, XML/HTML (via plugin) |
Infrastructure
Feature | Solr 6.2.1 | ElasticSearch 5.0 |
---|---|---|
Master-slave replication | ![]() |
![]() |
Integrated snapshot and restore | Filesystem | Filesystem, AWS Cloud Plugin for S3 repositories, HDFS Plugin for Hadoop environments, Azure Cloud Plugin for Azure storage repositories |
Indexing
Feature | Solr 6.2.1 | ElasticSearch 5.0 |
---|---|---|
Data Import | DataImportHandler - JDBC, CSV, XML, Tika, URL, Flat File | [DEPRECATED in 2.x] Rivers modules - ActiveMQ, Amazon SQS, CouchDB, Dropbox, DynamoDB, FileSystem, Git, GitHub, Hazelcast, JDBC, JMS, Kafka, LDAP, MongoDB, neo4j, OAI, RabbitMQ, Redis, RSS, Sofa, Solr, St9, Subversion, Twitter, Wikipedia |
ID field for updates and deduplication | ![]() |
![]() |
DocValues ![]() |
![]() |
![]() |
Partial Doc Updates ![]() |
![]() |
![]() |
Custom Analyzers and Tokenizers ![]() |
![]() |
![]() |
Per-field analyzer chain ![]() |
![]() |
![]() |
Per-doc/query analyzer chain ![]() |
![]() |
![]() |
Index-time synonyms ![]() |
![]() |
![]() |
Query-time synonyms ![]() |
![]() |
![]() |
Multiple indexes ![]() |
![]() |
![]() |
Near-Realtime Search/Indexing ![]() |
![]() |
![]() |
Complex documents ![]() |
![]() |
![]() |
Schemaless ![]() |
![]() |
![]() |
Multiple document types per schema ![]() |
![]() |
![]() |
Online schema changes ![]() |
![]() |
![]() |
Apache Tika integration ![]() |
![]() |
![]() |
Dynamic fields ![]() |
![]() |
![]() |
Field copying ![]() |
![]() |
![]() |
Hash-based deduplication ![]() |
![]() |
![]() |
Searching
Feature | Solr 6.2.1 | ElasticSearch 5.0 |
---|---|---|
Lucene Query parsing ![]() |
![]() |
![]() |
Structured Query DSL ![]() |
![]() |
![]() |
Span queries ![]() |
![]() |
![]() |
Spatial/geo search ![]() |
![]() |
![]() |
Multi-point spatial search ![]() |
![]() |
![]() |
Faceting ![]() |
![]() |
![]() |
Advanced Faceting ![]() |
![]() |
![]() |
Geo-distance Faceting | ![]() |
![]() |
Pivot Facets ![]() |
![]() |
![]() |
More Like This | ![]() |
![]() |
Boosting by functions ![]() |
![]() |
![]() |
Boosting using scripting languages ![]() |
![]() |
![]() |
Push Queries ![]() |
![]() |
![]() |
Field collapsing/Results grouping ![]() |
![]() |
![]() |
Query Re-Ranking ![]() |
![]() |
![]() |
Index-based Spellcheck ![]() |
![]() |
![]() |
Wordlist-based Spellcheck ![]() |
![]() |
![]() |
Autocomplete | ![]() |
![]() |
Query elevation ![]() |
![]() |
![]() |
Intra-index joins ![]() |
![]() |
![]() |
Inter-index joins ![]() |
![]() |
![]() |
Resultset Scrolling ![]() |
![]() |
![]() |
Filter queries ![]() |
![]() |
![]() |
Filter execution order ![]() |
![]() |
![]() |
Alternative QueryParsers ![]() |
![]() |
![]() |
Negative boosting ![]() |
![]() |
![]() |
Search across multiple indexes | ![]() |
![]() |
Result highlighting | ![]() |
![]() |
Custom Similarity ![]() |
![]() |
![]() |
Searcher warming on index reload ![]() |
![]() |
![]() |
Term Vectors API | ![]() |
![]() |
Customizability
Feature | Solr 6.2.1 | ElasticSearch 5.0 |
---|---|---|
Pluggable API endpoints ![]() |
![]() |
![]() |
Pluggable search workflow ![]() |
![]() |
![]() |
Pluggable update workflow ![]() |
![]() |
![]() |
Pluggable Analyzers/Tokenizers | ![]() |
![]() |
Pluggable QueryParsers ![]() |
![]() |
![]() |
Pluggable Field Types | ![]() |
![]() |
Pluggable Function queries | ![]() |
![]() |
Pluggable scoring scripts | ![]() |
![]() |
Pluggable hashing ![]() |
![]() |
![]() |
Pluggable webapps ![]() |
![]() |
![]() |
Automated plugin installation ![]() |
![]() |
![]() |
Distributed
Feature | Solr 6.2.1 | ElasticSearch 5.0 |
---|---|---|
Self-contained cluster ![]() |
![]() |
![]() |
Automatic node discovery | ![]() |
![]() |
Partition tolerance | ![]() |
![]() |
Automatic failover | ![]() |
![]() |
Automatic leader election | ![]() |
![]() |
Shard replication | ![]() |
![]() |
Sharding ![]() |
![]() |
![]() |
Automatic shard rebalancing ![]() |
![]() |
![]() |
Change # of shards | ![]() |
![]() |
Shard splitting | ![]() |
![]() |
Relocate shards and replicas ![]() |
![]() |
![]() |
Control shard routing ![]() |
![]() |
![]() |
Pluggable shard/replica assignment | ![]() |
![]() |
Consistency | Indexing requests are synchronous with replication. A indexing request won't return until all replicas respond. No check for downed replicas. They will catch up when they recover. When new replicas are added, they won't start accepting and responding to requests until they are finished replicating the index. | Replication between nodes is synchronous by default, thus ES is consistent by default, but it can be set to asynchronous on a per document indexing basis. Index writes can be configured to fail is there are not sufficient active shard replicas. The default is quorum, but all or one are also available. |
Misc
Feature | Solr 6.2.1 | ElasticSearch 5.0 |
---|---|---|
Web Admin interface | ![]() |
![]() |
Visualisation | Banana (Port of Kibana) | Kibana |
Hosting providers | WebSolr, Searchify, Hosted-Solr, IndexDepot, OpenSolr, gotosolr | Found, ObjectRocket, bonsai.io, Indexisto, qbox.io, IndexDepot, Compose.io, Sematext Logsene |
Thoughts...
I'm embedding my answer to this "Solr-vs-Elasticsearch" Quora question verbatim here:
1. Elasticsearch was born in the age of REST APIs. If you love REST APIs, you'll probably feel more at home with ES from the get-go. I don't actually think it's 'cleaner' or 'easier to use', but just that it is more aligned with web 2.0 developers' mindsets.
2. Elasticsearch's Query DSL syntax is really flexible and it's pretty easy to write complex queries with it, though it does border on being verbose. Solr doesn't have an equivalent, last I checked. Having said that, I've never found Solr's query syntax wanting, and I've always been able to easily write a custom SearchComponent if needed (more on this later).
3. I find Elasticsearch's documentation to be pretty awful. It doesn't help that some examples in the documentation are written in YAML and others in JSON. I wrote a ES code parser once to auto-generate documentation from Elasticsearch's source and found a number of discrepancies between code and what's documented on the website, not to mention a number of undocumented/alternative ways to specify the same config key.
By contrast, I've found Solr to be consistent and really well-documented. I've found pretty much everything I've wanted to know about querying and updating indices without having to dig into code much. Solr's schema.xml and solrconfig.xml are *extensively* documented with most if not all commonly used configurations.
4. Whilst what Rick says about ES being mostly ready to go out-of-box is true, I think that is also a possible problem with ES. Many users don't take the time to do the most simple config (e.g. type mapping) of ES because it 'just works' in dev, and end up running into issues in production.
And once you do have to do config, then I personally prefer Solr's config system over ES'. Long JSON config files can get overwhelming because of the JSON's lack of support for comments. Yes you can use YAML, but it's annoying and confusing to go back and forth between YAML and JSON.
5. If your own app works/thinks in JSON, then without a doubt go for ES because ES thinks in JSON too. Solr merely supports it as an afterthought. ES has a number of nice JSON-related features such as parent-child and nested docs that makes it a very natural fit. Parent-child joins are awkward in Solr, and I don't think there's a Solr equivalent for ES Inner hits.
6. ES doesn't require ZooKeeper for it's 'elastic' features which is nice coz I personally find ZK unpleasant, but as a result, ES does have issues with split-brain scenarios though (google 'elasticsearch split-brain' or see this: Elasticsearch Resiliency Status).
7. Overall from working with clients as a Solr/Elasticsearch consultant, I've found that developer preferences tend to end up along language party lines: if you're a Java/c# developer, you'll be pretty happy with Solr. If you live in Javascript or Ruby, you'll probably love Elasticsearch. If you're on Python or PHP, you'll probably be fine with either.
Something to add about this: ES doesn't have a very elegant Java API IMHO (you'll basically end up using REST because it's less painful), whereas Solrj is very satisfactory and more efficient than Solr's REST API. If you're primarily a Java dev team, do take this into consideration for your sanity. There's no scenario in which constructing JSON in Java is fun/simple, whereas in Python its absolutely pain-free, and believe me, if you have a non-trivial app, your ES json query strings will be works of art.
8. ES doesn't have in-built support for pluggable 'SearchComponents', to use Solr's terminology. SearchComponents are (for me) a pretty indispensable part of Solr for anyone who needs to do anything customized and in-depth with search queries.
Yes of course, in ES you can just implement your own RestHandler, but that's just not the same as being able to plug-into and rewire the way search queries are handled and parsed.
9. Whichever way you go, I highly suggest you choose a client library which is as 'close to the metal' as you can get. Both ES and Solr have *really* simple search and updating search APIs. If a client library introduces an additional DSL layer in attempt to 'simplify', I suggest you think long and hard about using it, as it's likely to complicate matters in the long-run, and make debugging and asking for help on SO more problematic.
In particular, if you're using Rails + Solr, consider using rsolr/rsolr
instead of sunspot/sunspot if you can help it. ActiveRecord is complex code and sufficiently magical. The last thing you want is more magic on top of that.---
To conclude, ES and Solr have more or less feature-parity and from a feature standpoint, there's rarely one reason to go one way or the other (unless your app lives/breathes JSON). Performance-wise, they are also likely to be quite similar (I'm sure there are exceptions to the rule. ES' relatively new autocomplete implementation, for example, is a pretty dramatic departure from previous Lucene/Solr implementations, and I suspect it produces faster responses at scale).
ES does offer less friction from the get-go and you feel like you have something working much quicker, but I find this to be illusory. Any time gained in this stage is lost when figuring out how to properly configure ES because of poor documentation - an inevitablity when you have a non-trivial application.
Solr encourages you to understand a little more about what you're doing, and the chance of you shooting yourself in the foot is somewhat lower, mainly because you're forced to read and modify the 2 well-documented XML config files in order to have a working search app.
---
EDIT on Nov 2015:
ES has been gradually distinguishing itself from Solr when it comes to data analytics. I think it's fair to attribute this to the immense traction of the ELK stack in the logging, monitoring and analytic space. My guess is that this is where Elastic (the company) gets the majority of its revenue, so it makes perfect sense that ES (the product) reflects this.
We see this manifesting primarily in the form of aggregations, which is a more flexible and nuanced replacement for facets. Read more about aggregations here: Migrating to aggregations
Aggregations have been out for a while now (since 1.4), but with the recently released ES 2.0 comes pipeline aggregations, which let you compute aggregations such as derivatives, moving averages, and series arithmetic on the results of other aggregations. Very cool stuff, and Solr simply doesn't have an equivalent. More on pipeline aggregations here: Out of this world aggregations
If you're currently using or contemplating using Solr in an analytics app, it is worth your while to look into ES aggregation features to see if you need any of it.
Resources
- My other sites may be of interest if you're new to Lucene, Solr and Elasticsearch:
- The Solr wiki and the Elasticsearch Guide are your friends.
Contribute
If you see any mistakes, or would like to append to the information on this webpage, you can clone the GitHub repo for this site with:
git clone https://github.com/superkelvint/solr-vs-elasticsearch
and submit a pull request.
Popular books related to Search
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Apache Solr vs Elasticsearch的更多相关文章
- 【搜索引擎】SOLR VS Elasticsearch(2019技术选型参考)
SOLR是什么 (官方的解释) Solr是基于Apache Lucene构建的流行的.快速的.开源的企业搜索平台. Solr也是高度可靠.可伸缩和容错的,提供分布式索引.复制和负载平衡查询.自动故障转 ...
- 02 Apache Solr: 概览 Solr在信息系统架构中的位置
概述: Apache Solr是一个用JAVA语言构建在Apache Lucene项目上的开源的企业级搜索平台.主要特性包含:全文搜索.命中高亮.片段式搜索.实时索引.动态集群.数据库集成. ...
- 01 Apache Solr:提升检索体验 为什么是Solr
背景: 最近开发一个大型的仓储管理平台项目,项目的前身是无数个版本的历史悠久的基于CS模式的Windows桌面程序.然后对于每一个客户,我们可能需要为之定制比较个性化的特殊功能.于是,有一个 ...
- Solr vs. Elasticsearch谁是开源搜索引擎王者
当前是云计算和数据快速增长的时代,今天的应用程序正以PB级和ZB级的速度生产数据,但人们依然在不停的追求更高更快的性能需求.随着数据的堆积,如何快速有效的搜索这些数据,成为对后端服务的挑战.本文,我们 ...
- 搜索引擎solr和elasticsearch
刚开始接触搜索引擎,网上收集了一些资料,在这里整理了一下分享给大家. 一.关于搜索引擎 搜索引擎(Search Engine)是指根据一定的策略.运用特定的计算机程序从互联网上搜集信息,在对信息进行组 ...
- solr与Elasticsearch对比
搜索引擎:Solr与Elasticsearch比较分析 Elasticsearch是一个实时的分布式搜索和分析引擎.它可以帮助你用前所未有的速度去处理大规模数据. 它可以用于全文搜索,结构化搜索以及分 ...
- 开源搜素引擎:Lucene、Solr、Elasticsearch、Sphinx优劣势比较
https://blog.csdn.net/belalds/article/details/82667692 开源搜索引擎分类 1.Lucene系搜索引擎,java开发,包括: Lucene Solr ...
- 转 Solr vs. Elasticsearch谁是开源搜索引擎王者
转 https://www.cnblogs.com/xiaoqi/p/6545314.html Solr vs. Elasticsearch谁是开源搜索引擎王者 当前是云计算和数据快速增长的时代,今天 ...
- solr和ElasticSearch(ES)的区别?
Solr2004年诞生 ElasticSearch 2010年诞生 ES更新 ElasticSearch简介: ElasticSearch是一个实时的分布式的搜索引擎和分析引擎.它可以帮助你用前所未有 ...
随机推荐
- (九十)使用多个storyboard+代码实现控制器的分开管理
使用单个storyboard会使得项目难与管理,使用纯代码又会过于麻烦,因此如果能将二者结合起来,并且使用多个storyboard,会使得项目简单简单.方便许多. 下面以一个简单的视图关系为例,介绍多 ...
- 用SpriteBuilder简化"耕牛遍地走"的动画效果(四)
写到这突然有童鞋质疑,你这哪里是牛,分明是熊嘛! 仔细看了下,还真像牛.反正是这个意思.怪本猫猪牛熊不分,好在道理是一样的. 下面继续,言归正传. 添加一个空白的touchBegan方法,如果没有这个 ...
- try、catch、finally 块的关系
try.catch.finally 块的关系 try块不能单独存在,后面必须跟catch块或者finally块. 三者之间的组合为:try-catch.try-catch-finally.try-fi ...
- 网站开发进阶(三十六)String.getBytes()方法中的中文编码问题
String.getBytes()方法中的中文编码问题 String的getBytes()方法是得到一个系统默认的编码格式的字节数组. getBytes("utf-8")得到一个U ...
- Android进阶(十)Android 发邮件
最近在做的APP涉及到发邮件,总结如下: 在android里进行邮件客户端开发可以有两种方式: 在邮件客户端的设计中,可以采用两种方法. 一种是调用android系统自带的邮件服务 优点:这种方法比较 ...
- 在maven中开发Spring需要的jar依赖
在maven中开发Spring需要的jar依赖 <properties> <spring.version>4.0.6.RELEASE</spring.version> ...
- Linux多线程实践(7) --多线程排序对比
屏障 int pthread_barrier_init(pthread_barrier_t *restrict barrier, const pthread_barrierattr_t *restri ...
- 一个简易版本的lua debugger实现
introduction 工欲善其事,必先利其器.lua作为一门动态语言,虽然我已经习惯了使用print来进行代码调试,但是还是有很多童鞋觉得一款好用的调试器能更好的进行lua代码编写.所以在以前接手 ...
- DFS迷宫递归所有路径 新手入门
这篇文章写给自己以后复习和个个入门朋友:提示同学们一定耐心看完解释 哪怕看得很难受,我是新手我懂大家的心烦.看完后慢慢体会代码 我们假设迷宫为如下状况: {0,0,1,0} ...
- Dynamics CRM 同一实体多个Form显示不同的Ribbon按钮
自CRM2011引入多FORM窗体,并且对不同的窗体引入了角色控制,给我们的客制化开发带来了多样化,既然有了多窗体也就理所当然的有了在不同的窗体显示不同的Ribbon按钮的需求,具体怎么做见下面的博客 ...