[rabbitmq-discuss] Exactly Once Delivery
[rabbitmq-discuss] Exactly Once Delivery http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2010-August/008272.html
[rabbitmq-discuss] Exactly Once Delivery
John Apps johndapps at gmail.com
Thu Aug 5 14:00:11 BST 2010
- Previous message: [rabbitmq-discuss] Exactly Once Delivery
- Next message: [rabbitmq-discuss] Exactly Once Delivery
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Matthew,
an excellent response and thank you for it! Yes, difficult it is! It raises a somewhat philosophical discussion around where the onus is
placed in terms of guaranteeing such things as 'guaranteed once', i.e., on
the client side or on the server side? The JMS standard offers guaranteed
once, whereby the onus is on the server (JMS implementation) and not on the
client. What I am trying to say is that, in my opinion, client programs should be as
'simple' as possible with the servers doing all the hard work. This is what
the JMS standard forces on implementors and, perhaps to a lesser extent
today, do does AMQP. Note: the word 'server' is horribly overloaded these days. It is used here
to indicate the software with which clients, producers and consumers,
communicate. Oh well, off to librabbitMQ and some example programs written in COBOL... Cheers, John
On Thu, Aug 5, 2010 at 13:22, Matthew Sackman <matthew at rabbitmq.com> wrote: > Hi Mike,
>> On Tue, Aug 03, 2010 at 04:43:56AM -0400, Mike Petrusis wrote:
> > In reviewing the mailing list archives, I see various threads which state
> that ensuring "exactly once" delivery requires deduplication by the
> consumer. For example the following:
> >
> > "Exactly-once requires coordination between consumers, or idempotency,
> > even when there is just a single queue. The consumer, broker or network
> > may die during the transmission of the ack for a message, thus causing
> > retransmission of the message (which the consumer has already seen and
> > processed) at a later point."
> http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-July/004237.html
> >
> > In the case of competing consumers which pull messages from the same
> queue, this will require some sort of shared state between consumers to
> de-duplicate messages (assuming the consumers are not idempotent).
> >
> > Our application is using RabbitMQ to distribute tasks across multiple
> workers residing on different servers, this adds to the cost of sharing
> state between the workers.
> >
> > Another message in the email archive mentions that "You can guarantee
> exactly-once delivery if you use transactions, durable queues and exchanges,
> and persistent messages, but only as long as any failing node eventually
> recovers."
>> All the above is sort of wrong. You can never *guarantee* exactly once
> (there's always some argument about whether receiving message duplicates
> but relying on idempotency is achieving exactly once. I don't feel it
> does, and this should become clearer as to why further on...)
>> The problem is publishers. If the server on which RabbitMQ is running
> crashes, after commiting a transaction containing publishes, it's
> possible the commit-ok message may get lost. Thus the publishers still
> think they need to republish, so wait until the broker comes back up and
> then republishes. This can happen an infinite number of times: the
> publishers connect, start a transaction, publish messages, commit the
> transaction and then the commit-ok gets lost and so the publishers
> repeat the process.
>> As a result, on the clients, you need to detect duplicates. Now this is
> really a barrier to making all operations idempotent. The problem is
> that you never know how many copies of a message there will be. Thus you
> never know when it's safe to remove messages from your dedup cache. Now
> things like redis apparently have the means to delete entries after an
> amount of time, which would at least allow you to avoid the database
> eating up all the RAM in the universe, but there's still the possibility
> that after the entry's been deleted, another duplicate will come along
> which you now won't detect as a duplicate.
>> This isn't just a problem with RabbitMQ - in any messaging system, if
> any message can be lost, you can not achieve exactly once semantics. The
> best you can hope for is a probability of a large number of 9s that you
> will be able to detect all the duplicates. But that's the best you can
> achieve.
>> Scaling horizontally is thus more tricky because, as you say, you may
> now have multiple consumers which each receive one copy of a message.
> Thus the dedup database would have to be distributed. With high message
> rates, this might well become prohibitive because of the amount of
> network traffic due to transactions between the consumers.
>> > What's the recommended way to deal with the potential of duplicate
> messages?
>> Currently, there is no "recommended" way. If you have a single consumer,
> it's quite easy - something like tokyocabinet should be more than
> sufficiently performant. For multiple consumers, you're currently going
> to have to look at some sort of distributed database.
>> > Is this a rare enough edge case that most people just ignore it?
>> No idea. But one way of making your life easier is for the producer to
> send slightly different messages on every republish (they would still
> obviously need to have the same msg id). That way, if you detect a msg
> with "republish count" == 0, then you know it's the first copy, so you
> can insert async into your shared database and then act on the message.
> You only need to do a query on the database whenever you receive a msg
> with "republish count" > 0 - thus you can tune your database for
> inserts and hopefully save some work - the common case will then be the
> first case, and lookups will be exceedingly rare.
>> The question then is: if you've received a msg, republish count > 0 but
> there are no entries in the database, what do you do? It shouldn't have
> overtaken the first publish (though if consumers disconnected without
> acking, or requeued messages, it could have), but you need to cause some
> sort of synchronise operation between all the consumers to ensure none
> are in the process of adding to the database - it all gets a bit hairy
> at this point.
>> Thus if your message rate is low, you're much safer doing the insert and
> select on every message. If that's too expensive, you're going to have
> to think very hard indeed about how to avoid races between different
> consumers thinking they're both/all responsible for acting on the same
> message.
>> This stuff isn't easy.
>> Matthew
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> --
---
John Apps
(49) 171 869 1813
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20100805/7eca06e8/attachment.htm>
- Previous message: [rabbitmq-discuss] Exactly Once Delivery
- Next message: [rabbitmq-discuss] Exactly Once Delivery
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the rabbitmq-discuss mailing list
[rabbitmq-discuss] Exactly Once Delivery的更多相关文章
- RabbitMQ介绍5 - 集群
RabbitMQ内建集群机制,利用Erlang提供的开放电信平台(OTP,Open telecom Platform)通信框架,使得集群很容易进行横向扩展,提高系统吞吐量.这里只讨论集群的概念.原理, ...
- RabbitMQ基础系列--客户端开发
Ⅰ.高层接口 ConnectionFactory Connection Channel Consumor Ⅱ.操作流程及API [一]创建连接工厂ConnectionFactory Connectio ...
- rabbitmq 公平分发和消息接收确认(转载)
原文地址:http://www.jianshu.com/p/f63820fe2638 当生产者投递消息到broker,rabbitmq把消息分发到消费者. 如果设置了autoAck=true 消费者会 ...
- 消息队列rabbitmq的五种工作模式(go语言版本)
前言:如果你对rabbitmq基本概念都不懂,可以移步此篇博文查阅消息队列RabbitMQ 一.单发单收 二.工作队列Work Queue 三.发布/订阅 Publish/Subscribe 四.路由 ...
- RabbitMQ 信道(channel)挂掉,但连接仍然存在,同时出现错误:Received remote Channel.Close (406): PRECONDITION_FAILED - unknown delivery tag x 的问题
该问题经过一番试验,发现是消费者(consumer)程序逻辑错误导致:在消息处理的回调函数中多次ack或nack. 开启Python日志,并在回调函数中两次ack得到如下信息: F:\software ...
- 消息队列——RabbitMQ学习笔记
消息队列--RabbitMQ学习笔记 1. 写在前面 昨天简单学习了一个消息队列项目--RabbitMQ,今天趁热打铁,将学到的东西记录下来. 学习的资料主要是官网给出的6个基本的消息发送/接收模型, ...
- 消息队列性能对比——ActiveMQ、RabbitMQ与ZeroMQ(译文)
Dissecting Message Queues 概述: 我花了一些时间解剖各种库执行分布式消息.在这个分析中,我看了几个不同的方面,包括API特性,易于部署和维护,以及性能质量..消息队列已经被分 ...
- RabbitMQ 高可用集群搭建及电商平台使用经验总结
面向EDA(事件驱动架构)的方式来设计你的消息 AMQP routing key的设计 RabbitMQ cluster搭建 Mirror queue policy设置 两个不错的RabbitMQ p ...
- RabbitMQ总结概念
AMQP:一个提供统一消息服务的应用层标准高级消息队列协议,是应用层协议的一个开放标准,为面向消息的中间件设计 http://www.diggerplus.org/archives/3110 AMQP ...
- 基于Netty与RabbitMQ的消息服务
Netty作为一个高性能的异步网络开发框架,可以作为各种服务的开发框架. 前段时间的一个项目涉及到硬件设备实时数据的采集,采用Netty作为采集服务的实现框架,同时使用RabbitMQ作为采集服务和各 ...
随机推荐
- PHP中file_put_contents追加和换行的实现方法
在PHP的一些应用中需要写日志或者记录一些信息,这样的话.可以使用fopen(),fwrite()以及 fclose()这些进行操作.也可以简单的使用file_get_contents()和file_ ...
- iOS学习笔记13-网络(二)NSURLSession
在2013年WWDC上苹果揭开了NSURLSession的面纱,将它作为NSURLConnection的继任者.现在使用最广泛的第三方网络框架:AFNetworking.SDWebImage等等都使用 ...
- 【Luogu】P1854花店橱窗布置(DP)
照例良心题目链接 此题使用f[i][j]表示前i束花放进前j个花瓶的时候的最大值.转移方程如下 f[i][j]=max(f[i][j-1],f[i-1][j-1]+que[i][j]) 其中que[i ...
- TeraTerm设定(解决日文乱码问题)
首先,字体Font的MS Gothic是有Japanese的,设置为这个比较保险. 其次,在General Setup里将Language设为:English. 原理是什么我也不清楚,试了几个选择,就 ...
- KD-Tree 的笔记
声明: 蒟蒻对于 KD-Tree 的一点理解,写在博客里面作为笔记. 1.KD-Tree 的定义 1)关于 K-D KD-Tree 中的 D 即为 Dimension ,意思也就是维度. 所以 KD- ...
- 【基础操作】FFT / DWT / NTT / FWT 详解
1. 2. 点值表示法 假设两个多项式相乘后得到的多项式 的次数(最高次项的幂数)为 $n$.(这个很好求,两个多项式的最高次项的幂数相加就得到了) 对于每个点,要用 $O(n)$ 的时间 把 $x$ ...
- mybatis配置报错(properties?,settings?,typeAliases?,typeHandlers?,objectFactory?,objectWrapperFactory?,reflectorFactory?,plugins?,environments?,databaseIdProvider?,mappers?)
如下报错:解决方案:要按照提示的顺序添加属性,(properties?,settings?,typeAliases?,typeHandlers?,objectFactory?,objectWrappe ...
- AC日记——接苹果 洛谷 P2690
题目背景 USACO 题目描述 很少有人知道奶牛爱吃苹果.农夫约翰的农场上有两棵苹果树(编号为1和2), 每一棵树上都长满了苹果.奶牛贝茜无法摘下树上的苹果,所以她只能等待苹果 从树上落下.但是,由于 ...
- 什么是 Linux 发行版
什么是Linux的发行版 就Linux的本质来说,它只是操作系统的核心,负责控制硬件.管理文件系统.程序进程等,并不给用户提供各种工具和应用软件.所谓工欲善其事,被必先利其器,一套在优秀的操作系统核心 ...
- c++ 高效并发编程
高效并发编程 并发编程的基本模型包括,通过消息机制来管理运行顺序的message passing, 通过互斥保护共享的shared memory. 线程同步的基本原则 最低限度共享变量,考虑使用imm ...