MemCache分布式缓存的一个bug

Memcached分布式缓存策略不是由服务器端至支持的，多台服务器之间并不知道彼此的存在。分布式的实现是由客户端代码（Memcached.ClientLibrary）通过缓存key-server映射来实现的，基本原理就是对缓存key求hash值，用hash值对服务器数量进行模运算，该key值被分配到模运算结果为索引的那台server上。

Memcached.ClientLibrary对缓存key计算hashcode的核心算法如下：

 /// <summary>
 /// Returns appropriate SockIO object given
 /// string cache key and optional hashcode.
 ///
 /// Trys to get SockIO from pool.  Fails over
 /// to additional pools in event of server failure.
 /// </summary>
 /// <param name="key">hashcode for cache key</param>
 /// <param name="hashCode">if not null, then the int hashcode to use</param>
 /// <returns>SockIO obj connected to server</returns>
 public SockIO GetSock(string key, object hashCode)
 {
     string hashCodeString = "<null>";
     if(hashCode != null)
         hashCodeString = hashCode.ToString();
 
     if(Log.IsDebugEnabled)
     {
         Log.Debug(GetLocalizedString("cache socket pick").Replace("$$Key$$", key).Replace("$$HashCode$$", hashCodeString));
     }
 
     if (key == null || key.Length == )
     {
         if(Log.IsDebugEnabled)
         {
             Log.Debug(GetLocalizedString("null key"));
         }
         return null;
     }
 
     if(!_initialized)
     {
         if(Log.IsErrorEnabled)
         {
             Log.Error(GetLocalizedString("get socket from uninitialized pool"));
         }
         return null;
     }
 
     // if no servers return null
     if(_buckets.Count == )
         return null;
 
     // if only one server, return it
     if(_buckets.Count == )
         return GetConnection((string)_buckets[]);
 
     int tries = ;
 
     // generate hashcode
     int hv;
     if(hashCode != null)
     {
         hv = (int)hashCode;
     }
     else
     {
 
         // NATIVE_HASH = 0
         // OLD_COMPAT_HASH = 1
         // NEW_COMPAT_HASH = 2
         switch(_hashingAlgorithm)
         {
             case HashingAlgorithm.Native:
                 hv = key.GetHashCode();
                 break;
 
             case HashingAlgorithm.OldCompatibleHash:
                 hv = OriginalHashingAlgorithm(key);
                 break;
 
             case HashingAlgorithm.NewCompatibleHash:
                 hv = NewHashingAlgorithm(key);
                 break;
 
             default:
                 // use the native hash as a default
                 hv = key.GetHashCode();
                 _hashingAlgorithm = HashingAlgorithm.Native;
                 break;
         }
     }
 
     // keep trying different servers until we find one
     while(tries++ <= _buckets.Count)
     {
         // get bucket using hashcode
         // get one from factory
         int bucket = hv % _buckets.Count;
         if(bucket < )
             bucket += _buckets.Count;
 
         SockIO sock = GetConnection((string)_buckets[bucket]);
 
         if(Log.IsDebugEnabled)
         {
             Log.Debug(GetLocalizedString("cache choose").Replace("$$Bucket$$", _buckets[bucket].ToString()).Replace("$$Key$$", key));
         }
 
         if(sock != null)
             return sock;
 
         // if we do not want to failover, then bail here
         if(!_failover)
             return null;
 
         // if we failed to get a socket from this server
         // then we try again by adding an incrementer to the
         // current key and then rehashing
         switch(_hashingAlgorithm)
         {
             case HashingAlgorithm.Native:
                 hv += ((string)("" + tries + key)).GetHashCode();
                 break;
 
             case HashingAlgorithm.OldCompatibleHash:
                 hv += OriginalHashingAlgorithm("" + tries + key);
                 break;
 
             case HashingAlgorithm.NewCompatibleHash:
                 hv += NewHashingAlgorithm("" + tries + key);
                 break;
 
             default:
                 // use the native hash as a default
                 hv += ((string)("" + tries + key)).GetHashCode();
                 _hashingAlgorithm = HashingAlgorithm.Native;
                 break;
         }
     }
 
     return null;
 }

根据缓存key得到服务器的核心代码

从源码中（62--82行代码）可以发现，计算hashcode的算法共三种：

（1）HashingAlgorithm.Native: 即使用.NET本身的hash算法，速度快，但与其他client可能不兼容，例如需要和java、ruby的client共享缓存的情况；

（2）HashingAlgorithm.OldCompatibleHash: 可以与其他客户端兼容，但速度慢；

（3）HashingAlgorithm.NewCompatibleHash: 可以与其他客户端兼容，据称速度快。

进一步分析发现，Memcached.ClientLibrary默认计算缓存key的hashcode的方式就是HashingAlgorithm.Native，而HashingAlgorithm.Native计算hashcode的算法为“hv = key.GetHashCode()”，即用了.net类库string类型自带的GetHashCode()方法。

Bug就要浮现出来了，根据微软（http://msdn.microsoft.com/zh-cn/library/system.object.gethashcode.aspx）对GetHashCode的解释：the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value this method returns may differ between .NET Framework versions and platforms, such as 32-bit and 64-bit platforms。string类型的GetHashCode()函数并不能保证不同平台同一个字符串返回的hash值相同，这样问题就出来了，对于不同服务器的同一缓存key来说，产生的hashcode可能不同，同一key对应的数据可能缓存到了不同的MemCache服务器上，数据的一致性无法保证，清除缓存的代码也可能失效。

// 64位 4.0
[__DynamicallyInvokable, ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail), SecuritySafeCritical]
public unsafe override int GetHashCode()
{
    if (HashHelpers.s_UseRandomizedStringHashing)
    {
        return string.InternalMarvin32HashString(this, this.Length, 0L);
    }
    IntPtr arg_25_0;
    IntPtr expr_1C = arg_25_0 = this;
    if (expr_1C != )
    {
        arg_25_0 = (IntPtr)((int)expr_1C + RuntimeHelpers.OffsetToStringData);
    }
    char* ptr = arg_25_0;
    int num = ;
    int num2 = num;
    char* ptr2 = ptr;
    int num3;
    while ((num3 = (int)(*(ushort*)ptr2)) != )
    {
        num = ((num << ) + num ^ num3);
        num3 = (int)(*(ushort*)(ptr2 + (IntPtr) / ));
        if (num3 == )
        {
            break;
        }
        num2 = ((num2 << ) + num2 ^ num3);
        ptr2 += (IntPtr) / ;
    }
    return num + num2 * ;
}
 
// 64位 2.0
// string
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
public unsafe override int GetHashCode()
{
    IntPtr arg_0F_0;
    IntPtr expr_06 = arg_0F_0 = this;
    if (expr_06 != )
    {
        arg_0F_0 = (IntPtr)((int)expr_06 + RuntimeHelpers.OffsetToStringData);
    }
    char* ptr = arg_0F_0;
    int num = ;
    int num2 = num;
    char* ptr2 = ptr;
    int num3;
    while ((num3 = (int)(*(ushort*)ptr2)) != )
    {
        num = ((num << ) + num ^ num3);
        num3 = (int)(*(ushort*)(ptr2 + (IntPtr) / ));
        if (num3 == )
        {
            break;
        }
        num2 = ((num2 << ) + num2 ^ num3);
        ptr2 += (IntPtr) / ;
    }
    return num + num2 * ;
}
 
//32位 4.0
[__DynamicallyInvokable, ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail), SecuritySafeCritical]
public unsafe override int GetHashCode()
{
    if (HashHelpers.s_UseRandomizedStringHashing)
    {
        return string.InternalMarvin32HashString(this, this.Length, 0L);
    }
    IntPtr arg_25_0;
    IntPtr expr_1C = arg_25_0 = this;
    if (expr_1C != )
    {
        arg_25_0 = (IntPtr)((int)expr_1C + RuntimeHelpers.OffsetToStringData);
    }
    char* ptr = arg_25_0;
    int num = ;
    int num2 = num;
    int* ptr2 = (int*)ptr;
    int i;
    for (i = this.Length; i > ; i -= )
    {
        num = ((num << ) + num + (num >> ) ^ *ptr2);
        num2 = ((num2 << ) + num2 + (num2 >> ) ^ ptr2[(IntPtr) / ]);
        ptr2 += (IntPtr) / ;
    }
    if (i > )
    {
        num = ((num << ) + num + (num >> ) ^ *ptr2);
    }
    return num + num2 * ;
}

GetHashCode几种版本的实现代码

解决问题的方法就是不要用MemCache默认的hash算法，实现方式有两种：

（1）初始化MemCache服务器的时候，指定为MemCahce自带其它的hash算法，代码为“this.pool.HashingAlgorithm = HashingAlgorithm.OldCompatibleHash;”。

（2）自定义hash算法，调用set()、get()、delete()等方式时传递hash值，这几个方法有参数传递hashcode的重载。

参考资料：分析Memcached客户端如何把缓存数据分布到多个服务器上(转)、memcached client - memcacheddotnet (Memcached.ClientLibrary) 1.1.5、memcache分布式实现、Object.GetHashCode 方法、关于 HashCode做key的可能性。

MemCache分布式缓存的一个bug的更多相关文章

C# Memcache分布式缓存简单入门
什么是Memcache?能做什么? 以下是百度的观点: memcache是一套分布式的高速缓存系统,由LiveJournal的Brad Fitzpatrick开发,但目前被许多网站使用以提升网站的访问 ...
memcache 分布式缓存
转载地址:http://www.cnblogs.com/phpstudy2015-6/p/6713164.html 作者:那一叶随风 1.memcached分布式简介 memcached虽然称为“分布 ...
CYQ.Data 对于分布式缓存Redis、MemCache高可用的改进及性能测试
背景: 随着.NET Core 在 Linux 下的热动,相信动不动就要分布式或集群的应用的需求,会慢慢火起来. 所以这段时间一直在研究和思考分布式集群的问题,同时也在思考把几个框架的思维相对提升到这 ...
Memcached 分布式缓存系统部署与调试
Memcache 分布式缓存系统部署与调试工作机制:通过在内存中开辟一块区域来维持一个大的hash表来加快页面访问速度,和数据库是独立的;目前主要用来缓存数据库的数据;存放在内存的数据通过LRU算法 ...
Golang校招简历项目-简单的分布式缓存
前言前段时间,校招投了golang岗位,但是没什么好的项目往简历上写,于是参考了许多网上资料,做了一个简单的分布式缓存项目. 现在闲下来了,打算整理下. github项目地址:https://git ...
分布式缓存 memcache学习
1.使用分布式缓存是为了解决多台机器共享信息的问题,通过访问一个ip和端口来可以访问不同的IIS服务器 2.memcache基础原理在Socket服务器端存储数据是以键值对的形式存储内存处理的算法 ...
MemCache分布式内存对象缓存系统
MemCache超详细解读 MemCache是一个自由.源码开放.高性能.分布式的分布式内存对象缓存系统,用于动态Web应用以减轻数据库的负载.它通过在内存中缓存数据和对象来减少读取数据库的次数,从而 ...
分布式缓存Memcached/memcached/memcache详解及区别
先来解释下标题中的三种写法:首字母大写的Memcached,指的是Memcached服务器,就是独立运行Memcached的后台服务器,用于存储缓存数据的“容器”.memcached和memcache ...
Nginx+Memcache+一致性hash算法实现页面分布式缓存（转）
网站响应速度优化包括集群架构中很多方面的瓶颈因素,这里所说的将页面静态化.实现分布式高速缓存就是其中的一个很好的解决方案... 1)先来看看Nginx负载均衡 Nginx负载均衡依赖自带的 ngx_h ...

随机推荐

TclError: no display name and no $DISPLAY environment variable
%matplotlib inline 或 %matplotlib notebook
.NET Framework3.0/3.5/4.0/4.5新增功能摘要
Microsoft .NET Framework 3.0 .NET Framework 3.0 中增加了不少新功能,例如: Windows Workflow Foundation (WF) Windo ...
【开源】开发者新闻聚合APP 1.0.3发布（第一个稳定版本，短期内不再发布新版本）
聚合了博客园新闻.infoq新闻.36kr新闻.oschina新闻.51cto新闻.csdn新闻: 争取做到随时刷随时有开发者的新闻! 目前还只支持安卓APP 最新版本的下载地址:https://gi ...
java提高篇(十六)-----异常（一）
Java的基本理念是“结构不佳的代码不能运行”!!!!! 大成若缺,其用不弊. 大盈若冲,其用不穷. 在这个世界不可能存在完美的东西,不管完美的思维有多么缜密,细心,我们都不可能考虑所有的因 ...
MySQL模糊查询（like）时区分大小写
问题说明:通过上面的语句,你会发现MySQL的like查询是不区分大小写的,因为我的失误,把Joe写成了joe才发现了这个东东吧.但是,有时候,我们需要区分大小写的是,该怎么办呢?解决方法如下: 方法 ...
记录自己在使用Bootstrap中的心得
一.网格系统在做CRM OP后台时,直接在前人的的一些页面上进行了修改和增加,发现一些东西增加字段后有问题,比如网格系统,怎么改样式都不对,最后自己没法发,做成了半响应式的了.今天重新看Bootst ...
Andrew Ng机器学习公开课笔记 -- Regularization and Model Selection
网易公开课,第10,11课 notes,http://cs229.stanford.edu/notes/cs229-notes5.pdf Model Selection 首先需要解决的问题是,模型 ...
Java-继承，多态-0922-04
定义类Human,具有若干属性和功能:定义其子类Man.Woman: 在主类Test中分别创建子类.父类和上转型对象,并测试其特性. 父类: package com.lianxi3; public c ...
[数据库连接池] Java数据库连接池--DBCP浅析.
前言对于数据库连接池, 想必大家都已经不再陌生, 这里仅仅设计Java中的两个常用数据库连接池: DBCP和C3P0(后续会更新). 一. 为何要使用数据库连接池假设网站一天有很大的访问量,数据库服务 ...

MemCache分布式缓存的一个bug

MemCache分布式缓存的一个bug的更多相关文章

随机推荐

热门专题