Apache Ignite 学习笔记(6): Ignite中Entry Processor使用

之前的文章我们其实已经用到了两种不同的方式访问Ignite中的数据。一种方式是第一篇文章中提到通过JDBC客户端用SQL访问数据，在这篇文章中我们也会看到不使用JDBC，如何通过Ignite API用SQL访问数据。还有用一种方式我称之为cache API, 即用get/put来访问数据。Ignite实现了JCache(JSR 107)标准，所以除了基本的cache操作外，我们也会介绍一些cache的原子操作和EntryProcessor的使用。

Cache API

Ignite提供了类似Map的API用来操作缓存上的数据，只不过Ignite的实现把这个Map上的数据分布在多个节点上，并且保证了这些操作是多线程/进程安全的。我们可以简单的在多个节点上使用get/put往Ignite缓存里读写数据，而把数据同步，并发控制等复杂问题留给Ignite来解决。除了get/put操作外，Ignite还提供了其他的原子操作以及异步操作，比如getAndPutIfAbsent, getAndPutAsync, putIfAbsent, putIfAbsentAsync, getAndReplace, getAndReplaceAsync等，完整的API列表可以看这里。

Ignite也支持在JCache标准中定义的entry processor。我没仔细读过JCache中对entry processor的定义，但根据Ignite的文档和使用经验，相比于基本的缓存get/put操作，entry processor有下面几个特性/优点:

相比于get/put等基本操作，在entry processor中我们可以实现更为复杂的cache更新逻辑，比如我们可以读出缓存中的某个值，然后做一些自定义计算后，再更新缓存中的值。
和get/put/putIfAbsent等操作一样，在entry processor中所有的操作是原子性的, 即保证了entry processor中定义的操作要么都成功，要么都失败。如果不用entry processor，为了达到相同目的，我们需要对需要要更新的缓存数据加锁，更新缓存数据，最后释放锁。而有了entry proce，我们可以更专注于缓存更新的逻辑，而不用考虑如何加解锁。
Entry processor允许在数据节点上直接进行操作。分布式缓存中，如果更新的缓存数据需要根据已经在缓存中的数据计算得到，往往需要在多个节点之间传送的缓存数据。而entry processor是把操作序列化后发送到缓存数据所在的节点，比起序列化缓存数据，要更高效。

Entry Processor代码示例

下面我们改造一下之前的例子，看看在Ignite中如何实现并调用一个entry processor。在这个例子中，cache中key的值依旧是城市的名字，但是value的值不再是简单的城市所在省份的名字，而是一个City类的实例。下面是City类的定义：

public class City {

    private String cityName;

    private String provinceName;

    private long population;

    public City(String cityName, String provinceName, long population) {

        this.cityName = cityName;

        this.provinceName = provinceName;

        this.population = population;

    }

   ...

}

在City类中，我们放了一个population的成员变量，用来表示该城市的人口数量。在主程序中，我们创建多个线程，通过entry processor不断修改不同城市的人口数量。每个entry processor做的事情也很简单: 读取当前人口数量加1，再把新值更新到cache中。下面是主程序的代码

public class IgniteEntryProcessorExample {

    public static void main(String[] args) {

        // start an ignite cluster

        Ignite ignite = startCluster(args);

        CacheConfiguration<String, City> cacheCfg = new CacheConfiguration<>();

        cacheCfg.setName("CITY");

        cacheCfg.setCacheMode(CacheMode.PARTITIONED);

        cacheCfg.setBackups(1);

        IgniteCache<String, City> cityProvinceCache = ignite.getOrCreateCache(cacheCfg);

        // let's create a city and put it in the cache

        City markham = new City("Markham", "Ontario", 0);

        cityProvinceCache.put(markham.getCityName(), markham);

        System.out.println("Insert " + markham.toString());

        // submit two tasks to increase population

        ExecutorService service = Executors.newFixedThreadPool(2);

        IncreaseCityPopulationTask task1 = new IncreaseCityPopulationTask(cityProvinceCache, markham.getCityName(), 10000);

        IncreaseCityPopulationTask task2 = new IncreaseCityPopulationTask(cityProvinceCache, markham.getCityName(), 20000);

        Future<?> result1 = service.submit(task1);

        Future<?> result2 = service.submit(task2);

        System.out.println("Submit two tasks to increase the population");

        service.shutdown();

        try {

            service.awaitTermination(Long.MAX_VALUE, TimeUnit.SECONDS);

        } catch (InterruptedException e) {

            e.printStackTrace();

        }

        // get the population and check whether it is 30000

        City city = cityProvinceCache.get(markham.getCityName());

        if (city.getPopulation() != 30000) {

            System.out.println("Oops, the population is " + city.getPopulation() + " instead of 30000");

        } else {

            System.out.println("Yeah, the population is " + city.getPopulation());

        }

    }

    public static class IncreaseCityPopulationTask implements Runnable {

        private IgniteCache<String, City> cityProvinceCache;

        private String cityName;

        private long population;

        public IncreaseCityPopulationTask(IgniteCache<String, City> cityProvinceCache,

                                          String cityName, long population) {

            this.cityProvinceCache = cityProvinceCache;

            this.cityName = cityName;

            this.population = population;

        }

        @Override

        public void run() {

            long p = 0;

            while(p++ < population) {

                cityProvinceCache.invoke(cityName, new EntryProcessor<String, City, Object>() {

                    @Override

                    public Object process(MutableEntry<String, City> mutableEntry, Object... objects)

                            throws EntryProcessorException {

                        City city = mutableEntry.getValue();

                        if (city != null) {

                            city.setPopulation(city.getPopulation() + 1);

                            mutableEntry.setValue(city);

                        }

                        return null;

                    }

                });

            }

        }

    }

    private static Ignite startCluster(String[] args) {

      ...

    }

}

4~10行，和之前的例子一样，我们启动一个Ignite节点，并且创建一个名为“CITY”的cache，cache的key是城市的名字(String)，cache的value是一个City的对象实例。
13~15行，我们创建了一个名字为“Markham”的City实例，它的初始population值是0。
18~30行，我们创建了2个线程，每个线程启动后都会调用IncreaseCityPopulationTask的Run()函数，不同的是在线程创建时我们指定了不同的population增加次数，一个增加10000次，一个增加20000次。
在33~38行，我们从cache中取回名为"Markham"的实例，并检查它最终的人口数量是不是30000。如果两个线程之间的操作(读cache，增加人口，写cache)是原子操作的话，那么最终结果应该是30000。
57~68是Entry Processor的具体用法，通过cityProvinceCache.invoke()函数就可以调用entry processor，invoke()函数的第一参数是entry processor要作用的数据的key。第二个参数是entry processor的一个实例，该实例必须要实现接口类EntryProcessor的process()函数。在第二个参数之后，还可以传入多个参数，调用时这些参数会传给process()函数。
在process()函数的中，第一个参数mutableEntry包含了process()函数作用的数据的key和value，可以通过MutableEntry.getKey()和MutableEntry.getValue()得到(如果该key的value不存在cache中，getValue()会返回null)。第二个之后的objects参数，是调用invoke()函数时除了key和EntryProcessor之外，传入的参数。
在entry processor中可以实现一些复杂的逻辑，然后调用MutableEntry.setValue()对value值进行修改。如果需要删除value，调用MutableEntry.remove()。
EntryProcessor()被调用时，cache中对应的key值会被加锁，所以对同一个键值的不同entry processor之间是互斥的，保证了一个entry processor中的所有操作是原子操作。
另外，有一点需要注意的是，在entry processor中的操作需要时无状态的，因为同一个entry processor有可能会在primary和backup节点上执行多次，所以要保证entry processor中的操作只和cache中的当前值相关，如果还和当前节点的一些参数和状态相关，会导致在不同节点上运行entry processor后写入cache的值不一致。详情见invoke()函数的文档。

总结

这篇文章我们介绍了Ignite Cache基本的put/get()操作外的其他操作，比如异步的操作和entry processor**这篇文章里用到的例子的完整代码和maven工程可以在这里找到。

下一篇文章，我们会继续看看如何使用Ignite的SQL API对cache进行查询和修改。