ES系列(七):多节点任务的分发与收集实现
我们知道,当我们对es发起search请求或其他操作时,往往都是随机选择一个coordinator发起请求。而这请求,可能是该节点能处理,也可能是该节点不能处理的,也可能是需要多节点共同处理的,可以说是情况比较复杂。
所以,coordinator的重要工作是,做请求分发与结果收集。那么,如何高性能和安全准确地实现这一功能则至关重要。而这,也许诸君各有思路,孰优孰劣不访一起来探讨探讨!
1. 请求分发的简单思路
我们这里所说的请求分发,一般是针对多个网络节点而言的。那么,如何将请求发往多节点,并在最终将结果合并起来呢?
害,无脑的先来一个。同步请求各节点,当第一个节点响应后,再向第二个节点发起请求,以此类推,直到所有节点请求完成,然后再将结果聚合起来。就完成了需求了,不费吹灰之力。简单不?
无脑处理自有无脑处理的缺点。依次请求各节点,无法很好利用系统的分布式特点,变并行为串行了,好不厉害。另外,对于当前请求,当其未处理完成这所有节点的分发收集工作时,当前线程将会一直被占用。从而,下游请求将无法再接入,从而将你了并发能力,使其与线程池大小同日而语了。这可不好。
我们依次想办法优化下。
首先,我们可以将串行分发请求变成并行分发,即可以使用多线程,向多节点发起请求,当某线程处理完成时,就返回结果。使用类似于CountDownLatch的同步工具,保证所有节点都处理完成后,再由外单主线程进行结果合并操作。
以上优化,看起来不错,避免了同步的性能问题。但是,当有某个节点响应非常慢时,它将阻塞后续节点的工作,从而使整个请求变慢,从而同样变成线程池的大小即是并发能力的瓶颈。可以说,治标不治本。
再来,继续优化。我们可以释放掉主线程的持有,让每个分发线程处理完成当前任务时,都去检查任务队列,是否已完成。如果未完成则忽略,如果已完成,则启动合并任务。
看起来不错,已经有完全并发样子了。但还能不能再优化?各节点的分发,同样是同步请求,虽然处理简单,但在这server响应期间,该线程仍是无法被使用的,如果类似请求过多,则必然是不小的消耗。如果能将单节点的请求,能够做到异步处理,那样岂不完美?但这恐怕不好做吧!不过,终归是一个不错的想法了。
2. es中search的多节点分发收集
我们以search的分发收集为出发点,观看es如何办成这件。原因是search在es中最为普遍与经典,虽说不得每个地方实现都一样,但至少参考意义还是有的。故以search为切入点。search的框架工作流程,我们之前已经研究过,本节就直接以核心开始讲解,它是在 TransportSearchAction.executeRequest() 中的。
// org.elasticsearch.action.search.TransportSearchAction#executeRequest
private void executeRequest(Task task, SearchRequest searchRequest,
SearchAsyncActionProvider searchAsyncActionProvider, ActionListener<SearchResponse> listener) {
final long relativeStartNanos = System.nanoTime();
final SearchTimeProvider timeProvider =
new SearchTimeProvider(searchRequest.getOrCreateAbsoluteStartMillis(), relativeStartNanos, System::nanoTime);
ActionListener<SearchSourceBuilder> rewriteListener = ActionListener.wrap(source -> {
if (source != searchRequest.source()) {
// only set it if it changed - we don't allow null values to be set but it might be already null. this way we catch
// situations when source is rewritten to null due to a bug
searchRequest.source(source);
}
final ClusterState clusterState = clusterService.state();
final SearchContextId searchContext;
final Map<String, OriginalIndices> remoteClusterIndices;
if (searchRequest.pointInTimeBuilder() != null) {
searchContext = SearchContextId.decode(namedWriteableRegistry, searchRequest.pointInTimeBuilder().getId());
remoteClusterIndices = getIndicesFromSearchContexts(searchContext, searchRequest.indicesOptions());
} else {
searchContext = null;
remoteClusterIndices = remoteClusterService.groupIndices(searchRequest.indicesOptions(),
searchRequest.indices(), idx -> indexNameExpressionResolver.hasIndexAbstraction(idx, clusterState));
}
OriginalIndices localIndices = remoteClusterIndices.remove(RemoteClusterAware.LOCAL_CLUSTER_GROUP_KEY);
if (remoteClusterIndices.isEmpty()) {
executeLocalSearch(
task, timeProvider, searchRequest, localIndices, clusterState, listener, searchContext, searchAsyncActionProvider);
} else {
// 多节点数据请求
if (shouldMinimizeRoundtrips(searchRequest)) {
// 通过 parentTaskId 关联所有子任务
final TaskId parentTaskId = task.taskInfo(clusterService.localNode().getId(), false).getTaskId();
ccsRemoteReduce(parentTaskId, searchRequest, localIndices, remoteClusterIndices, timeProvider,
searchService.aggReduceContextBuilder(searchRequest),
remoteClusterService, threadPool, listener,
(r, l) -> executeLocalSearch(
task, timeProvider, r, localIndices, clusterState, l, searchContext, searchAsyncActionProvider));
} else {
AtomicInteger skippedClusters = new AtomicInteger(0);
// 直接分发多shard请求到各节点
collectSearchShards(searchRequest.indicesOptions(), searchRequest.preference(), searchRequest.routing(),
skippedClusters, remoteClusterIndices, remoteClusterService, threadPool,
ActionListener.wrap(
searchShardsResponses -> {
// 当所有节点都响应后,再做后续逻辑处理,即此处的后置监听
final BiFunction<String, String, DiscoveryNode> clusterNodeLookup =
getRemoteClusterNodeLookup(searchShardsResponses);
final Map<String, AliasFilter> remoteAliasFilters;
final List<SearchShardIterator> remoteShardIterators;
if (searchContext != null) {
remoteAliasFilters = searchContext.aliasFilter();
remoteShardIterators = getRemoteShardsIteratorFromPointInTime(searchShardsResponses,
searchContext, searchRequest.pointInTimeBuilder().getKeepAlive(), remoteClusterIndices);
} else {
remoteAliasFilters = getRemoteAliasFilters(searchShardsResponses);
remoteShardIterators = getRemoteShardsIterator(searchShardsResponses, remoteClusterIndices,
remoteAliasFilters);
}
int localClusters = localIndices == null ? 0 : 1;
int totalClusters = remoteClusterIndices.size() + localClusters;
int successfulClusters = searchShardsResponses.size() + localClusters;
// 至于后续搜索实现如何,不在此间
executeSearch((SearchTask) task, timeProvider, searchRequest, localIndices, remoteShardIterators,
clusterNodeLookup, clusterState, remoteAliasFilters, listener,
new SearchResponse.Clusters(totalClusters, successfulClusters, skippedClusters.get()),
searchContext, searchAsyncActionProvider);
},
listener::onFailure));
}
}
}, listener::onFailure);
if (searchRequest.source() == null) {
rewriteListener.onResponse(searchRequest.source());
} else {
Rewriteable.rewriteAndFetch(searchRequest.source(), searchService.getRewriteContext(timeProvider::getAbsoluteStartMillis),
rewriteListener);
}
}
可以看到,es的search功能,会被划分为几种类型,有点会走集群分发,而有的则不需要。我们自然是希望走集群分发的,所以,只需看 collectSearchShards() 即可。这里面其实就是对多个集群节点的依次请求,当然还有结果收集。
// org.elasticsearch.action.search.TransportSearchAction#collectSearchShards
static void collectSearchShards(IndicesOptions indicesOptions, String preference, String routing, AtomicInteger skippedClusters,
Map<String, OriginalIndices> remoteIndicesByCluster, RemoteClusterService remoteClusterService,
ThreadPool threadPool, ActionListener<Map<String, ClusterSearchShardsResponse>> listener) {
// 使用该计数器进行结果控制
final CountDown responsesCountDown = new CountDown(remoteIndicesByCluster.size());
final Map<String, ClusterSearchShardsResponse> searchShardsResponses = new ConcurrentHashMap<>();
final AtomicReference<Exception> exceptions = new AtomicReference<>();
// 迭代各节点,依次发送请求
for (Map.Entry<String, OriginalIndices> entry : remoteIndicesByCluster.entrySet()) {
final String clusterAlias = entry.getKey();
boolean skipUnavailable = remoteClusterService.isSkipUnavailable(clusterAlias);
Client clusterClient = remoteClusterService.getRemoteClusterClient(threadPool, clusterAlias);
final String[] indices = entry.getValue().indices();
ClusterSearchShardsRequest searchShardsRequest = new ClusterSearchShardsRequest(indices)
.indicesOptions(indicesOptions).local(true).preference(preference).routing(routing);
// 向集群中 clusterAlias 异步发起请求处理 search
clusterClient.admin().cluster().searchShards(searchShardsRequest,
new CCSActionListener<ClusterSearchShardsResponse, Map<String, ClusterSearchShardsResponse>>(
clusterAlias, skipUnavailable, responsesCountDown, skippedClusters, exceptions, listener) {
@Override
void innerOnResponse(ClusterSearchShardsResponse clusterSearchShardsResponse) {
// 每次单节点响应时,将结果存放到 searchShardsResponses 中
searchShardsResponses.put(clusterAlias, clusterSearchShardsResponse);
} @Override
Map<String, ClusterSearchShardsResponse> createFinalResponse() {
// 所有节点都返回时,将结果集返回
return searchShardsResponses;
}
}
);
}
}
// org.elasticsearch.client.support.AbstractClient.ClusterAdmin#searchShards
@Override
public void searchShards(final ClusterSearchShardsRequest request, final ActionListener<ClusterSearchShardsResponse> listener) {
// 发起请求 indices:admin/shards/search_shards, 其对应处理器为 TransportClusterSearchShardsAction
execute(ClusterSearchShardsAction.INSTANCE, request, listener);
}
以上是es向集群中多节点发起请求的过程,其重点在于所有的请求都是异步请求,即向各节点发送完成请求后,当前线程即为断开状态。这就体现了无阻塞的能力了,以listner形式进行处理后续业务。这对于发送自然没有问题,但如何进行结果收集呢?实际上就是通过listner来处理的。在远程节点响应后,listener.onResponse()将被调用。
2.1. 多节点响应结果处理
这是我们本文讨论的重点。前面我们看到es已经异步发送请求出去了(且不论其如何发送),所以如何收集结果也很关键。而es中的做法则很简单,使用一个 ConcurrentHashMap 收集每个结果,一个CountDown标识是否已处理完成。
// org.elasticsearch.action.search.TransportSearchAction.CCSActionListener#CCSActionListener
CCSActionListener(String clusterAlias, boolean skipUnavailable, CountDown countDown, AtomicInteger skippedClusters,
AtomicReference<Exception> exceptions, ActionListener<FinalResponse> originalListener) {
this.clusterAlias = clusterAlias;
this.skipUnavailable = skipUnavailable;
this.countDown = countDown;
this.skippedClusters = skippedClusters;
this.exceptions = exceptions;
this.originalListener = originalListener;
} // 成功时的响应
@Override
public final void onResponse(Response response) {
// inner响应为将结果放入 searchShardsResponses 中
innerOnResponse(response);
// maybeFinish 则进行结果是否完成判定,如果完成,则调用回调方法,构造结果
maybeFinish();
} private void maybeFinish() {
// 使用一个 AtomicInteger 进行控制
if (countDown.countDown()) {
Exception exception = exceptions.get();
if (exception == null) {
FinalResponse response;
try {
// 创建响应结果,此处 search 即为 searchShardsResponses
response = createFinalResponse();
} catch(Exception e) {
originalListener.onFailure(e);
return;
}
// 成功响应回调,实现结果收集后的其他业务处理
originalListener.onResponse(response);
} else {
originalListener.onFailure(exceptions.get());
}
}
}
// CountDown 实现比较简单,只有最后一个返回true, 其他皆为false, 即实现了 At Most Once 语义
/**
* Decrements the count-down and returns <code>true</code> iff this call
* reached zero otherwise <code>false</code>
*/
public boolean countDown() {
assert originalCount > 0;
for (;;) {
final int current = countDown.get();
assert current >= 0;
if (current == 0) {
return false;
}
if (countDown.compareAndSet(current, current - 1)) {
return current == 1;
}
}
}
可见,ES中的结果收集,是以一个 AtomicInteger 实现的CountDown来处理的,当所有节点都响应时,就处理最终结果,否则将每个节点的数据放入ConcurrentHashMap中暂存起来。
而通过一个Client通用的异步调用框架,实现多节点的异步提交。整个节点响应以 CCSActionListener 作为接收者。可以说是比较简洁的了,好像也没有我们前面讨论的复杂性。因为:大道至简。
2.2. 异步提交请求实现
我们知道,如果本地想实现异步提交请求,只需使用另一个线程或者线程池技术,即可实现。而对于远程Client的异步提交,则还需要借助于外部工具了。此处借助于Netty的channel.write()实现,节点响应时再回调回来,从而恢复上下文。整个过程,没有一点阻塞同步,从而达到了高效的处理能力,当然还有其他的一些异常处理,自不必说。
具体样例大致如下:因最终的处理器是以 TransportClusterSearchShardsAction 进行处理的,所以直接转到 TransportClusterSearchShardsAction。
// org.elasticsearch.action.admin.cluster.shards.TransportClusterSearchShardsAction
public class TransportClusterSearchShardsAction extends
TransportMasterNodeReadAction<ClusterSearchShardsRequest, ClusterSearchShardsResponse> { private final IndicesService indicesService; @Inject
public TransportClusterSearchShardsAction(TransportService transportService, ClusterService clusterService,
IndicesService indicesService, ThreadPool threadPool, ActionFilters actionFilters,
IndexNameExpressionResolver indexNameExpressionResolver) {
super(ClusterSearchShardsAction.NAME, transportService, clusterService, threadPool, actionFilters,
ClusterSearchShardsRequest::new, indexNameExpressionResolver, ClusterSearchShardsResponse::new, ThreadPool.Names.SAME);
this.indicesService = indicesService;
} @Override
protected ClusterBlockException checkBlock(ClusterSearchShardsRequest request, ClusterState state) {
return state.blocks().indicesBlockedException(ClusterBlockLevel.METADATA_READ,
indexNameExpressionResolver.concreteIndexNames(state, request));
} @Override
protected void masterOperation(final ClusterSearchShardsRequest request, final ClusterState state,
final ActionListener<ClusterSearchShardsResponse> listener) {
ClusterState clusterState = clusterService.state();
String[] concreteIndices = indexNameExpressionResolver.concreteIndexNames(clusterState, request);
Map<String, Set<String>> routingMap = indexNameExpressionResolver.resolveSearchRouting(state, request.routing(), request.indices());
Map<String, AliasFilter> indicesAndFilters = new HashMap<>();
Set<String> indicesAndAliases = indexNameExpressionResolver.resolveExpressions(clusterState, request.indices());
for (String index : concreteIndices) {
final AliasFilter aliasFilter = indicesService.buildAliasFilter(clusterState, index, indicesAndAliases);
final String[] aliases = indexNameExpressionResolver.indexAliases(clusterState, index, aliasMetadata -> true, true,
indicesAndAliases);
indicesAndFilters.put(index, new AliasFilter(aliasFilter.getQueryBuilder(), aliases));
} Set<String> nodeIds = new HashSet<>();
GroupShardsIterator<ShardIterator> groupShardsIterator = clusterService.operationRouting()
.searchShards(clusterState, concreteIndices, routingMap, request.preference());
ShardRouting shard;
ClusterSearchShardsGroup[] groupResponses = new ClusterSearchShardsGroup[groupShardsIterator.size()];
int currentGroup = 0;
for (ShardIterator shardIt : groupShardsIterator) {
ShardId shardId = shardIt.shardId();
ShardRouting[] shardRoutings = new ShardRouting[shardIt.size()];
int currentShard = 0;
shardIt.reset();
while ((shard = shardIt.nextOrNull()) != null) {
shardRoutings[currentShard++] = shard;
nodeIds.add(shard.currentNodeId());
}
groupResponses[currentGroup++] = new ClusterSearchShardsGroup(shardId, shardRoutings);
}
DiscoveryNode[] nodes = new DiscoveryNode[nodeIds.size()];
int currentNode = 0;
for (String nodeId : nodeIds) {
nodes[currentNode++] = clusterState.getNodes().get(nodeId);
}
listener.onResponse(new ClusterSearchShardsResponse(groupResponses, nodes, indicesAndFilters));
}
}
// doExecute 在父类中完成
// org.elasticsearch.action.support.master.TransportMasterNodeAction#doExecute
@Override
protected void doExecute(Task task, final Request request, ActionListener<Response> listener) {
ClusterState state = clusterService.state();
logger.trace("starting processing request [{}] with cluster state version [{}]", request, state.version());
if (task != null) {
request.setParentTask(clusterService.localNode().getId(), task.getId());
}
new AsyncSingleAction(task, request, listener).doStart(state);
} // org.elasticsearch.action.support.master.TransportMasterNodeAction.AsyncSingleAction#doStart
AsyncSingleAction(Task task, Request request, ActionListener<Response> listener) {
this.task = task;
this.request = request;
this.listener = listener;
this.startTime = threadPool.relativeTimeInMillis();
} protected void doStart(ClusterState clusterState) {
try {
final DiscoveryNodes nodes = clusterState.nodes();
if (nodes.isLocalNodeElectedMaster() || localExecute(request)) {
// check for block, if blocked, retry, else, execute locally
final ClusterBlockException blockException = checkBlock(request, clusterState);
if (blockException != null) {
if (!blockException.retryable()) {
listener.onFailure(blockException);
} else {
logger.debug("can't execute due to a cluster block, retrying", blockException);
// 重试处理
retry(clusterState, blockException, newState -> {
try {
ClusterBlockException newException = checkBlock(request, newState);
return (newException == null || !newException.retryable());
} catch (Exception e) {
// accept state as block will be rechecked by doStart() and listener.onFailure() then called
logger.trace("exception occurred during cluster block checking, accepting state", e);
return true;
}
});
}
} else {
ActionListener<Response> delegate = ActionListener.delegateResponse(listener, (delegatedListener, t) -> {
if (t instanceof FailedToCommitClusterStateException || t instanceof NotMasterException) {
logger.debug(() -> new ParameterizedMessage("master could not publish cluster state or " +
"stepped down before publishing action [{}], scheduling a retry", actionName), t);
retryOnMasterChange(clusterState, t);
} else {
delegatedListener.onFailure(t);
}
});
// 本地节点执行结果,直接以异步线程处理即可
threadPool.executor(executor)
.execute(ActionRunnable.wrap(delegate, l -> masterOperation(task, request, clusterState, l)));
}
} else {
if (nodes.getMasterNode() == null) {
logger.debug("no known master node, scheduling a retry");
retryOnMasterChange(clusterState, null);
} else {
DiscoveryNode masterNode = nodes.getMasterNode();
final String actionName = getMasterActionName(masterNode);
// 发送到master节点,以netty作为通讯工具,完成后回调 当前listner
transportService.sendRequest(masterNode, actionName, request,
new ActionListenerResponseHandler<Response>(listener, responseReader) {
@Override
public void handleException(final TransportException exp) {
Throwable cause = exp.unwrapCause();
if (cause instanceof ConnectTransportException ||
(exp instanceof RemoteTransportException && cause instanceof NodeClosedException)) {
// we want to retry here a bit to see if a new master is elected
logger.debug("connection exception while trying to forward request with action name [{}] to " +
"master node [{}], scheduling a retry. Error: [{}]",
actionName, nodes.getMasterNode(), exp.getDetailedMessage());
retryOnMasterChange(clusterState, cause);
} else {
listener.onFailure(exp);
}
}
});
}
}
} catch (Exception e) {
listener.onFailure(e);
}
}
可见,es中确实有两种异步的提交方式,一种是当前节点就是执行节点,直接使用线程池提交;另一种是远程节点则起网络调用,最终如何实现异步且往下看。
// org.elasticsearch.transport.TransportService#sendRequest
public final <T extends TransportResponse> void sendRequest(final DiscoveryNode node, final String action,
final TransportRequest request,
final TransportRequestOptions options,
TransportResponseHandler<T> handler) {
final Transport.Connection connection;
try {
// 假设不是本节点,则获取远程的一个 connection, channel
connection = getConnection(node);
} catch (final NodeNotConnectedException ex) {
// the caller might not handle this so we invoke the handler
handler.handleException(ex);
return;
}
sendRequest(connection, action, request, options, handler);
}
// org.elasticsearch.transport.TransportService#getConnection
/**
* Returns either a real transport connection or a local node connection if we are using the local node optimization.
* @throws NodeNotConnectedException if the given node is not connected
*/
public Transport.Connection getConnection(DiscoveryNode node) {
if (isLocalNode(node)) {
return localNodeConnection;
} else {
return connectionManager.getConnection(node);
}
} // org.elasticsearch.transport.TransportService#sendRequest
/**
* Sends a request on the specified connection. If there is a failure sending the request, the specified handler is invoked.
*
* @param connection the connection to send the request on
* @param action the name of the action
* @param request the request
* @param options the options for this request
* @param handler the response handler
* @param <T> the type of the transport response
*/
public final <T extends TransportResponse> void sendRequest(final Transport.Connection connection, final String action,
final TransportRequest request,
final TransportRequestOptions options,
final TransportResponseHandler<T> handler) {
try {
final TransportResponseHandler<T> delegate;
if (request.getParentTask().isSet()) {
// If the connection is a proxy connection, then we will create a cancellable proxy task on the proxy node and an actual
// child task on the target node of the remote cluster.
// ----> a parent task on the local cluster
// |
// ----> a proxy task on the proxy node on the remote cluster
// |
// ----> an actual child task on the target node on the remote cluster
// To cancel the child task on the remote cluster, we must send a cancel request to the proxy node instead of the target
// node as the parent task of the child task is the proxy task not the parent task on the local cluster. Hence, here we
// unwrap the connection and keep track of the connection to the proxy node instead of the proxy connection.
final Transport.Connection unwrappedConn = unwrapConnection(connection);
final Releasable unregisterChildNode = taskManager.registerChildConnection(request.getParentTask().getId(), unwrappedConn);
delegate = new TransportResponseHandler<T>() {
@Override
public void handleResponse(T response) {
unregisterChildNode.close();
handler.handleResponse(response);
} @Override
public void handleException(TransportException exp) {
unregisterChildNode.close();
handler.handleException(exp);
} @Override
public String executor() {
return handler.executor();
} @Override
public T read(StreamInput in) throws IOException {
return handler.read(in);
} @Override
public String toString() {
return getClass().getName() + "/[" + action + "]:" + handler.toString();
}
};
} else {
delegate = handler;
}
asyncSender.sendRequest(connection, action, request, options, delegate);
} catch (final Exception ex) {
// the caller might not handle this so we invoke the handler
final TransportException te;
if (ex instanceof TransportException) {
te = (TransportException) ex;
} else {
te = new TransportException("failure to send", ex);
}
handler.handleException(te);
}
} // org.elasticsearch.transport.TransportService#sendRequestInternal
private <T extends TransportResponse> void sendRequestInternal(final Transport.Connection connection, final String action,
final TransportRequest request,
final TransportRequestOptions options,
TransportResponseHandler<T> handler) {
if (connection == null) {
throw new IllegalStateException("can't send request to a null connection");
}
DiscoveryNode node = connection.getNode(); Supplier<ThreadContext.StoredContext> storedContextSupplier = threadPool.getThreadContext().newRestorableContext(true);
ContextRestoreResponseHandler<T> responseHandler = new ContextRestoreResponseHandler<>(storedContextSupplier, handler);
// TODO we can probably fold this entire request ID dance into connection.sendReqeust but it will be a bigger refactoring
final long requestId = responseHandlers.add(new Transport.ResponseContext<>(responseHandler, connection, action));
final TimeoutHandler timeoutHandler;
if (options.timeout() != null) {
timeoutHandler = new TimeoutHandler(requestId, connection.getNode(), action);
responseHandler.setTimeoutHandler(timeoutHandler);
} else {
timeoutHandler = null;
}
try {
if (lifecycle.stoppedOrClosed()) {
/*
* If we are not started the exception handling will remove the request holder again and calls the handler to notify the
* caller. It will only notify if toStop hasn't done the work yet.
*/
throw new NodeClosedException(localNode);
}
if (timeoutHandler != null) {
assert options.timeout() != null;
timeoutHandler.scheduleTimeout(options.timeout());
}
connection.sendRequest(requestId, action, request, options); // local node optimization happens upstream
} catch (final Exception e) {
// usually happen either because we failed to connect to the node
// or because we failed serializing the message
final Transport.ResponseContext<? extends TransportResponse> contextToNotify = responseHandlers.remove(requestId);
// If holderToNotify == null then handler has already been taken care of.
if (contextToNotify != null) {
if (timeoutHandler != null) {
timeoutHandler.cancel();
}
// callback that an exception happened, but on a different thread since we don't
// want handlers to worry about stack overflows. In the special case of running into a closing node we run on the current
// thread on a best effort basis though.
final SendRequestTransportException sendRequestException = new SendRequestTransportException(node, action, e);
final String executor = lifecycle.stoppedOrClosed() ? ThreadPool.Names.SAME : ThreadPool.Names.GENERIC;
threadPool.executor(executor).execute(new AbstractRunnable() {
@Override
public void onRejection(Exception e) {
// if we get rejected during node shutdown we don't wanna bubble it up
logger.debug(
() -> new ParameterizedMessage(
"failed to notify response handler on rejection, action: {}",
contextToNotify.action()),
e);
}
@Override
public void onFailure(Exception e) {
logger.warn(
() -> new ParameterizedMessage(
"failed to notify response handler on exception, action: {}",
contextToNotify.action()),
e);
}
@Override
protected void doRun() throws Exception {
contextToNotify.handler().handleException(sendRequestException);
}
});
} else {
logger.debug("Exception while sending request, handler likely already notified due to timeout", e);
}
}
}
// org.elasticsearch.transport.RemoteConnectionManager.ProxyConnection#sendRequest
@Override
public void sendRequest(long requestId, String action, TransportRequest request, TransportRequestOptions options)
throws IOException, TransportException {
connection.sendRequest(requestId, TransportActionProxy.getProxyAction(action),
TransportActionProxy.wrapRequest(targetNode, request), options);
}
// org.elasticsearch.transport.TcpTransport.NodeChannels#sendRequest
@Override
public void sendRequest(long requestId, String action, TransportRequest request, TransportRequestOptions options)
throws IOException, TransportException {
if (isClosing.get()) {
throw new NodeNotConnectedException(node, "connection already closed");
}
TcpChannel channel = channel(options.type());
outboundHandler.sendRequest(node, channel, requestId, action, request, options, getVersion(), compress, false);
}
// org.elasticsearch.transport.OutboundHandler#sendRequest
/**
* Sends the request to the given channel. This method should be used to send {@link TransportRequest}
* objects back to the caller.
*/
void sendRequest(final DiscoveryNode node, final TcpChannel channel, final long requestId, final String action,
final TransportRequest request, final TransportRequestOptions options, final Version channelVersion,
final boolean compressRequest, final boolean isHandshake) throws IOException, TransportException {
Version version = Version.min(this.version, channelVersion);
OutboundMessage.Request message = new OutboundMessage.Request(threadPool.getThreadContext(), features, request, version, action,
requestId, isHandshake, compressRequest);
ActionListener<Void> listener = ActionListener.wrap(() ->
messageListener.onRequestSent(node, requestId, action, request, options));
sendMessage(channel, message, listener);
}
// org.elasticsearch.transport.OutboundHandler#sendMessage
private void sendMessage(TcpChannel channel, OutboundMessage networkMessage, ActionListener<Void> listener) throws IOException {
MessageSerializer serializer = new MessageSerializer(networkMessage, bigArrays);
SendContext sendContext = new SendContext(channel, serializer, listener, serializer);
internalSend(channel, sendContext);
}
private void internalSend(TcpChannel channel, SendContext sendContext) throws IOException {
channel.getChannelStats().markAccessed(threadPool.relativeTimeInMillis());
BytesReference reference = sendContext.get();
// stash thread context so that channel event loop is not polluted by thread context
try (ThreadContext.StoredContext existing = threadPool.getThreadContext().stashContext()) {
channel.sendMessage(reference, sendContext);
} catch (RuntimeException ex) {
sendContext.onFailure(ex);
CloseableChannel.closeChannel(channel);
throw ex;
}
}
// org.elasticsearch.transport.netty4.Netty4TcpChannel#sendMessage
@Override
public void sendMessage(BytesReference reference, ActionListener<Void> listener) {
// netty 发送数据,异步回调,完成异步请求
channel.writeAndFlush(Netty4Utils.toByteBuf(reference), addPromise(listener, channel)); if (channel.eventLoop().isShutdown()) {
listener.onFailure(new TransportException("Cannot send message, event loop is shutting down."));
}
}
简单说,就是依托于netty的pipeline机制以及eventLoop实现远程异步请求,至于具体实现如何,请参考之前文章或各网文。
本文单讨论如题话题,可大可小,通过思路罗列与es的实现参考,相信定能为大家带来一些碰撞的火花。
ES系列(七):多节点任务的分发与收集实现的更多相关文章
- ES系列七、ES-倒排索引详解
1.单词——文档矩阵 单词-文档矩阵是表达两者之间所具有的一种包含关系的概念模型,图3-1展示了其含义.图3-1的每列代表一个文档,每行代表一个单词,打对勾的位置代表包含关系. 图3-1 单词-文档矩 ...
- ES系列目录
ES系列一.CentOS7安装ES 6.3.1 ES系列二.CentOS7安装ES head6.3.1 ES系列三.基本知识准备 ES系列四.ES6.3常用api之文档类api ES系列五.ES6.3 ...
- Alamofire源码解读系列(七)之网络监控(NetworkReachabilityManager)
Alamofire源码解读系列(七)之网络监控(NetworkReachabilityManager) 本篇主要讲解iOS开发中的网络监控 前言 在开发中,有时候我们需要获取这些信息: 手机是否联网 ...
- ES系列十七、logback+ELK日志搭建
一.ELK应用场景 在复杂的企业应用服务群中,记录日志方式多种多样,并且不易归档以及提供日志监控的机制.无论是开发人员还是运维人员都无法准确的定位服务.服务器上面出现的种种问题,也没有高效搜索日志内容 ...
- ES系列十六、集群配置和维护管理
一.修改配置文件 1.节点配置 1.vim elasticsearch.yml # ======================== Elasticsearch Configuration ===== ...
- struts2官方 中文教程 系列七:消息资源文件
介绍 在本教程中,我们将探索使用Struts 2消息资源功能(也称为 resource bundles 资源绑定).消息资源提供了一种简单的方法,可以将文本放在一个视图页面中,通过应用程序,创建表单字 ...
- 计算广告CTR预估系列(七)--Facebook经典模型LR+GBDT理论与实践
计算广告CTR预估系列(七)--Facebook经典模型LR+GBDT理论与实践 2018年06月13日 16:38:11 轻春 阅读数 6004更多 分类专栏: 机器学习 机器学习荐货情报局 版 ...
- Keil MDK STM32系列(七) STM32F4基于HAL的PWM和定时器
Keil MDK STM32系列 Keil MDK STM32系列(一) 基于标准外设库SPL的STM32F103开发 Keil MDK STM32系列(二) 基于标准外设库SPL的STM32F401 ...
- WCF编程系列(七)信道及信道工厂
WCF编程系列(七)信道及信道工厂 信道及信道栈 前面已经提及过,WCF中客户端与服务端的交互都是通过消息来进行的.消息从客户端传送到服务端会经过多个处理动作,在WCF编程模型中,这些动作是按层 ...
随机推荐
- Consul 服务的注册和发现
Consul 是Hashicorp公司推出的开源工具,用于实现分布式系统的服务发现与配置.Consul是分布式的,高可用的,可横向扩展的. Consul 的主要特点有: Service Disc ...
- python函数默认值只初始化一次
当在函数中定义默认值时,值初始化只会进行一次,就是执行到def methodname时执行.看下面代码: from datetime import datetime def test(t=dateti ...
- [Java] 数据分析--数据预处理
数据结构 键-值对:HashMap 1 import java.io.File; 2 import java.io.FileNotFoundException; 3 import java.util. ...
- [Python] 可变/不可变类型 & 参数传递
与c/c++不同,Python/Java中的变量都是引用类型,没有值类型 Python赋值语句由三部分构成,例如: int a = 1 类型 标识 值 标识(identity):用于唯一标识 ...
- Docker——JVM 感知容器的 CPU 和 Memory 资源限制
前言 对于那些在Java应用程序中使用Docker的CPU和内存限制的人来说,可能会遇到一些挑战.特别是CPU限制,因为JVM在内部透明地设置GC线程和JIT编译器线程的数量. 这些可以通过命令行选项 ...
- DS1302应用电路
http://www.diangon.com/wenku/rd/danpianji/201501/00017904.html
- jenkins部署vue项目
一.新建自由风格的项目 二.配置项目 三.部分部署脚本 #!/bin/bashecho $PATHnpm config set proxy nullnpm config set https-proxy ...
- java并发编程工具类JUC第一篇:BlockingQueue阻塞队列
Java BlockingQueue接口java.util.concurrent.BlockingQueue表示一个可以存取元素,并且线程安全的队列.换句话说,当多线程同时从 JavaBlocking ...
- sentinel入门
sentinel下载:sentinel-dashboard-1.8.1.jar 下载完成后进入sentinel-dashboard-1.8.1.jar的文件夹,在cmd中输入java -jar sen ...
- Tengine AIFramework框架
Tengine AIFramework框架 在开源大势下,以数据.算力.算法为三驾马车的人工智能实现了初级阶段的产业化落地.任何一个技术领域成熟的标志是从应用到平台的成功迭代,AI 也不例外,最终引导 ...