HttpClient exception:ExceptionType:System.Threading.Tasks.TaskCanceledException: The operation was canceled. ---> System.IO.IOException: Unable to read data from the transport connection: Operation ca

error msg:

System.Threading.Tasks.TaskCanceledException: The operation was canceled. ---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled. ---> System.Net.Sockets.SocketException: Operation canceled --- End of inner exception stack trace --- at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error) at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token) at System.Net.Http.HttpConnection.ReadAsync(Memory`1 destination) at System.Net.Http.HttpConnection.RawConnectionStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken) at System.Net.Security.SslStreamInternal.<FillBufferAsync>g__InternalFillBufferAsync|38_0[TReadAdapter](TReadAdapter adap, ValueTask`1 task, Int32 min, Int32 initial) at System.Net.Security.SslStreamInternal.ReadAsyncInternal[TReadAdapter](TReadAdapter adapter, Memory`1 buffer) at System.Net.Http.HttpConnection.FillAsync() at System.Net.Http.HttpConnection.ReadNextResponseHeaderLineAsync(Boolean foldedHeadersAllowed) at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken) --- End of inner exception stack trace ---

相关bug

Most / all of these requests were using POST operations (or other methods that have method bodies).
We suspect, but aren't sure, that this might be related to connection pooling in some manner.

That error message is for SocketError.OperationAborted, which likely means the socket was closed while the send or receive was in progress.

Is there anything here that we can look at?
To reproduce this we need to schedule > 5 runs of 4,000+ tests that takes an hour plus.
In such scenario, we typically see 1 - 3 such failures using SocketsHttpHandler.

Note that we never use SocketsHttpHandler directly, so I don't think that we improperly using it.
We got through HttpClient, a shared instance that is re-used among threads, as recommended.

The code is actually doing something like:

try

{

    return await httpHandler.SendAsync(...);

}

catch(HttpRequestException e)

{

}

So we can catch the exception and then check state / logging / something.

Anything you can recommend that we'll look at to get to the root cause?

Is it possible you're hitting a timeout? It shouldn't manifest this way, but it would give us a place to start looking. For example, if you change your HttpClient's Timeout to Timeout.InfiniteTimeSpan, does this still happen?

Or if it's potentially related to connection pooling, what happens if you change the SocketHttpHandler's PooledConnectionLifetime and PooledConnectionIdleTimeout to infinite... does it still repro?

And... I have a repro. On my machine, this reproduce the scenario instantly.
You were correct about the timeout causing this.

using System;

using System.Net.Http;

using System.Threading.Tasks;

namespace ConsoleApp15

{

    class Program

    {

        static async Task Main(string[] args)

        {

            // http://slowwly.robertomurray.co.uk/delay/3000/url/http://www.google.co.uk

            var client = new HttpClient(new SocketsHttpHandler

            {

            })

            {

                Timeout = TimeSpan.FromMilliseconds(250)

            };

            var tasks = new[]

            {

                Run(client),

                Run(client),

                Run(client),

                Run(client),

                Run(client),

                Run(client),

                Run(client),

                Run(client),

                Run(client),

            };

            await Task.WhenAll(tasks);

            Console.WriteLine("Done");

        }

        public static async Task Run(HttpClient client)

        {

            for (int i = 0; i < 25; i++)

            {

                try

                {

                    await client.GetStringAsync("http://slowwly.robertomurray.co.uk/delay/3000/url/http://www.google.co.uk");

                    Console.WriteLine("Should timeout");

                }

                catch (Exception e)

                {

                }

                await client.GetStringAsync("http://slowwly.robertomurray.co.uk/delay/100/url/http://www.google.co.uk");

            }

        }

    }

}

That's hitting a timeout and throwing an OperationCanceledException... that's expected.

What's not expected is getting an IOException (not wrapped in an OperationCanceledException) out of SendAsync and when nothing on the request is being canceled / timing out.

I captured System.Net logs and Wireshark traces of another failure. Here's where I see the failure in Wireshark:

1990370 214.439692    127.0.0.1 → 127.0.0.1    TCP 84 51686 → 51687 [FIN, ACK] Seq=1 Ack=10334254 Win=525568 Len=0

1990371 214.439736    127.0.0.1 → 127.0.0.1    TCP 84 51687 → 51686 [ACK] Seq=10334254 Ack=2 Win=525568 Len=0

1990408 274.618073    127.0.0.1 → 127.0.0.1    TCP 84 51687 → 51686 [RST, ACK] Seq=10334254 Ack=2 Win=0 Len=0

The RST looks unusual to me, especially since it's occurring after the connection should have been closed. The timestamp on the RST makes it look like is was sent 60 seconds after the connection was closed, but I'm still investigating whether or not that's accurate or just an issue with the logs I collected.

I'm going to try capture another network trace to see if the issue occurs after a FIN again, since that could be valuable info.

We are seeing the exact same thing when running our applications on Linux. A temporary "workaround" for us has been to set the timeout to 100 seconds(it's not an acceptable work around).

When we have the timeout to something more reasonable, like 10 seconds or something, we get the error every 500 request or something like that.

The httpendpoint that we are calling are under our control and when the error happens, we can't even see that the request has reached the service. So it seems like the request doesn't leave the box.
We are using dotnet core 2.1.

I've started a stopwatch for each requests and then Im catching TaskCanceledException and logs the following(this is when we have the timeout set to 15 seconds):

Task was cancelled after 15000ms System.Threading.Tasks.TaskCanceledException:
The operation was canceled. ---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled. ---> System.Net.Sockets.SocketException: Operation canceled
--- End of inner exception stack trace ---
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error)
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)
at System.Net.Security.SslStreamInternal.g__InternalFillBufferAsync|38_0[TReadAdapter](TReadAdapter adap, ValueTask`1 task, Int32 min, Int32 initial)

Thanks for the additional info. It's possible the behavior you're seeing is related, but it sounds more like an actual timeout. In this case there are two differences from what I've observed. (1) in all other cases I've seen the connection is actually sending the data, as observed via wireshark and (2) in other cases the failure occurs well before a timeout should have occurred.

There's an open issue right now (#20296) tracking the fact that we throw TaskCanceledException on an actual request timeout. I think that's probably the actual root cause here.

If you think that your issue isn't actually being caused by timeouts though, feel free to open a new issue with all of the info you shared here.

10.

Interesting development. We have encountered this issue after adding to our test suite a test that spawns a process with powershell script (very simple one - writing 1 line to a file). ~30 seconds after this tests there is a failure in 1-3 random subsequent tests.

Because of that we thought that this might be related to processes inheriting handles from the parent one and it appears that we are right.

We have made a PR that unskipps those tests and switches the implementation of Process to the one that sets 'inherit handles' to 'false'. After doing so everything seems to be fine, it looks like the issue is with handle inheritance and the behavior of those handles after that child process ends.

The PR in question is here: ravendb/ravendb#8028

Our idea is that, at least as a workaround, it would be good to be able to set the

retVal = Interop.Kernel32.CreateProcess(

                                null,                // we don't need this since all the info is in commandLine

                                commandLine,         // pointer to the command line string

                                ref unused_SecAttrs, // address to process security attributes, we don't need to inherit the handle

                                ref unused_SecAttrs, // address to thread security attributes.

                                true,                // handle inheritance flag

                                creationFlags,       // creation flags

                                (IntPtr)environmentBlockPtr, // pointer to new environment block

                                workingDirectory,    // pointer to current directory name

                                ref startupInfo,     // pointer to STARTUPINFO

                                ref processInfo      // pointer to PROCESS_INFORMATION

                            );

11.

That's very interesting -- thanks for sharing that info. I briefly looked into the possibility that this was related to socket handles a few weeks ago after observing similar behavior with the h2spec process spawned by the Kestrel tests. It's probably worth looking into in more detail if you're seeing the same behavior.

12.

Note that as of 3.0, socket handles are no longer inherited: #32903

13.

that would probably solve the issue, but:

Can this be applied to 2.1?

Can you add private bool _inheritHandles = true field to Process.cs that we would be able to set via reflection to false at least?

14.

If that actually "fixes" it, I have trouble believing it's anything other than masking a symptom, and we should actually find the root cause and fix that.

Unless we believe the root cause really is that the socket handle is getting inherited and somehow that child process then closing is causing problems for the socket? In which case we should be able to repro that pretty easily, and if that's actually the problem, then I agree it would be a candidate for back porting.

15.

Fixing the root cause would be best, but till then could you consider applying point 2. so we can workaround this?

16.

Applying it where... as a servicing fix to 2.1? That's unlikely.

17.

Adding that private field in Process.cs (probably Process.Windows.cs) and using it instead of

would be a problem?

18.

The bar for servicing fixes is very high. You're suggesting adding unexposed, unvalidated, functionality-accessible-only-via-reflection-and-that-we-encourage-developers-to-use-via-reflection as a servicing fix. Hence "unlikely". If you need it immediately, you can of course make the change in your own local fork.

Let's get to the bottom of the problem and then figure out the right fix.

19.

I agree with @stephentoub.
I feel we are getting closer. @rmkerr invested already weeks of his time trying to get to the bottom of this (incl. extensive discussion with winsock team in Windows). Let's see if this new information can help root cause it and address the root cause.
If it doesn't help too much, we can think about either porting the change #32903 to 2.1, or doing something to enable people work around the suspectible root cause.

20.

I've been trying to build a repro that launches a process and then has several sockets sending and receiving data, but I have not been able to force the issue to occur. @ppekrol, would it be possible to set up a project with just the affected test and the code you're using to spawn the process?

If not I'll put some more time into the repro when I'm back in the office next week.

21.

Hi @rmkerr
Apologies for the delay, I really doubt that this would be doable. We already spend significant amount of hours to isolate the issue, create a smaller subset of tests, but for an unknown reason, it only appears for us on a larger set of tests e.g. our full suite. I do not have an idea what else could be done to isolate this. We tried moving that from SlowTests project (takes around 20 min to complete) to FastsTests (around 90 seconds), but it does not reproduce then.

22.

No problem @ppekrol. The info you shared is still valuable. I'm back in the office today, and will spend some time trying to cut down the size of the repro.

23.

I'm able to reproduce the behavior described by @ppekrol with a modified version of the kestrel test suite.

To start, I disabled all of the tests except for the H2Spec tests (which are launches as a separate process) andConnectionNotClosedWhenClientSatisfiesMinimumDataRateGivenLargeResponseHeaders, which was the test that I observed failing the most frequently. In order to reproduce the issue more consistently, I set up the ConnectionNotClosed... test to run five times.

I was able to further minimize the repro by replacing the H2Spec tests with the following code:

[Fact]

public void LaunchProcess()

{

    for (int i = 0; i < 100; ++i)

    {

        //Thread.Sleep(100);

        Process.Start("Powershell.exe", @"""C:\Tools\HelloWorld.ps1""").WaitForExit();

    }

}

The targeted script just prints "hello world" to the console.

When running the test suites after this modification, I was able to reproduce the failure on 6 out of 10 runs. After replacing Process.Start with Thread.Sleep (as commented out above), the tests passed on 10 out of 10 runs.

Obviously this is pretty rough testing, and there isn't a lot of data. But I do think the behavior is interesting.

HttpClient exception:ExceptionType:System.Threading.Tasks.TaskCanceledException: The operation was canceled. ---> System.IO.IOException: Unable to read data from the transport connection: Operation ca的更多相关文章

.Net多线程编程—System.Threading.Tasks.Parallel
System.Threading.Tasks.Parallel类提供了Parallel.Invoke,Parallel.For,Parallel.ForEach这三个静态方法. 1 Parallel. ...
转载 Net多线程编程—System.Threading.Tasks.Parallel
.Net多线程编程—System.Threading.Tasks.Parallel System.Threading.Tasks.Parallel类提供了Parallel.Invoke,Paral ...
System.Threading.Tasks.Task 引起的 IIS 应用池崩溃
接口服务运行一段时间后,IIS应用池就会突然挂掉,事件查看日志,会有事件日志Event ID为5011的错误为应用程序池“PokeIn”提供服务的进程在与 Windows Process Activ ...
System.Threading.Tasks.Task引起的IIS应用程序池崩溃
问题现象 IIS应用程序池崩溃(Crash)的特征如下: 1. 从客户端看,浏览器一直处于连接状态,Web服务器无响应. 2. 从服务器端看(Windows Server 2008 + IIS 7.0 ...
一、并行编程 - 数据并行 System.Threading.Tasks.Parallel 类
一.并行概念 1.并行编程在.NET 4中的并行编程是依赖Task Parallel Library(后面简称为TPL) 实现的.在TPL中,最基本的执行单元是task(中文可以理解为"任 ...
System.Threading.Tasks.Task 任务引起的IIS应用程序池崩溃
转载:http://www.cnblogs.com/aaa6818162/p/4421305.html 问题现象 IIS应用程序池崩溃(Crash)的特征如下: 1. 从客户端看,浏览器一直处于连接状 ...
using System.Threading.Tasks;
using System.Threading.Tasks; .Net并行库介绍——Task1
vs2013c#测试using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Threading.Tasks; namespace ConsoleApplication1_CXY { class Program { stati
首先安装Unit Test Generator.方法为:工具->扩展和更新->联机->搜索“图标为装有蓝色液体的小试管.Unit Test Generator”, 编写代码,生成一个 ...
System.Threading.Tasks并发和异步代码使用
main.cs System.Threading.Tasks.Parallel.For(0, 10, i => { TestLock test ...

随机推荐

2018-2019-2 网络对抗技术 20165336 Exp4 恶意代码分析
2018-2019-2 网络对抗技术 20165336 Exp4 恶意代码分析 1.实践目标 1.1是监控你自己系统的运行状态,看有没有可疑的程序在运行. 1.2是分析一个恶意软件,就分析Exp2或E ...
004-数据结构之ADT-栈与队列【数组方式实现】
一.ADT 在介绍抽象数据类型的时候,先看看什么是数据类型,在Java中可能首先会想到像 int,double这样的词,这是Java中的基本数据类型,一个数据类型会涉及到两件事: ①.拥有特定特征的数 ...
音乐类产品——“网易云音乐”app交互原型模板（免费使用）
网易云音乐虽是一款音乐app,但有人说它也是社交界的一股清流以及一匹黑马.音乐带给人的感染,激发着很多人在这里表达着他们的情绪和心声.网易云音乐上的真实用户点评,不仅被印在地铁的广告牌上,还在朋友圈频 ...
DOS特殊字符转义方法
http://www.robvanderwoude.com/escapechars.php 期望得到的字符转义后字符说明 % %% May not always be required in do ...
Windows SFTP 的安装
用于Windows系统的免费SFTP服务器-Free SFTP Servers 前不久,有人问我:“怎么从 Linux 系统传文件到 Windows 服务器,不能用 FTP 协议.” 文件数量不大.用 ...
adobe air for ios 例子
越南soha 平台ios ane http://yun.baidu.com/s/1o65G9XS 里面有oc源码,as库,以及打包生成ane,测试ane 全过程
iOS 开发笔记 - 导航到地图
导航到地图,已经不是什么新鲜事了.网上有好多参考的资料,我总结出只需要两步第一步:在info中加上支持的各平台比如:iosamap高德地图.comgooglemaps谷歌地图.baidumap百度 ...
Vue2leaflet 替换国内地图api,带{z}/{x}/{y}形式的
参考:https://www.cnblogs.com/gispathfinder/p/9535685.html Vue2leaflet安装后,默认自带的地图URL如下 url:'http://{s}. ...
HTML5的优点与缺点？
优点:a.网络标准统一.HTML5本身是由W3C推荐出来的. b.多设备.跨平台 c.即时更新,提高可用性和改进用户的友好体验: d.有几个新的标签,这将有助于开发人员定义重要的内容: e.可以给站点 ...
POJ 3162 bit区间查询最值+树形DP
POJ 3162 『题目链接』POJ 3162 『题目类型』bit区间查询最值+树形DP ✡Problem: 一棵n个节点的树.wc爱跑步,跑n天,第i天从第i个节点开始跑步,每次跑到距第i个节点最远 ...

HttpClient exception:ExceptionType:System.Threading.Tasks.TaskCanceledException: The operation was canceled. ---> System.IO.IOException: Unable to read data from the transport connection: Operation ca

HttpClient exception:ExceptionType:System.Threading.Tasks.TaskCanceledException: The operation was canceled. ---> System.IO.IOException: Unable to read data from the transport connection: Operation ca的更多相关文章

随机推荐

热门专题