Introduction

Span<T> is a new type we are adding to the platform to represent contiguous regions of arbitrary memory, with performance characteristics on par with T[]. Its APIs are similar to the array, but unlike arrays, it can point to either managed or native memory, or to memory allocated on the stack.

  1. // managed memory
  2. var arrayMemory = new byte[100];
  3. var arraySpan = new Span<byte>(arrayMemory);
  4.  
  5. // native memory
  6. var nativeMemory = Marshal.AllocHGlobal(100);
  7. Span<byte> nativeSpan;
  8. unsafe {
  9. nativeSpan = new Span<byte>(nativeMemory.ToPointer(), 100);
  10. }
  11. SafeSum(nativeSpan);
  12. Marshal.FreeHGlobal(nativeMemory);
  13.  
  14. // stack memory
  15. Span<byte> stackSpan;
  16. unsafe {
  17. byte* stackMemory = stackalloc byte[100];
  18. stackSpan = new Span<byte>(stackMemory, 100);
  19. }
  20. SafeSum(stackSpan);

As such, Span<T> is an abstraction over all types of memory available to .NET programs.

  1. // this method does not care what kind of memory it works on
  2. static ulong SafeSum(Span<byte> bytes) {
  3. ulong sum = 0;
  4. for(int i=0; i < bytes.Length; i++) {
  5. sum += bytes[i];
  6. }
  7. return sum;
  8. }

When wrapping an array, Span<T> is not limited to pointing to the first element of the array. It can point to any sub-range. In other words, it supports slicing.

  1. var array = new byte[] { 1, 2, 3 };
  2. var slice = new Span<byte>(array, start:1, length:2);
  3. Console.WriteLine(slice[0]); // prints 2

API Surface

The full API surface of Span<T> is not yet finalized, but the main APIs we will expose are the following:

  1. public struct Span<T> {
  2. public Span(T[] array)
  3. public Span(T[] array, int index)
  4. public Span(T[] array, int index, int length)
  5. public unsafe Span(void* memory, int length)
  6.  
  7. public static implicit operator Span<T> (ArraySegment<T> arraySegment);
  8. public static implicit operator Span<T> (T[] array);
  9.  
  10. public int Length { get; }
  11. public ref T this[int index] { get; }
  12.  
  13. public Span<T> Slice(int index);
  14. public Span<T> Slice(int index, int length);
  15. public bool TryCopyTo(T[] destination);
  16. public bool TryCopyTo(Span<T> destination);
  17.  
  18. public T[] ToArray();
  19. }

In addition, we will provide a read-only version of Span<T>. The ReadOnlySpan<T> is required to represent slices of immutable and read-only structures, e.g. System.String slices. This is discussed below.

Scenarios

Span<T> is a small, but critical, building block for a much larger effort to provide .NET APIs to enable development of high scalability server applications.

The .NET Framework design philosophy has focused almost solely on productivity for developers writing application software. In addition, many of the Framework’s design decisions were made assuming Windows client-server applications circa 1999. This design philosophy is a big part of .NET’s success as .NET is universally viewed as a very high productivity platform.

But the landscape has shifted since our platform was conceived almost 20 years ago. We now target non-Windows operating systems, our developers write cloud hosted services demanding different tradeoffs than client-server applications, the state of the art patterns have moved away from once popular technologies like XML, UTF16, SOAP (to name a few), and the hardware running today’s software is very different than what was available 20 years ago.

When we analyze the gaps we have today and the requirements of today’s high scale servers, we realize that we need to provide modern no-copy, low-allocation, and UTF8 data transformation APIs that are efficient, reliable, and easy to use. Prototypes of such APIs are available in corefxlab repository, and Span<T> is one of the main fundamental building blocks for these APIs.

Data Pipelines

Modern servers are often designed as, often reactive, pipelines of components doing transformations on byte buffers. For example, such pipeline in a web server might consist of the following transformations: socket fills in a buffer -> HTTP parsing -> decompression -> Base 64 decoding -> routing -> HTML writing -> HTML escaping -> HTTP writing -> compression -> socket writing.

Span<byte> is very useful for implementing transformation routines of such data pipelines. First, Span<T> allows the server to freely switch between managed and native buffers depending on situation/settings. For example, Windows RIO sockets work best with native buffers, and libuv Kestrel works best with pinned managed arrays. Secondly, it allows complicated transformation algorithms to be implemented in safe code without the need to resort to using raw pointers. Lastly, the fact that Span<T> is slicable allows the pipeline to abstract away the physical chunks of buffers, treating them as uniform logical chunks relevant to that particular section of the pipeline.

The stack-only nature of Span<T> (see more on this below) allows pooled memory to be safely returned to the pool after the transformation pipeline completes, and allows the pipeline to pass only the relevant slice of the buffer to each transformation routine/component. In other words, Span<T> aids in lifetime management of pooled buffers, which is critical to the performance of today's servers.

Discontinuous Buffers

As alluded to before, data pipelines often process data in chunks as they arrive at a socket. This creates problems for data transformation routines such as parsing, which often have to deal with data that can reside in two or more buffers. For example, there might be a need to parse an integer residing partially in one buffer and partially in another. Since spans can abstract stack memory, they can solve this problem in a very elegant and performant way as illustrated in the following routine from ASP.NET Channels pipeline (full source):

  1. public unsafe static uint GetUInt32(this ReadableBuffer buffer) {
  2. ReadOnlySpan<byte> textSpan;
  3.  
  4. if (buffer.IsSingleSpan) { // if data in single buffer, it’s easy
  5. textSpan = buffer.First.Span;
  6. }
  7. else if (buffer.Length < 128) { // else, consider temp buffer on stack
  8. var data = stackalloc byte[128];
  9. var destination = new Span<byte>(data, 128);
  10. buffer.CopyTo(destination);
  11. textSpan = destination.Slice(0, buffer.Length);
  12. }
  13. else {
  14. // else pay the cost of allocating an array
  15. textSpan = new ReadOnlySpan<byte>(buffer.ToArray());
  16. }
  17.  
  18. uint value;
  19. var utf8Buffer = new Utf8String(textSpan);
  20. // yet the actual parsing routine is always the same and simple
  21. if (!PrimitiveParser.TryParseUInt32(utf8Buffer, out value)) {
  22. throw new InvalidOperationException();
  23. }
  24. return value;
  25. }

Non-Allocating Substring

Modern server protocols are more often than not text-based, and so it's not surprising that such servers often create and manipulate lots of strings.

One of the most common basic string operations is string slicing. Currently, System.String.Substring is the main .NET API for creating slices of a string, but the API is inefficient as it creates a new string to represent the slice and copies the characters from the original string to the new string slice. Because of this inefficiency, high performance servers shy away from using this API, where they can (in their internals), and pay the cost in the publicly facing APIs.

ReadOnlySpan<char> could be a much more efficient standard representation of a subsection of a string:

  1. public struct ReadOnlySpan<T> {
  2. public ReadOnlySpan(T[] array)
  3. public ReadOnlySpan(T[] array, int index)
  4. public ReadOnlySpan(T[] array, int index, int length)
  5. public unsafe ReadOnlySpan(void* memory, int length)
  6.  
  7. public int Length { get; }
  8. public T this[int index] { get; }
  9.  
  10. public ReadOnlySpan <T> Slice(int index)
  11. public ReadOnlySpan <T> Slice(int index, int count)
  12.  
  13. public bool TryCopyTo(T[] destination);
  14. public bool TryCopyTo(Span<T> destination);
  15. }
  16.  
  17. ReadOnlySpan<char> lengthText = "content-length:123".Slice(15);

Parsing

Currently, the .NET parsing APIs require the exact string representing the text being parsed to be passed as the argument to the APIs:

  1. int value = int.Parse("123");
  2. int value = int.Parse("content-length:123".Substring(15)); // this allocates

We do not have APIs that can parse a slice of a string or a text buffer without the need to first allocate a substring representing the text being parsed. ReadOnlySpan<char>-based APIs, together with non-allocating substring APIs discussed above, could solve this problem:

  1. public static class SpanParsingExtensions {
  2. static bool TryParse(this ReadOnlySpan<char> text, out int value)
  3. }
  4.  
  5. "content-length:123".Slice(15).TryParse(out int value);

The API can be further improved (take in Span<byte>) to parse text buffers regardless of the encoding (e.g. UTF8)

  1. public static class SpanParsingExtensions {
  2. static bool TryParse(this ReadOnlySpan<byte> text, EncodingData encoding, out int value)
  3. }
  4.  
  5. var byteArray = new byte[]{49, 50, 51};
  6. var bytesFromStringSlice = "content-length:123".Slice(15).As<byte>();
  7. var bytesFromUtf8StringSlice = new Utf8String("content-length:123").Slice(15);
  8.  
  9. byteArray.TryParse(EncodingData.Utf8, out int value);
  10. bytesFromStringSlice.TryParse(EncodingData.Utf16, out int value);
  11. bytesFromUtf8StringSlice.TryParse(EncodingData.Utf8, out int value);

Formatting

Similarly, formatting (the reverse of parsing) can be very elegantly and efficiently done on existing memory buffers backed by slices of arrays, native buffers, and stack allocated arrays. For Example, the following routine from corfxlab formats an integer (as UTF8 text) into an arbitrary byte buffer:

  1. public static bool TryFormat(this int value, Span<byte> buffer, out int bytesWritten)

Buffer Pooling

Span<T> can be used to pool memory from a large single buffer allocated on the native heap. This decreases [pointless] work the GC needs to perform to manage pooled buffers, which never get collected anyway, but often need to be permanently pinned, which is bad for the system. Also, the fact that native memory does not move lowers the cost of interop and the cost of pool related error checking (e.g. checking if a buffer is already returned to the pool).

Separatelly, the stack-only nature of Span<T> makes lifetime management of pooled memory more relaible; it helps in avoiding use-after-free errors with pooled memory. Without Span<T>, it’s often not clear when a pooled buffer that was passed to a separate module can be returned to the pool, as the module could be holding to the buffer for later use. With Span<T>, the server pipeline can be sure that there are no more references to the buffer after the stack pops to the frame that first allocated the span and passed it down to other modules.

Native code interop

Today, unmanaged buffers passed over unmanaged to managed boundary are frequently copied to byte[] to allow safe access from managed code. Span<T> can eliminate the need to copy in many such scenarios.

Secondly, a number of performance critical APIs in the Framework take unsafe pointers as input. Examples include Encoding.GetChars or Buffer.MemoryCopy. Over time, we should provide more safe APIs that use Span<T>, which will allow more code to compile as safe but still preserve its performance characteristics.

Requirements

To support the scenarios described above, Span<T> must meet the following requirements:

  1. Ability to wrap managed and native memory
  2. Performance characteristics on par with arrays
  3. Be memory-safe

Design/Representation

We will provide two different implementations of Span<T>:

  • Fast Span<T> (available on runtimes with special support for spans)
  • Slow Span<T> (available on all current .NET runtimes, even existing ones, e.g. .NET 4.5)

The fast implementation, will rely on "ref field" support and will look as follows:

  1. public struct Span<T> {
  2. internal ref T _pointer;
  3. internal int _length;
  4. }

A prototype of such fast Span<T> can be found here. Through the magic of the "ref field", it can support slicing without requiring a strong pointer to the root of the sliced object. The GC is able to trace the interior pointer, keep the root object alive, and update the interior pointer if the object is relocated during a collection.

A different representation will be implemented for platforms that don’t support ref fields (interior pointers):

  1. public struct Span<T> {
  2. internal IntPtr _pointer;
  3. internal object _relocatableObject;
  4. internal int _length;
  5. }

A prototype of this design can be found here. In this representation, the Span<T>'s indexer will add the _pointer and the address of _relocatableObject before accessing items in the Span. This will make the accessor slower, but it will ensure that when the GC moves the sliced object (e.g. array) in memory, the indexer still accesses the right memory location. Note that if the Span wraps a managed object, the _pointer field will be the offset off the object's root to the objects data slice, but if the Span wraps a native memory, the _pointer will point to the memory and the _relocatableObject will be set to null (zero). In either case, adding the pointer and the address of the object (null == 0) results in the right "effective" address.

Struct Tearing

Struct tearing is a threading issue that affects all structs larger than what can be atomically updated on the target processor architecture. For example, some 64-bit processors can only update one 64-bit aligned memory block atomically. This means that some processors won’t be able to update both the _pointer and the _length fields of the Span atomically. This in turn means that the following code, might result in another thread observing _pointer and _length fields belonging to two different spans (the original one and the one being assigned to the field):

  1. internal class Buffer {
  2. Span<byte> _memory = new byte[1024];
  3.  
  4. public void Resize(int newSize) {
  5. _memory = new byte[newSize]; // this will not update atomically
  6. }
  7.  
  8. public byte this[int index] => _memory[index]; // this might see partial update
  9. }

For most structs, tearing is at most a correctness bug and can be dealt with by making the fields (typed as the tearable struct type) non-public and synchronizing access to them. But since Span needs to be as fast as the array, access to the field cannot be synchronized. Also, because of the fact that Span accesses (and writes to) memory directly, having the _pointer and the _length be out of sync could result in memory safety being compromised.

The only other way (besides synchronizing access, which would be not practical) to avoid this issue is to make Span a stack-only type, i.e. its instances can reside only on the stack (which is accessed by one thread).

Span<T> will be stack-only

Span<T> will be a stack-only type; more precisely, it will be a by-ref type (just like its field in the fast implementation). This means that Spans cannot be boxed, cannot appear as a field of a non-stack-only type, and cannot be used as a generic argument. However, Span<T> can be used as a type of method arguments or return values.

We chose to make Span<T> stack-only as it solves several problems:

  • Efficient representation and access: Span<T> can be just managed pointer and length.
  • Efficient GC tracking: limit number of interior pointers that the GC have to track. Tracking of interior pointers in the heap during GC would be pretty expensive.
  • Safe concurrency (struct tearing discussed above): Span<T> assignment does not have to be atomic. Atomic assignment would be required for storing Span<T> on the heap to avoid data tearing issues.
  • Safe lifetime: Safe code cannot create dangling pointers by storing it on the heap when Span<T> points to unmanaged memory or stack memory. The unsafe stack frame responsible for creating unsafe Span is responsible for ensuring that it won’t escape the scope.
  • Reliable buffer pooling: buffers can be rented from a pool, wrapped in spans, the spans passed to user code, and when the stack unwinds, the program can reliably return the buffer to the pool as it can be sure that there are no outstanding references to the buffer.

The fast representation makes the type instances automatically stack-only, i.e. the constraint will be enforced by the CLR. This restriction should also be enforced by managed language compilers and/or analyzers for better developer experience. For the slow Span<T>, language compiler checks and/or analyzers is the only option (as the runtimes won't enforce the stack-only restriction).

Memory<T>

As alluded to above, in the upcoming months, many data transformation components in .NET (e.g. Base64Encoding, compressions, formatting, parsing) will provide APIs operating on memory buffers. We will do this work to develop no-copy/low-allocation end-to-end data pipelines, like the ASP.NET Channels. These APIs will use a collection of types, including, but not limited to, Span<T>, to represent various data pipeline primitives and exchange types.

This new collection of types must be usable by two distinct sets of customers:

  • Productivity developers (99% case): these are the developers who use LINQ, async, lambdas, etc., and often for good reasons care more about productivity than squeezing the last cycles out of some low level transformation routines.
  • Low level developers (1% case): our library and framework authors for whom performance is a critical aspect of their work.

Even though the goals of each group are different, they rely on each other to be successful. One is a necessary consumer of the other.

A stack-only type with the associated trade-offs is great for low level developers writing data transformation routines. Productivity developers, writing apps, may not be so thrilled when they realize that when using stack-only types, they lose many of the language features they rely on to get their jobs done (e.g. async await). And so, a stack-only type simply can’t be the primary exchange type we recommend for high level developers/scenarios/APIs.

For the whole platform to be successful, we must add an exchange type, currently called Memory<T>, that can be used with the full power of the language, i.e. it’s not stack-only. Memory<T> can be seen as a “promise” of a Span. It can be freely used in generics, stored on the heap, used with async await, and all the other language features we all love. When Memory<T> is finally ready to be manipulated by a data transformation routine, it will be temporarly converted to a span (the promise will be realized), which will provide much more efficient (remember "on par with array") access to the buffer's data.

See a prototype of Memory<T> here. Note that the prototype is currently not tearing safe. We will make it safe in the upcoming weeks.

Other Random Thoughts

Optimizations

We need to enable the existing array bounds check optimizations for Span<T> – in both the static compiler and the JIT – to make its performance on par with arrays. Longer term, we should optimize struct passing and construction to make slicing operations on Spans more efficient. Today, we recommend that Spans are sliced only when a shorted span needs to be passed to a different routine. Within a single routine, code should do index arithmetic to access subranges of spans.

Conversions

Span<T> will support reinterpret cast conversions to Span<byte>. It will also support unsafe casts between arbitrary primitive types. The reason for this limitation is that some processors don’t support efficient unaligned memory access.

A prototype of such API can be found here, and the API can be used as follows:

  1. var bytes = new Span<byte>(buffer);
  2. var characters = bytes.Cast<byte, char>();
  3. if(char.IsLower(characters[0]) { ... }

Platform Support Plans

We want to ship Span<T> as a NuGet fat package available for .NET Standard 1.1 and above. Runtimes that support by-ref fields and returns will get the fast ref-field Span<T>. Other runtimes will get the slower three-field Span<T>.

Relationship to Array Slicing

Since Span<T> will be a stack-only type, it’s not well suited as the general representation of array slice. When an array is sliced, majority of our users expect the result to be either an array, or at least a type that is very similar to the array (e.g. ArraySegment<T>). We will design array slicing separately from Span<T>.

Covariance

Unlike T[]Span<T> will not support covariant casts, i.e. cast Span<Subtype> to Span<Basetype>. Because of that, we won’t be doing covariance checks when storing references in Span<T> instances.

Language Support

Separately from this document, we are exploring language features to better support Span<T>:

  1. Enforcement of Stack-Only Type Restrictions

Span<T> and ReadOnlySpan<T> will be included in the set of built-in stack-only types. Any other struct containing one of these will be transitively considered a stack-only type. The compiler will error if a stack-only type is used in a disallowed context, e.g. used as a type argument, placed on the heap (boxed, passed to asynchronous call, used as a field of a class, etc.).

  1. Language Support for pinning

    1. Span<byte> buffer = ...
    2. fixed(byte* pBuffer = buffer){
    3. ...
    4. }
  2. Slicing syntax

C# compiler will add slicing syntax, and Memory<T>Span<T>, and ReadOnlySpan<T> will support it. The details are TBD, but imagine that a[1..5] calls a.Slice(1, 5) or a.Slice(new Range(1, 5)). c# Span<T> span = ... var slice = span[1..5];

  1. Safe Span<T> stackalloc

    1. void SafeMethod() {
    2. var buffer = stackalloc Span<byte>(128);
    3. PrimitiveFormatter.TryFormat(buffer, DateTime.Now, ...);
    4. }
  2. Primitive constraint

    Some Span<T> operations are valid only for so called primitive type arguments. For example, the reinterpret cast operation.

    We are exploring adding the ability to constrain type parameters to primitive types, i.e. types that are bit blittable. The cast operation would constrain its type parameters as follows:

    1. public static Span<U> Cast<T, U>(this Span<T> slice) where T:primitive where U:primitive
    2. { ... }

Open Issues

  1. Detailed design of Memory<T>    - Representation

    • Operations
    • Lifetime/Pinning
  2. Span<T> API design details
    • Type of the Length property
    • Namespace and type name
  3. Details of runtime optimizations

Span<T>的更多相关文章

  1. CSS之div和span标签

    div和span是非常重要的标签,div的语义是division"分割": span的语义就是span"范围.跨度". 这两个东西,都是最最重要的"盒 ...

  2. 如何改变span元素的宽度与高度

    内联元素:也称为行内元素,当多个行内元素连续排列时,他们会显示在一行里面. 内联元素的特性:本身是无法设置宽度和高度属性的,但是可以通过CSS样式来控制,达到我们想要的宽度和高度. span举例1: ...

  3. Android之TextView的样式类Span的使用详解

           Android中的TextView是个显示文字的的UI类,在现实中的需求中,文字有各式各样的样式,TextView本身没有属性去设置实现,我们可以通过Android提供的 Spannab ...

  4. 火狐下多个span连在一起和换行写存在差异

    当父元素的宽度确定,多个span换行写,span加起来占的宽度比预设的大

  5. IE下a标签后面的span元素向右浮动后错位

    错误原因span放在了a标签之后 正确写法是放在之前 如下: <li><span>2016-07-29</span><a href="#" ...

  6. Jquery 关于span标签的取值赋值用法

    span是最简单的容器,可以当作一个形式标签,其取值赋值方法有别于一般的页面元素. //赋值 $("#spanid").html(value) //取值 $("#span ...

  7. div span

    无牵无挂,不带任何样式,因此经常使用div完成整体样式的构建,span完成细微样式的构建. div为块级元素,span为行内元素. 使用div完成显示区域的居中.左右浮动等,完成整体的样式布局,然后在 ...

  8. HTML5--div、span超出部分省略号显示

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/ ...

  9. 记SpannableString设多少span时注意事项

    public void setSpan(Object what, int start, int end, int flags) { } 这个方法里的第一个参数,也就是一些span的对象,不能重复使用. ...

  10. HTML <span> 标签

    定义和用法: <span>标签被用来组合文档中的行内元素.在行内定义一个区域,也就是一行内可以被<span>划分成好几个区域,从而实现某种特定效果.<span>本身 ...

随机推荐

  1. 原生js 函数 callee属性

    一.在es5中,函数中有arguments参数,该参数是一个包含函数传入的参数的类数组. <script> function myArrgu(x){ console.log(argumen ...

  2. linux驱动之中断处理过程汇编部分

    linux系统下驱动中,中断异常的处理过程,与裸机开发中断处理过程非常类似.通过简单的回顾裸机开发中断处理部分,来参考学习linux系统下中断处理流程. 一.ARM裸机开发中断处理过程 以S3C244 ...

  3. 来自后端的突袭? --开包即食的教程带你浅尝最新开源的C# Web引擎 Blazor

    在今年年初, 恰逢新春佳节临近的时候. 微软给全球的C#开发者们, 着实的送上了一分惊喜. 微软正式开源Blazor ,将.NET带回到浏览器. 这个小惊喜, 迅速的在dotnet开发者中间传开了. ...

  4. 关于eclipse tomcat 无法启动(8080,8005,8009端口被占用)的解决方法,附 eclipse tomcat 与 tomcat 并存方式

    eclipse 在编译运行时 新建的tomcat连接始终为stopped状态,描述为8080,8005,8009端口被占用. 这是因为在装完tomcat后,tomcat服务已启动,而eclipse仅仅 ...

  5. Python学习第十四篇——类初步使用及面向对象思想

    class Restaurant(): def __init__(self,restaurant_name,cuisine_type): self.name = restaurant_name sel ...

  6. springBoot项目启动类启动无法访问

    springBoot项目启动类启动无法访问. 网上也查了一些资料,我这里总结.下不来虚的,也不废话. 解决办法: 1.若是maven项目,则找到右边Maven Projects --->Plug ...

  7. pdf转eps后存在大片空白的处理

    之前pdf转eps的方式是用acrobat直接转,发现每次转完后,图片都显示在一张A4纸上,插入到论文中时会出现大片空白:但在pdf中是没有这么多空白的,与裁剪没关系. 后来在 http://tex. ...

  8. Linux 安装软件之后设置PATH环境变量

    每一个软件都有安装路径这一项,指定安装路径的目的,一方面是便于文件搜索与查找,另一方面更方便的使用软件. 比如,几乎大多数自己安装的软件,都会选择安装在/usr/local目录下,比如apache.m ...

  9. 【问题解决方案】从 Anaconda Prompt 或 Jupyter Notebook 终端进入Python后重新退出到命令状态

    从 Anaconda Prompt 或 Jupyter Notebook 终端进入Python后重新退出到命令状态 退出Python:exit() 或者 Ctrl+z 例子一枚 默认打开的是3.7,需 ...

  10. 【kindle笔记】之 《鬼吹灯》-9-20

    [kindle笔记]读书记录-总 9-20 日常吐槽 连着几天,基本是一口气读完了鬼吹灯. 想来,也算是阴差阳错了.本来是想看盗墓的,读了几页开头,心想坏了,拷贝错了,这是鬼吹灯-- 讲真的,每每读小 ...