.NET性能优化-为结构体数组使用StructLinq

前言

本系列的主要目的是告诉大家在遇到性能问题时，有哪些方案可以去优化；并不是要求大家一开始就使用这些方案来提升性能。

在之前几篇文章中，有很多网友就有一些非此即彼的观念，在实际中，处处都是开发效率和性能之间取舍的艺术。《计算机编程艺术》一书中提到过早优化是万恶之源，在进行性能优化时，你必须要问自己几个问题，看需不要进行性能优化。

优化的成本高么？
如果立刻开始优化会带来什么影响？
因为对任务目标的影响或是兴趣等其他原因而关注这个问题？
任务目标影响有多大？
随着硬件性能提升或者框架版本升级，优化的结果会不会过时？
如果不进行优化或延迟优化的进行会带来什么负面的影响？
如果不进行优化或延迟优化，相应的时间或成本可以完成什么事情，是否更有价值？

如果评估下来，还是优化的利大于弊，而且在合理的时间范围内，那么就去做；如果觉得当前应用的QPS不高、用户体验也还好、内存和CPU都有空余，那么就放一放，主要放在二八法则中能为你创建80%价值的事情上。但是大家要记住过早优化是万恶之源不是写垃圾代码的借口。

回到正题，在上篇文章《使用结构体替代类》中有写在缓存和大数据量计算时使用结构体有诸多的好处，最后关于计算性能的例子中，我使用的是简单的for循环语句，但是在C#中我们使用LINQ多于使用for循环。有小伙伴就问了两个问题：

平时使用的LINQ对于结构体是值传递还是引用传递？
如果是值传递，那么有没有办法改为引用传递？达到更好性能？

针对这两个问题特意写一篇回答一下，字数不多，几分钟就能阅读完。

Linq是值传递

在.NET平台上，默认对于值类型的方法传参都是值传递，除非在方法参数上指定ref，才能变为引用传递。

同样，在LINQ实现的Where、Select、Take众多方法中，也没有加入ref关键字，所以在LINQ中全部都是值传递，如果结构体Size大于8byte（当前平台的指针大小），那么在调用方法时，结构体的速度要慢于引用传递的类。

比如我们编写如下代码，使用常见的Linq API进行数据的结构化查询，分别使用结构体和类，看看效果，数组数据量为1w。

public class SomeClass

{

    public int Value1; public int Value2;

    public float Value3; public double Value4;

    public string? Value5; public decimal Value6;

    public DateTime Value7; public TimeOnly Value8;

    public DateOnly Value9;

}  

public struct SomeStruct

{

    public int Value1; public int Value2;

    public float Value3; public double Value4;

    public string? Value5; public decimal Value6;

    public DateTime Value7; public TimeOnly Value8;

    public DateOnly Value9;

}

[MemoryDiagnoser]

[Orderer(SummaryOrderPolicy.FastestToSlowest)]

public class Benchmark

{

    private static readonly SomeClass[] ClassArray;

    private static readonly SomeStruct[] StructArray;  

    static Benchmark()

    {

        var baseTime = DateTime.Now;

        ClassArray = new SomeClass[10000];

        StructArray = new SomeStruct[10000];

        for (int i = 0; i < 10000; i++)

        {

            var item = new SomeStruct

            {

                Value1 = i, Value2 = i, Value3 = i,

                Value4 = i, Value5 = i.ToString(),

                Value6 = i, Value7 = baseTime.AddHours(i),

                Value8 = TimeOnly.MinValue, Value9 = DateOnly.MaxValue

            };

            StructArray[i] = item;

            ClassArray[i] = new SomeClass

            {

                Value1 = i, Value2 = i, Value3 = i,

                Value4 = i, Value5 = i.ToString(),

                Value6 = i, Value7 = baseTime.AddHours(i),

                Value8 = TimeOnly.MinValue, Value9 = DateOnly.MaxValue

            };

        }

    }  

    [Benchmark(Baseline = true)]

    public decimal Class()

    {

        return ClassArray.Where(x => x.Value1 > 5000)

            .Where(x => x.Value3 > 5000)

            .Where(x => x.Value7 > DateTime.MinValue)

            .Where(x => x.Value5 != string.Empty)

            .Where(x => x.Value6 > 1)

            .Where(x => x.Value8 > TimeOnly.MinValue)

            .Where(x => x.Value9 > DateOnly.MinValue)

            .Skip(100)

            .Take(10000)

            .Select(x => x.Value6)

            .Sum();

    }

    [Benchmark]

    public decimal Struct()

    {

        return StructArray.Where(x => x.Value1 > 5000)

            .Where(x => x.Value3 > 5000)

            .Where(x => x.Value7 > DateTime.MinValue)

            .Where(x => x.Value5 != string.Empty)

            .Where(x => x.Value6 > 1)

            .Where(x => x.Value8 > TimeOnly.MinValue)

            .Where(x => x.Value9 > DateOnly.MinValue)

            .Skip(100)

            .Take(10000)

            .Select(x => x.Value6)

            .Sum();

    }

}

Benchmakr的结果如下，大家看到在速度上有5倍的差距，结构体由于频繁装箱内存分配的也更多。

那么注定没办开开心心的在结构体上用LINQ了吗？那当然不是，引入我们今天要给大家介绍的项目。

使用StructLinq

首先来介绍一下StructLinq，在C#中用结构体实现LINQ，以大幅减少内存分配并提高性能。引入IRefStructEnumerable，以提高元素为胖结构体(胖结构体是指结构体大小大于16Byte)时的性能。

引入StructLinq

这个库已经分发在 NuGet上。可以直接通过下面的命令安装 StructLinq :

PM> Install-Package StructLinq

简单使用

下方就是一个简单的使用，用来求元素和。唯一不同的地方就是需要调用ToStructEnumerable方法。

using StructLinq;

int[] array = new [] {1, 2, 3, 4, 5};

int result = array

                .ToStructEnumerable()

                .Where(x => (x & 1) == 0, x=>x)

                .Select(x => x *2, x => x)

                .Sum();

x=>x用于避免装箱（和分配内存），并帮助泛型参数推断。你也可以通过对Where和Select函数使用结构来提高性能。

性能

所有的跑分结果你可以在这里找到. 举一个例子，下方代码的Linq查询:

   list

     .Where(x => (x & 1) == 0)

     .Select(x => x * 2)

     .Sum();

可以被替换为下面的代码:

  list

    .ToStructEnumerable()

    .Where(x => (x & 1) == 0)

    .Select(x => x * 2)

    .Sum();

或者你想零分配内存，可以像下面一样写（类型推断出来，没有装箱）:

 list

   .ToStructEnumerable()

   .Where(x => (x & 1) == 0, x=>x)

   .Select(x => x * 2, x=>x)

   .Sum(x=>x);

如果想要零分配和更好的性能，可以像下面一样写：

  var where = new WherePredicate();

  var select = new SelectFunction();

  list

    .ToStructEnumerable()

    .Where(ref @where, x => x)

    .Select(ref @select, x => x, x => x)

    .Sum(x => x);

上方各个代码的Benchmark结果如下所示:

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042

Intel Core i7-8750H CPU 2.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores

.NET Core SDK=5.0.101

  [Host]     : .NET Core 5.0.1 (CoreCLR 5.0.120.57516, CoreFX 5.0.120.57516), X64 RyuJIT

  DefaultJob : .NET Core 5.0.1 (CoreCLR 5.0.120.57516, CoreFX 5.0.120.57516), X64 RyuJIT

Method	Mean	Error	StdDev	Ratio	Gen 0	Gen 1	Gen 2	Allocated
LINQ	65.116 μs	0.6153 μs	0.5756 μs	1.00	-	-	-	152 B
StructLinqWithDelegate	26.146 μs	0.2402 μs	0.2247 μs	0.40	-	-	-	96 B
StructLinqWithDelegateZeroAlloc	27.854 μs	0.0938 μs	0.0783 μs	0.43	-	-	-	-
StructLinqZeroAlloc	6.872 μs	0.0155 μs	0.0137 μs	0.11	-	-	-	-

StructLinq在这些场景里比默认的LINQ实现快很多。

在上文场景中使用

我们也把上面的示例代码使用StructLinq改写一下。

// 引用类型使用StructLinq

[Benchmark]

public double ClassStructLinq()

{

    return ClassArray

        .ToStructEnumerable()

        .Where(x => x.Value1 > 5000)

        .Where(x => x.Value3 > 5000)

        .Where(x => x.Value7 > DateTime.MinValue)

        .Where(x => x.Value5 != string.Empty)

        .Where(x => x.Value6 > 1)

        .Where(x => x.Value8 > TimeOnly.MinValue)

        .Where(x => x.Value9 > DateOnly.MinValue)

        .Skip(100)

        .Take(10000)

        .Select(x => x.Value4)

        .Sum(x => x);

}  

// 结构体类型使用StructLinq

[Benchmark]

public double StructLinq()

{

    return StructArray

        .ToStructEnumerable()

        .Where(x => x.Value1 > 5000)

        .Where(x => x.Value3 > 5000)

        .Where(x => x.Value7 > DateTime.MinValue)

        .Where(x => x.Value5 != string.Empty)

        .Where(x => x.Value6 > 1)

        .Where(x => x.Value8 > TimeOnly.MinValue)

        .Where(x => x.Value9 > DateOnly.MinValue)

        .Skip(100)

        .Take(10000)

        .Select(x => x.Value4)

        .Sum(x => x);

}  

// 结构体类型 StructLinq 零分配

[Benchmark]

public double StructLinqZeroAlloc()

{

    return StructArray

        .ToStructEnumerable()

        .Where(x => x.Value1 > 5000, x=> x)

        .Where(x => x.Value3 > 5000, x => x)

        .Where(x => x.Value7 > DateTime.MinValue, x => x)

        .Where(x => x.Value5 != string.Empty, x => x)

        .Where(x => x.Value6 > 1, x => x)

        .Where(x => x.Value8 > TimeOnly.MinValue, x => x)

        .Where(x => x.Value9 > DateOnly.MinValue, x => x)

        .Skip(100)

        .Take(10000)

        .Select(x => x.Value4, x => x)

        .Sum(x => x);

}  

// 结构体类型 StructLinq 引用传递

[Benchmark]

public double StructLinqRef()

{

    return StructArray

        .ToRefStructEnumerable()  // 这里使用的是ToRefStructEnumerable

        .Where((in SomeStruct x) => x.Value1 > 5000)

        .Where((in SomeStruct x) => x.Value3 > 5000)

        .Where((in SomeStruct x) => x.Value7 > DateTime.MinValue)

        .Where((in SomeStruct x) => x.Value5 != string.Empty)

        .Where((in SomeStruct x) => x.Value6 > 1)

        .Where((in SomeStruct x) => x.Value8 > TimeOnly.MinValue)

        .Where((in SomeStruct x) => x.Value9 > DateOnly.MinValue)

        .Skip(100)

        .Take(10000)

        .Select((in SomeStruct x) => x.Value4)

        .Sum(x => x);

}  

// 结构体类型 StructLinq 引用传递 零分配

[Benchmark]

public double StructLinqRefZeroAlloc()

{

    return StructArray

        .ToRefStructEnumerable()

        .Where((in SomeStruct x) => x.Value1 > 5000, x=> x)

        .Where((in SomeStruct x) => x.Value3 > 5000, x=> x)

        .Where((in SomeStruct x) => x.Value7 > DateTime.MinValue, x=> x)

        .Where((in SomeStruct x) => x.Value5 != string.Empty, x=> x)

        .Where((in SomeStruct x) => x.Value6 > 1, x => x)

        .Where((in SomeStruct x) => x.Value8 > TimeOnly.MinValue, x=> x)

        .Where((in SomeStruct x) => x.Value9 > DateOnly.MinValue, x=> x)

        .Skip(100, x => x)

        .Take(10000, x => x)

        .Select((in SomeStruct x) => x.Value4, x=> x)

        .Sum(x => x, x=>x);

}  

// 结构体 直接for循环

[Benchmark]

public double StructFor()

{

    double sum = 0;

    int skip = 100;

    int take = 10000;

    for (int i = 0; i < StructArray.Length; i++)

    {

        ref var x = ref StructArray[i];

        if(x.Value1 <= 5000) continue;

        if(x.Value3 <= 5000) continue;

        if(x.Value7 <= DateTime.MinValue) continue;

        if(x.Value5 == string.Empty) continue;

        if(x.Value6 <= 1) continue;

        if(x.Value8 <= TimeOnly.MinValue) continue;

        if(x.Value9 <= DateOnly.MinValue) continue;

        if(i < skip) continue;

        if(i >= skip + take) break;

        sum += x.Value4;

    }  

    return sum;

}

最后的Benchmark结果如下所示。

从以上Benchmark结果可以得出以下结论：

类和结构体都可以使用StructLinq来减少内存分配。
类和结构体使用StructLinq都会导致代码跑的更慢。
结构体类型使用StructLinq的引用传递模式可以获得5倍的性能提升，比引用类型更快。
无论是LINQ还是StructLinq由于本身的复杂性，性能都没有For循环来得快。

总结

在已经用上结构体的高性能场景，其实不建议使用LINQ了，因为LINQ本身它性能就存在瓶颈，它主要就是为了提升开发效率。建议直接使用普通循环。

如果一定要使用，那么建议大于8byte的结构体使用StructLinq的引用传递模式(ToRefStructEnumerable)，这样可以把普通LINQ结构体的性能提升5倍以上，也能几乎不分配额外的空间。