victoriaMetrics中的一些Sao操作

快速获取当前时间

victoriaMetrics中有一个fasttime库，用于快速获取当前的Unix时间，实现其实挺简单，就是在后台使用一个goroutine不断以1s为周期刷新表示当前时间的变量currentTimestamp，获取的时候直接原子加载该变量即可。其性能约是time.Now()的8倍。

其核心方式就是将主要任务放到后台运行，通过一个中间变量来传递运算结果，以此来通过异步的方式提升性能，但需要业务能包容一定的精度偏差。

func init() {

	go func() {

		ticker := time.NewTicker(time.Second)

		defer ticker.Stop()

		for tm := range ticker.C {

			t := uint64(tm.Unix())

			atomic.StoreUint64(&currentTimestamp, t)

		}

	}()

}

var currentTimestamp = uint64(time.Now().Unix())

// UnixTimestamp returns the current unix timestamp in seconds.

//

// It is faster than time.Now().Unix()

func UnixTimestamp() uint64 {

	return atomic.LoadUint64(&currentTimestamp)

}

计算结构体的哈希值

hashUint64函数中使用xxhash.Sum64计算了结构体Key的哈希值。通过unsafe.Pointer将指针转换为*[]byte类型，byte数组的长度为unsafe.Sizeof(*k)，unsafe.Sizeof()返回结构体的字节大小。

如果一个数据为固定的长度，如h的类型为uint64，则可以直接指定长度为8进行转换，如：bp:=([8]byte)(unsafe.Pointer(&h))

需要注意的是unsafe.Sizeof()返回的是数据结构的大小而不是其指向内容的数据大小，如下返回的slice大小为24，为slice首部数据结构SliceHeader的大小，而不是其引用的数据大小(可以使用len获取slice引用的数据大小)。此外如果结构体中有指针，则转换成的byte中存储的也是指针存储的地址。
slice := []int{1,2,3,4,5,6,7,8,9,10}

fmt.Println(unsafe.Sizeof(slice)) //24

type Key struct {

	Part interface{}

	Offset uint64

}

func (k *Key) hashUint64() uint64 {

	buf := (*[unsafe.Sizeof(*k)]byte)(unsafe.Pointer(k))

	return xxhash.Sum64(buf[:])

}

将字符串添加到已有的[]byte中

使用如下方式即可：

str := "1231445"

arr := []byte{1, 2, 3}

arr = append(arr, str...)

将int64的数组转换为byte数组

直接操作了底层的SliceHeader

func int64ToByteSlice(a []int64) (b []byte) {

   sh := (*reflect.SliceHeader)(unsafe.Pointer(&b))

   sh.Data = uintptr(unsafe.Pointer(&a[0]))

   sh.Len = len(a) * int(unsafe.Sizeof(a[0]))

   sh.Cap = sh.Len

   return

}

并发访问的sync.WaitGroup

并发访问的sync.WaitGroup的目的是为了在运行时添加需要等待的goroutine

// WaitGroup wraps sync.WaitGroup and makes safe to call Add/Wait

// from concurrent goroutines.

//

// An additional limitation is that call to Wait prohibits further calls to Add

// until return.

type WaitGroup struct {

	sync.WaitGroup

	mu sync.Mutex

}

// Add registers n additional workers. Add may be called from concurrent goroutines.

func (wg *WaitGroup) Add(n int) {

	wg.mu.Lock()

	wg.WaitGroup.Add(n)

	wg.mu.Unlock()

}

// Wait waits until all the goroutines call Done.

//

// Wait may be called from concurrent goroutines.

//

// Further calls to Add are blocked until return from Wait.

func (wg *WaitGroup) Wait() {

	wg.mu.Lock()

	wg.WaitGroup.Wait()

	wg.mu.Unlock()

}

// WaitAndBlock waits until all the goroutines call Done and then prevents

// from new goroutines calling Add.

//

// Further calls to Add are always blocked. This is useful for graceful shutdown

// when other goroutines calling Add must be stopped.

//

// wg cannot be used after this call.

func (wg *WaitGroup) WaitAndBlock() {

	wg.mu.Lock()

	wg.WaitGroup.Wait()

	// Do not unlock wg.mu, so other goroutines calling Add are blocked.

}

// There is no need in wrapping WaitGroup.Done, since it is already goroutine-safe.

时间池

高频次创建timer会消耗一定的性能，为了减少某些情况下的性能损耗，可以使用sync.Pool来回收利用创建的timer

// Get returns a timer for the given duration d from the pool.

//

// Return back the timer to the pool with Put.

func Get(d time.Duration) *time.Timer {

	if v := timerPool.Get(); v != nil {

		t := v.(*time.Timer)

		if t.Reset(d) {

			logger.Panicf("BUG: active timer trapped to the pool!")

		}

		return t

	}

	return time.NewTimer(d)

}

// Put returns t to the pool.

//

// t cannot be accessed after returning to the pool.

func Put(t *time.Timer) {

	if !t.Stop() {

		// Drain t.C if it wasn't obtained by the caller yet.

		select {

		case <-t.C:

		default:

		}

	}

	timerPool.Put(t)

}

var timerPool sync.Pool

访问限速

victoriaMetrics的vminsert作为vmagent和vmstorage之间的组件，接收vmagent的流量并将其转发到vmstorage。在vmstorage卡死、处理过慢或下线的情况下，有可能会导致无法转发流量，进而造成vminsert CPU和内存飙升，造成组件故障。为了防止这种情况，vminsert使用了限速器，当接收到的流量激增时，可以在牺牲一部分数据的情况下保证系统的稳定性。

victoriaMetrics的源码中对限速器有如下描述：

Limit the number of conurrent f calls in order to prevent from excess memory usage and CPU thrashing

限速器使用了两个参数：maxConcurrentInserts和maxQueueDuration，前者给出了突发情况下可以处理的最大请求数，后者给出了某个请求的最大超时时间。需要注意的是Do(f func() error)是异步执行的，而ch又是全局的，因此会异步等待其他请求释放资源(struct{})。

可以看到限速器使用了指标来指示当前的限速状态。同时使用cgroup.AvailableCPUs()*4 (即runtime.GOMAXPROCS(-1)*4)来设置默认的maxConcurrentInserts长度。

当该限速器用在处理如http请求时，该限速器并不能限制底层上送的请求，其限制的是对请求的处理。在高流量业务处理中，这也是最消耗内存的地方，通常包含数据读取、内存申请拷贝等。底层的数据受/proc/sys/net/core/somaxconn和socket缓存区的限制。

var (

	maxConcurrentInserts = flag.Int("maxConcurrentInserts", cgroup.AvailableCPUs()*4, "The maximum number of concurrent inserts. Default value should work for most cases, "+

		"since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration")

	maxQueueDuration = flag.Duration("insert.maxQueueDuration", time.Minute, "The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts")

)

// ch is the channel for limiting concurrent calls to Do.

var ch chan struct{}

// Init initializes concurrencylimiter.

//

// Init must be called after flag.Parse call.

func Init() {

	ch = make(chan struct{}, *maxConcurrentInserts) //初始化limiter，最大突发并行请求量为maxConcurrentInserts

}

// Do calls f with the limited concurrency.

func Do(f func() error) error {

	// Limit the number of conurrent f calls in order to prevent from excess

	// memory usage and CPU thrashing.

	select {

	case ch <- struct{}{}: //在channel中添加一个元素，表示开始处理一个请求

		err := f() //阻塞等大请求处理结束

		<-ch //请求处理完之后释放channel中的一个元素，释放出的空间可以用于处理下一个请求

		return err

	default:

	}

    //如果当前达到处理上限maxConcurrentInserts，则需要等到其他Do(f func() error)释放资源。

	// All the workers are busy.

	// Sleep for up to *maxQueueDuration.

	concurrencyLimitReached.Inc()

	t := timerpool.Get(*maxQueueDuration) //获取一个timer，设置等待超时时间为 maxQueueDuration

	select {

	case ch <- struct{}{}: //在maxQueueDuration时间内等待其他请求释放资源，如果获取到资源，则回收timer，继续处理

		timerpool.Put(t)

		err := f()

		<-

		return err

	case <-t.C: //在maxQueueDuration时间内没有获取到资源，定时器超时后回收timer，丢弃请求并返回错误信息

		timerpool.Put(t)

		concurrencyLimitTimeout.Inc()

		return &httpserver.ErrorWithStatusCode{

			Err: fmt.Errorf("cannot handle more than %d concurrent inserts during %s; possible solutions: "+

				"increase `-insert.maxQueueDuration`, increase `-maxConcurrentInserts`, increase server capacity", *maxConcurrentInserts, *maxQueueDuration),

			StatusCode: http.StatusServiceUnavailable,

		}

	}

}

var (

	concurrencyLimitReached = metrics.NewCounter(`vm_concurrent_insert_limit_reached_total`)

	concurrencyLimitTimeout = metrics.NewCounter(`vm_concurrent_insert_limit_timeout_total`)

	_ = metrics.NewGauge(`vm_concurrent_insert_capacity`, func() float64 {

		return float64(cap(ch))

	})

	_ = metrics.NewGauge(`vm_concurrent_insert_current`, func() float64 {

		return float64(len(ch))

	})

)

优先级控制

victoriaMetrics的pacelimiter库实现了优先级控制。主要方法由Inc、Dec和WaitIfNeeded。低优先级任务需要调用WaitIfNeeded方法，如果此时有高优先级任务(调用Inc方法)，则低优先级任务需要等待高优先级任务结束(调用Dec方法)之后才能继续执行。

// PaceLimiter throttles WaitIfNeeded callers while the number of Inc calls is bigger than the number of Dec calls.

//

// It is expected that Inc is called before performing high-priority work,

// while Dec is called when the work is done.

// WaitIfNeeded must be called inside the work which must be throttled (i.e. lower-priority work).

// It may be called in the loop before performing a part of low-priority work.

type PaceLimiter struct {

	mu          sync.Mutex

	cond        *sync.Cond

	delaysTotal uint64

	n           int32

}

// New returns pace limiter that throttles WaitIfNeeded callers while the number of Inc calls is bigger than the number of Dec calls.

func New() *PaceLimiter {

	var pl PaceLimiter

	pl.cond = sync.NewCond(&pl.mu)

	return &pl

}

// Inc increments pl.

func (pl *PaceLimiter) Inc() {

	atomic.AddInt32(&pl.n, 1)

}

// Dec decrements pl.

func (pl *PaceLimiter) Dec() {

	if atomic.AddInt32(&pl.n, -1) == 0 {

		// Wake up all the goroutines blocked in WaitIfNeeded,

		// since the number of Dec calls equals the number of Inc calls.

		pl.cond.Broadcast()

	}

}

// WaitIfNeeded blocks while the number of Inc calls is bigger than the number of Dec calls.

func (pl *PaceLimiter) WaitIfNeeded() {

	if atomic.LoadInt32(&pl.n) <= 0 {

		// Fast path - there is no need in lock.

		return

	}

	// Slow path - wait until Dec is called.

	pl.mu.Lock()

	for atomic.LoadInt32(&pl.n) > 0 {

		pl.delaysTotal++

		pl.cond.Wait()

	}

	pl.mu.Unlock()

}

// DelaysTotal returns the number of delays inside WaitIfNeeded.

func (pl *PaceLimiter) DelaysTotal() uint64 {

	pl.mu.Lock()

	n := pl.delaysTotal

	pl.mu.Unlock()

	return n

}