http://blog.dubbelboer.com/

Date: 09 Apr 2012
Author: Erik Dubbelboer

SYN cookies

So one day I noticed /var/log/syslog on one of our servers was filled with the following message:

TCP: Possible SYN flooding on port 80. Sending cookies.

This message can come a from a SYN DDOS, but in our case it was because of the amount of new connections one of our application was receiving. The syslog message is emitted when the SYN backlog of a socket is full.

The kernel documentation has the following to say about SYN cookies:

Note, that syncookies is fallback facility.
It MUST NOT be used to help highly loaded servers to stand
against legal connection rate. If you see SYN flood warnings
in your logs, but investigation shows that they occur
because of overload with legal connections, you should tune
another parameters until this warning disappear.
See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow. syncookies seriously violate TCP protocol, do not allow
to use TCP extensions, can result in serious degradation
of some services (f.e. SMTP relaying), visible not by you,
but your clients and relays, contacting you. While you see
SYN flood warnings in logs not being really flooded, your server
is seriously misconfigured.

To fix this problem I started by increasing the net.ipv4.tcp_max_syn_backlog kernel parameter. On our Ubuntu system the default was 2048 so I changed it to 4096 and restarted our application.

Nothing changed and the flooding messages still kept being emitted. I tried tuning some more parameters like tcp_synack_retries and netdev_max_backlog but nothing helped. Finally a friend pointed out to me that I could be looking at the actual kernel source to find out why the message was still being emitted.

Going back to the source

The obvious place to start this search was the function that actually emitted the message.

I found the function in net/ipv4/tcp_ipv4.c:

static void syn_flood_warning(const struct sk_buff *skb)
{
const char *msg; #ifdef CONFIG_SYN_COOKIES
if (sysctl_tcp_syncookies)
msg = "Sending cookies";
else
#endif
msg = "Dropping request"; pr_info("TCP: Possible SYN flooding on port %d. %s.\n",
ntohs(tcp_hdr(skb)->dest), msg);
}

This function was only being called from one point, another function in net/ipv4/tcp_ipv4.c:

int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
{
// ... if (inet_csk_reqsk_queue_is_full(sk) && !isn) {
if (net_ratelimit())
syn_flood_warning(skb); // ...
}

This function is called every time a new connection is set up. The net_ratelimit() is only there to prevent the syslog from being flooded with messages. So I followed the inet_csk_reqsk_queue_is_full() function to see what caused it to return true.

The function is defined in include/net/inet_connection_sock.h:

static inline int inet_csk_reqsk_queue_is_full(const struct sock *sk)
{
return reqsk_queue_is_full(&inet_csk(sk)->icsk_accept_queue);
}

It simply calles another function in include/net/request_sock.h:

static inline int reqsk_queue_is_full(const struct request_sock_queue *queue)
{
return queue->listen_opt->qlen >> queue->listen_opt->max_qlen_log;
}

So this is the actual check to see if the backlog is full. At a first glance the function looks a bit strange. The queue length is bit shifted by a max length variable.qlen is an integer so shifting it to the left will decrease it’s value. max_qlen_log is a base 2 logarithm of the max queue length (which can only be a power of 2). So when qlen is smaller than the maximum queue length all 1 bits will be shifted out and the return value will be 0. When qlen is larger it will still have bits set to 1 and the function will return a positive number.

So where is max_qlen_log set?

Size of the backlog

Searching for an assignment of max_qlen_log I found only one place.

net/core/request_sock.c:

int reqsk_queue_alloc(struct request_sock_queue *queue,
unsigned int nr_table_entries)
{
// ... nr_table_entries = min_t(u32, nr_table_entries, sysctl_max_syn_backlog);
nr_table_entries = max_t(u32, nr_table_entries, 8);
nr_table_entries = roundup_pow_of_two(nr_table_entries + 1); // ... for (lopt->max_qlen_log = 3;
(1 << lopt->max_qlen_log) < nr_table_entries;
lopt->max_qlen_log++); // ...
}

The reqsk_queue_alloc() function is called each time a new socket starts listening for connections. As you can see from the code max_qlen_log will depend on the nr_table_entries argument.

First nr_table_entries is bound to the 8,sysctl_max_syn_backlog range. This is the first sign of a kernel parameter, namely net.ipv4.tcp_max_syn_backlog, that I tried to tune. Apparently it has some effect on the backlog size but it only specifies a maximum, not the actual value like many resources would have you believe.

nr_table_entries then it is incremented by 1 (which still seems strange to me, see below) and rounded to the nearest power of 2. The for loop then sets max_qlen_log to the base 2 logarithm of nr_table_entries.

So from this function I found the maximum size of the backlog is bound by net.ipv4.tcp_max_syn_backlog but the actual size is determined by the nr_table_entries argument. Time to find out where reqsk_queue_alloc() is called.

The only place it is called from is in net/ipv4/inet_connection_sock.c:

int inet_csk_listen_start(struct sock *sk, const int nr_table_entries)
{
// ... reqsk_queue_alloc(&icsk->icsk_accept_queue, nr_table_entries); // ...
}

The inet_csk_listen_start() function does nothing with the nr_table_entries argument and is itself called in net/ipv4/af_inet.c:

int inet_listen(struct socket *sock, int backlog)
{
// ... err = inet_csk_listen_start(sk, backlog); // ...
}

This function doesn’t change the backlog size either. The function is not called directly in the source but is assign to a function pointer inside the inet_stream_ops struct in net/ipv4/af_inet.c.

The pointer in the struct is called in net/socket.c:

SYSCALL_DEFINE2(listen, int, fd, int, backlog)
{
// ... int somaxconn; // ... somaxconn = sock_net(sock->sk)->core.sysctl_somaxconn; if ((unsigned)backlog > somaxconn)
backlog = somaxconn; // ... sock->ops->listen(sock, backlog); // ...
}

Now this is the actual listen() syscall which has a backlog argument. This function also seems to put an upper limit on the backlog size. sock_net(sock->sk)->core.sysctl_somaxconn is another kernel parameter controlled by net.core.somaxconn. It defaults to the SOMAXCONN macro which equals 128 on our system.

128 is quite low so I increased it to 4096 as well. I restarted our application but to my supprise the flooding message still kept being emitted.

Lucky the application we are using is open source. So I opened up the source and found the application calling listen() with a backlog of again SOMAXCONN. After changing this to 1000000 (why not set it very high and let the kernel parameters limit it?), recompiling and restarting the application the message finally stopped.

Note: kernel 3.3 does exactly the same as 2.6 (on which this post is based)

A reasonably backlog size

In the reqsk_queue_alloc() function you can see an array of request_sock * pointers the size of nr_table_entries is allocated. On a 64 bit system the size of the request_sock is 56 bytes. Plus the 8 bytes for the pointer makes it around 64 bytes per entry. So 4096 entries would only take up 0.25 MB. 4096 should be enough for most servers but you can see that increasing it even more wouldn’t be a problem.

Keep in mind that setting the backlog to 4096 will actually make it 8192 entries big. This because of the strange + 1 in nr_table_entries = roundup_pow_of_two(nr_table_entries + 1);. This is the reason that software like nginxredis and apache all set the backlog to 511.

Conclusion

The main thing I learned from this all is that using open source software allows you to track and fix problems that closed source software would not.

Also fixing the SYN flooding problem requires you to modify net.ipv4.tcp_max_syn_backlognet.core.somaxconn and the backlog size passed to the listen() syscall.

All you need to know about SYN floods的更多相关文章

  1. 分布式拒绝服务攻击 DDoS

    分布式拒绝服务(DDoS:Distributed Denial of Service)攻击指借助于客户/服务器技术,将多个计算机联合起来作为攻击平台,对一个或多个目标发动DDoS攻击,从而成倍地提高拒 ...

  2. PF防火墙

    PF防火墙 点击认领       PF防火墙 ( 全称:Packet Filter ) 是 UNIX LIKE 系统上进行 TCP/IP 流量过滤和网络地址转换的软件系统.PF 同样也能提供 TCP/ ...

  3. DDoS攻击

    来自百度百科 分布式拒绝服务攻击编辑 分布式拒绝服务攻击(英文:Distributed Denial of Service,缩写:DDoS)亦称洪水攻击.顾名思义,即是利用网络上已被攻陷的电脑作为“僵 ...

  4. Data Center手册(2): 安全性

    有个安全性有下面几种概念: Threat:威胁 Vulnerability: 安全隐患 Attack: 攻击 有关Threat 常见的威胁有下面几种 DoS(Denial of Service拒绝服务 ...

  5. TCP三次握手原理,你真的了解吗?

    最近碰到一个问题,Client 端连接服务器总是抛异常.在反复定位分析.并查阅各种资料搞懂后,我发现并没有文章能把这两个队列以及怎么观察他们的指标说清楚. 问题描述 场景:Java 的 Client ...

  6. DNS服务器能遭受到的DDNS攻击类型

    每个网站都有可能会遭受到攻击,现在的互联网服务器遭受的最多的是DDOS攻击,本文总结了一些DNS服务器会遇到的DDOS攻击类型,以及相应的预防措施. [1]的作者将DDOS的攻击分为三种类型: 大数据 ...

  7. TCP 三次握手原理,你真的理解吗?

    最近,阿里中间件小哥哥蛰剑碰到一个问题——client端连接服务器总是抛异常.在反复定位分析.并查阅各种资料文章搞懂后,他发现没有文章把这两个队列以及怎么观察他们的指标说清楚. 因此,蛰剑写下这篇文章 ...

  8. 【转】关于TCP 半连接队列和全连接队列

    摘要: # 关于TCP 半连接队列和全连接队列 > 最近碰到一个client端连接异常问题,然后定位分析并查阅各种资料文章,对TCP连接队列有个深入的理解 > > 查资料过程中发现没 ...

  9. 如何缓解DDOS攻击

    1.减少攻击面 (a) reduce the number of necessary Internet entry points,(b) eliminate non-critical Internet ...

随机推荐

  1. class extension、class category、class-continuation category

    class extension Objective-C 2.0增加了class extensions用于解决两个问题: 允许一个对象可以拥有一个私有的interface,且可由编译器验证. 支持一个公 ...

  2. 对于WebAssembly编译出来的.wasm文件js如何调用

    WebAssembly也叫浏览器字节码技术 这里就不过多的解释了网上很多介绍 主要是让大家知道在js里面如何调用执行它,我之前看WebAssemblyAPI时候反正是看得一脸懵逼 也是为了大家能更快的 ...

  3. 洛谷 P1483 序列变换

    https://www.luogu.org/problemnew/show/P1483 数据范围不是太大. 一个数组记录给k,记录每个数加了多少. 对于查询每个数的大小,那么就枚举每个数的因子,加上这 ...

  4. 【线段树 细节题】bzoj1067: [SCOI2007]降雨量

    主要还是细节分析:线段树作为工具 Description 我们常常会说这样的话:“X年是自Y年以来降雨量最多的”.它的含义是X年的降雨量不超过Y年,且对于任意Y<Z<X,Z年的降雨量严格小 ...

  5. [LUOGU] P1880 [NOI1995]石子合并

    题目描述 在一个圆形操场的四周摆放N堆石子,现要将石子有次序地合并成一堆.规定每次只能选相邻的2堆合并成新的一堆,并将新的一堆的石子数,记为该次合并的得分. 试设计出1个算法,计算出将N堆石子合并成1 ...

  6. (34)zabbix Queue队列

    概述 queue(队列)显示监控项等待刷新的时间,可以看到每种agent类型刷新时间,通过queue可以更好的体现出监控的一个指标.正常情况下,是一片绿色. 如果出现过多红色,那么需要留意一下.我们也 ...

  7. nginx 无法加载css/js图片等文件 404 not fund

    刚配置Nginx反向代理,Nginx可能会出现无法加载css.js或者图片等文件,这里需要在配置文件*.conf里面加上如下配置项. location ~ .*\.(js|css|png|jpg)$ ...

  8. Kafka创建&查看topic,生产&消费指定topic消息

    启动zookeeper和Kafka之后,进入kafka目录(安装/启动kafka参考前面一章:https://www.cnblogs.com/cici20166/p/9425613.html) 1.创 ...

  9. linux中的硬盘及flash操作

    磁盘操作是块设备的必备操作,需要认真掌握. 一.硬盘 1.硬盘文件 默认串口硬盘的设备文件为sda(第一块硬盘).sdb(第二块硬盘).... 默认并口硬盘的设备文件为hda(第一块硬盘).hdb(第 ...

  10. foxmial 和 outlook设置问题

    您可以使用支持POP3的客户端软件(例如Foxmail或Outlook)收发您的邮件.请配置您的电子邮件客户端,以下载QQ邮箱邮件. 了解如何进行配置,请单击您的电子邮件客户端名称: Foxmail设 ...