应用层协议系列（两）——HTTPserver之http协议分析

上一篇文章《抄nginx Httpserver设计与实现（一）——多进程和多通道IO现》中实现了一个仿照nginx的支持高并发的server。但仅仅是实现了port监听和数据接收。并没有实现对http协议的解析，以下就对怎样解析http协议进行说明。

我们能够通过浏览器訪问之前所搭建的httpserver，能够看到终端输出例如以下：

GET / HTTP/1.1

Host: 127.0.0.1:8080

Connection: keep-alive

Cache-Control: max-age=0

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36

Accept-Encoding: gzip,deflate,sdch

Accept-Language: zh-CN,zh;q=0.8

參考一些网上的资料能够知道，http协议主要有三部分组成，即请求行、若干请求字段、请求体。请求行主要包含所使用的http方法，訪问的路径以及http的版本号。请求字段主要包含若干个具体说明本次http请求的字段。每一个字段由字段名+冒号+空格+字段值组成。

请求体主要包含发送到client的数据。

当中请求行和请求字段之间是连续的，而请求字段与请求体之间会有两个空白行（\r\n）分隔。

在明白了这些内容之后，我们就能够開始对接收到的http请求进行解析了。本文将使用两个类，CHttpRequest和CHttpResponse来实现这一功能。

以下首先改动上一篇文章中的

handleRequest方法：

//处理http请求

bool handleRequest(int connFd) {

    if (connFd<=0) return false;

    //读取缓存

    char buff[4096];

    //读取http header

    int len = (int)recv(connFd, buff, sizeof(buff), 0);

    if (len<=0) {

        return false;

    }

    buff[len] = '\0';

    std::cout<<buff<<std::endl;

    CHttpRequest *httpRequest = new CHttpRequest();

    httpRequest->handleRequest(buff);

    CHttpResponse *httpResponse = new CHttpResponse(httpRequest);

    bool result = httpResponse->response(connFd);

    //返回是否须要中断连接

    std::string transformConnection(httpRequest->connection);

    std::transform(transformConnection.begin(), transformConnection.end(), transformConnection.begin(), ::tolower);

    return transformConnection == "Keep-Alive" && result;

}

该代码中採用了一个长度为4096的缓冲区接收http头，接收完毕之后，调用CHttpRequest进行解析。

以下来看看CHttpRequest的代码：

#include "CHttpRequest.h"

#include "define.h"

using namespace std;

CHttpRequest::CHttpRequest() {

    connection = "Close";

    modifiedTime = "";

    fileStart = 0;

    fileEnd = 0;

    fieldMap[TS_HTTP_HEADER_CONNECTION] = &CHttpRequest::handleConnection;

    fieldMap[TS_HTTP_HEADER_AUTHORIZATION] = &CHttpRequest::handleAuthorization;

    fieldMap[TS_HTTP_HEADER_RANGE] = &CHttpRequest::handleRange;

    fieldMap[TS_HTTP_HEADER_IF_MOD_SINCE] = &CHttpRequest::handleIfModSince;

}

void CHttpRequest::handleRequest(char *header) {

    stringstream stream;

    stream<<header;

    int count = 0;

    while (1) {

        if (stream.eof()) {

            break;

        }

        char line[1024];

        stream.getline(line, sizeof(line));

        if (strcmp(line, "")==0) {

            continue;

        }

        stringstream lineStream;

        lineStream<<line;

        //first line

        if (count == 0) {

            lineStream>>method;

            lineStream>>path;

            lineStream>>version;

        }else {

            string fieldName;

            lineStream>>fieldName;

            //remove \r

            line[strlen(line)-1] = '\0';

            void(CHttpRequest::*func)(char*) = fieldMap[fieldName];

            if (func!=NULL) {

                (this->*func)(line+fieldName.length()+1);

            }

        }

        count++;

    }

}

void CHttpRequest::handleConnection(char *field) {

    if (ENABLE_KEEP_ALIVE) {

        connection = string(field);

    }

}

void CHttpRequest::handleAuthorization(char *field) {

    char authName[10], authInfo[256];

    sscanf(field, "%s %s", authName, authInfo);

    authorize = string(authInfo);

}

void CHttpRequest::handleRange(char *field) {

    if (strstr(field, "bytes=")==field) {

        char *start = strtok(field+strlen("bytes="), "-");

        fileStart = start==NULL?

0:atol(start);

        char *end = strtok(NULL, "-");

        fileEnd = end==NULL?0:atol(end);

    }

}

void CHttpRequest::handleIfModSince(char *field) {

    modifiedTime = string(field);

}

为了保证http解析的效率，本文採用了与nginx中类似的做法，将字段名与解析函数放到了map中（nginx中使用的是hash表，在这里简化为map）。

在解析完毕之后，调用CHttpResponse构造响应。CHttpResponse代码例如以下：

#include "CHttpResponse.h"

#include "CHttpRequest.h"

#include <sys/socket.h>

#include "define.h"

#include <string.h>

#define HTTP_RESPONSE_404 "<html><head><title>404 Not Found</title></head><body><h1>Not Found</h1><p>The requested URL was not found on this server.</p></body></html>"

std::string getStringFromTime(time_t time) {

    char timeBuff[64];

    struct tm tm = *gmtime(&time);

    strftime(timeBuff, sizeof timeBuff, "%a, %d %b %Y %H:%M:%S %Z", &tm);

    return std::string(timeBuff);

}

CHttpResponse::CHttpResponse(CHttpRequest *request) {

    m_request = request;

    if (m_request->method.compare(TS_HTTP_METHOD_GET_S)==0 || m_request->method.compare(TS_HTTP_METHOD_HEAD_S)==0) {

        std::string path = ROOT_PATH;

        if (m_request->path.compare("/")==0) {

            path += ROOT_HTML;

        }else {

            path += m_request->path;

        }

        m_statusCode = 0;

        //if file exist

        if (isFileExist(path.c_str())) {

            //if receive modified time

            if (!m_request->modifiedTime.empty()) {

                time_t time = fileModifiedTime(path.c_str());

                if (getStringFromTime(time) == m_request->modifiedTime) {

                    m_statusCode = TS_HTTP_STATUS_NOT_MODIFIED;

                    m_statusMsg = TS_HTTP_STATUS_NOT_MODIFIED_S;

                }

            }

            //if file modified

            if (m_statusCode == 0) {

                if (m_request->fileStart || m_request->fileEnd) {

                    long long fileSize = getFileSize(path.c_str());

                    //if request range satisfied

                    if (m_request->fileStart<fileSize && m_request->fileEnd<fileSize) {

                        m_statusCode = TS_HTTP_STATUS_PARTIAL_CONTENT;

                        m_statusMsg = TS_HTTP_STATUS_PARTIAL_CONTENT_S;

                        m_sendFilePath = path;

                    }else {

                        m_statusCode = TS_HTTP_STATUS_REQUEST_RANGE_NOT_SATISFIABLE;

                        m_statusMsg = TS_HTTP_STATUS_REQUEST_RANGE_NOT_SATISFIABLE_S;

                    }

                }else {

                    m_statusCode = TS_HTTP_STATUS_OK;

                    m_statusMsg = TS_HTTP_STATUS_OK_S;

                    m_sendFilePath = path;

                }

            }

        } else {

            m_statusCode = TS_HTTP_STATUS_NOT_FOUND;

            m_statusMsg = TS_HTTP_STATUS_NOT_FOUND_S;

            m_sendStr = HTTP_RESPONSE_404;

        }

    }

}

bool CHttpResponse::response(int connFd) {

    bool result = true;

    std::stringstream responseStream;

    responseStream<<m_request->version<<" "<<m_statusMsg<<"\r\n";

    //time

    responseStream<<"Date: "<<getStringFromTime(time(0))<<"\r\n";

    //server name

    responseStream<<"Server: "<<SERVER_NAME<<"\r\n";

    //keep alive

    responseStream<<"Connection: "<<m_request->connection<<"\r\n";

    //content length

    long long contentLength = 0;

    //if file exist

    if (!m_sendFilePath.empty()) {

        //if define file end

        if (m_request->fileEnd) {

            contentLength = m_request->fileEnd - m_request->fileStart + 1;

        }

        //if define file start

        else if (m_request->fileStart) {

            contentLength = getFileSize(m_sendFilePath.c_str()) - m_request->fileStart + 1;

        }

        //if undefine start or end

        else {

            contentLength = getFileSize(m_sendFilePath.c_str());

        }

    } else if (!m_sendStr.empty()) {

        contentLength = m_sendStr.length();

    }

    if (contentLength) {

        responseStream<<"Content-Length: "<<contentLength<<"\r\n";

    }

    //last modified

    if (!m_sendFilePath.empty()) {

        responseStream<<"Last-Modified: "<<getStringFromTime(fileModifiedTime(m_sendFilePath.c_str()))<<"\r\n";

        responseStream<<"Accept-Ranges: "<<"bytes"<<"\r\n";

    }

    //content type

    if (!m_sendFilePath.empty()) {

        char path[256];

        strcpy(path, m_sendFilePath.c_str());

        char *ext = strtok(path, ".");

        char *lastExt = ext;

        while (ext!=NULL) {

            ext = strtok(NULL, ".");

            if (ext) lastExt = ext;

        }

        for (int i=0; i<38; i++) {

            if (strcmp(mmt[i].ext, lastExt)==0) {

                responseStream<<"Content-Type: "<<mmt[i].type<<"\r\n";

                break;

            }

        }

    }

    //other

    switch (m_statusCode) {

        case TS_HTTP_STATUS_UNAUTHORIZED:

            responseStream<<"WWW-Authenticate: Basic realm=\"zhaoxy.com\"\r\n";

            break;

        case TS_HTTP_STATUS_FOUND:

            responseStream<<"Location: /index.html\r\n";

            break;

        case TS_HTTP_STATUS_PARTIAL_CONTENT:

            responseStream<<"Content-Range: "<<"bytes "<<m_request->fileStart<<"-"<<(m_request->fileEnd==0?

contentLength:m_request->fileEnd)<<"/"<<getFileSize(m_sendFilePath.c_str())<<"\r\n";

            break;

        default:

            break;

    }

    //seperator

    responseStream<<"\r\n";

    //send response header

    std::string responseStr = responseStream.str();

    std::cout<<responseStr<<std::endl;

    send(connFd, responseStr.c_str(), responseStr.length(), 0);

    //content

    //if not head method

    if (m_request->method.compare(TS_HTTP_METHOD_HEAD_S)!=0) {

        if (!m_sendFilePath.empty()) {

            std::ifstream file(m_sendFilePath);

            file.seekg(m_request->fileStart, std::ifstream::beg);

            while(file.tellg() != -1)

            {

                char *p = new char[1024];

                bzero(p, 1024);

                file.read(p, 1024);

                int n = (int)send(connFd, p, 1024, 0);

                if (n < 0) {

                    std::cout<<"ERROR writing to socket"<<std::endl;

                    result = false;

                    break;

                }

                delete p;

            }

            file.close();

        }else {

            send(connFd, m_sendStr.c_str(), m_sendStr.length(), 0);

        }

    }

    return result;

}

该代码支持断点续传、last modified和authorization字段。具体的逻辑不作具体说明，有疑问的能够留言。

该Httpserver的代码已经上传到GitHub上，大家能够直接下载。

假设大家认为对自己有帮助的话，还希望能帮顶一下，谢谢：）

个人博客：http://blog.csdn.net/zhaoxy2850

本文地址：http://blog.csdn.net/zhaoxy_thu/article/details/24716221

转载请注明出处。谢谢！

应用层协议系列（两）——HTTPserver之http协议分析的更多相关文章

[C#网络编程系列]专题一：网络协议简介
转自:http://www.cnblogs.com/zhili/archive/2012/08/11/NetWorkProgramming.html 因为这段时间都在研究C#网络编程的一些知识, 所以 ...
协议系列之HTTP协议
什么是HTTP\HTTPS HTTP是Hyper Text Transfer Protocol(超文本传输协议)的缩写.HTTP协议用于从WWWserver传输超文本到本地浏览器的传输协议,它能使浏览 ...
分布式理论系列（三）ZAB 协议
分布式理论系列(三)ZAB 协议在学习了 Paxos 后,接下来学习 Paxos 在开源软件 Zookeeper 中的应用. 一.Zookeeper Zookeeper 致力于提供一个高性能.高可用 ...
HTTP协议系列(1)
一.为什么学习Http协议首先明白我们为什么学习HTTP协议,也就是说明白HTTP协议的作用.HTTP协议是用于客户端与服务器之间的通讯.明白了HTTP协议的作用也就知道了为什么要学习H ...
安全协议系列（五）---- IKE 与 IPSec（中）
在上一篇中,搭建好了实验环境.完整运行一次 IKE/IPSec 协议,收集相关的输出及抓包,就可以进行协议分析.分析过程中,我们将使用 IKE 进程的屏幕输出和 Wireshark 抓包,结合相关 R ...
协议系列之TCP/IP协议
根据前面介绍的几种协议,将IP协议.TCP协议.UDP协议组合起来,于是便有了TCP/IP协议.现在很多的应用的通信都是建立在TCP/IP协议的基础上,运用非常广泛,很有必要对其学习一下. 打个不太恰 ...
协议系列之TCP协议
3.TCP协议从上一节我们了解了什么是IP协议,以及IP协议的一些特性,利用IP协议传输都是单向的,不可靠的,无连接状态的.正是这些特性,于是便产生了TCP协议.TCP协议属于传输层,在IP协议网络 ...
协议系列之IP协议
1.协议协议(protocol)的定义:为计算机网络中进行数据交换而建立的规则.标准或约定的集合.两个终端相互通信时双方达成的一种约定,规定了一套通信规则,双方通信必须遵守这些规则.这些规则规定了分 ...
HTTP和SOAP完全就是两个不同的协议
HTTP只负责把数据传送过去,不会管这个数据是XML.HTML.图片.文本文件或者别的什么.而SOAP协议则定义了怎么把一个对象变成XML文本,在远程如何调用等,怎么能够混为一谈. ...
RabbitMQ框架构建系列（一）——AMPQ协议
一.MQ 在介绍RabbitMq之前,先来说一下MQ.什么是MQ?MQ全称为Message Queue即消息队列,就是一个消息的容器, MQ是消费-生产者模型的一个典型的代表,一端往消息队列中不断写入 ...

随机推荐

UVA - 11388 GCD LCM
II U C ONLINE C ON TEST Problem D: GCD LCM Input: standard input Output: standard output The GC ...
OCP读书笔记(14) - 管理数据库性能
搜集统计信息 1. dbms_stats gather_schema_stats 1)option:有四个选项: a.gather empty:只分析目前还没有搜集过统计信息的表 SQL> co ...
hdu2066一个人的旅行(dijkstra)
Problem Description 虽然草儿是个路痴(就是在杭电待了一年多,居然还会在校园里迷路的人,汗~),但是草儿仍然很喜欢旅行,因为在旅途中会遇见很多人(白马王子,^0^),很多事,还能丰 ...
Python处理海量手机号码
Python处理海量手机号码一.任务描述上周,老板给我一个小任务:批量生成手机号码并去重.给了我一个Excel表,里面是中国移动各个地区的可用手机号码前7位(如下图),里面有十三张表,每个表里的电 ...
IE8，IE9，IE10，FireFox 的CSS HACK
#employeesView { top: 732px; //所有浏览器 top: 730px\9;//所有IE浏览器 } @media all and (min-width:0) { #employ ...
深入理解Tomcat系列之二：源码调试环境搭建（转）
前言最近对Tomcat的源码比较感兴趣,于是折腾了一番.要调试源码首先需要搭建环境,由于参考了几篇帖子发现都不怎么靠谱,最后还是折腾出来了,然而却花了足足一天的时间去搭建这个环境.发现都不是帖子的问 ...
Phalcon之表单（Forms）
Phalcon中提供了 Phalcon\Forms组件以方便开发人员创建和维护应用中的表单. 以下的样例中展示了主要的用法: <?php use Phalcon\Forms\Form, Phal ...
获取Enum枚举值描述的几法方法
原文:获取Enum枚举值描述的几法方法 1.定义枚举时直接用中文由于VS对中文支持的很不错,所以很多程序员都采用了此方案．缺点:1.不适合多语言 2.感觉不太完美,毕竟大部分程序员大部分代码都使用 ...
MongoDB 基础命令——数据库表的增删改查——遍历操作表中的记录
分组排序查询最大记录 //对 "catagory" 不等于 null 的数据进行分组查询,且查询结果倒序 db.getCollection('userAccount').aggre ...
在Java中如何使用jdbc连接Sql2008数据库(转)
我们在javaEE的开发中,肯定是要用到数据库的,那么在javaEE的开发中,是如何使用代码实现和SQL2008的连接的呢?在这一篇文章中,我将讲解如何最简单的使用jdbc进行SQL2008的数据库的 ...

应用层协议系列（两）——HTTPserver之http协议分析

应用层协议系列（两）——HTTPserver之http协议分析的更多相关文章

随机推荐

热门专题