Etag & If-None-Match 专题

一、概述
缓存通俗点讲，就是将已经得到的‘东东’存放在一个相对于自己而言，尽可能近的地方，以便下次需要时，不会再二笔地跑到起始点（很远的地方）去获取，而是就近解决，从而缩短时间和节约金钱（坐车要钱嘛）。Web缓存，也是同样的道理，说白了，就是当你第一次访问网址时，将这个东东（representations），如html页面、图片、JavaScript文件等，存在一个离你较近的地方，当你下次还需要它时，不用再一次跋山涉水到服务器（origin servers）去获取。继而，web缓存的优势也就很明显了：

　　1、减少了网络延迟，加快了页面响应速度，增强了用户体验嘛。（因为我是就近获取的，路程缩短了，所以响应速度当然比到遥远的服务器去获取快哦）；

　　2、减少了网络带宽消耗嘛。（就近获取）；

　　3、通过缓存，我们都不用到服务器（origin servers）去请求了，从而也就相应地减轻了服务器的压力。

那web缓存将这些东东放在哪儿呢？下面我就看看有哪些缓存种类，从而了解放在哪吧。

二、Web缓存的种类

--数据库缓存--：

当web应用关系复杂，数据表蹭蹭蹭往上涨时，可以将查询后的数据放到内存中进行缓存，下次再查询时，就直接从内存缓存中获取，从而提高响应速度。

--CDN缓存--：

CDN通俗点，就是当我们发送一个web请求时，会先经过它一道手，然后它帮我们计算路径，去哪得到这些东东（representations）的路径短且快。这个是网站管理员部署的，所以他们也可以将大家经常访问的representations放在CDN里，这样，就响应就更快了。

--代理服务器缓存--：

代理服务器缓存，其实跟下面即将讲的浏览器缓存性质差不多，差别就是代理服务器缓存面向的群体更广，规模更大而已。即，它不只为一个用户服务，一般为大量用户提供服务，同一个副本会被重用多次,因此在减少相应时间和带宽使用方面很有效。

--浏览器缓存--：

简而言之，就是，每个浏览器都实现了 HTTP 缓存，我们通过浏览器使用HTTP协议与服务器交互的时候，浏览器就会根据一套与服务器约定的规则进行缓存工作。当我们点击浏览器上‘后退’或者‘前进’按钮时，显得特别有用。

三、Web缓存的执行机制

所谓机制就是一些双方的约定，清晰地告诉对方，什么时候该做什么事。web缓存也一样，你总得告诉我（请求）什么时候到缓存中去获取，什么到服务器去获取representations吧。So，也得有一套相应的机制，web 缓存机制分为两大部分http协议（HTTP1.0和HTTP1.1）和网站管理人员制定的协议。抛开网站内部制定的协议，我们来看看http协议中定义的缓存机制。

By the way，我们可以在HTML文档中的<head>中通过<meta>来缓存，如下：

<meta http-equiv="Pragma" content="no-cache"/>

但，它只有部分浏览器可以用，并且代理服务器也不会鸟它。（因为meta在html中，代理服务器几乎不回去读它滴）。

--http缓存机制--

1、 Expires

http缓存机制主要在http响应头中设定，响应头中相关字段为Expires、Cache-Control、Last-Modified、If-Modified-Since、Etag。

HTTP 1.0协议中的。简而言之，就是告诉浏览器在约定的这个时间前，可以直接从缓存中获取资源（representations），而无需跑到服务器去获取。

另：Expires因为是对时间设定的，且时间是Greenwich Mean Time （GMT），而不是本地时间，所以对时间要求较高。

2、 Cache-Control

HTTP1.1协议中的，因为有了它，所以可以忽略上面提到的Expires。因为Cache-Control相对于Expires更加具体，细致。

且，就算同时设置了Cache-Control和Expires，Cache-Control的优先级也高于Expires。

下面就来看看，Cache-Control响应头中常用字段的具体含义：

　　（1）、max-age：用来设置资源（representations）可以被缓存多长时间，单位为秒；

　　（2）、s-maxage：和max-age是一样的，不过它只针对代理服务器缓存而言；

　　（3）、public：指示响应可被任何缓存区缓存；

　　（4）、private：只能针对个人用户，而不能被代理服务器缓存；

　　（5）、no-cache：强制客户端直接向服务器发送请求,也就是说每次请求都必须向服务器发送。服务器接收到请求，然后判断资源是否变更，是则返回新内容，否则返回304，未变更。这个很容易让人产生误解，使人误以为是响应不被缓存。实际上Cache-Control: no-cache是会被缓存的，只不过每次在向客户端（浏览器）提供响应数据时，缓存都要向服务器评估缓存响应的有效性。

　　（6）、no-store：禁止一切缓存（这个才是响应不被缓存的意思）。

3、 Etag & If-None-Match

HTTP/1.1 200 OK
Date: Fri, 30 Oct 1998 13:19:41 GMT
Server: Apache/1.3.3 (Unix)
Cache-Control: max-age=3600, must-revalidate
Expires: Fri, 30 Oct 1998 14:19:41 GMT
Last-Modified: Mon, 29 Jun 1998 02:28:12 GMT
ETag: "3e86-410-3596fbbc"
Content-Length: 1040
Content-Type: text/html

Etag是属于HTTP 1.1属性，它是由服务器生成返回给前端，

当你第一次发起HTTP请求时，服务器会返回一个Etag，

并在你第二次发起同一个请求时，客户端会同时发送一个If-None-Match，而它的值就是Etag的值（此处由发起请求的客户端来设置）。

然后，服务器会比对这个客服端发送过来的Etag是否与服务器的相同，

如果相同，就将If-None-Match的值设为false，返回状态为304，客户端继续使用本地缓存，不解析服务器返回的数据（这种场景服务器也不返回数据，因为服务器的数据没有变化嘛）

如果不相同，就将If-None-Match的值设为true，返回状态为200，客户端重新解析服务器返回的数据

说白了，
ETag 实体标签: 一般为资源实体的哈希值
即ETag就是服务器生成的一个标记，用来标识返回值是否有变化。

且Etag的优先级高于Last-Modified。

温馨提示【If-None-Match头中的ETag必须和返回的ETag值一样，有双引号】：

生成ETag的逻辑：
org.springframework.web.filter.ShallowEtagHeaderFilter#generateETagHeaderValue

    /**
     * Generate the ETag header value from the given response body byte array.
     * <p>The default implementation generates an MD5 hash.
     * @param inputStream the response body as an InputStream
     * @param isWeak whether the generated ETag should be weak
     * @return the ETag header value
     * @see org.springframework.util.DigestUtils
     */
    protected String generateETagHeaderValue(InputStream inputStream, boolean isWeak) throws IOException {
        // length of W/ + " + 0 + 32bits md5 hash + "
        StringBuilder builder = new StringBuilder(37);
        if (isWeak) {
            builder.append("W/");
        }
        builder.append("\"0");
        DigestUtils.appendMd5DigestAsHex(inputStream, builder);
        builder.append('"');
        return builder.toString();
    }

3xx
301 Move Permanently
302 Found
304 Not Modified

4、 Last-Modified & If-Modified-Since

Last-Modified与Etag类似。不过Last-Modified表示响应资源在服务器最后修改时间而已。与Etag相比，不足为：

　　（1）、Last-Modified标注的最后修改只能精确到秒级，如果某些文件在1秒钟以内，被修改多次的话，它将不能准确标注文件的修改时间；

　　（2）、如果某些文件会被定期生成，当有时内容并没有任何变化，但Last-Modified却改变了，导致文件没法使用缓存；

　　（3）、有可能存在服务器没有准确获取文件修改时间，或者与代理服务器时间不一致等情形。

然而，Etag是服务器自动生成或者由开发者生成的对应资源在服务器端的唯一标识符，能够更加准确的控制缓存。

四、扩展阅读

[1]、"Caching Tutorial"

RFC 7232              HTTP/1.1 Conditional Requests            June 2014
   This method relies on the fact that if two different responses were
   sent by the origin server during the same second, but both had the
   same Last-Modified time, then at least one of those responses would
   have a Date value equal to its Last-Modified time.  The arbitrary
   60-second limit guards against the possibility that the Date and
   Last-Modified values are generated from different clocks or at
   somewhat different times during the preparation of the response.  An
   implementation MAY use a value larger than 60 seconds, if it is
   believed that 60 seconds is too short.

2.3. ETag

   The "ETag" header field in a response provides the current entity-tag
   for the selected representation, as determined at the conclusion of
   handling the request.  An entity-tag is an opaque validator for
   differentiating between multiple representations of the same
   resource, regardless of whether those multiple representations are
   due to resource state changes over time, content negotiation
   resulting in multiple representations being valid at the same time,
   or both.  An entity-tag consists of an opaque quoted string, possibly
   prefixed by a weakness indicator.
 
     ETag       = entity-tag
     entity-tag = [ weak ] opaque-tag
     weak       = %x57.2F ; "W/", case-sensitive
     opaque-tag = DQUOTE *etagc DQUOTE
     etagc      = %x21 / %x23-7E / obs-text
                ; VCHAR except double quotes, plus obs-text

      Note: Previously, opaque-tag was defined to be a quoted-string
      ([RFC2616], Section 3.11); thus, some recipients might perform
      backslash unescaping.  Servers therefore ought to avoid backslash
      characters in entity tags.
   An entity-tag can be more reliable for validation than a modification
   date in situations where it is inconvenient to store modification
   dates, where the one-second resolution of HTTP date values is not
   sufficient, or where modification dates are not consistently
   maintained.

   Examples:
     ETag: "xyzzy"
     ETag: W/"xyzzy"
     ETag: ""

https://tools.ietf.org/html/rfc7232#section-2.3

The ETag or entity tag is part of HTTP, the protocol for the World Wide Web. It is one of several mechanisms that HTTP provides for web cache validation, which allows a client to make conditional requests. This allows caches to be more efficient, and saves bandwidth, as a web server does not need to send a full response if the content has not changed. ETags can also be used for optimistic concurrency control,[1] as a way to help prevent simultaneous updates of a resource from overwriting each other.

An ETag is an opaque identifier assigned by a web server to a specific version of a resource found at a URL. If the resource representation at that URL ever changes, a new and different ETag is assigned. Used in this manner ETags are similar to fingerprints, and they can be quickly compared to determine whether two representations of a resource are the same.
https://en.wikipedia.org/wiki/HTTP_ETag

17.13 ETag support
An ETag (entity tag) is an HTTP response header returned by an HTTP/1.1 compliant web server used to determine change in content at a given URL. It can be considered to be the more sophisticated successor to the Last-Modified header. When a server returns a representation with an ETag header, the client can use this header in subsequent GETs, in an If-None-Match header. If the content has not changed, the server returns 304: Not Modified.

Support for ETags is provided by the Servlet filter ShallowEtagHeaderFilter. It is a plain Servlet Filter, and thus can be used in combination with any web framework. The ShallowEtagHeaderFilter filter creates so-called shallow ETags (as opposed to deep ETags, more about that later).The filter caches the content of the rendered JSP (or other content), generates an MD5 hash over that, and returns that as an ETag header in the response. The next time a client sends a request for the same resource, it uses that hash as the If-None-Match value. The filter detects this, renders the view again, and compares the two hashes. If they are equal, a 304 is returned. This filter will not save processing power, as the view is still rendered. The only thing it saves is bandwidth, as the rendered response is not sent back over the wire.

You configure the ShallowEtagHeaderFilter in web.xml:

<filter>
  <filter-name>etagFilter</filter-name>
    <filter-class>org.springframework.web.filter.ShallowEtagHeaderFilter</filter-class>
</filter>
 
<filter-mapping>
  <filter-name>etagFilter</filter-name>
  <servlet-name>petclinic</servlet-name>
</filter-mapping>

https://docs.spring.io/spring-framework/docs/3.2.0.M2/reference/html/mvc.html#mvc-exceptionhandlers

http://www.spring4all.com/article/1553
java config方式配置：

@Configuration
public class WebConfig {
 
  @Bean
  public Filter shallowEtagHeaderFilter() {
    return new ShallowEtagHeaderFilter();
  }
}

https://stackoverflow.com/questions/26151057/add-a-servlet-filter-in-a-spring-boot-application
https://stackoverflow.com/questions/30350137/generating-etag-using-spring-boot
https://javadeveloperzone.com/spring-boot/spring-boot-etag-header-example/

Etag & If-None-Match 专题的更多相关文章

Tornado源码分析 --- Etag实现
Etag(URL的Entity Tag): 对于具体Etag是什么,请求流程,实现原理,这里不进行介绍,可以参考下面链接: http://www.oschina.net/question/234345 ...
Java进阶专题(十八) 系统缓存架构设计 (下)
前言上章节介绍了Redis相关知识,了解了Redis的高可用,高性能的原因.很多人认为提到缓存,就局限于Redis,其实缓存的应用不仅仅在于Redis的使用,比如还有Nginx缓存,缓存队列等等.这 ...
IP, TCP, and HTTP--reference
IP, TCP, and HTTP Issue #10 Syncing Data, March 2014 By Daniel Eggert When an app communicates with ...
nodejs express 学习
nodejs的大名好多人应该是听过的,而作为nodejs web 开发的框架express 大家也应该比较熟悉. 记录一下关于express API 的文档: express() 创建express ...
Express详解
express() 创建一个express应用程序 var express = require('express'); var app = express(); app.get('/', functi ...
express 中文文档
express() 创建一个express应用程序 var express = require('express'); var app = express(); app.get('/', functi ...
持久化存储与HTTP缓存
本文主要学习一下一些高级的HTTP知识,例如Session LocalStorage Cache-Control Expires ETag 其实主要就是涉及到了持久化存储与缓存的技术在此之前已经学习 ...
re.match re.search re.findall区别
re正则表达式里面,常用的三种方法的区别. re.macth和search匹配得到的是match对象,findall得到的是一个列表. match从字符串开头开始匹配,search返回与正则表达式匹配 ...
开源服务专题之------sshd服务安装管理及配置文件理解和安全调优
本专题我将讨论一下开源服务,随着开源社区的日趋丰富,开源软件.开源服务,已经成为人类的一种公共资源,发展势头可谓一日千里,所以不可不知.SSHD服务,在我们的linux服务器上经常用到,很重要,涉及到 ...

随机推荐

python3中numpy函数tile的用法
tile函数位于python模块 numpy.lib.shape_base中,他的功能是重复某个数组.比如tile(A,n),功能是将数组A重复n次,构成一个新的数组,我们还是使用具体的例子来说明问题 ...
UVA 11859 - Division Game
看题传送门题目大意有一个n * m的矩阵,每个元素均为2~10000之间的正整数,两个游戏者轮流操作.每次可选一行中的1个或者多个大于1的整数把它们中的每个数都变成它的某个真因子,比如12可以变成 ...
cocos 关于文件名称的各种坑各种斜杠坑
cocos 全部文件路径的斜杠必须用 / 而不能够用 \ 不然编译到安卓各种坑相对路径第一个字符不可带 / /*比如 res/test.png 这样的应该是标准的 /res/test.p ...
php课程 1-3 web项目中php、html、js代码的执行顺序是怎样的（详解）
php课程 1-3 web项目中php.html.js代码的执行顺序是怎样的(详解) 一.总结一句话总结:b/s结构总是先执行服务器端的先.js是客户端脚本 ,是最后执行的.所以肯定是php先执行 ...
JDK8 直接定义接口中静态方法
JDK8前,接口只能是抽象方法. 但是在JDK8中,静态方法是可以直接定义方法体,可以直接用接口名调用.实现类和实现是不可以调用的一.直接调用接口的静态方法二.实现接口的子类来调用接口的静态方法 ...
【2024】求X到Y之间的整数和
Time Limit: 3 second Memory Limit: 2 MB [问题描述] 计算X到Y之间的整数和(要求用函数实现).注意输入时X不一定小于Y,且X.Y不一定都是整数. [输入] 两 ...
通过onTouch来确定点击的是listView哪一个item
事实上这主要是用了ListView的一个方法,通过坐标就能够确定当前是哪一个listView,别的我就不多说了直接看看代码吧, lv_flide.setOnTouchListener(new OnTo ...
jquery-8 jquery如何处理css样式
jquery-8 jquery如何处理css样式一.总结一句话总结: 1.如何获取网页的三个高? 1)可视区域的高$(window).height(); 2)文档总高度$(document).h ...
Jupyter Notebook 常用快捷键
Jupyter Notebook 提供了比 IPython 美观的多得多的可视化形式.(比如对于 pandas 下的 DataFrame 的展示,df.head(5)) Jupyter Noteboo ...
jQuery中serializeArray方法的使用及对象与字符串的转换
使用jQuery中的serializeArray()方法可以方便的将表单中的各个信息,转化为多个{name:xx,value:xx}对象的数组, 再使用遍历的方式可以方便的将数组转化为json对象, ...

Etag & If-None-Match 专题

2.3. ETag

Etag & If-None-Match 专题的更多相关文章

随机推荐

热门专题