ngx_http_substitutions_filter_module 模块替换正文内容和URL

摘要

有时候需要使用Nginx的反向代理某站点,并通过 httpsubmodule 和ngx_http_substitutions_filter_module 模块替换正文内容和URL。
官方自带的模块HttpSubModule 只能匹配1条规则,但是使用第三方模块ngx_http_substitutions_filter_module 可以匹配多条规则。

ngx_http_substitutions_filter_module 模块替换正文内容和URL

本文地址 https://www.old.i4t.com/3530.html
Nginx
Date:2019年01月10日01:12:04


有时候需要使用Nginx的反向代理某站点,并通过 httpsubmodule 和ngx_http_substitutions_filter_module 模块替换正文内容和URL。
官方自带的模块HttpSubModule 只能匹配1条规则,但是使用第三方模块ngx_http_substitutions_filter_module 可以匹配多条规则。

一、ngx_http_sub_module

ngx_http_sub_module模块是一个过滤器,它修改网站响应内容中的字符串,比如你想把响应内容中的123全部替换成321,这个模块已经内置在nginx中,但是默认未安装,需要安装需要加上配置参数:--with-http_sub_module
官方文档地址http://nginx.org/en/docs/http/ngx_http_sub_module.html

1.安装nginx并添加sub_module

  1. 1.安装依赖
  2. yum install -y gcc glibc gcc-c++ prce-devel openssl-devel pcre-devel lua-devel libxml2 libxml2-dev libxslt-devel perl-ExtUtils-Embed GeoIP GeoIP-devel GeoIP-data
  3. 2.下载软件包
  4. useradd -s /sbin/nologin nginx -M
  5. wget http://nginx.org/download/nginx-1.14.2.tar.gz
  6. tar xf nginx-1.14.2.tar.gz && cd nginx-1.14.2
  7. 3.编译sub模块
  8. ./configure --prefix=/usr/local/nginx-1.14 --user=nginx --group=nginx --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module
  9. make && make install
  10. 4.设置软连并启动
  11. ln -s /usr/local/nginx-1.14 /usr/local/nginx
  12. /usr/local/nginx/sbin/nginx
  13. [root@i4t~]# /usr/local/nginx/sbin/nginx -V
  14. nginx version: nginx/1.14.2
  15. built by gcc 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)
  16. built with OpenSSL 1.0.2k-fips 26 Jan 2017
  17. TLS SNI support enabled
  18. configure arguments: --prefix=/usr/local/nginx-1.14 --user=nginx --group=nginx --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module

2.创建测试文件进行测试

  1. 修改nginx.conf
  2. [root@i4t conf]# cat nginx.conf
  3. worker_processes 1;
  4. events {
  5. worker_connections 1024;
  6. }
  7. http {
  8. include mime.types;
  9. default_type application/octet-stream;
  10. sendfile on;
  11. keepalive_timeout 65;
  12. server {
  13. listen 80;
  14. server_name localhost;
  15. location / {
  16. root html;
  17. index index.html index.htm;
  18. #sub_filter 'http://nginx.com|http://nginx.org' http://www.old.i4t.com;
  19. sub_filter 'http://nginx.org' http://www.old.i4t.com;
  20. sub_filter 'http://nginx.com' http://www.old.i4t.com;
  21. sub_filter_types *;
  22. sub_filter_once off;
  23. }
  24. }
  25. }
  26. [root@i4t ~]# /usr/local/nginx/sbin/nginx -s reload
  27. 温馨提示:
  28. 不可以将替换内容写成一行
  29. sub_filter 'http://nginx.com|http://nginx.org' 'https://img.old.i4t.com' ir;
  30. 只可以写成如下格式,不支持匹配多行
  31. sub_filter 'http://nginx.com' 'https://img.old.i4t.com';
  32. sub_filter 'http://nginx.org' 'https://img.old.i4t.com';
  33. 修改站点目录
  34. [root@i4t ~]# echo "http://nginx.com http://nginx.org" >/usr/local/nginx/html/index.html

访问测试
image_1d0pp4fku1frm1rbbul5m55ll413.png-35.9kB
image_1d0pp4okd5d0caj1qcsn868q71g.png-57.1kB
如匹配多个则不生效
image_1d0pp77jmmidu21ghk1ar065d1t.png-102.1kB

sub参数说明

sub_filter string replacement;
将字符串string修改成replacement,不区分大小写,传入文本是上一次处理后的文本

sub_filter_last_modified on | off; default: off  
是否阻止response header中写入Last-Modified,防止缓存,默认是off,即防止缓存

sub_filter_once on | off;  default: on ub_filter
指令是执行一次,还是重复执行,默认是只执行一次

sub_filter_types mime-type ...;  default: text/html 
指定类型的MINE TYPE才有效

官方文档地址http://nginx.org/en/docs/http/ngx_http_sub_module.html

二、ngx_http_substitutions_filter_module

官方解释
nginx_substitutions_filter是一个可以同时执行常规操作的过滤器模块,响应主体上的表达式和固定字符串替换。这个模块与Nginx的本机替代模块完全不同。它扫描输出链缓冲区并逐行匹配字符串

GitHub地址https://github.com/yaoweibin/ngx_http_substitutions_filter_module

备注:
ngx_http_substitutions_filter_module 是指第三方nginx模块 substitutions4nginx (原:Google Code 现:github)
HttpSubModule 是指Nginx官方的 with-http_sub_module模块(option)

2.1 Nginx 编译模块安装

  1. 1.安装依赖
  2. yum install -y gcc glibc gcc-c++ prce-devel openssl-devel pcre-devel lua-devel libxml2 libxml2-dev libxslt-devel perl-ExtUtils-Embed GeoIP GeoIP-devel GeoIP-data
  3. 2.下载软件包
  4. useradd -s /sbin/nologin nginx -M
  5. wget http://nginx.org/download/nginx-1.14.2.tar.gz
  6. tar xf nginx-1.14.2.tar.gz
  7. 3.下载模块
  8. wget https://codeload.github.com/yaoweibin/ngx_http_substitutions_filter_module/zip/master
  9. unzip master
  10. 4.编译ngx_http_sustitutions模块
  11. cd nginx-1.14.2
  12. ./configure --prefix=/usr/local/nginx-1.14 --user=nginx --group=nginx --with-http_ssl_module --with-http_stub_status_module --add-module=/root/ngx_http_substitutions_filter_module-master
  13. make && make install
  14. 5.启动nginx,检查模块
  15. ln -s /usr/local/nginx-1.14 /usr/local/nginx
  16. /usr/local/nginx/sbin/nginx
  17. [root@i4t nginx-1.14.2]# /usr/local/nginx/sbin/nginx -V
  18. nginx version: nginx/1.14.2
  19. built by gcc 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)
  20. built with OpenSSL 1.0.2k-fips 26 Jan 2017
  21. TLS SNI support enabled
  22. configure arguments: --prefix=/usr/local/nginx-1.14 --user=nginx --group=nginx --with-http_ssl_module --with-http_stub_status_module --add-module=/root/ngx_http_substitutions_filter_module-master

2.2 测试Nginx substitutions 模块

  1. 1.修改nginx配置文件
  2. [root@i4t ~]# cat /usr/local/nginx/conf/nginx.conf
  3. worker_processes 1;
  4. events {
  5. worker_connections 1024;
  6. }
  7. http {
  8. include mime.types;
  9. default_type application/octet-stream;
  10. sendfile on;
  11. keepalive_timeout 65;
  12. server {
  13. listen 80;
  14. server_name localhost;
  15. location / {
  16. root html;
  17. index index.html index.htm;
  18. subs_filter 'http://nginx.com|http://nginx.org' 'https://img.old.i4t.com' ir;
  19. #subs_filter http://nginx.org https://img.old.i4t.com;
  20. }
  21. }
  22. }
  23. [root@i4t ~]# /usr/local/nginx/sbin/nginx -s reload
  24. 温馨提示:
  25. 可以将替换内容写成一行
  26. subs_filter 'http://nginx.com|http://nginx.org' 'https://img.old.i4t.com' ir;
  27. subs_filter 'http://nginx.com' 'https://img.old.i4t.com';
  28. subs_filter 'http://nginx.org' 'https://img.old.i4t.com';
  29. 2.修改站点目录
  30. [root@i4t ~]# echo "http://nginx.com http://nginx.org" >/usr/local/nginx/html/index.html

访问测试,不替换的话默认访问页面是nginx.com && nginx.org,替换后如下
浏览器可能会有缓存,可以使用curl命令进行查询
image_1d0poaf9j1p6o120p13ti1liq952m.png-35.8kB
image_1d0pkgg0eg5stc7uuguh3bop9.png-76.3kB

2.3 模块使用说明

例子
location / {
        subs_filter_types text/html text/css text/xml;
        subs_filter st(\d*).example.com $1.example.com ir;
            subs_filter a.example.com s.example.com;
            subs_filter http://$host https://$host;
    }
Directives
    *   subs_filter_types

    *   subs_filter

   subs_filter_types
    syntax: *subs_filter_types mime-type [mime-types] *

    default: *subs_filter_types text/html*

    context: *http, server, location*

    *subs_filter_types* is used to specify which content types should be
    checked for *subs_filter*, in addition to *text/html*. The default is
    only *text/html*.

    This module just works with plain text. If the response is compressed,
    it can't uncompress the response and will ignore this response. This
    module can be compatible with gzip filter module. But it will not work
    with proxy compressed response. You can disable the compressed response
    like this:

    proxy_set_header Accept-Encoding "";

   subs_filter
    syntax: *subs_filter source_str destination_str [gior] *

    default: *none*

    context: *http, server, location*

    *subs_filter* allows replacing source string(regular expression or
    fixed) in the nginx response with destination string. The variables 
    in matching text is only avaiable under fixed string mode, which means 
    the matching text could not contain variables if it is a regular 
    expression. Substitution text may contain variables. More than one 
    substitution rules per location is supported. 
    The meaning of the third flags are:

    *   *g*(default): Replace all the match strings.

    *   *i*: Perform a case-insensitive match.

    *   *o*: Just replace the first one.

    *   *r*: The pattern is treated as a regular expression, default is
        fixed string.

   subs_filter_bypass
    syntax: *subs_filter_bypass $variable1 ...*

    default: *none*

    context: *http, server, location*

    You can sepcify several variables with this directive. If at least one
    of the variable is not empty and is not equal to '0', this substitution
    filter will be disabled.

  Installation
    To install, get the source with subversion:

    git clone
    git://github.com/yaoweibin/ngx_http_substitutions_filter_module.git

    and then compile nginx with the following option:

    ./configure --add-module=/path/to/module

  Known issue
    *   Can't substitute the response header.

  CHANGES
    Changes with nginx_substitutions_filter 0.6.4 2014-02-15

    *   Now non-200 response will work

    *   added the subs_filter_bypass directive

    Changes with nginx_substitutions_filter 0.6.2 2012-08-26

    *   fixed a bug of buffer overlap

    *   fixed a bug with last zero buffer

    Changes with nginx_substitutions_filter 0.6.0 2012-06-30

    *   refactor this module

    Changes with nginx_substitutions_filter 0.5.2 2010-08-11

    *   do many optimizing for this module

    *   fix a bug of buffer overlap

    *   fix a segment fault bug when output chain return NGX_AGAIN.

    *   fix a bug about last buffer with no linefeed. This may cause segment
        fault. Thanks for Josef Fröhle

    Changes with nginx_substitutions_filter 0.5 2010-04-15

    *   refactor the source structure, create branches of dev

    *   fix a bug of small chunk of buffers causing lose content

    *   fix the bug of last_buf and the nginx's compatibility above 0.8.25

    *   fix a bug with unwanted capture config error in fix string
        substitution

    *   add feature of regex captures

    Changes with nginx_substitutions_filter 0.4 2009-12-23

    *   fix many bugs

    Changes with nginx_substitutions_filter 0.3 2009-02-04

    *   Initial public release

  Reporting a bug
    Questions/patches may be directed to Weibin Yao, yaoweibin@gmail.com.

  Copyright & License
    This module is licensed under the BSD license.

    Copyright (C) 2014 by Weibin Yao <yaoweibin@gmail.com>.

    All rights reserved.

    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions are
    met:

    *
          Redistributions of source code must retain the above copyright

        notice, this list of conditions and the following disclaimer.

    *
          Redistributions in binary form must reproduce the above copyright

        notice, this list of conditions and the following disclaimer in the
        documentation and/or other materials provided with the distribution.

    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
    IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
    TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
    PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
    HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
    TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
    PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
    LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
    NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
    SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


新闻联播老司机

发表评论

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: