Oracle 11g数据脱敏

前言

最近开发人员有个需求，导一份生产库的数据到测试库。

由于生产数据安全需要，需要并允许对导出的数据进行加密脱敏处理。

关于加密和脱敏

个人理解，

加密是通过一系列规则对数据进行处理，可以通过规则解密出原有的数据甚至被破解。

而脱敏则是按照一定规律对数据进行处理，属于不可逆行为，会丢失原有数据内容。

加密的数据一定是已经脱敏，但是脱敏的数据不等同于加密。

此篇文章讨论脱敏。

环境构造

SYS@zkm> drop table scott.test purge;

Table dropped.

SYS@zkm> create table scott.test as select level id from dual connect by level<=15;

Table created.

SYS@zkm> desc scott.test

 Name                                      Null?    Type

 ----------------------------------------- -------- ----------------------------

 ID                                                 NUMBER

SYS@zkm> drop table zkm.test purge;

Table dropped.

SYS@zkm> create table zkm.test as select * from scott.test where 1=2;

Table created.

SYS@zkm> desc zkm.test

 Name                                      Null?    Type

 ----------------------------------------- -------- ----------------------------

 ID                                                 NUMBER

SYS@zkm>

基本需求是scott.test表的数据脱敏导出后导入zkm.test表。

方法1

使用md5算法创建函数。

create or replace function fn_md5(input_string VARCHAR2) return varchar2

IS

raw_input RAW(128) := UTL_RAW.CAST_TO_RAW(input_string);

decrypted_raw RAW(2048);

error_in_input_buffer_length EXCEPTION;

BEGIN

sys.dbms_obfuscation_toolkit.MD5(input => raw_input,checksum => decrypted_raw);

return rawtohex(decrypted_raw);

END;

/

SYS@zkm> select id,fn_md5(id) string from scott.test;

        ID STRING

---------- --------------------------------------------------

         1 C4CA4238A0B923820DCC509A6F75849B

         2 C81E728D9D4C2F636F067F89CC14862C

         3 ECCBC87E4B5CE2FE28308FD9F2A7BAF3

         4 A87FF679A2F3E71D9181A67B7542122C

         5 E4DA3B7FBBCE2345D7772B0674A318D5

         6 1679091C5A880FAF6FB5E6087EB1B2DC

         7 8F14E45FCEEA167A5A36DEDD4BEA2543

         8 C9F0F895FB98AB9159F51FD0297E236D

         9 45C48CCE2E2D7FBDEA1AFC51C7C6AD26

        10 D3D9446802A44259755D38E6D163E820

        11 6512BD43D9CAA6E02C990B0A82652DCA

        ID STRING

---------- --------------------------------------------------

        12 C20AD4D76FE97759AA27A0C99BFF6710

        13 C51CE410C124A10E0DB5E4B97FC2AF39

        14 AAB3238922BCC25A6F606EB525FFDC56

        15 9BF31C7FF062936A96D3C8BD1F8F2FF3

15 rows selected.

缺点很明显，长度变长了，并且类型不再是number类型。

后续创建新表（也是一个缺点）存储转换后的数据，再通过数据泵导出即可。

（但是zkm.test表的id字段不能是number类型，不然是导不进去的）因此此方法对于存在字段是非字符串类型的几乎不可行。

方法2

使用内部函数translate。

SYS@zkm> select id,translate(id,'','abcdefghijk') string from scott.test;

        ID STRING

---------- --------------------------------------------------

         1 b

         2 c

         3 d

         4 e

         5 f

         6 g

         7 h

         8 i

         9 j

        10 ba

        11 bb

        ID STRING

---------- --------------------------------------------------

        12 bc

        13 bd

        14 be

        15 bf

15 rows selected.

后续创建新表（缺点）存储转换后的数据，再通过数据泵导出即可。

如果数字类型转换为字符类型，则存在和方法1一样的问题。因此数字还是转换为数字类型即可，比如。

SYS@zkm> select id,translate(id,'','') string,to_number(translate(id,'','')) string2 from scott.test;

        ID STRING             STRING2

---------- --------------- ----------

         1 5                        5

         2 6                        6

         3 4                        4

         4 7                        7

         5 1                        1

         6 9                        9

         7 8                        8

         8 7                        7

         9 4                        4

        10 51                      51

        11 55                      55

        ID STRING             STRING2

---------- --------------- ----------

        12 56                      56

        13 54                      54

        14 57                      57

        15 51                      51

15 rows selected.

注意，translate之后返回的类型还是字符串型的，需要to_number转换一下。

方法3

使用expdp的REMAP_DATA参数。

具体用法说明见官方文档：REMAP_DATA

该方法相对于方法1和方法2稍微复杂点，但是由于数据量比较大，因此用该方法最为方便。

构造一个含有中文行的字段。

SYS@zkm> create table scott.test2 (name varchar2(20));

Table created.

SYS@zkm> insert into scott.test2 values('数据库');

1 row created.

SYS@zkm> insert into scott.test2 values('绑定变量');

1 row created.

SYS@zkm> insert into scott.test2 values('执行计划');

1 row created.

SYS@zkm> commit;

Commit complete.

SYS@zkm> select * from scott.test2;

NAME

------------------------------------------------------------

数据库

绑定变量

执行计划

SYS@zkm> create table zkm.test2 as select * from scott.test2 where 1=2;

Table created.

使用方法1，方法2的验证结果。

SYS@zkm> select name,fn_md5(name) string from scott.test2;

NAME       STRING

---------- -----------------------------------------------------------------

数据库     8B2F1D29A87AB8968601C2AB7D9084B9

绑定变量   FCE81E294017C008C111B61164728CFA

执行计划   A394306DB20856B8BF68FA197DC5DAD4

SYS@zkm> col name for a15

SYS@zkm> set line 500

SYS@zkm> col string for a50

SYS@zkm> select name,translate(name,'数据库定变量执行计划绑','之后返回的类型还是字符') string from scott.test2;

NAME            STRING

--------------- --------------------------------------------------

数据库          之后返

绑定变量        符回的类

执行计划        型还是字

SYS@zkm> select name,translate(name,'数据库定变量执行计划绑','哈哈哈哈哈哈哈哈哈哈嗝') string from scott.test2;

NAME            STRING

--------------- --------------------------------------------------

数据库          哈哈哈

绑定变量        嗝哈哈哈

执行计划        哈哈哈哈

SYS@zkm>

由于remap_data的使用需要创建包和包体来调用，主要考虑两种情况，

字符串类型脱敏（无论是否中文）
数字类型

在考虑一下两种情况，

类型不变
字符串的长度不变

创建包和包体内容如下：

create or replace package scott.pkg_remap

is

    function fn_numeral(input_string number) return number;

    function f_remap_varchar(p_varchar varchar2) return varchar2;

end;

/

create or replace package body scott.pkg_remap

is

function fn_numeral(input_string number) return number as

    begin

    return floor(dbms_random.value(1, 100000));

    end;

function f_remap_varchar(p_varchar varchar2) return varchar2 as

    begin

        return translate(p_varchar,'数据库定变量执行计划绑','之后返回的类型还是字符');

    end;

end;

/

模板复制

SYS@zkm> create or replace package scott.pkg_remap

  2  is

  3     function fn_numeral(input_string number) return number;

  4     function f_remap_varchar(p_varchar varchar2) return varchar2;

  5  end;

  6  /

Package created.

SYS@zkm>

SYS@zkm>

SYS@zkm> create or replace package body scott.pkg_remap

  2  is

  3  function fn_numeral(input_string number) return number as

  4     begin

  5     return floor(dbms_random.value(1, 100000));

  6     end;

  7  function f_remap_varchar(p_varchar varchar2) return varchar2 as

  8     begin

  9             return translate(p_varchar,'数据库定变量执行计划绑','之后返回的类型还是字符');

 10     end;

 11  end;

 12  /

Package body created.

导出并导入。

其中，由于表scott.test的id字段是number类型，因此调用的是pkg_remap.fn_numeral

[oracle@oracle ~]$ expdp \'/ as sysdba\' directory=dirtmp dumpfile=test.dmp logfile=test.log cluster=n tables=scott.test remap_data=scott.test.id:scott.pkg_remap.fn_numeral reuse_dumpfiles=y

Export: Release 11.2.0.4. - Production on Mon May  :: 

Copyright (c) , , Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4. - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

Starting "SYS"."SYS_EXPORT_TABLE_01":  "/******** AS SYSDBA" directory=dirtmp dumpfile=test.dmp logfile=test.log cluster=n tables=scott.test remap_data=scott.test.id:scott.pkg_remap.fn_numeral reuse_dumpfiles=y

Estimate in progress using BLOCKS method...

Processing object type TABLE_EXPORT/TABLE/TABLE_DATA

Total estimation using BLOCKS method:  KB

Processing object type TABLE_EXPORT/TABLE/TABLE

. . exported "SCOTT"."TEST"                              5.125 KB       rows

Master table "SYS"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded

******************************************************************************

Dump file set for SYS.SYS_EXPORT_TABLE_01 is:

  /home/oracle/test.dmp

Job "SYS"."SYS_EXPORT_TABLE_01" successfully completed at Mon May  ::  elapsed  ::

[oracle@oracle ~]$ impdp \'/ as sysdba\' directory=dirtmp dumpfile=test.dmp logfile=imptest.log cluster=n remap_schema=scott:zkm TABLE_EXISTS_ACTION=truncate                             

Import: Release 11.2.0.4. - Production on Mon May  :: 

Copyright (c) , , Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4. - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

Master table "SYS"."SYS_IMPORT_FULL_01" successfully loaded/unloaded

Starting "SYS"."SYS_IMPORT_FULL_01":  "/******** AS SYSDBA" directory=dirtmp dumpfile=test.dmp logfile=imptest.log cluster=n remap_schema=scott:zkm TABLE_EXISTS_ACTION=truncate

Processing object type TABLE_EXPORT/TABLE/TABLE

Table "ZKM"."TEST" exists and has been truncated. Data will be loaded but all dependent metadata will be skipped due to table_exists_action of truncate

Processing object type TABLE_EXPORT/TABLE/TABLE_DATA

. . imported "ZKM"."TEST"                                5.125 KB       rows

Job "SYS"."SYS_IMPORT_FULL_01" successfully completed at Mon May  ::  elapsed  ::

查询zkm.test结果如下：

SYS@zkm> select * from zkm.test;

        ID

----------

     87848

      3399

     49478

     86510

     85267

     82478

     19478

     80623

      2317

     93170

     89452

        ID

----------

       571

     90558

     86931

     68415

15 rows selected.

可以看到，数据已经脱敏成功。

导出并导入。

其中，由于表scott.test2的name字段是varchar2类型，因此调用的是pkg_remap.f_remap_varchar

[oracle@oracle ~]$ expdp \'/ as sysdba\' directory=dirtmp dumpfile=test.dmp logfile=test.log cluster=n tables=scott.test2 remap_data=scott.test2.name:scott.pkg_remap.f_remap_varchar reuse_dumpfiles=y

Export: Release 11.2.0.4. - Production on Mon May  :: 

Copyright (c) , , Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4. - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

Starting "SYS"."SYS_EXPORT_TABLE_01":  "/******** AS SYSDBA" directory=dirtmp dumpfile=test.dmp logfile=test.log cluster=n tables=scott.test2 remap_data=scott.test2.name:scott.pkg_remap.f_remap_varchar reuse_dumpfiles=y

Estimate in progress using BLOCKS method...

Processing object type TABLE_EXPORT/TABLE/TABLE_DATA

Total estimation using BLOCKS method:  KB

Processing object type TABLE_EXPORT/TABLE/TABLE

. . exported "SCOTT"."TEST2"                             5.039 KB        rows

Master table "SYS"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded

******************************************************************************

Dump file set for SYS.SYS_EXPORT_TABLE_01 is:

  /home/oracle/test.dmp

Job "SYS"."SYS_EXPORT_TABLE_01" successfully completed at Mon May  ::  elapsed  ::

[oracle@oracle ~]$ impdp \'/ as sysdba\' directory=dirtmp dumpfile=test.dmp logfile=imptest.log cluster=n remap_schema=scott:zkm TABLE_EXISTS_ACTION=truncate

Import: Release 11.2.0.4. - Production on Mon May  :: 

Copyright (c) , , Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4. - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

Master table "SYS"."SYS_IMPORT_FULL_01" successfully loaded/unloaded

Starting "SYS"."SYS_IMPORT_FULL_01":  "/******** AS SYSDBA" directory=dirtmp dumpfile=test.dmp logfile=imptest.log cluster=n remap_schema=scott:zkm TABLE_EXISTS_ACTION=truncate

Processing object type TABLE_EXPORT/TABLE/TABLE

Table "ZKM"."TEST2" exists and has been truncated. Data will be loaded but all dependent metadata will be skipped due to table_exists_action of truncate

Processing object type TABLE_EXPORT/TABLE/TABLE_DATA

. . imported "ZKM"."TEST2"                               5.039 KB        rows

Job "SYS"."SYS_IMPORT_FULL_01" successfully completed at Mon May  ::  elapsed  ::

查询zkm.test2结果如下：

SYS@zkm> select * from zkm.test2;

NAME

------------------------------------------------------------

之后返

符回的类

型还是字

可以看到，数据已经脱敏成功。

方法4

oracle12c中em集成了一个功能 data masking(数据脱敏)。

由于没有12c生产环境，因此此方法不做讨论。

参考链接

https://blog.csdn.net/enmotech/article/details/81713790

DES加解密

http://blog.itpub.net/28539951/viewspace-2063896/

https://www.anbob.com/archives/1016.html

Oracle之DBMS_RANDOM包详解