按照计划开始了生产库的升级,环境基于linux 64位. uname:

Linux 2.6.18-308.el5 #1 SMP Fri Jan 27 17:17:51 EST 2012 x86_64 x86_64 x86_64 GNU/Linux

数据库是10.2.0.5.0

要升级到11.2.0.2.0

已经提前打了最新的PSU

Interim patches (1) :

Patch 16056267 : applied on Thu Oct 03 16:01:47 ICT 2013

Unique Patch ID: 15994298

Patch description: "Database Patch Set Update : 11.2.0.2.10 (16056267)"

Created on 12 Mar 2013, 10:08:56 hrs PST8PDT

Sub-patch 14727315; "Database Patch Set Update : 11.2.0.2.9 (14727315)"

Sub-patch 14275621; "Database Patch Set Update : 11.2.0.2.8 (14275621)"

Sub-patch 13923804; "Database Patch Set Update : 11.2.0.2.7 (13923804)"

Sub-patch 13696224; "Database Patch Set Update : 11.2.0.2.6 (13696224)

升级的详细步骤在另一篇日志中已经讲到。

https://www.cnblogs.com/kingle-study/p/11017573.html

以下就是升级中碰到的问题,总之尽管在测试环境已经做了两遍演练,但是在生产还是碰到了不少的问题,而且有些ORA-600的问题让oracle原厂support的人也很无语,不过还是经过坚持不懈的努力,还是按时将数据库从10g升级到了11g

1.invalid components issue fix error

invalid component的问题修复在另一篇日志中已经有所描述。

http://space.itpub.net/?uid-23718752-action-viewspace-itemid-766215

因为生产中有部分component是invalid,因为没有用到,暂时没有影响,所以在升级之前需要修复这些问题,但是在修复的时候还是碰到了ora的错误。

照例按照步骤,先把invalid的jvm给remove,但是在re-install的时候出现了下面的问题。

SQL> /
COMP_ID COMP_NAME STATUS VERSION
------------------------------ -------------------------------------------------- ----------- ------------------------------
CATALOG Oracle Database Catalog Views VALID 10.2.0.5.0
CATPROC Oracle Database Packages and Types VALID 10.2.0.5.0
OWM Oracle Workspace Manager VALID 10.2.0.5.0
XDB Oracle XML Database VALID 10.2.0.5.0
ODM Oracle Data Mining VALID 10.2.0.5.0
CONTEXT Oracle Text VALID 10.2.0.5.0
JAVAVM JServer JAVA Virtual Machine VALID 10.2.0.5.0
XML Oracle XDK REMOVED 10.2.0.5.0
CATJAVA Oracle Database Java Packages VALID 10.2.0.5.0
APS OLAP Analytic Workspace VALID 10.2.0.5.0
XOQ Oracle OLAP API VALID 10.2.0.5.0
COMP_ID COMP_NAME STATUS VERSION
------------------------------ -------------------------------------------------- ----------- ------------------------------
AMD OLAP Catalog VALID 10.2.0.5.0
12 rows selected.
SQL> BEGIN
2 IF dbms_registry.status('CATJAVA') IS NULL THEN
3 RAISE_APPLICATION_ERROR(-20000,
4 'CATJAVA has not been loaded into the database.');
5 END IF;
6 IF dbms_registry.is_loaded('JAVAVM') != 1 THEN
7 RAISE_APPLICATION_ERROR(-20000,
8 'JServer is not operational in the database; ' ||
9 'JServer is required to remove CATJAVA from the database.');
10 END IF;
11 END;
12 /
BEGIN
*
ERROR at line 1:
ORA-20000: JServer is not operational in the database; JServer is required to remove CATJAVA from the database.
ORA-06512: at line 7

升级的大晚上碰到这个问题,大家都有点手足无策,没办法从metalink中还是找到了相关的一些步骤,

需要re-load一下。然后remove,re-install试试。

SQL> execute sys.dbms_registry.loaded('JAVAVM');
PL/SQL procedure successfully completed.
SQL> execute sys.dbms_registry.loaded('CATJAVA');
PL/SQL procedure successfully completed.
SQL> execute sys.dbms_registry.valid('JAVAVM');
PL/SQL procedure successfully completed.
SQL> execute sys.dbms_registry.valid('CATJAVA');
PL/SQL procedure successfully completed

2. DB startup upgrade failed,

在切换了oracle home 从10g到11g之后,尝试启动到startup upgrade模式,竟然包了如下的错误。

startup upgrade

ORA-00600: internal error code, arguments: [kccsbck_first], [], [],

[], [], [], [], [], [], [], [],

solution:

最后还是先保证11g的环境下没有ora进程存在,然后切回10g的环境,看有没有ora的进程,如果有一定要停掉。然后再切回11g home,就可以了。

3. timezone issue

timezone在10gR2 0.5.0版本中是version 4,但是在11g中是version 14.一定要确保升级到version 14,要不会出现很多奇怪的问题,尤其在时间敏感的系统中,比如订购系统中,这个一定要保证正确。

有一个简单的方法可以测试一下timezone是否升级成功。

1)select * from v$timezone_file;

2) exp with consistent=y in table mode

按照预定的步骤我升级了timezone,但是在exp consistent=y的时候竟然还是报错了。

如果export 不加consistent=y就没有问题,import也没有任何问题。

 exp APP_ROLLBK/APP_ROLLBK file=test.dmp tables=AAAAA  consistent=y
Export: Release 11.2.0.2.0 - Production on Tue Oct 8 08:30:08 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in UTF-8 character set and UTF8 NCHAR character set
About to export specified tables via Conventional Path ...
. . exporting table AAAAA 76 rows exported
EXP-00008: ORACLE error 1466 encountered
ORA-01466: unable to read data - table definition has changed
Export terminated successfully with warnings.

为了排查,测试了下expdp是否也有问题,测试了一下,expdp 正常expdp XXXX/XXX DIRECTORY=DATA_PUMP_DIR DUMPFILE=test.dmp tables=AAAAA consistent=y眼看离rollback的时间越来越近,但是这个问题还是没有解决。照着oracle的建议做了如下测试,但是还是不行。

what we have tried suggested by ORACLE:

1. we create a new table, and export using below command

exp xxxx consistent=y …

there will be issues

2. we tried to export from schema1(without consistent=y), and import into schema2, and export from schema2 with consistent=y

there will still issues.

咨询了下公司的资深专家,给了如下的建议

From Oracle MOS doc # 1011680.102 –

    The scenario you report usually occurs when some change has been made to the System date or FIXED_DATE in the init.ora.

The problem happens because the creation timestamp of objects is greater than the System time.

o If you are using FIXED_DATE, you have to add a few minutes (or a few hours) to the TIME.

DBA: we don’t set it manually, just keep it is.

o If not, it could be that the System date of the OS is incorrect. If this is the case, you have do another full export without the consistent=y – and recreate the database.

DBA : we checked physical lock and date, they are synchronized,

hwclock;date

Tue 08 Oct 2013 01:18:18 PM ICT -0.267508 seconds

Tue Oct 8 13:18:17 ICT 2013

You can run this query to identify the wrongly-defined objects:

select to_char (created, 'dd-mm-yyyy hh24:mi:ss') "CREATION TIME", object_name, object_type, object_id from dba_objects where created > sysdate;

DBA: we have ran above query, and nothing returned.

最后oracle试着想重新升级一下timezone,第一个步骤是purge recyclebin;

做完之后试着exp consistent=y了一下,竟然可以了!

关于recycle bin的部分,oracle最后的回复如下:

Reference with note : Complete Checklist for Manual Upgrades to 11gR2 (Doc ID 837570.1)

Action :

#sqlplus / as sysdba

SQL> purge recyclebin ;

EXP-00008: ORACLE error 1466 encountered

ORA-01466: unable to read data - table definition has changed

Export terminated successfully with warnings.

This is a time-based read consistency error for a database object, such as a table or index.

Either of the following may have happened: The query was parsed and executed with a snapshot older than the time the object was changed. The creation time-stamp of the object is greater than the current system time. This happens, for example, when the system time is set to a time earlier than the creation time of the object.

Output from script. (utlu112i.sql) recommend that recycle bin is empty prior to upgrading

4. exp with consistent in schema mode

第三个问题解决了,第四个问题就迎刃而解了,继续后面的操作。

 exp prdrefwork/dba4true file=test.dmp consistent=y
Export: Release 11.2.0.2.0 - Production on Tue Oct 8 07:58:49 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in UTF-8 character set and UTF8 NCHAR character set
. exporting pre-schema procedural objects and actions
EXP-00008: ORACLE error 1466 encountered
ORA-01466: unable to read data - table definition has changed
ORA-06512: at "SYS.DBMS_AW_EXP", line 818
ORA-06512: at line 1
EXP-00083: The previous problem occurred when calling SYS.DBMS_AW_EXP.schema_info_exp
. exporting foreign function library names for user xxxxx
. exporting PUBLIC type synonyms
. exporting private type synonyms
. exporting object type definitions for user xxxx
About to export xxxxxx objects ..

5.error from alert log

升级基本完成,检查alert log,发现如下的错误。显示小小紧张了一下,排除了其他的可能, 是timezone升级的修复,修复之后再没有这些错误了。Archived Log entry 6765 added for thread 1 sequence 6728 ID 0xb8c6d509 dest 1:Tue Oct 08 04:51:01 2013ORA-1466 (RO Tx began: 10/07/2013 21:51:00, Last DDL: 10/08/2013 02:07:25, Curr Time: 10/07/2013 21:51:00)Tue Oct 08 06:04:15 2013The value (158) of MAXTRANS parameter ignored.还有一个问题需要注意的是,如果在10g中有外部表的话,升级11g的时候一定也要注意,很有可能外部表会有问题,(比如directory的变更),导致在oracle的内部job中报出很多错误,无法进行。

6. ora-00020 throw after DB upgrade

升级工作基本完成了,还没准备休息。就听有app的人反应数据库的连接不够了。查看alert日志,发现如下的日志,自己试着用sysdba登,竟然都提示没有process了。Archived Log entry 6793 added for thread 1 sequence 6756 ID 0xb8c6d509 dest 1:Wed Oct 09 00:31:24 2013ORA-00020: maximum number of processes 5000 exceededORA-20 errors will not be written to the alert log for the next minute. Please look at trace files to see all the ORA-20 errors.Wed Oct 09 00:32:33 2013ORA-00020: maximum number of processes 5000 exceededORA-20 errors will not be written to the alert log for the next minute. Please look at trace files to see all the ORA-20 errors.Wed Oct 09 00:35:03 2013稍等了一下,再次登进去,发现有大量的inactive session.而且sql_id都是null.证明那些session再没有执行相关的sql,和等待事务。而且program都是指向一个客户端,根据以往的经验是app在程序中没有释放连接。先作为临时解决方法,让他们把那个客户端过来的inactive session都kill了,先让其他的流程能够跑起来。kill之后,几个小时系统基本正常了,但是这个问题还是一个定时炸弹,不知道根本的原因都不踏实。过了几个小时候,我监控session数突然从800增加到了3000多,而且还在不断增加。

as session suddenly increased from 800 to 3458,赶紧给team发邮件,让他们监控在那段时间有什么job/daemon在跑。

最后根据客户的反馈也比较模糊,他们说起了很多job,确实不好定位问题。

这下可苦了我们,不过还好我有开发功底,我从他们那拿到代码,排除了各种不想管的部分后,一点一点的琢磨起来,最后一共相关的job有40个左右,当我看到第20个的时候,发现connection明显处理有问题,会循环产生相当多的connection。但是没有合理的close,定位了问题。

让我们由被动变为主动。客户也不好意思,加紧去修复那个问题了。

STATUS          CNT
-------- ----------
KILLED 122
ACTIVE 38
INACTIVE 3298
----------
sum 3458

7.re-create user_synonyms view

转载: https://cloud.tencent.com/developer/column/2127

侵删!

ORACLE 10g 升级 11g问题汇总(转载文)的更多相关文章

  1. oracle 10g升级到11g

    Linux 上Oracle RAC 10g 升级到 Oracle RAC 11g 了解如何在 Oracle Enterprise Linux 5 上逐步将 Oracle RAC 10g 第 2 版升级 ...

  2. Oracle 10g和11g中的自动统计任务

    1)  先来看下oracle 10g中的自动统计任务的问题. 从Oracle Database 10g开始,Oracle在建库后就默认创建了一个名为GATHER_STATS_JOB的定时任务,用于自动 ...

  3. ORACLE 10g、11g批量刷新用户密码

    一.ORACLE 10g批量刷新用户密码 select 'alter user ' || username ||' identified by values '''||password ||''';' ...

  4. Oracle 10g 和11g r2 下载地址(使用迅雷)

    http://www.blogjava.net/wangdetian168/archive/2011/03/01/345428.html  10g http://www.blogjava.net/wa ...

  5. oracle 10g和11g将表到缓存到内存中

    alter table 表名 cache;alter table 表名 storage(buffer_pool keep);

  6. Linux 上Oracle RAC 10g 升级到 Oracle RAC 11g

    了解如何在 Oracle Enterprise Linux 5 上逐步将 Oracle RAC 10g 第 2 版升级到 Oracle RAC 11g. Oracle 数据库 11g(即,新一代网格计 ...

  7. CENTOS 6.4 安装oracle 10g,手工建库及升级到10.2.0.5

    一. 数据库软件安装 参照官方手册 1.安装rpm包 注这里的yum直接用163的yum yum -y install binutils compat-libstdc++-33 compat-libs ...

  8. Oracle 10g AND Oracle 11g手工建库案例--Oracle 11g

    Oracle 10g AND Oracle 11g手工建库案例--Oracle 11g 系统环境: 操作系统: RedHat EL6 Oracle:  Oracle 10g and Oracle 11 ...

  9. Zero Downtime Upgrade of Oracle 10g to Oracle 11g Using GoldenGate — 3

    DDL Setup Steps SQL> grant execute on utl_file to ggs; Grant succeeded. Create GLOBALS file [orac ...

随机推荐

  1. vue-cli安装以及创建一个简单的项目(二)(vuex使用、发行一个简单的app)

    1.vuex的使用 vuex是vue的状态管理中心,vuex来保存我们需要管理的状态值,值一旦被修改,所有引用该值的地方就会自动更新,常用于: 1.多个视图依赖同一状态(l例:菜单导航) 2.来自不同 ...

  2. LINQ 之 LookUp

    声明:本文为www.cnc6.cn原创,转载时请注明出处,谢谢! 本文作者文采欠佳,文字表达等方面不是很好,但实际的代码例子是非常实用的,请作参考. 一.先准备要使用的类: 1.Person类: cl ...

  3. LinkedHashMap实现和LRU

    HashMap是Java中叫法,在Python中就叫Dict 在Python的标准库中实现了LinkedHashMap,它的名字叫OrderedDict,它的源码比较简单,OrderedDict继承了 ...

  4. Weblogic-SSRF漏洞复现

    Weblogic-SSRF漏洞复现 一.SSRF概念 服务端请求伪造(Server-Side Request Forgery),是一种有攻击者构造形成有服务端发起请求的一个安全漏洞.一般情况下,SSR ...

  5. C# 使用Environment获取当前程序运行环境相关信息

    Enviroment类和AppDomain类前者表示系统级的相关信息,后者表示应用程序级的相关信息. 我常用这两个类获取一些程序运行目录.操作系统位数等信息: string basedir = App ...

  6. jsonHelper帮助类

    使用前,需引用开源项目类using Newtonsoft.Json 链接:https://pan.baidu.com/s/1htK784XyRCl2XaGGM7RtEg 提取码:gs2n using ...

  7. uni-app学习(五)好用的插件3

    1. uni-app学习(五)好用的插件3 1.1. 分享推广页面 分享推广页面,分享第三方.保存二维码.复制推广地址 模板地址 示例 这个用到的几率还是蛮大的,可以直接拿来修改下用 1.2. 教育A ...

  8. Docker制作dotnet core控制台程序镜像

    (1)首先我们到某个目录下,然后在此目录下打开visual studio code. 2.编辑docker file文件如下: 3.使用dotnet new console创建控制台程序; 4.使用d ...

  9. Ubuntu19.04安装常用软件

    安装Indicator Stickynotes 桌面便签小工具sudo add-apt-repository ppa:umang/indicator-stickynotessudo apt-get u ...

  10. python从入门到放弃之进程进阶篇

    什么我们得了解清楚什么是进程,进程就是系统分配的一个资源单位,真正在程序中干活得是线程,默认是每个进程中都拥有一个线程 然后我们在了解下什么是进程池这个概念 进程池是的数量是取决于我当前电脑的逻辑处理 ...