19 December, 2007

Using an Index for a NOT EQUALS Query

Lauerent Schneider has posted an example of when you _can_ use an Index to drive a NOT EQUALS query.

Always Explicitly Convert DataTypes

Good programming practice [whether in Oracle or any other software environment] is to NOT rely on implicit conversion of datatypes. Particularly if you are working with multiple languages. Also, when you go through an upgrade to a new version.
Tom Kyte always recommends Explicit Conversion of DataTypes. However, he acknowledges the occassion when he did not follow his own advice -- and he was embarrassed in front of his audience.

27 November, 2007

Are ANALYZE and DBMS_STATS also DDLs ?

Most of us know that DDLs always run an implicit commit. That means that any transaction that was active before the DDL was issued gets COMMITted {even if the DDL itself subsequently fails}. Therefore, we ought to be [very] careful when mixing DDLs with DMLs (ie normal INSERT/UPDATE/DELETE transactions].

Common understanding is that CREATE , ALTER and DROP statements are DDLs. However, we might not always realise that implicit commits might be issued by other statements as well --- eg by ANALYZE or EXECUTE DBMS_STATS !

Here's a very short example :


SQL>
SQL> REM Demonstrate that ANALYZE and DBMS_STATS calls
SQL> rem like other DDLs also cause implicit commits
SQL>
SQL>
SQL> create table test_stats_ddl (col1 varchar2(5));
Table created.
SQL>
SQL> insert into test_stats_ddl values ('a');
1 row created.
SQL>
SQL> rollback ;
Rollback complete.
SQL>
SQL> select * from test_stats_ddl ;
no rows selected
SQL>
SQL> REM the above should show that the INSERT did get Rolledback
SQL>
SQL> rem lets try another transaction
SQL>
SQL> insert into test_stats_ddl values ('b');
1 row created.
SQL>
SQL> analyze table test_stats_ddl compute statistics;
Table analyzed.
SQL>
SQL> rollback;
Rollback complete.
SQL>
SQL> select * from test_stats_ddl;
COL1
-----
b
SQL>
SQL> REM did the INSERT of 'b' get Rolledback ? No !
SQL> rem so the analyze did an implicit commit
SQL>
SQL> REM what about dbms_stats ?
SQL>
SQL> insert into test_stats_ddl values ('c');
1 row created.
SQL>
SQL> exec dbms_stats.gather_table_stats('','test_stats_ddl');
PL/SQL procedure successfully completed.
SQL>
SQL> rollback;
Rollback complete.
SQL>
SQL> select * from test_stats_ddl;
COL1
-----
b
c
SQL>
SQL> select last_analyzed, num_rows from user_tables where table_name = 'TEST_STATS_DDL';
LAST_ANAL NUM_ROWS
--------- ----------
27-NOV-07 2
SQL>
SQL> REM so, once again, the insert did get committed
SQL>
SQL> REM let's try some "ordinary" DDL
SQL>
SQL> insert into test_stats_ddl values ('d');
1 row created.
SQL>
SQL> create table test_2 as select * from test_stats_ddl where 1=2;
Table created.
SQL>
SQL> rollback;
Rollback complete.
SQL>
SQL> select * from test_stats_ddl;
COL1
-----
b
c
d
SQL>
SQL> select * from test_2 ;
no rows selected
SQL>
SQL> REM it looks like the create table also did an implicit commit !

18 October, 2007

Flush Buffer_Cache -- when Tracing doesn't show anything happening

I believe that the "ALTER SYSTEM FLUSH BUFFER_CACHE" command is very sparingly used. After all, who wants to flush out active (or "hot") blocks and have to incur physical reads all over again ? I guess the expectation is that this command is used in Benchmarking only.
However, recently, I was able to use this command to diagnose a seemingly hanging query. V$SESSION_WAIT wasn't reporting any new wait (it was reporting only the the previous wait - 'SQL*Net more data from client' - incurred in the session before the query began executing), a 10046 level 8 trace showed nothing after the Parse# command. That left us puzzled. Why wasn't the 10046 trace showing any activity ?
I then started querying V$SESSSTAT and found that "consistent gets" was slowly increasing. An explain plan showed Nested Loop joins. The two tables and indexes involved were very small.
Aah ! "very small tables", I thought. Most likely they are in the Buffer Cache.
I issued an ALTER SYSTEM FLUSH BUFFER_CACHE and all of sudden we saw a few dozen 'db file sequential read' entries in the 10046 trace file. After that short spurt, the trace file went to "sleep". V$SESSION_WAIT now showed that the last wait was 'db file sequential read'.

One case where the FLUSH BUFFER_CACHE was the right thing to do. No one grudged having to reload the SGA --- after we had identified the bad SQL and spent a few hours on it, we tuned it down so that a 2 hour batch job completed in 3 minutes.
We finally "tuned" the INSERT..AS SELECT.. by using "_complex_view_merging=FALSE", which we then changed to "NO_MERGE" Hints in the individual subselects inside the query.

What had held us up for a short while were :
a) 'SQL*Net more data from client' in V$SESSION_WAIT with a continously increasing SECONDS_IN_WAIT seeemed to indicate that there was a network problem -- this was a wrong interpretation as the STATE was actually "WAITED SHORT TIME" so the continously increasing SECONDS_IN_WAIT was only a "counter" since the last wait had really occurred [for another discussion on how we might misread V$SESSION_WAIT, see Tanel Poder's blog entry]
b) That 10046 trace file wasn't indicating any activity -- there were no entries after the PARSE call on that particular SQL.

13 October, 2007

Inserts waiting on Locks ? Inserts holding Locks ?

When could we have a situation where a user says that his transaction is "slow" and it is merely inserting rows into a table -- an insert which used to complete in less than a minute but now has been "hanging" for a long time ? (He runs an "INSERT INTO TABLE A SELECT * FROM ....") You'd think that if the INSERT uses the SELECT query to build the data it is most likely the SELECT that is slow. You think that you need to tune the SELECT. (and worse where the SELECT is based on a distributed query -- you try to tune joins across databases).

But stop and think. If this INSERT was always much faster ("till yesterday !" he says), and is "hanging" now, you should look at where the user's session is Waiting.
Here's a case study (which I developed and tested on my Home PC) where an INSERT might wait ("forever") because it is actually attempting to insert a Duplicate Value (when there is a Unique Index) and is waiting on a "lock". Inserts should not need to wait on Locks. Moreover, if the duplicate value exists in the table, the INSERT should immediately fail with an ORA-0001 error. Yet we have a case where the INSERT waits on a lock !

On the flip side, an Insert does not need to "hold a lock", you expect because it is not updating data that someone else might attempt to update, it is after all "creating new data that no one else knows yet and, therefore, no one else needs to reference yet !".
In the case study, you see how Session 43 seems to hold a Lock and Session 45 does actually wait on a Lock. I've also suggested how you can extend the case study by creating a session where the second INSERT is based on a SELECT and your (or your DBA) would attempt to tune the SELECT if he doesn't look at session waits.

21 September, 2007

More on Bind Variable Peeking and Execution Plans

Here's how Tom Kyte used the example of a the mystery why a query with SQL_TRACE enabled seemed to perform better than the same query by itself --- because it was being parsed as two different queries when Bind Variable Peeking saw different values.
To the many responses about bind variable peeking, he further reviews some of the options.

10 September, 2007

ATOMIC_REFRESH=>FALSE causes TRUNCATE and INSERT behaviour in 10g ??

The 10g Datawarehousing documentation, when speaking of maintaining MViews inadvertently seems to imply that even FAST Refresh's can benefit with TRUNCATE.
Immediately after the text about the ATOMIC_REFRESH parameter (" If set to FALSE, Oracle can optimize refresh by using parallel DML and truncate DDL on a materialized views."), come the words :
For example, to perform a fast refresh on the materialized view cal_month_sales_mv, the DBMS_MVIEW package would be called as follows:
DBMS_MVIEW.REFRESH('CAL_MONTH_SALES_MV', 'F', '', TRUE, FALSE, 0,0,0, FALSE);
seeming to imply that even a FAST Refresh can use a TRUNCATE and INSERT -- which, certainly, cannot be the case.

Furthermore, when I tested (10.2.0.1 on Windows) refreshing 1 and 2 MVs together, I found that the TRUNCATE would occur only if one MV was being refreshed. If 2 MVs were being refreshed, even if ATOMIC_REFRESH was set to FALSE, I saw both suffering DELETE and INSERTs.

These were my test SQLs :

set echo on
set timing on
spool Test_MV_Complete_Refresh
drop materialized view my_dummy_mv;
drop materialized view my_mv_fast;
drop materialized view my_mv_complete;
drop materialized view my_mv2_complete;
drop table my_source_tab purge;
create table my_source_tab as select * from dba_objects where 1=2;
alter table my_source_tab nologging;
insert /*+ APPEND */ into my_source_tab select * from dba_objects;
commit;
delete my_source_tab where object_id is null ;
alter table my_source_tab modify (object_id not null);
create unique index my_source_tab_u1 on my_source_tab(object_id);
alter table my_source_tab add primary key (object_id);
create snapshot log on my_source_tab ;
create materialized view my_mv_fast
refresh fast on demand
as select * from my_source_tab;
create materialized view my_mv_complete
refresh complete on demand
as select * from my_source_tab;
create materialized view my_mv2_complete
refresh complete on demand
as select * from my_source_tab;
create materialized view my_dummy_mv
refresh complete
as select * from dual
where dummy='X' and rownum=1
/
exec dbms_mview.refresh('MY_MV_FAST','F');
exec dbms_mview.refresh('MY_MV_COMPLETE','C');
exec dbms_mview.refresh('MY_MV_COMPLETE,MY_DUMMY_MV','C');
exec dbms_mview.refresh('MY_MV_COMPLETE,MY_MV2_COMPLETE','C');
alter session set tracefile_identifier='T_M_C_R';
alter session set sql_trace=TRUE;
exec dbms_mview.refresh('MY_MV_FAST','F',atomic_refresh=>FALSE);
exec dbms_mview.refresh('MY_MV_COMPLETE','C',atomic_refresh=>FALSE);
exec dbms_mview.refresh('MY_MV_COMPLETE,MY_DUMMY_MV','C',atomic_refresh=>FALSE);
exec dbms_mview.refresh('MY_MV_COMPLETE,MY_MV2_COMPLETE','C',atomic_refresh=>FALSE);
spool off
exit

28 August, 2007

When "COST" doesn't indicate true load

Here is an example when reading "COST" alone can be wrong.


SQL>
SQL> REM ===========================================================================
SQL> REM Discover the table TEST_TABLE
SQL> REM This is a copy of DBA_OBJECTS, inserted twice
SQL>
SQL> desc test_table;
Name Null? Type
----------------------------------------- -------- ----------------------------
OWNER VARCHAR2(30)
OBJECT_NAME VARCHAR2(128)
SUBOBJECT_NAME VARCHAR2(30)
OBJECT_ID NUMBER
DATA_OBJECT_ID NUMBER
OBJECT_TYPE VARCHAR2(19)
CREATED DATE
LAST_DDL_TIME DATE
TIMESTAMP VARCHAR2(19)
STATUS VARCHAR2(7)
TEMPORARY VARCHAR2(1)
GENERATED VARCHAR2(1)
SECONDARY VARCHAR2(1)
SQL>
SQL> column index_name format a30
SQL> column column_name format a30
SQL> column column_position format 9999 hea 'Posn'
SQL>
SQL> REM Identify Indexes present
SQL> select index_name, column_name, column_position
2 from user_ind_columns
3 where table_name = 'TEST_TABLE'
4 order by index_name, column_position
5 /
INDEX_NAME COLUMN_NAME Posn
------------------------------ ------------------------------ -----
TEST_TABLE_NDX_1 CREATED 1
Elapsed: 00:00:00.39
SQL>
SQL> exec dbms_stats.gather_table_stats('HEMANT','TEST_TABLE',cascade=>TRUE);
Elapsed: 00:00:42.51
SQL> select count(*) from test_table;
COUNT(*)
----------
103114
Elapsed: 00:00:00.65
SQL> select blocks, num_rows from user_tables where table_name = 'TEST_TABLE';
BLOCKS NUM_ROWS
---------- ----------
1460 103114
Elapsed: 00:00:00.14
SQL> select leaf_blocks, num_rows from user_indexes where index_name = 'TEST_TABLE_NDX_1';
LEAF_BLOCKS NUM_ROWS
----------- ----------
273 103114
Elapsed: 00:00:00.17
SQL>
SQL> REM ===========================================================================
SQL> REM Get the query execution plan for MIN queries
SQL> set autotrace on
SQL>
SQL> select min(created) from test_table
2 /
MIN(CREAT
---------
30-AUG-05
Elapsed: 00:00:00.12
Execution Plan
----------------------------------------------------------
Plan hash value: 108964483
--------------------------------------------------------------------------------
---------------
Id Operation Name Rows Bytes Cost (%C
PU) Time
--------------------------------------------------------------------------------
---------------
0 SELECT STATEMENT 1 8 328
(3) 00:00:04
1 SORT AGGREGATE 1 8

2 INDEX FULL SCAN (MIN/MAX) TEST_TABLE_NDX_1 103K 805K

--------------------------------------------------------------------------------
---------------

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
420 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> REM --> that was only 2 Block Gets
SQL>
SQL> REM ===========================================================================
SQL> REM Get the query execution plan for MAX queries
SQL> set autotrace on
SQL>
SQL> select max(created) from test_table
2 /
MAX(CREAT
---------
12-AUG-07
Elapsed: 00:00:00.17
Execution Plan
----------------------------------------------------------
Plan hash value: 108964483
--------------------------------------------------------------------------------
---------------
Id Operation Name Rows Bytes Cost (%C
PU) Time
--------------------------------------------------------------------------------
---------------
0 SELECT STATEMENT 1 8 328
(3) 00:00:04
1 SORT AGGREGATE 1 8

2 INDEX FULL SCAN (MIN/MAX) TEST_TABLE_NDX_1 103K 805K

--------------------------------------------------------------------------------
---------------

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
420 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> REM --> Again, only 2 Block Gets
SQL> REM ===========================================================================
SQL> REM Get Min and Max together
SQL> set autotrace on
SQL>
SQL> select min(created), max(created) from test_table
2 /
MIN(CREAT MAX(CREAT
--------- ---------
30-AUG-05 12-AUG-07
Elapsed: 00:00:02.28
Execution Plan
----------------------------------------------------------
Plan hash value: 711311523
--------------------------------------------------------------------------------
-
Id Operation Name Rows Bytes Cost (%CPU) Time

--------------------------------------------------------------------------------
-
0 SELECT STATEMENT 1 8 328 (3) 00:00:04

1 SORT AGGREGATE 1 8

2 TABLE ACCESS FULL TEST_TABLE 103K 805K 328 (3) 00:00:04

--------------------------------------------------------------------------------
-

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1445 consistent gets
1440 physical reads
0 redo size
490 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> REM --> That took 1,445 Block Gets -- at the same "COST" of 328 !
SQL>
SQL>
SQL> REM ===========================================================================
SQL> REM Try using a UNION
SQL> set autotrace on
SQL>
SQL> select min(created) Min_date, NULL Max_date from test_table
2 union
3 select NULL Min_date, max(created) Max_date from test_table
4 /
MIN_DATE MAX_DATE
--------- ---------
30-AUG-05
12-AUG-07
Elapsed: 00:00:00.25
Execution Plan
----------------------------------------------------------
Plan hash value: 1754591915
--------------------------------------------------------------------------------
-------------------------
Id Operation Name Rows Bytes TempSpc
Cost (%CPU) Time
--------------------------------------------------------------------------------
-------------------------
0 SELECT STATEMENT 2 16
1426 (52) 00:00:18
1 SORT UNIQUE 2 16 6513K
1426 (52) 00:00:18
2 UNION-ALL

3 SORT AGGREGATE 1 8
713 (4) 00:00:09
4 INDEX FULL SCAN (MIN/MAX) TEST_TABLE_NDX_1 103K 805K

5 SORT AGGREGATE 1 8
713 (4) 00:00:09
6 INDEX FULL SCAN (MIN/MAX) TEST_TABLE_NDX_1 103K 805K

--------------------------------------------------------------------------------
-------------------------

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
520 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
2 rows processed
SQL>
SQL> REM --> Only 4 Block Gets, although the "COST" is 1,426
SQL>
SQL> REM ===========================================================================
SQL> REM Try using Inline Views to get the two values together in single row
SQL> set autotrace on
SQL>
SQL> select a.Min_Date, b.Max_date from
2 (select min(created) Min_date from test_table) a,
3 (select max(created) Max_date from test_table) b
4 /
MIN_DATE MAX_DATE
--------- ---------
30-AUG-05 12-AUG-07
Elapsed: 00:00:00.06
Execution Plan
----------------------------------------------------------
Plan hash value: 1527082337
--------------------------------------------------------------------------------
-----------------
Id Operation Name Rows Bytes Cost (
%CPU) Time
--------------------------------------------------------------------------------
-----------------
0 SELECT STATEMENT 1 18 655
(2) 00:00:08
1 NESTED LOOPS 1 18 655
(2) 00:00:08
2 VIEW 1 9 328
(3) 00:00:04
3 SORT AGGREGATE 1 8

4 INDEX FULL SCAN (MIN/MAX) TEST_TABLE_NDX_1 103K 805K

5 VIEW 1 9 328
(3) 00:00:04
6 SORT AGGREGATE 1 8

7 INDEX FULL SCAN (MIN/MAX) TEST_TABLE_NDX_1 103K 805K

--------------------------------------------------------------------------------
-----------------

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
482 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> REM --> Only 4 Block Gets, although the "COST" is 655
SQL>
SQL>
SQL> spool off

24 August, 2007

NULLs are not Indexed, Right ? NOT !

Update 09-Feb-08 :
See MetaLink Note#551754.1 "Query with ROWNUM Condition On a Table Which Has Index on Column "Constant" Returns No Rows" for a caveat. (Bug#6737251). The recommendation there is to NOT use a Constant in an Index.

We all know very well that NULLs are not indexed and a query for NULL values in a column cannot use the index on that column ? However, there is a way to index a column with NULL values.
I read about this method a few times on different blogs in the past week. So I decided to create my own test case.
What I do is to add a CONSTANT to the column being indexed. Since the Constant is NOT NULL but is the second column in the index, the NULLs of our actual column have to get included in the index.
This method fails on 10.2.0.1 if the constant is a "1". See the references in my example below :





SQL>
SQL> REM ===========================================================================
SQL> REM Discover the table TEST_TABLE and how many NULLs does the STATUS column have
SQL> REM -- FYI, this is a prebuilt copy of DBA_OBJECTS
SQL> REM where I have updated STATUS to NULL for 29 rows
SQL>
SQL> desc test_table;
Name Null? Type
----------------------------------------- -------- ----------------------------
OWNER VARCHAR2(30)
OBJECT_NAME VARCHAR2(128)
SUBOBJECT_NAME VARCHAR2(30)
OBJECT_ID NUMBER
DATA_OBJECT_ID NUMBER
OBJECT_TYPE VARCHAR2(19)
CREATED DATE
LAST_DDL_TIME DATE
TIMESTAMP VARCHAR2(19)
STATUS VARCHAR2(7)
TEMPORARY VARCHAR2(1)
GENERATED VARCHAR2(1)
SECONDARY VARCHAR2(1)
SQL>
SQL> column index_name format a30
SQL> column column_name format a30
SQL> column column_position format 9999 hea 'Posn'
SQL>
SQL> REM The index that is currently present is not usable in our test queries later
SQL> select index_name, column_name, column_position
2 from user_ind_columns
3 where table_name = 'TEST_TABLE'
4 order by index_name, column_position
5 /
INDEX_NAME COLUMN_NAME Posn
------------------------------ ------------------------------ -----
TEST_TBL_NDX LAST_DDL_TIME 1
Elapsed: 00:00:00.21
SQL>
SQL> select decode(status,'VALID','VALID',NULL,'Is Null'),
2 count(*)
3 from test_table
4 group by decode(status,'VALID','VALID',NULL,'Is Null')
5 /
DECODE( COUNT(*)
------- ----------
VALID 51528
Is Null 29
Elapsed: 00:00:00.17
SQL>
SQL> REM ===========================================================================
SQL> REM Get the query execution plan for queries where status is NULL
SQL> exec dbms_stats.gather_table_stats(ownname=>'HEMANT',tabname=>'TEST_TABLE',cascade=>TRUE);
Elapsed: 00:00:04.43
SQL>
SQL> set autotrace on
SQL>
SQL> select count(*) from test_table where status is NULL
2 /
COUNT(*)
----------
29
Elapsed: 00:00:00.31
Execution Plan
----------------------------------------------------------
Plan hash value: 711311523

--------------------------------------------------------------------------------
-

Id Operation Name Rows Bytes Cost (%CPU) Time


--------------------------------------------------------------------------------
-

0 SELECT STATEMENT 1 6 165 (3) 00:00:02


1 SORT AGGREGATE 1 6


* 2 TABLE ACCESS FULL TEST_TABLE 16 96 165 (3) 00:00:02


--------------------------------------------------------------------------------
-


Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("STATUS" IS NULL)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
714 consistent gets
405 physical reads
0 redo size
411 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> REM We saw above that it was doing an FTS and 714 block gets
SQL> REM ===========================================================================
SQL>
SQL> REM ===========================================================================
SQL> REM Create an Index on status and try again
SQL> set autotrace off
SQL>
SQL> create index test_table_status_ndx1 on test_table(status)
2 /
Elapsed: 00:00:00.50
SQL>
SQL> exec dbms_stats.gather_table_stats(ownname=>'HEMANT',tabname=>'TEST_TABLE',cascade=>TRUE);
Elapsed: 00:00:03.71
SQL>
SQL> set autotrace on
SQL>
SQL> select count(*) from test_table where status is NULL
2 /
COUNT(*)
----------
29
Elapsed: 00:00:00.29
Execution Plan
----------------------------------------------------------
Plan hash value: 711311523

--------------------------------------------------------------------------------
-

Id Operation Name Rows Bytes Cost (%CPU) Time


--------------------------------------------------------------------------------
-

0 SELECT STATEMENT 1 6 165 (3) 00:00:02


1 SORT AGGREGATE 1 6


* 2 TABLE ACCESS FULL TEST_TABLE 43 258 165 (3) 00:00:02


--------------------------------------------------------------------------------
-


Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("STATUS" IS NULL)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
714 consistent gets
709 physical reads
0 redo size
411 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> REM Still doing an FTS and 714 block gets because index test_table_status_ndx1 does
SQL> REM not capture NULLs !
SQL> REM ===========================================================================
SQL>
SQL> REM ===========================================================================
SQL> REM Create a "concatenated" index on status and try again
SQL> REM -- my concatenated index, as if an FBI, uses a constant "A"
SQL> REM why did I use "A" instead of 1 ?
SQL> REM my first few tests with "1" kept reporting ORA-600 [qkssao2s1] errors on dbms_stats
SQL> REM -- although the Execution Plan WAS working
SQL> REM If I used "2" as the constant, I got worse errors and disconnections
SQL> REM So I decided to use an Alpha character, as STATUS is VARCHAR2
SQL> REM For more information on qkssao2s1 and constant "1" see Note#5005939.8
SQL>
SQL> create index test_table_status_ndx2 on test_table(status,'A')
2 /
Elapsed: 00:00:00.48
SQL>
SQL> exec dbms_stats.gather_table_stats(ownname=>'HEMANT',tabname=>'TEST_TABLE',cascade=>TRUE);
Elapsed: 00:00:05.39
SQL>
SQL> set autotrace on
SQL>
SQL> select count(*) from test_table where status is NULL
2 /
COUNT(*)
----------
29
Elapsed: 00:00:00.64
Execution Plan
----------------------------------------------------------
Plan hash value: 180495830

--------------------------------------------------------------------------------
------------

Id Operation Name Rows Bytes Cost (%CPU)
Time

--------------------------------------------------------------------------------
------------

0 SELECT STATEMENT 1 6 2 (0)
00:00:01

1 SORT AGGREGATE 1 6


* 2 INDEX RANGE SCAN TEST_TABLE_STATUS_NDX2 45 270 2 (0)
00:00:01

--------------------------------------------------------------------------------
------------


Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("STATUS" IS NULL)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
411 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> REM Aah ! Now we see the Index test_table_status_ndx2 being used to get NULLs
SQL> REM on status. NULLs *are* in this index because status is only the leading column
SQL> REM of the index and the succeeding column is a not null
SQL> REM The performance gain comes from having to do 2 block gets instead of 714
SQL> REM ===========================================================================
SQL> REM ===========================================================================
SQL> set autotrace off
SQL> drop index test_table_status_ndx1;
Elapsed: 00:00:00.59
SQL> drop index test_table_status_ndx2;
Elapsed: 00:00:00.31
SQL> spool off

18 August, 2007

How does the CBO select an Index ?

Don Seiler has posted his experience with a particular troublesome issue with the CBO.
Why was it selecting "FOO_IDX" over "FOO_PK" ? See
die Seilerwerks: Dr. Statslove or: How I Learned to Stop Guessing and Love the 10053 Trace

13 August, 2007

NLS_DATE_FORMAT

NLS_DATE_FORMAT

Friday afternoon, I was working with an application team to "tune" the most expensive SQL on the system -- an hourly MV refresh query.
The query was re-written (excluding data that was not needed) and ran much faster. Next, I then ran the refresh from the database server just prior to submitting a job for the hourly refresh. And I got my shock. The query ran for 39 minutes from my Unix telnet session although I had just seen it complete in less than 3 minutes on the desktop. We tested the same thing twice as "DBMS_MVIEW.REFRESH" calls.

I went back to the developer's desktop. He was using a non-Oracle client and I suspected that it was passing some directives ("alter session" commands). I couldn't find any such directives in the setup. I then looked at parameters with a SHOW PARAMETER. There was no difference in the listing from the desktop and the Unix telnet session. I went back to my PC, fired up a SQLPlus client and tried the DBMS_MVIEW.REFRESH. Sure enough, it was running in less than 2minutes.

I traced the executions from the server and the client. Rather than looking at the EXPLAIN PLAN, I looked at the ROW SOURCE OPERATIONS. I found that there was a difference in the actual execution. (If I were to tkprof with the explain option, the server-side Explain would still use the server-side NLS_DATE_FORMAT on both trace files, I suspect).

I knew that I should suspect NLS_DATE_FORMAT because the server-side init.ora had a date format DD-MON-YYYY HH24:MI:SS. But I saw no Registry settings on both desktops that would have set NLS_DATE_FORMAT to DD-MON-RR. Logically, the server-side format should apply (and that is what SHOW PARAMETER also seemed to show -- I saw the same value from both the client and the telnet session).

We then tested DBMS_JOB.SUBMIT and DBMS_JOB.RUN. Again (and not too surprisingly now !), the execution of the job submitted from the client was faster than the same job definition (but a different job id) from the server. NLS_ENV in DBA_JOBS showed that NLS_DATE_FORMAT for the job submitted from the client was DD-MON-RR.

So that meant :
a. The Windows Clients (whether the non-Oracle tool or my desktop's SQLPlus) was using DD-MON-RR and overrriding the server-side NLS_DATE_FORMAT. Since I could not find a client-side registry entry that explicitly sets this, it would mean that the client *defaults* to DD-MON-RR
b. SHOW PARAMETER doesn’t always show the current values being used by the session

And, quite obviously, different NLS_DATE_FORMAT’s meant that the results of the query could very well vary ! That is quite important to know.


Back at home during the weekend, I decided to test all of this again.

First, I create a “Test” Table.

Here I create a table with data from DBA_OBJECTS and modify the “LAST_DDL_TIME” values so that I can have a date colum with very many different date values.
=================================================================
SQL>
SQL> drop table hemant.test_table ;

Table dropped.

SQL>
SQL> create table hemant.test_table nologging as
2 select * from dba_objects
3 union
4 select * from dba_objects
5 /

Table created.

SQL>
SQL> alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS';

Session altered.

SQL>
SQL> update hemant.test_table
2 set last_ddl_time = (last_ddl_time-365)+rownum/100;

51557 rows updated.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> create index hemant.test_tbl_ndx on hemant.test_table(last_ddl_time) nologging;

Index created.

SQL>
SQL> exec dbms_stats.gather_table_stats(ownname=>'HEMANT',tabname=>'TEST_TABLE',cascade=>TRUE);

PL/SQL procedure successfully completed.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> select to_char(min(last_ddl_time),'DD-MON-YYYY') from test_table ;

TO_CHAR(MIN
-----------
29-OCT-2001

SQL> select to_char(max(last_ddl_time),'DD-MON-YYYY') from test_table ;

TO_CHAR(MAX
-----------
17-OCT-2007

SQL>
SQL>

=================================================================

Next, I run my test script.

I find that the
a) even though the server-side NLS_DATE_FORMAT is DD-MON-YYYY …, the DBMS_JOB call actually inserts the setting of DD-MON-RR when I submit my first MV Refresh Job.
I am quite sure that I have not set this in my Windows Registry.
b) when I query by a two-digit year, Oracle does a Full Table Scan and retrieves all
the records – thus it is NOT behaving with “RR” ??
c) If I explicitly change my NLS_DATE_FORMAT to DD-MON-RR, the same query
converts to an Index Range Scan and finds only the specific rows that match the
specified date range.

What is the Moral Of The Story ?
Be Careful with NLS_DATE_FORMATs. Pay particular attention to see if your client is defaulting to DD-MON-RR when your server is set to some other value. There are application scenarios where you would *want* different NLS_DATE_FORMATs (eg in a global database with clients using different formats). Remember that data entry and translation are handled by NLS_% parameters. Verify that you are really seeing the data that you want to see.
=================================================================

SQL>
SQL> set pages600
SQL> set linesize 110
SQL>
SQL> col owner format a15
SQL> col object_name format a30 trunc
SQL> col name format a18
SQL> col value format a22
SQL>
SQL> select name, value from v$parameter where name = 'nls_date_format';

NAME VALUE
------------------ ----------------------
nls_date_format DD-MON-YYYY HH24:MI:SS

SQL>
SQL> show parameter nls_date_format

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
nls_date_format string DD-MON-YYYY HH24:MI:SS
SQL>
SQL> select to_char(min(last_ddl_time),'DD-MON-YYYY') from test_table ;

TO_CHAR(MIN
-----------
29-OCT-2001

SQL> select to_char(max(last_ddl_time),'DD-MON-YYYY') from test_table ;

TO_CHAR(MAX
-----------
17-OCT-2007

SQL>
SQL> set autotrace traceonly explain
SQL>
SQL> select owner, count(*)
2 from test_table
3 where last_ddl_time
4 > sysdate-365
5 group by owner
6 /

Execution Plan
----------------------------------------------------------
Plan hash value: 2200191467

---------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
---------------------------------------------------------------------------------
0 SELECT STATEMENT 26 364 172 (7) 00:00:03
1 HASH GROUP BY 26 364 172 (7) 00:00:03
* 2 TABLE ACCESS FULL TEST_TABLE 19370 264K 169 (5) 00:00:03
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("LAST_DDL_TIME">SYSDATE@!-365)

SQL>
SQL> select owner, count(*)
2 from test_table
3 where last_ddl_time
4 > '01-OCT-07'
5 group by owner
6 /

Execution Plan
----------------------------------------------------------
Plan hash value: 2200191467

---------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
---------------------------------------------------------------------------------
0 SELECT STATEMENT 26 364 174 (8) 00:00:03
1 HASH GROUP BY 26 364 174 (8) 00:00:03
* 2 TABLE ACCESS FULL TEST_TABLE 51556 704K 167 (4) 00:00:03
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("LAST_DDL_TIME">'01-OCT-07')

SQL>
SQL>
SQL> set autotrace off
SQL> drop materialized view test_mv;

Materialized view dropped.

SQL> create materialized view test_mv
2 build deferred
3 REFRESH WITH ROWID
4 as
5 select * from test_table
6 where last_ddl_time
7 > '01-OCT-07'
8 /

Materialized view created.

SQL>
SQL> variable jno number;
SQL> exec dbms_job.submit(:jno,'DBMS_MVIEW.REFRESH(''TEST_MV'',''C'');',sysdate+1,'sysdate+1/24');

PL/SQL procedure successfully completed.

SQL>
SQL> select * from user_jobs order by job;

JOB LOG_USER PRIV_USER SCHEMA_USER
---------- ------------------------------ ------------------------------ ------------------------------
LAST_DATE LAST_SEC THIS_DATE THIS_SEC NEXT_DATE NEXT_SEC TOTAL_TIME B
--------- -------- --------- -------- --------- -------- ---------- -
INTERVAL
--------------------------------------------------------------------------------------------------------------
FAILURES
----------
WHAT
--------------------------------------------------------------------------------------------------------------
NLS_ENV
--------------------------------------------------------------------------------------------------------------
MISC_ENV INSTANCE
---------------------------------------------------------------- ----------
131 HEMANT HEMANT HEMANT
14-AUG-07 22:28:37 0 N
sysdate+1/24

DBMS_MVIEW.REFRESH('TEST_MV','C');
NLS_LANGUAGE='AMERICAN' NLS_TERRITORY='AMERICA' NLS_CURRENCY='$' NLS_ISO_CURRENCY='AMERICA' NLS_NUMERIC_CHARAC
TERS='.,' NLS_DATE_FORMAT='DD-MON-RR' NLS_DATE_LANGUAGE='AMERICAN' NLS_SORT='BINARY'
0102000200000000 0


SQL>
SQL>
SQL> alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS';

Session altered.

SQL> show parameter nls_date_format;

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
nls_date_format string DD-MON-YYYY HH24:MI:SS
SQL>
SQL> set autotrace traceonly explain
SQL>
SQL> select owner, count(*)
2 from test_table
3 where last_ddl_time
4 > '01-OCT-07'
5 group by owner
6 /

Execution Plan
----------------------------------------------------------
Plan hash value: 2200191467

---------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
---------------------------------------------------------------------------------
0 SELECT STATEMENT 26 364 174 (8) 00:00:03
1 HASH GROUP BY 26 364 174 (8) 00:00:03
* 2 TABLE ACCESS FULL TEST_TABLE 51556 704K 167 (4) 00:00:03
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("LAST_DDL_TIME">'01-OCT-07')

SQL>
SQL>
SQL> set autotrace off
SQL> drop materialized view test_mv_2;

Materialized view dropped.

SQL> create materialized view test_mv_2
2 build deferred
3 REFRESH WITH ROWID
4 as
5 select * from test_table
6 where last_ddl_time
7 > '01-OCT-07'
8 /

Materialized view created.

SQL>
SQL> variable jno number;
SQL> exec dbms_job.submit(:jno,'DBMS_MVIEW.REFRESH(''TEST_MV_2'',''C'');',sysdate+1,'sysdate+1/24');

PL/SQL procedure successfully completed.

SQL>
SQL> select * from user_jobs order by job;

JOB LOG_USER PRIV_USER SCHEMA_USER
---------- ------------------------------ ------------------------------ ------------------------------
LAST_DATE LAST_SEC THIS_DATE THIS_SEC NEXT_DATE NEXT_SEC TOTAL_TIME B
-------------------- -------- -------------------- -------- -------------------- -------- ---------- -
INTERVAL
--------------------------------------------------------------------------------------------------------------
FAILURES
----------
WHAT
--------------------------------------------------------------------------------------------------------------
NLS_ENV
--------------------------------------------------------------------------------------------------------------
MISC_ENV INSTANCE
---------------------------------------------------------------- ----------
131 HEMANT HEMANT HEMANT
14-AUG-2007 22:28:37 22:28:37 0 N
sysdate+1/24

DBMS_MVIEW.REFRESH('TEST_MV','C');
NLS_LANGUAGE='AMERICAN' NLS_TERRITORY='AMERICA' NLS_CURRENCY='$' NLS_ISO_CURRENCY='AMERICA' NLS_NUMERIC_CHARAC
TERS='.,' NLS_DATE_FORMAT='DD-MON-RR' NLS_DATE_LANGUAGE='AMERICAN' NLS_SORT='BINARY'
0102000200000000 0

132 HEMANT HEMANT HEMANT
14-AUG-2007 22:28:41 22:28:41 0 N
sysdate+1/24

DBMS_MVIEW.REFRESH('TEST_MV_2','C');
NLS_LANGUAGE='AMERICAN' NLS_TERRITORY='AMERICA' NLS_CURRENCY='$' NLS_ISO_CURRENCY='AMERICA' NLS_NUMERIC_CHARAC
TERS='.,' NLS_DATE_FORMAT='DD-MON-YYYY HH24:MI:SS' NLS_DATE_LANGUAGE='AMERICAN' NLS_SORT='BINARY'
0102000200000000 0


SQL>
SQL> show parameter nls_date_format

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
nls_date_format string DD-MON-YYYY HH24:MI:SS
SQL>
SQL> set autotrace on
SQL>
SQL> select owner, count(*)
2 from test_table
3 where last_ddl_time
4 > '01-OCT-07'
5 group by owner
6 /

OWNER COUNT(*)
--------------- ----------
MDSYS 885
DMSYS 189
TSMSYS 3
CTXSYS 339
FLOWS_FILES 12
HR 34
OLAPSYS 720
OUTLN 8
PUBLIC 20073
EXFSYS 281
HEMANT 25
SCOTT 6
SYSTEM 454
DBSNMP 46
OE 127
ORDPLUGINS 10
ORDSYS 1669
PM 26
SH 306
SYSMAN 1321
BI 8
IX 53
XDB 682
FLOWS_010600 1111
SI_INFORMTN_SCH 8
EMA

SYS 22918
WMSYS 242

27 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2200191467

---------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
---------------------------------------------------------------------------------
0 SELECT STATEMENT 26 364 174 (8) 00:00:03
1 HASH GROUP BY 26 364 174 (8) 00:00:03
* 2 TABLE ACCESS FULL TEST_TABLE 51556 704K 167 (4) 00:00:03
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("LAST_DDL_TIME">'01-OCT-07')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
714 consistent gets
704 physical reads
0 redo size
990 bytes sent via SQL*Net to client
391 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
27 rows processed

SQL>
SQL>
SQL> alter session set nls_date_format='DD-MON-RR';

Session altered.

SQL>
SQL> show parameter nls_date_format

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
nls_date_format string DD-MON-YYYY HH24:MI:SS
SQL>
SQL> select owner, count(*)
2 from test_table
3 where last_ddl_time
4 > '01-OCT-07'
5 group by owner
6 /

OWNER COUNT(*)
--------------- ----------
SYS 55


Execution Plan
----------------------------------------------------------
Plan hash value: 3358789471

---------------------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
---------------------------------------------------------------------------------------------
0 SELECT STATEMENT 26 364 57 (2) 00:00:01
1 HASH GROUP BY 26 364 57 (2) 00:00:01
2 TABLE ACCESS BY INDEX ROWID TEST_TABLE 729 10206 56 (0) 00:00:01
* 3 INDEX RANGE SCAN TEST_TBL_NDX 729 3 (0) 00:00:01
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("LAST_DDL_TIME">'01-OCT-07')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
470 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL>
SQL> alter session set "_GBY_HASH_AGGREGATION_ENABLED"=FALSE ;

Session altered.

SQL>
SQL> select owner, count(*)
2 from test_table
3 where last_ddl_time
4 > '01-OCT-07'
5 group by owner
6 /

OWNER COUNT(*)
--------------- ----------
SYS 55


Execution Plan
----------------------------------------------------------
Plan hash value: 2247708158

---------------------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
---------------------------------------------------------------------------------------------
0 SELECT STATEMENT 26 364 57 (2) 00:00:01
1 SORT GROUP BY 26 364 57 (2) 00:00:01
2 TABLE ACCESS BY INDEX ROWID TEST_TABLE 729 10206 56 (0) 00:00:01
* 3 INDEX RANGE SCAN TEST_TBL_NDX 729 3 (0) 00:00:01
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("LAST_DDL_TIME">'01-OCT-07')


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
470 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL>

=================================================================

Shared Nothing or Shared Disks (RAC) ?

Kevin Closson has written a (what some call "long") post on Shared Nothing (EnterpriseDB, DB2) v Shared Disks (Oracle RAC). Read it if you work with RAC or DB2 or both. Having read that, also read modred's response (on the bonglonglong blog) All In the Assumptions.

08 August, 2007

LGWR and 'log file sync waits'

Kevin Closson has posted an insightful note on LGWR -- why sometimes the wait event 'log file sync' does not mean Redo Log File Write (ie I/O) waits. LGWR spends time in waiting on writes that it has posted but also in returning to post the sessions waiting on *it* to the effect that the writes have completed. It is the second part of the LGWR time that is the "processing" time which may not be getting enough CPU !

19 July, 2007

Using ARRAYSIZE to reduce RoundTrips and number of FETCH calls

I have been able to use ARRAYSIZE in SQLPlus to fetch more rows at each call and reduce the number of round-trips between the SQLPlus Client and the database. [see example below]

However, this does not work when fetching rows from one database to another via a Database Link. (SET ARRAYSIZE is an SQL*Plus command, not an Oracle Database / SQL command).

If I were to fetch a few tens of thousands or hundreds of thousands of rows across a DBLink -- eg for a Materialized View refresh -- how can I tune the size of each fetch and reduce the number of round-trips between the two databases ?


For Example : In SQLPlus I run the query

====================================================
select *
from
test_txn_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.03 368 85 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 10284 0.82 4.42 1853 12300 0 154239
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10286 0.84 4.45 2221 12385 0 154239


Rows Row Source Operation
------- ---------------------------------------------------
154239 TABLE ACCESS FULL TEST_TXN_TABLE (cr=12300 pr=1853 pw=0 time=797235 us)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 10284 0.00 0.02
db file scattered read 143 0.48 3.29
SQL*Net message from client 10284 0.35 151.98
db file sequential read 15 0.08 0.20


I see 10,284 round-trips between the client SQLPlus and the server Database.
(and the corresponding count of FETCH calls)
====================================================

====================================================
If I re-run it with ARRAYSIZE 100 and PAGESIZE 600 (Pagesize to
reduce the overhead that SQLPlus spends in formatting page and column
titles every 16 lines ) I get

select *
from
test_txn_table


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.33 348 186 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1544 0.67 2.81 1851 3680 0 154239
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1546 0.67 3.14 2199 3866 0 154239



Rows Row Source Operation
------- ---------------------------------------------------
154239 TABLE ACCESS FULL TEST_TXN_TABLE (cr=3680 pr=1851 pw=0 time=466682 us)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 1544 0.00 0.00
db file scattered read 140 0.08 2.12
SQL*Net message from client 1544 0.68 55.60
SQL*Net more data to client 3976 0.00 0.08
db file sequential read 9 0.05 0.09

I now have only 1,544 round trips
====================================================

How can I get similar savings when the query is to fetch data from one
database to another via a DBLink ?

29 June, 2007

Read Consistency across Statements

A recent blog posting on the SET TRANSACTION READ ONLY reminded me of a note
that I had written in 2004. [note the references to AUM because most of my databases
were still 8i]

Oracle uses Rollback/Undo segments to ensure Read Consistency. Thus, if your query started at time t0, and it takes 30 minutes to run [ie till t30], you expect to see the data as of time t0 throughout the duration of the query. If another user were to issue an update at time t5 you would continue to see the data as of time t4 [or t0] even if your query comes to the respective row at, say, time t25. Read-consistency is provided by Oracle by ensuring that the pre-update image of the row is available in a Rollback/Undo segment [If that other user had committed at time t15, the information may or may not be still available in the Rollback/Undo segment at t25 as undo information relating to a committed update can be overwritten by other transactions -- in which case your query may encounter the "ORA-01555 'snapshot too old or rollback segment too small'" error]. All of this is common knowledge.

However, we sometimes forget that Read Consistency is at the Statement level, NOT the Transaction level. Thus, if you have two different SELECTs reading the same data, one after the other, and a second user updates *and* commits his update after the end of the first SELECT but before the beginning of the second SELECT, your second SELECT would *NOT* see the same data as the first SELECT.


Here are a few examples :

In the Query session :
1* select * from t1
18:10:12 SQL> /
COL1 COL2
---------- -----
1 a
2 b

In the Update session :
18:10:10 SQL> update t1 set col1=3,col2='c' where col1=2
18:10:41 2 /
1 row updated.

Back in the Query session :
1* select * from t1
18:12:38 SQL> /
COL1 COL2
---------- -----
1 a
2 b

So far, so good. As the Update session hasn't yet committed, I can still see the data.
18:13:32 SQL> l
1* select * from t1
18:13:33 SQL> /
COL1 COL2
---------- -----
1 a
2 b
18:13:34 SQL>

Here, I commit my update :
18:14:00 SQL> commit;
Commit complete.
18:14:01 SQL>

The Query session, now re-running the same SQL, sees the new data :
18:15:08 SQL> l
1* select * from t1
18:15:09 SQL> /
COL1 COL2
---------- -----
1 a
3 c
18:15:09 SQL>

As expected !
But, wait .... What if my Query session was actually a long-running report, consisting of multiple SQLs? If, say the first SQL fetched a count of records in a table and then printed that at the beginning of the report, and the last SQL also fetched a count and printed the count again, we might see different counts at the beginning and ending of the report !!
Here's an example of how you could get inconsistent data across different fetches in the same PLSQL block :

In the Query session :
18:53:35 SQL> l
1 declare
2 i number;
3 c1 number;
4 c2 varchar2(10);
5 cursor cursor1 is select col1, col2 from t1;
6 begin
7 i :=1;
8 loop
9 exit when i > 5;
10 open cursor1;
11 loop
12 fetch cursor1 into c1,c2;
13 exit when cursor1%NOTFOUND;
14 dbms_output.put_line('Got the values ' c1' and 'c2' in run 'i
' at ' to_char(sysdate,'HH24:MI:SS'));
15 end loop;
16 close cursor1;
17 dbms_lock.sleep(3);
18 i := i+1;
19 end loop;
20* end;
18:53:36 SQL> /
Got the values 1 and a in run 1 at 18:53:37
Got the values 3 and c in run 1 at 18:53:37
Got the values 1 and a in run 2 at 18:53:40
Got the values 26 and z in run 2 at 18:53:40
Got the values 1 and a in run 3 at 18:53:44
Got the values 26 and z in run 3 at 18:53:44
Got the values 1 and a in run 4 at 18:53:47
Got the values 26 and z in run 4 at 18:53:47
Got the values 1 and a in run 5 at 18:53:50
Got the values 26 and z in run 5 at 18:53:50
PL/SQL procedure successfully completed.
18:53:53 SQL>

In the Update session :
18:53:32 SQL> l
1* update t1 set col1=26, col2='z' where col1=3
18:53:38 SQL> /
1 row updated.
18:53:39 SQL> commit;
Commit complete.
18:53:40 SQL>

As the update was committed at 18:53:39, the Query session started seeing the new data thereafter, even though it was still within one PLSQL Block, within one Loop -- because a new SELECT was executed each time.
So how do we ensure that we get Read Consistent data across statements ? We can use the SET TRANSACTION READ ONLY command.

In the Query session :
19:00:35 SQL> set transaction read only;
Transaction set.
19:00:48 SQL> l
1* set transaction read only
19:00:49 SQL> get afiedt.buf
1 declare
2 i number;
3 c1 number;
4 c2 varchar2(10);
5 cursor cursor1 is select col1, col2 from t1;
6 begin
7 i :=1;
8 loop
9 exit when i > 5;
10 open cursor1;
11 loop
12 fetch cursor1 into c1,c2;
13 exit when cursor1%NOTFOUND;
14 dbms_output.put_line('Got the values ' c1' and 'c2' in run 'i
' at ' to_char(sysdate,'HH24:MI:SS'));
15 end loop;
16 close cursor1;
17 dbms_lock.sleep(3);
18 i := i+1;
19 end loop;
20* end;
19:00:59 SQL> /
Got the values 1 and a in run 1 at 19:01:02
Got the values 26 and z in run 1 at 19:01:02
Got the values 1 and a in run 2 at 19:01:06
Got the values 26 and z in run 2 at 19:01:06
Got the values 1 and a in run 3 at 19:01:09
Got the values 26 and z in run 3 at 19:01:09
Got the values 1 and a in run 4 at 19:01:12
Got the values 26 and z in run 4 at 19:01:12
Got the values 1 and a in run 5 at 19:01:15
Got the values 26 and z in run 5 at 19:01:15
PL/SQL procedure successfully completed.
19:01:18 SQL> select * from t1;
COL1 COL2
---------- -----
1 a
26 z
19:01:28 SQL> commit;
Commit complete.
19:01:43 SQL> select * from t1;
COL1 COL2
---------- -----
1 a
10 j
19:01:48 SQL>

In the Update session :
19:00:01 SQL> update t1 set col1=10, col2='j' where col1=26
19:01:03 2 /
1 row updated.
19:01:04 SQL> commit;
Commit complete.
19:01:05 SQL>

Although I had updated and committed at 19:01:04, my Query session continued to see the "old" or "Consistent" data, even at 19:01:18. I ended my "Read Consistent Across Statements" view by ending my pseudo-transaction in the Query session {It can be ended with either the COMMIT or the ROLLBACK keyword -- as we have not really done any DML in this session it makes no difference whether we issue a COMMIT or a ROLLBACK}. Only after that, could I see the updated records. The SET TRANSACTION READ ONLY "pseudo-transaction" does NOT allow DML [e.g INSERT or UPDATE] statements. Such DML would get an ORA-01456 error :
" 01456, 00000, "may not perform insert/delete/update operation inside a READ ONLY
transaction"
// *Cause: A non-DDL insert/delete/update or select for update operation
// was attempted
// *Action: commit (or rollback) transaction, and re-execute"


The SET TRANSACTION READ ONLY is useful when you have a report consisting of multiple statements and you need to ensure that all executions see the same Read-Consistent image. What is the downside ? You are more likely to hit ORA-01555 errors as Oracle does not guarantee preservation of the pre-image of updated data in the Rollback segments once an updating tranaction has committed. Oracle 9i with Automatic Undo Management and UNDO_RETENTION helps alleviate this problem by making a "best" attempt to keep data available in the Undo Segment for the duration specified by UNDO_RETENTION. Non-AUM 9i and 8i and below do not guarantee any retention of pre-update images in the Rollback Segments. The DBA might create more and larger rollback segments to reduce the probability {but not guarantee} of pre-update information being overwritten in the Rollback Segments.

27 June, 2007

Some observations from the latest Oracle-HP Benchmark

A quick read of the latest Oracle-HP Benchmark (world record TPC-C, using Single Processor, Oracle Enterprise Linux, Oracle 10g SE-One) shows some curious things :
1. Oracle has priced for the Xeon QuadCore Processor at the rate of 1 Processor based on
the single socket justified as "When licensing Oracle programs with Standard Edition One or Standard Edition in the product name, a processor is counted equivalent to an occupied socket" for a 3-year licence. Thus, Oracle used the combination of "Processor, not Core" for SE/SE-One and 50% of List price for a 3-Year Licence.
2. This provides us a hint that we could go for a 2 QuadCore Processor server and pay for only 2 CPUs for the SE-One licence.
The licence costing for 1 Processor for 3 years at 50% did help Oracle achieve "a new world record for price/performance"

3. For the benchmark, the there were only 2 Online Redo Logs but of 12,817MB each ! Huge ! Obviously, log_checkpoint_interval and log_checkpoint_timeout were both 0.
Checkpointing only at the 12GB log switch.
4. DB_BLOCK_SIZE was 2048, not the "standard" 8192.
5. DB_CACHE_SIZE was 2000M but DB_KEEP_CACHE_SIZE was 16,500M.
6. The Benchmark used "ANALYZE TABLE/INDEX ESTIMATE STATISTICS" instead of the normally recommended "DBMS_STATS..."
7. Database Recovery was using the SQLPlus "RECOVER" command, _not_ with RMAN.
8. "Demonstration" of recovery from loss of a log file was actually demonstrating that "
system continued running because the logs are mirrored" . They didn't have to show Recovery in case of loss of online/active redo logs !

25 June, 2007

A Bug in OWI

Tanel Poder has discovered a bug in how Oracle reports wait events -- when he was testing a program reading from an External Table and V$SESSION_WAIT was showing neither the the correct STATE nor the correct EVENT.

06 June, 2007

Oracle Books in the Library

I was going through the books in the Geography section of the local library. An Apress book stood out. It was "Pro Oracle Spatial". I wonder if I should tell the librarians that they have wrongly shelved the book OR that they have correctly classified it.

31 May, 2007

AUTOALLOCATE and Undo Segments

Recently I had a query from another DBA who asked why a monitoring script had reported early in the morning "Threshold : Max_Free less than Twice the largest Extent. Potential error if the DataFile cannot AutoExtend" and listing two Undo Segments with 64MB extents in the UNDOTBS1 tablespace. Apparently, although DBA_EXTENTS did show two Undo Segments with extents of 64MB, the question was whether the nextextent will be bigger than 64m -- "it should be 64K. because the unit of the tablespace "undotbs1" is 64K. parameters "next" & "min_extents" are obsolete in LMT."

My response was : "AUTOALLOCATE means that the Next ExtentSize is automatically determined. Segments start with 64K extents then automatically switch to 1MB extents and 8MB extents and finally 64MB extents as they start growing. That is why it is important to know the size of the last (largest) extent . If the largest extent is already 64MB, then the next WILL be 64MB. At 07:12 in the morning, segments _SYSSMU8$ and _SYSSMU9$ were already using 64MB extents. Since AUM periodically drops and recreates Undo segments, they may get created with 64KB extents. As a Segment grows for large transaction, its Next Extent will autoallocate to 1MB, 8MB and 64MB extents. Do NOT confuse LMT and AutoAllocate. They are two different concepts." I was able to produce a listing showing undo segments with 64KB, 1MB and 8MB extents (by the time I ran the query, there were no 64MB extents around -- because of the "Automatic" Undo Management that Oracle does). The alert had just warned that if there had been a long running transaction which had continued to grow and the undo tablespace datafile did not autoextend, the transaction might have failed -- just as would happen with a table or index segment in a normal tablespace whose datafile does not extent to allocate the next extent for the table and index. To further test this, I just ran a few SQLs on my PC : UNDO SEGMENTS BEFORE RUNNING MY TRANSACTIONS :

SQL> select segment_name, bytes/1024, blocks , count(*)
2 from dba_extents
3 where tablespace_name = 'UNDOTBS1'
4 group by segment_name, bytes/1024, blocks
5 order by 1
6 /
SEGMENT_NAME BYTES/1024 BLOCKS COUNT(*)
--------------- ---------- ---------- ----------
_SYSSMU1$ 64 8 2
1024 128 6
_SYSSMU10$ 64 8 16
1024 128 6
_SYSSMU2$ 64 8 115
1024 128 28
_SYSSMU3$ 64 8 2
1024 128 7
_SYSSMU4$ 64 8 2
1024 128 1
_SYSSMU5$ 64 8 2
1024 128 4
_SYSSMU6$ 64 8 25
1024 128 23
_SYSSMU7$ 64 8 3
1024 128 9
_SYSSMU8$ 64 8 2
1024 128 9
_SYSSMU9$ 64 8 127
1024 128 21
20 rows selected.
SQL> spool off


RUNNING MY TRANSACTIONS (and validating that Oracle does allocate a different Undo Segment for each new transaction) :


SQL>
SQL> REM Begin Transactions here ========================================================
SQL>
SQL>
SQL> drop table hemant.test_txn_table ;
Table dropped.
SQL>
SQL> create table hemant.test_txn_table as
2 select * from dba_objects
3 union
4 select * from dba_objects
5 /
Table created.
SQL>
SQL> update hemant.test_txn_table
2 set object_id=object_id+1, owner=substr(owner,1,3)'_HKC' ;
51538 rows updated.
SQL>
SQL> select t.xidusn, r.name, t.start_time, t.used_ublk, t.used_urec
2 from v$transaction t, v$rollname r
3 where t.xidusn=r.usn
4 /
XIDUSN NAME START_TIME USED_UBLK USED_UREC
---------- ------------------------------ -------------------- ---------- ----------
6 _SYSSMU6$ 05/31/07 23:01:20 635 51538
SQL> commit;
Commit complete.
SQL>
SQL> create index hemant.test_txn_tbl_ndx on hemant.test_txn_table(owner);
Index created.
SQL>
SQL> delete hemant.test_txn_table
2 where owner = 'SYS_HKC';
24692 rows deleted.
SQL>
SQL> select t.xidusn, r.name, t.start_time, t.used_ublk, t.used_urec
2 from v$transaction t, v$rollname r
3 where t.xidusn=r.usn
4 /
XIDUSN NAME START_TIME USED_UBLK USED_UREC
---------- ------------------------------ -------------------- ---------- ----------
7 _SYSSMU7$ 05/31/07 23:01:28 771 24823
SQL> rollback;
Rollback complete.
SQL>
SQL> update hemant.test_txn_table
2 set object_id=object_id+1, owner=substr(owner,1,3)'_HKC2'
3 where owner = 'SYS_HKC';
24692 rows updated.
SQL>
SQL> select t.xidusn, r.name, t.start_time, t.used_ublk, t.used_urec
2 from v$transaction t, v$rollname r
3 where t.xidusn=r.usn
4 /
XIDUSN NAME START_TIME USED_UBLK USED_UREC
---------- ------------------------------ -------------------- ---------- ----------
9 _SYSSMU9$ 05/31/07 23:01:40 833 74076
SQL> commit;
Commit complete.
SQL>
SQL> delete hemant.test_txn_table
2 where rowid in (select rowid from hemant.test_txn_table where rownum <>
SQL> select t.xidusn, r.name, t.start_time, t.used_ublk, t.used_urec
2 from v$transaction t, v$rollname r
3 where t.xidusn=r.usn
4 /
XIDUSN NAME START_TIME USED_UBLK USED_UREC
---------- ------------------------------ -------------------- ---------- ----------
5 _SYSSMU5$ 05/31/07 23:01:50 5 248
SQL> commit;
Commit complete.
SQL>
SQL> delete hemant.test_txn_table
2 where rowid in (select rowid from hemant.test_txn_table where rownum <>
SQL> select t.xidusn, r.name, t.start_time, t.used_ublk, t.used_urec
2 from v$transaction t, v$rollname r
3 where t.xidusn=r.usn
4 /
XIDUSN NAME START_TIME USED_UBLK USED_UREC
---------- ------------------------------ -------------------- ---------- ----------
1 _SYSSMU1$ 05/31/07 23:01:51 3 126
SQL> commit;
Commit complete.
SQL>
SQL> insert into hemant.test_txn_table
2 select * from dba_objects
3 union
4 select * from dba_objects
5 union
6 select * from hemant.test_txn_table ;
102888 rows created.
SQL>
SQL> select t.xidusn, r.name, t.start_time, t.used_ublk, t.used_urec
2 from v$transaction t, v$rollname r
3 where t.xidusn=r.usn
4 /
XIDUSN NAME START_TIME USED_UBLK USED_UREC
---------- ------------------------------ -------------------- ---------- ----------
6 _SYSSMU6$ 05/31/07 23:01:51 294 5186
SQL> commit;
Commit complete.
SQL> spool off



UNDO SEGMENTS *AFTER* RUNNING THOSE TRANSACTIONS :


SQL> select segment_name, bytes/1024, blocks , count(*)
2 from dba_extents
3 where tablespace_name = 'UNDOTBS1'
4 group by segment_name, bytes/1024, blocks
5 order by 1
6 /
SEGMENT_NAME BYTES/1024 BLOCKS COUNT(*)
--------------- ---------- ---------- ----------
_SYSSMU1$ 64 8 2
1024 128 6
_SYSSMU10$ 64 8 16
1024 128 6
_SYSSMU2$ 64 8 107
1024 128 26
_SYSSMU3$ 64 8 2
1024 128 7
_SYSSMU4$ 64 8 33
1024 128 7
_SYSSMU5$ 64 8 2
1024 128 1
_SYSSMU6$ 64 8 25
1024 128 23
_SYSSMU7$ 64 8 3
1024 128 9
_SYSSMU8$ 64 8 2
1024 128 9
_SYSSMU9$ 64 8 127
1024 128 21
20 rows selected.
SQL>


Thus, we can see that different Undo Segments grew at different rates and Extent sizes vary from 64KB to 1MB in the above example but can be 64MB as well.

28 May, 2007

Where's the Problem ? Not in the Database !

Here is a post by Herod T about how the problem can sometimes be somewhere far removed from the database server. In fact, reading the comment hidden in the application code, it seems as if the problem was "known" at sometime in the past but no solution implemented then. Users just lived with the problem for a few months.

22 May, 2007

RollForward from a Cold Backup

As promised earlier here is how I do a Rollforward from a Cold Backup :

FIRST :
This is the status of the Database *BEFORE* the Cold Backup :
SQL>
SQL> col name format a45
SQL> select to_char(sysdate,'DD-MON-RR HH24:MI') from dual;
TO_CHAR(SYSDATE
---------------
22-MAY-07 22:16
SQL>
SQL> select group#, status from v$log order by status;
GROUP# STATUS
---------- ----------------
2 CURRENT
3 INACTIVE
1 INACTIVE
SQL>
SQL> select recid, name, sequence#, to_char(first_time,'DD-MON-RR HH24:MI') ,blocks*block_size
2 from v$archived_log
3 where first_time > trunc(sysdate)
4 order by first_time;
RECID NAME SEQUENCE#
---------- --------------------------------------------- ----------
TO_CHAR(FIRST_T BLOCKS*BLOCK_SIZE
--------------- -----------------
21 C:\OR10G2DB\ARCH\ARC00001_0623282965.001 1
22-MAY-07 22:09 512
22 C:\OR10G2DB\ARCH\ARC00002_0623282965.001 2
22-MAY-07 22:09 175104
SQL>
SQL> select dbid, name, checkpoint_change#, current_scn from v$database;
DBID NAME CHECKPOINT_CHANGE#
---------- --------------------------------------------- ------------------
CURRENT_SCN
-----------
138573118 OR10G2DB 812793
812933
SQL>
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>
SQL> spool off
========================================================================
After the shutdown, I do a COLD BACKUP
{including "online" Redo Log files -- I'll explain why later} of the database.
Once the Cold Backup is completed, I proceed to restart the database and execute some transactions :
--------------------------------------------------------------------------------------------------------------------------
SQL>
SQL> col name format a45
SQL> select to_char(sysdate,'DD-MON-RR HH24:MI') from dual;
TO_CHAR(SYSDATE
---------------
22-MAY-07 22:22
SQL>
SQL> select group#, status from v$log order by status;
GROUP# STATUS
---------- ----------------
2 CURRENT
3 INACTIVE
1 INACTIVE
SQL>
SQL> select recid, name, sequence#, to_char(first_time,'DD-MON-RR HH24:MI') ,blocks*block_size
2 from v$archived_log
3 where first_time > trunc(sysdate)
4 order by first_time;
RECID NAME SEQUENCE#
---------- --------------------------------------------- ----------
TO_CHAR(FIRST_T BLOCKS*BLOCK_SIZE
--------------- -----------------
21 C:\OR10G2DB\ARCH\ARC00001_0623282965.001 1
22-MAY-07 22:09 512
22 C:\OR10G2DB\ARCH\ARC00002_0623282965.001 2
22-MAY-07 22:09 175104
SQL>
SQL> select dbid, name, checkpoint_change#, current_scn from v$database;
DBID NAME CHECKPOINT_CHANGE#
---------- --------------------------------------------- ------------------
CURRENT_SCN
-----------
138573118 OR10G2DB 813231
813465
SQL>
SQL> REM Begin Transactions here =================
SQL>
SQL>
SQL> drop table hemant.test_txn_table ;
Table dropped.
SQL>
SQL> create table hemant.test_txn_table as
2 select * from dba_objects
3 union
4 select * from dba_objects
5 /
Table created.
SQL>
SQL> update hemant.test_txn_table
2 set object_id=object_id+1, owner=substr(owner,1,3)'_HKC' ;
51542 rows updated.
SQL> commit;
Commit complete.
SQL>
SQL> create index hemant.test_txn_tbl_ndx on hemant.test_txn_table(owner);
Index created.
SQL>
SQL> delete hemant.test_txn_table
2 where owner = 'SYS_HKC';
24691 rows deleted.
SQL> rollback;
Rollback complete.
SQL>
SQL> update hemant.test_txn_table
2 set object_id=object_id+1, owner=substr(owner,1,3)'_HKC2'
3 where owner = 'SYS_HKC';
24691 rows updated.
SQL> commit;
Commit complete.
SQL>
SQL> create table last_txn_time (txn_time ) as select sysdate from dual;
Table created.
SQL>
SQL> alter system switch logfile;
System altered.
SQL>
SQL> select group#, status from v$log order by status;
GROUP# STATUS
---------- ----------------
2 ACTIVE
3 ACTIVE
1 CURRENT
SQL>
SQL> select recid, name, sequence#, to_char(first_time,'DD-MON-RR HH24:MI') ,blocks*block_size
2 from v$archived_log
3 where first_time > trunc(sysdate)
4 order by first_time;
RECID NAME SEQUENCE#
---------- --------------------------------------------- ----------
TO_CHAR(FIRST_T BLOCKS*BLOCK_SIZE
--------------- -----------------
21 C:\OR10G2DB\ARCH\ARC00001_0623282965.001 1
22-MAY-07 22:09 512
22 C:\OR10G2DB\ARCH\ARC00002_0623282965.001 2
22-MAY-07 22:09 175104
23 C:\OR10G2DB\ARCH\ARC00003_0623282965.001 3
22-MAY-07 22:10 8186880
RECID NAME SEQUENCE#
---------- --------------------------------------------- ----------
TO_CHAR(FIRST_T BLOCKS*BLOCK_SIZE
--------------- -----------------
24 C:\OR10G2DB\ARCH\ARC00004_0623282965.001 4
22-MAY-07 22:22 8120320
25 C:\OR10G2DB\ARCH\ARC00005_0623282965.001 5
22-MAY-07 22:22 8120320
26 C:\OR10G2DB\ARCH\ARC00006_0623282965.001 6
22-MAY-07 22:22 8120320
RECID NAME SEQUENCE#
---------- --------------------------------------------- ----------
TO_CHAR(FIRST_T BLOCKS*BLOCK_SIZE
--------------- -----------------
27 C:\OR10G2DB\ARCH\ARC00007_0623282965.001 7
22-MAY-07 22:23 8120320
28 C:\OR10G2DB\ARCH\ARC00008_0623282965.001 8
22-MAY-07 22:23 8120320
8 rows selected.
SQL>
SQL> select dbid, name, checkpoint_change#, current_scn from v$database;
DBID NAME CHECKPOINT_CHANGE#
---------- --------------------------------------------- ------------------
CURRENT_SCN
-----------
138573118 OR10G2DB 814864
815153
SQL>
SQL> spool off
============================================================================
I have deliberately created an Index and done DELETEs, INSERTs and ROLLBACKs with UPDATEs to generate noticeable Redo (ArchiveLogs)
The ArchiveLog file is at least ARC00008 (we shall see later that there were two more Archives before the shutdown).

I next restore the Cold Backup _including_ the "online" Redo Log file to verify that the Cold Backup is consistent :
-------------------------------------------------------------------------------------------------------------------------------
SQL>
SQL> col name format a45
SQL> select to_char(sysdate,'DD-MON-RR HH24:MI') from dual;
TO_CHAR(SYSDATE
---------------
22-MAY-07 22:42
SQL>
SQL> select group#, status from v$log order by status;
GROUP# STATUS
---------- ----------------
2 CURRENT
3 INACTIVE
1 INACTIVE
SQL>
SQL> select recid, name, sequence#, to_char(first_time,'DD-MON-RR HH24:MI') ,blocks*block_size
2 from v$archived_log
3 where first_time > trunc(sysdate)
4 order by first_time;
RECID NAME SEQUENCE#
---------- --------------------------------------------- ----------
TO_CHAR(FIRST_T BLOCKS*BLOCK_SIZE
--------------- -----------------
21 C:\OR10G2DB\ARCH\ARC00001_0623282965.001 1
22-MAY-07 22:09 512
22 C:\OR10G2DB\ARCH\ARC00002_0623282965.001 2
22-MAY-07 22:09 175104
SQL>
SQL> select dbid, name, checkpoint_change#, current_scn from v$database;
DBID NAME CHECKPOINT_CHANGE#
---------- --------------------------------------------- ------------------
CURRENT_SCN
-----------
138573118 OR10G2DB 813231
813379
SQL>
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>
SQL> spool off
============================================================================
As you can see, the ControlFile that is restored (from the Cold Backup) is NOT aware of ArchiveLogs beyond ARC00002. Checkpoint SCNs and "Current SCN"s are always incremented at every startup and shutdown so this listing of 813231 and 813379 is slightly ahead of the first listing of 813231 and 813465 (ie the SCN got incremented after the query on v$database and again possibly incremented during the OPEN)
As I have actually incremented SCNs and OPENed the database, I cannot really use this copy. I only did an OPEN to verify that the Restore was consistent and usable.
I will now Restore the Cold Backup *again*.
------------------------------------------------------------------------------------------------------------------------------
Whether I restore or without the "online" Redo Log files now, it doesn't matter --
because I will be doing an OPEN RESETLOGS that is necessitated by the USING BACKUP CONTROLFILE.
---------------------------------------------------------------------------------------------------------------------------------
Here is how I do the RollForward : Notice how the Rollforward is from ARC00003 onwards.
The "using backup controlfile" is important !
See my posting at http://www.freelists.org/archives/oracle-l/05-2007/msg00440.html
---------------------------------------------------------------------------------------------------------------------------------
SQL> startup mount
ORACLE instance started.
Total System Global Area 83886080 bytes
Fixed Size 1247420 bytes
Variable Size 54527812 bytes
Database Buffers 25165824 bytes
Redo Buffers 2945024 bytes
Database mounted.
SQL> recover database using backup controlfile until cancel;
ORA-00279: change 813230 generated at 05/22/2007 22:16:15 needed for thread 1
ORA-00289: suggestion : C:\OR10G2DB\ARCH\ARC00003_0623282965.001
ORA-00280: change 813230 for thread 1 is in sequence #3

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 813860 generated at 05/22/2007 22:22:49 needed for thread 1
ORA-00289: suggestion : C:\OR10G2DB\ARCH\ARC00004_0623282965.001
ORA-00280: change 813860 for thread 1 is in sequence #4
ORA-00278: log file 'C:\OR10G2DB\ARCH\ARC00003_0623282965.001' no longer needed
for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 814287 generated at 05/22/2007 22:22:50 needed for thread 1
ORA-00289: suggestion : C:\OR10G2DB\ARCH\ARC00005_0623282965.001
ORA-00280: change 814287 for thread 1 is in sequence #5
ORA-00278: log file 'C:\OR10G2DB\ARCH\ARC00004_0623282965.001' no longer needed
for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 814599 generated at 05/22/2007 22:22:59 needed for thread 1
ORA-00289: suggestion : C:\OR10G2DB\ARCH\ARC00006_0623282965.001
ORA-00280: change 814599 for thread 1 is in sequence #6
ORA-00278: log file 'C:\OR10G2DB\ARCH\ARC00005_0623282965.001' no longer needed
for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 814620 generated at 05/22/2007 22:23:01 needed for thread 1
ORA-00289: suggestion : C:\OR10G2DB\ARCH\ARC00007_0623282965.001
ORA-00280: change 814620 for thread 1 is in sequence #7
ORA-00278: log file 'C:\OR10G2DB\ARCH\ARC00006_0623282965.001' no longer needed
for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 814656 generated at 05/22/2007 22:23:06 needed for thread 1
ORA-00289: suggestion : C:\OR10G2DB\ARCH\ARC00008_0623282965.001
ORA-00280: change 814656 for thread 1 is in sequence #8
ORA-00278: log file 'C:\OR10G2DB\ARCH\ARC00007_0623282965.001' no longer needed
for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 814864 generated at 05/22/2007 22:23:07 needed for thread 1
ORA-00289: suggestion : C:\OR10G2DB\ARCH\ARC00009_0623282965.001
ORA-00280: change 814864 for thread 1 is in sequence #9
ORA-00278: log file 'C:\OR10G2DB\ARCH\ARC00008_0623282965.001' no longer needed
for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 815052 generated at 05/22/2007 22:23:12 needed for thread 1
ORA-00289: suggestion : C:\OR10G2DB\ARCH\ARC00010_0623282965.001
ORA-00280: change 815052 for thread 1 is in sequence #10
ORA-00278: log file 'C:\OR10G2DB\ARCH\ARC00009_0623282965.001' no longer needed
for this recovery

Specify log: {=suggested filename AUTO CANCEL}
ORA-00279: change 815152 generated at 05/22/2007 22:23:14 needed for thread 1
ORA-00289: suggestion : C:\OR10G2DB\ARCH\ARC00011_0623282965.001
ORA-00280: change 815152 for thread 1 is in sequence #11
ORA-00278: log file 'C:\OR10G2DB\ARCH\ARC00010_0623282965.001' no longer needed
for this recovery

Specify log: {=suggested filename AUTO CANCEL}
CANCEL
Media recovery cancelled.
SQL> alter database open resetlogs;
Database altered.
SQL> select to_char(txn_time,'DD-MON-RR HH24:MI:SS') from last_txn_time;
TO_CHAR(TXN_TIME,'
------------------
22-MAY-07 22:23:12
SQL> select group#, status from v$log order by status;
GROUP# STATUS
---------- ----------------
2 CURRENT
3 UNUSED
1 UNUSED
SQL>
SQL> select recid, name, sequence#, to_char(first_time,'DD-MON-RR HH24:MI') ,blocks*block_size
2 from v$archived_log
3 where first_time > trunc(sysdate)
4 order by first_time;
RECID
----------
NAME
--------------------------------------------------------------------------------
SEQUENCE# TO_CHAR(FIRST_T BLOCKS*BLOCK_SIZE
---------- --------------- -----------------
25
C:\OR10G2DB\ARCH\ARC00001_0623282965.001
1 22-MAY-07 22:09 512
21
C:\OR10G2DB\ARCH\ARC00001_0623282965.001
1 22-MAY-07 22:09 512
RECID
----------
NAME
--------------------------------------------------------------------------------
SEQUENCE# TO_CHAR(FIRST_T BLOCKS*BLOCK_SIZE
---------- --------------- -----------------
22
C:\OR10G2DB\ARCH\ARC00002_0623282965.001
2 22-MAY-07 22:09 175104
23
C:\OR10G2DB\ARCH\ARC00002_0623282965.001
RECID
----------
NAME
--------------------------------------------------------------------------------
SEQUENCE# TO_CHAR(FIRST_T BLOCKS*BLOCK_SIZE
---------- --------------- -----------------
2 22-MAY-07 22:09 175104
24
C:\OR10G2DB\ARCH\ARC00003_0623282965.001
3 22-MAY-07 22:10 373760
SQL> select dbid, name, checkpoint_change#, current_scn from v$database;
DBID NAME CHECKPOINT_CHANGE# CURRENT_SCN
---------- --------- ------------------ -----------
138573118 OR10G2DB 815154 815445
SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit
============================================================================
During the RollForward I applied ArchiveLogs upto ARC00010. I then queried my data
(select to_char(txn_time,'DD-MON-RR HH24:MI:SS') from last_txn_time or
I could have queried "test_txn_table" as well) and verified that I do
have the data created by my transactions *after* the Cold Backup.
You can also see that also Log Sequence Numbers have got Reset, the SCNs
do *NOT* get Reset. 10g's "new" feature about "roll-forward through resetlogs"
is because archivelog filenames contain a Reset_ID as well -- nothing to
do with SCNs. SCNs always are incremented (SCNs like time can only go in
one direction -- although we can actually use Server OS commands to reset
Server Clocks, Oracle relies on SCNs, not timestamps, for ordering transactions).

NOTE : This method of "RollForward" recovery is also covered in Oracle MetaLink Note#271916.1

15 May, 2007

SQL Statement Execution Times

Daniel Fink has posted a blog entry about how Execution Statistics for long running SQLs might not be visible in StatsPack reports if the SQLs end after the StatsPack Snapshots. The general advice is to NOT run StatsPack Snapshots at long intervals ("intervals of 8 hours are meaningless" we frequently hear). However, if there was a long running SQL that began before the first StatsPack Snapshot and ended after the last Snapshot, that SQL wouldn't be seen in the StatsPack Report at all. You would be left wondering "if the server was so busy [and 'sar' did show 'busy'ness] why don't I see any significant SQLs in the Report". You might well miss the SQLs accounting for most of the buffer gets, disk reads, CPU time.

14 May, 2007

"Recovery" in Cold Backup

In a recent email discussion thread on ORACLE-L, I outlined methods
of "recovery" from loss of control files(s) and online redo logs IF
the last shutdown had been a Normal Shutdown.
(eg if ControlFiles and/or RedoLog files have been deleted after the
shutdown --- such a strategy could also be used where the database has
been cloned by copying only the datafiles from cold backup (shutdown normal/immediate).

01 May, 2007

Stress Testing

An analogy of a "stress" test that should NOT be run.

Suppose your current system is a jar containing 1million beans. The mouth of the jar is small. Generally, each user expects to fetch 100 beans. The way he does it is to put his hand in and pick up one bean at a time. Thus, each user has to put his hand into the jar 100 times. If the mouth of the jar is small, only 2 or 3 users can put in their hands simultaneously. Other users have to keep waiting. It is possible that the first user is unable to rapidly put his hand into the jar 100 times because, on occassion, someone else's hand is in the jar. Therefore, some portion of his operation's time will be spent in "waiting".

Now, suppose the new system is still a jar containing 1million beans. However, now the mouth of the jar is much bigger. This means that even if each user has to put his hand in 100times, more hands can go in simultaneously. The number of users waiting to put their hands into the jar is lesser. The probability that the first user has to wait because the mouth of the jar doesn't have enough free space is lesser. He still takes the same amount of time to put in his and and remove a bean 100 times. BUT he spends *less* time waiting for the mouth of the jar to be free. However, if your "stress test" is such that the user expects to fetch 1,000 beans [instead of 100 beans], he has to put his hand into the jar 1,000 times now ! That user will say "the system is slower".


The beans in the jar is the data.
The mouth of the jar is the system capacity [throughput in terms of I/O calls because of storage bandwidth /response time , CPU calls etc that can be handled because of the number of CPUs and the speed of the CPUs]
The hand is the CPU or I/O call that the user makes.
The "single bean" is because each call will fetch a finite amount -- eg an Indexed Read will read 1 block at a time [if it has to read 1,000 datablocks, it has to make 1,000 {actually 2,000+} I/O calls].

The "stress test" should be
a) IF normally 5 users run reports, than have 10 users run reports
b) If the user fetches 100 rows, then he must still be fetching 100 rows -- if he fetches more rows than he does normally, then whatever be the system he still has to make that many more CPU or I/O calls

25 April, 2007

Recovery without UNDO Tablespace DataFiles

If you encounter a situation where an UNDO datafile is missing from a cloned database you can actually drop and recreate the UNDOTBS tablespace with
a) Shutdown immediate
b) Edit the init.ora to unset "undo_management=AUTO" {Oracle then ignores the setting of "undo_tablespace"}
c) Startup
d) Drop Tablespace UNDOTBS
e) Create UNDO Tablespace UNDOTBS datafile ...
f) Shutdown
g) Reset "undo_management=AUTO"
h) Startup Database

19 April, 2007

Programs that expect strings to be of a certain length !

Kevin Closson has posted on his blog a description of how the Oracle OUI encounters segmentation faults when trying to handle a very long Linux kernel name. Apparently, the OUI expects a length of not more than 30 characters but Kevin's team had been building Linux servers with longer names. Now, 30 characters is quite long enough, wouldn't you say ?
This is an example of a mismatch between two different program. Linux's "uname" call does not specify lengths for the uname structures but Oracle, going by experience from 100s of thousands of installations over the years, does not expect a name longer than 30 characters. I wouldn't fault Oracle !

18 April, 2007

UNDO and REDO for INSERTs and DELETEs

Last Sunday, I wrote an note {for developers and analysts within my organisation} on the impact of INSERT and DELETE operations on UNDO and REDO requirements. As the note also included actual SQL statements on a demo table, I could show the Undo size from V$TRANSACTION and the Redo size from V$SESSSTAT.

I included a few options (CTAS, APPEND, NOLGGING, INSERT) and scenarios (new table, without indexes, with 2 indexes). The presence of indexes did make a signifcant difference to the total REDO.


Learnings :
The UNDO for a DELETE is more for than that for an INSERT.
That is because the UNDO for a DELETE is actually the re-Insert of the whole row. So, it has preserve values for each of the columns of the row to be able to "Undo" the DELETE.
Conversely, the UNDO for an INSERT is a Delete of the row by ROWID – which does not have to preserve the values being deleted.

A DELETE causes more REDO than an INSERT
Since the UNDO is also preserved in the REDO stream, this also enlarges the REDO.

The presence of Indexes also increase the amount of UNDO and REDO.
If the Table has one or more Indexes present, an INSERT also has to update the Index(es). Similarly, a DELETE has to remove remove entries from the Index(es). Like the Table, the Index(es) is/are also protected by UNDO and REDO.

DELETEs always cause REDO (ie are always "logging")..
There are NOLOGGING methods of achieving INSERTs but DELETEs are almost always normal DML, protected by Redo {except in the case when achieved by executing TRUNCATEs} See {link to 9i Doc on nologging} for information on NOLOGGING operations.

Normal DML always causes REDO.
Even if a Table has been created or configured with the NOLOGGING option, all normal DML – including INSERTs -- still cause logging. Again see {link to 9i Doc on nologging}


Practical Uses :
When doing large DELETEs, consider if you can drop one or some of the Indexes on
the Table to reduce the REDO

If doing a very large purge of a Table, it can be faster to use this method :
Create an empty new Table and set it to NOLOGGING
Use an INSERT with an APPEND hint to copy selective data from the original Table (the one to be purged) – ie copy the data that is to be retained
Build Indexes on the new Table (or build after the rename in the next step)
Drop the original Table and rename the new table to take the same name as the original Table
Build Indexes if not done earlier
Create Triggers
Recompile dependent objects

When "bulk-copying" data from between two existing Tables (or to a new Table), you can use the same method as above

Rebuild Indexes after bulk deletes

Use the NOLOGGING option when rebuilding Indexes

After all such activities are completed, reset the Tables and Indexes to LOGGING
(although, actually, normal DML would ignore the NOLOGGING attribute on the Table, a table which appears to be NOLOGGING can worry a DBA who knows something about Nologging but not enough !)

Avoid using NOLOGGING operations when Online Backups are running (particularly on OLTP)

Identify all scheduled jobs and adhoc executions which use NOLOGGING and track them in case a Restoration and Recovery of the database needs to be executed as a Recovery through a NOLOGGING job would mark the affected blocks as "logically corrupt" requiring manual re-execution of the NOLOGGING job.
Critical databases could be set to FORCE LOGGING. It is good to periodically check for nologging operations (query V$DATAFILE for UNRECOVERABLE_CHANGE# and UNRECOVERABLE_TIME).