Sunday, March 23, 2008

A simple index test using Oracle 10g free edition

Background

I downloaded oracle 10g express edition from http://www.oracle.com/technology/products/database/xe/index.html.

I have a Pentium 4 winxp machine with 1gb ram and 320 gb hdd.

The download and install was a breeze.

It installs as a service and does start up with windows. However it can be stopped from starting up with windows, by using the services tool in the control panel.

When up and running the oracle 10g took around 250 mb ram on my pc.

I do not have toad, but the 10g express edition install comes with SQL command line and also a HTML based UI. The HTML UI though not comparable to Toad, does work alright with the basic tasks.

I wanted to test out a single column index and see if it mad any difference in searches.

The database and data setup

The test table and sequence is as below:-


Sequence details

CREATE SEQUENCE "TASK_LIST_SEQ"
MINVALUE 1 MAXVALUE 999999999999999999999999999
INCREMENT BY 1 START WITH 1 CACHE 20 NOORDER NOCYCLE
/

Table details

CREATE TABLE "TASK_LIST"
( "TASK_ID" NUMBER(18,0) NOT NULL ENABLE,
"TASK_NAME" VARCHAR2(100) NOT NULL ENABLE,
"TASK_STATUS" NUMBER(1,0) NOT NULL ENABLE,
CONSTRAINT "TASK_LIST_PK" PRIMARY KEY ("TASK_ID") ENABLE
)
/
CREATE OR REPLACE TRIGGER "BI_TASK_LIST"
before insert on "TASK_LIST"
for each row
begin
select "TASK_LIST_SEQ".nextval into :NEW.TASK_ID from dual;
end;

/
ALTER TRIGGER "BI_TASK_LIST" ENABLE
/


1) The table holds tasks with their id's, name and status.
2) The id is the primary key.
3) The sequence helps with the id generation automatically with the help of a trigger.
4) The column of interest is TASK_STATUS. I have decided to have status of 1 as complete and 0 as not complete.

Now we will do the following with the help of a pl/sql block.

1) Insert 200000 tasks with status 1(complete)
2) Insert 30000 tasks with status 0(not complete).

The code to do so is as below:-


--Insert 200000 tasks with status 1(complete)
declare
task_name_var varchar2(100);
begin
for i in 1..200000 loop
task_name_var := 'Task ' || i;
--DBMS_OUTPUT.PUT_LINE('task_name_var : ' || task_name_var);
Insert into
task_list(task_name, task_status)
values(task_name_var,1);
end loop;
end;

--Insert 30000 tasks with status 0(not complete)
declare
task_name_var varchar2(100);
begin
for i in 200001..230000 loop
task_name_var := 'Task ' || i;
--DBMS_OUTPUT.PUT_LINE('task_name_var : ' || task_name_var);
Insert into
task_list(task_name, task_status)
values(task_name_var,1);
end loop;
end;


The Tests without any index

Test not complete tasks

do a select count for all task not complete

select count(*) from task_list tl where tl.task_status = 0;

The explain plan (click on the image for greater resolution)



Test complete tasks

do a select count for all task complete

select count(*) from task_list tl where tl.task_status =1;

The explain plan (click on the image for greater resolution)



The Tests with index

Add the Index

create index "TASK_STATUS_IDX1"
on "TASK_LIST" ("TASK_STATUS")
/

Test not complete tasks

do a select count for all task not complete

select count(*) from task_list tl where tl.task_status = 0;

The explain plan (click on the image for greater resolution)



Test complete tasks

do a select count for all task complete

select count(*) from task_list tl where tl.task_status =1;

The explain plan (click on the image for greater resolution)



Conclusion

Note:- The number of rows for task complete(task_status - 1) was 200000 and for task not complete(task_status - 0) was 30000.

1) Without the index the time taken by both searches was 3 and cost was 234
2) With the index the time taken by the task complete(task_status - 1) search was 2 and cost was 131
3) With the index the time taken by the task not complete(task_status - 0) search was 1 and cost was 61.
4) The index on task_status column did improve performance.
5) The performance gain was greater for the task_status value with less number of rows i.e. task not complete(task_status - 0).