Data Modeling Data Warehousing Data Quality Oracle Database Erwin BI FAQ: Oracle SQL PL Sql Questions Answers

What are the indexes in Oracle ?

An index is a schema object that contains an entry for each value that appears in the indexed column(s) of the table or cluster and provides direct, fast access to rows. Oracle Database supports several types of index:
Normal indexes. (By default, Oracle Database creates B-tree indexes.)
Bitmap indexes, which store rowids associated with a key value as a bitmap
Partitioned indexes, which consist of partitions containing an entry for each value that appears in the indexed column(s) of the table
Function-based indexes, which are based on expressions. They enable you to construct queries that evaluate the value returned by an expression, which in turn may include built-in or user-defined functions.
Domain indexes, which are instances of an application-specific index of type indextype

Q)What is difference between IN and NOT IN ?

SQL: "IN" Function
________________________________________
The IN function helps reduce the need to use multiple OR conditions.
The syntax for the IN function is:
SELECT columns
FROM tables
WHERE column1 in (value1, value2, .... value_n);
This SQL statement will return the records where column1 is value1, value2..., or value_n. The IN function can be used in any valid SQL statement - select, insert, update, or delete.

Example #1
The following is an SQL statement that uses the IN function:
SELECT *
FROM suppliers
WHERE supplier_name in ( 'IBM', 'Hewlett Packard', 'Microsoft');
This would return all rows where the supplier_name is either IBM, Hewlett Packard, or Microsoft. Because the * is used in the select, all fields from the suppliers table would appear in the result set.
It is equivalent to the following statement:
SELECT *
FROM suppliers
WHERE supplier_name = 'IBM'
OR supplier_name = 'Hewlett Packard'
OR supplier_name = 'Microsoft';
As you can see, using the IN function makes the statement easier to read and more efficient.

Example #2
You can also use the IN function with numeric values.
SELECT *
FROM orders
WHERE order_id in (10000, 10001, 10003, 10005);
This SQL statement would return all orders where the order_id is either 10000, 10001, 10003, or 10005.
It is equivalent to the following statement:
SELECT *
FROM orders
WHERE order_id = 10000
OR order_id = 10001
OR order_id = 10003
OR order_id = 10005;

Example #3 using "NOT IN"
The IN function can also be combined with the NOT operator.
For example,
SELECT *
FROM suppliers
WHERE supplier_name not in ( 'IBM', 'Hewlett Packard', 'Microsoft');
This would return all rows where the supplier_name is neither IBM, Hewlett Packard, or Microsoft. Sometimes, it is more efficient to list the values that you do not want, as opposed to the values that you do want.

Q)Understanding how NULLS affect the IN,NOT IN , EXISTS and NOT EXISTS?

Answer 1:

On the surface, it may appear that the SQL clauses IN and EXISTS are interchangeable. However, they're quite different in how they handle NULL values and may give different results. The problem comes from the fact that, in an Oracle database, a NULL value means unknown, so any comparison or operation against a NULL value is also NULL, and any test that returns NULL is always ignored. For example, neither one of these queries return any rows:

select 'true' from dual where 1 = null; = No Result
select 'true' from dual where 1 != null; = No Result

The value 1 is neither equal nor not equal to NULL. Only IS NULL would return true on a NULL value and return a row.

select 'true' from dual where 1 is null; ; = No Result
select 'true' from dual where null is null; Return ‘true’

When you use IN, you're telling SQL to take a value and compare it against every value or set of values in a list using =. If any NULL values exist, a row will not be returned--even if both values are NULL.

select 'true' from dual where null in (null); = No Result
select 'true' from dual where (null,null) in ((null,null)); = No Result
select 'true' from dual where (1,null) in ((1,null)); = No Result

An IN is functionally equivalent to the = ANY clause:

select 'true' from dual where null = ANY (null); ); = No Result
select 'true' from dual where (null,null) = ANY ((null,null)); ); = No Result
select 'true' from dual where (1,null) = ANY ((1,null)); ); = No Result

When you use an equivalent form of EXISTS, SQL counts rows and ignores the value(s) in the subquery--even if you return NULL.

select 'true' from dual where exists (select null from dual); Return True
select 'true' from dual where exists (select 0 from dual where null is null); Return True

The IN and EXISTS are logically the same. The IN clause compares values returned by the subquery and filters out rows in the outer query; the EXISTS clause compares values and filters out rows inside the subquery. In the case of NULL values, the resulting set of rows is the same.

select ename from emp where empno in (select mgr from emp); Return Manager name
select ename from emp e where exists (select 0 from emp where mgr = e.empno); Return
Manager name

But problems arise when the logic is reversed to use NOT IN and NOT EXISTS, which return different sets of rows (the first query returns 0 rows; the second returns the intended data--they aren't the same query):
select ename from emp where empno not in (select mgr from emp); No rows returned because subquery return NULL

select ename from emp e where not exists (select 0 from emp where mgr =
e.empno);
The NOT IN clause is virtually equivalent to comparing each value with = and failing if any test is FALSE or NULL. For example:

select 'true' from dual where 1 not in (null,2);
select 'true' from dual where 1 != null and 1 != 2;
select 'true' from dual where (1,2) not in ((2,3),(2,null));
select 'true' from dual where (1,null) not in ((1,2),(2,3));

These queries don't return any rows. The second is more obvious, 1 != NULL is NULL, so the whole WHERE condition is false for that row. While these would work:
select 'true' from dual where 1 not in (2,3);
select 'true' from dual where 1 != 2 and 1 != 3;
You can still use the NOT IN query from before, as long as you prevent NULL from being returned in the results (again, these both work, but I'm assuming empno is not null, which is a good assumption in this case):
select ename from emp where empno not in (select mgr from emp where mgr is not
null);
select ename from emp where empno not in (select nvl(mgr,0) from emp);
By understanding the difference between IN, EXISTS, NOT IN, and NOT EXISTS, you can avoid a very common problem when NULLs appear in the data of a subquery.

Answer2 :

Resulting in:
EMP_NBR EMP_NAME MGR_NBR
1 DON 5
2 HARI 5
3 RAMESH 5
4 JOE 5
5 DENNIS NULL
6 NIMISH 5
7 JESSIE 5
8 KEN 5
9 AMBER 5
10 JIM 5
Now, the aim is to find all those employees who are not managers. Let’s see how we can achieve that by using the “NOT IN” vs the “NOT EXISTS” clause.
NOT IN
SQL> select count(*) from emp_master where emp_nbr not in ( select mgr_nbr from emp_master );
COUNT(*)
———-
0
This means that everyone is a manager…hmmm, I wonder whether anything ever gets done in that case
NOT EXISTS
SQL> select count(*) from emp_master T1 where not exists ( select 1 from emp_master T2 where t2.mgr_nbr = t1.emp_nbr );
COUNT(*)
———-
9
Now there are 9 people who are not managers. So, you can clearly see the difference that NULL values make and since NULL != NULL in SQL, the NOT IN clause does not return any records back. (in MS SQL Server, depending upon the ANSI NULLS setting, the behavior can be altered but this post only talks about the behavior that is same in Oracle, DB2 LUW and MS SQL Server).
Performance implications:
When using “NOT IN”, the query performs nested full table scans, whereas for “NOT EXISTS”, query can use an index within the sub-query.
Another Optional Method
Another way of doing this is to use an outer join and check for NULL values in the other table:
SELECT COUNT(*)
FROM EMP_MASTER T1
LEFT OUTER JOIN EMP_MASTER T2
ON T1.EMP_NBR = T2.MGR_NBR
WHERE T2.MGR_NBR IS NULL
/
Of course, there should be other selection criteria as well (possibly a range search criteria, an equality SARG (searchable argument) criteria etc.) to help improve the selectivity besides just the NOT EXISTS clause.

Answer 3:

What's the difference between NOT IN and NOT EXISTS in SQL? Which is better to use in SQL queries for the performance aspects?

> EXPERT RESPONSE

For the first question, let's assume we're talking about the difference in syntax between NOT IN and NOT EXISTS when both have subqueries.

Here are examples of the syntax:

select client_name
from clients
where client_id not in
( select client_id
from clientwarnings )
select client_name
from clients as x
where not exists
( select client_id
from clientwarnings
where client_id = x.client_id )
In the NOT IN example, the subquery identifies a set of client_ids, against which each client_id in the clients table can be compared. In the NOT EXISTS example, the subquery is correlated and can be thought of as doing a lookup for each client_id in the clients table.

Your second question asks which is preferred for performance reasons. My answer is: it's your call. Don't trust your instinct, because it might be wrong.

My guess would be that the NOT IN query has to build a list for the subquery, which is slow, but then it can do in-memory compares, which is fast, while the NOT EXISTS query can use matching indexes, which are faster than building a memory table ... As you can see, I am easily confused.

And what about the LEFT OUTER JOIN testing for IS NULL? A perfectly good alternate solution:

select c.client_name
from clients as c
left outer
join clientwarnings as cw
on c.client_id = cw.client_id
where cw.client_id is null
Answer 4 :
When coding a SQL statement with tables in master-detail relationships, it's common to have to decide whether to write the query using the WHERE EXISTS (. . .) clause or the WHERE value IN (. . .) clause. You may resist using WHERE EXISTS because it has the awkward syntax of returning a value, which you always ignore.
However, there's a difference when using rule-based optimization. You can determine the performance of a rule-based query by understanding which table is the driving table and how many rows each part returns.
When you write a query using the IN clause, you're telling the rule-based optimizer that you want the inner query to drive the outer query (think: IN = inside to outside). For example, to query the 14-row EMP table for the direct reports to the employee KING, you could write the following:
select ename from emp e
where mgr in (select empno from emp where ename = 'KING');
Here's the EXPLAIN PLAN for this query:
OBJECT OPERATION
---------- ----------------------------------------
SELECT STATEMENT()
NESTED LOOPS()
EMP TABLE ACCESS(FULL)
EMP TABLE ACCESS(BY INDEX ROWID)
PK_EMP INDEX(UNIQUE SCAN)
This query is virtually equivalent to this:
select e1.ename from emp e1,(select empno from emp where ename = 'KING') e2
where e1.mgr = e2.empno;
You can write the same query using EXISTS by moving the outer query column to a subquery condition, like this:
select ename from emp e
where exists (select 0 from emp where e.mgr = empno and ename = 'KING');
When you write EXISTS in a where clause, you're telling the optimizer that you want the outer query to be run first, using each value to fetch a value from the inner query (think: EXISTS = outside to inside).
The EXPLAIN PLAN result for the query is:
OBJECT OPERATION
---------- ----------------------------------------
SELECT STATEMENT()
FILTER()
EMP TABLE ACCESS(FULL)
EMP TABLE ACCESS(BY INDEX ROWID)
PK_EMP INDEX(UNIQUE SCAN)
This is virtually similar to the PL/SQL code:
set serveroutput on;
declare
l_count integer;
begin
for e in (select mgr,ename from emp) loop
select count(*) into l_count from emp
where e.mgr = empno and ename = 'KING';
if l_count != 0 then
dbms_output.put_line(e.ename);
end if;
end loop;
end;
To determine which clause offers better performance in rule-based optimization, consider how many rows the inner query will return in comparison to the outer query. In many cases, EXISTS is better because it requires you to specify a join condition, which can invoke an INDEX scan. However, IN is often better if the results of the subquery are very small. You usually want to run the query that returns the smaller set of results first.
Some people avoid the EXISTS clause because of the requirement to return a result from the query--even though the result is never used. Depending on personal style, people often use 'X,' 1, 0, or null. From looking at the EXPLAIN PLAN output, it appears that the optimizer throws out whatever value you enter and uses 0 all the time. Many developers get into the habit of always entering some constant value.

Data Modeling Data Warehousing Data Quality Oracle Database Erwin BI FAQ

Friday, 8 January 2010

Oracle SQL PL Sql Questions Answers

No comments:

Blog Archive

About Me