الانتقال إلى المحتوى

شرح UNION ALL و UNION و MINUS و INTERSECT


omar-alreyati

Recommended Posts

الاخوة الاعزاء : تحية طيبة وبعد

أرجو منكم التكرم بشرح UNION ALL و UNION و MINUS و INTERSECT ان أمكن مع توضيح الفائده من استخدامه مع الامثلة التي توضح ذلك .

راجياً منكم الرد بالسرعة الممكنة .
وشكراً

رابط هذا التعليق
شارك

السلام عليكم
UNION - Combine the unique rows returned by 2 SELECT statements


UNION ALL - Combine the rows returned by 2 SELECT statements (including all duplicates)

INTERSECT - Return only those rows that are in *both* SELECT statements

MINUS - Return the rows that are in the first SELECT but not the second

To combine more than two SELECTs simply nest the expressions

SELECT expr1 UNION (SELECT expr2 UNION SELECT expr3)

رابط هذا التعليق
شارك

اليك الشرح مع امثلة

UNION ALL

The UNION ALL operator merges the result sets of two component queries. This operation returns rows retrieved by either of the component queries. The following example illustrates the UNION ALL operation: 

SELECT CUST_NBR, NAME
FROM CUSTOMER WHERE REGION_ID = 5
UNION ALL
SELECT C.CUST_NBR, C.NAME
FROM CUSTOMER C
WHERE C.CUST_NBR IN (SELECT O.CUST_NBR
                    FROM CUST_ORDER O, EMPLOYEE E
                    WHERE O.SALES_EMP_ID = E.EMP_ID AND E.LNAME = 'MARTIN');

 CUST_NBR NAME
---------- ------------------------------
        1 Cooper Industries
        2 Emblazon Corp.
        3 Ditech Corp.
        4 Flowtech Inc.
        5 Gentech Industries
        4 Flowtech Inc.
        8 Zantech Inc.

7 rows selected.

As we can see from the result set, there is one customer, which is retrieved by both the SELECTs, and therefore appears twice in the result set. The UNION ALL operator simply merges the output of its component queries, without caring about any duplicates in the final result set.



UNION

The UNION operator returns all distinct rows retrieved by two component queries. The UNION operation eliminates duplicates while merging rows retrieved by either of the component queries. The following example illustrates the UNION operation: 

SELECT CUST_NBR, NAME
FROM CUSTOMER WHERE REGION_ID = 5
UNION
SELECT C.CUST_NBR, C.NAME
FROM CUSTOMER C
WHERE C.CUST_NBR IN (SELECT O.CUST_NBR
                    FROM CUST_ORDER O, EMPLOYEE E
                    WHERE O.SALES_EMP_ID = E.EMP_ID AND E.LNAME = 'MARTIN');

 CUST_NBR NAME
---------- ------------------------------
        1 Cooper Industries
        2 Emblazon Corp.
        3 Ditech Corp.
        4 Flowtech Inc.
        5 Gentech Industries
        8 Zantech Inc.

6 rows selected.

This query is a modification of the previous query; the keywords UNION ALL have been replaced with UNION. Notice that the result set contains only distinct rows (no duplicates). To eliminate duplicate rows, a UNION operation needs to do some extra tasks as compared to the UNION ALL operation. These extra tasks include sorting and filtering the result set. If we observe carefully, we will notice that the result set of the UNION ALL operation is not sorted, whereas the result set of the UNION operation is sorted. These extra tasks introduce a performance overhead to the UNION operation. A query involving UNION will take extra time compared to the same query with UNION ALL, even if there are no duplicates to remove. Therefore, unless we have a valid need to retrieve only distinct rows, we should use UNION ALL instead of UNION for better performance.



INTERSECT

INTERSECT returns only the rows retrieved by both component queries. Compare this with UNION, which returns the rows retrieved by any of the component queries. If UNION acts like 'OR', INTERSECT acts like 'AND'. For example: 

SELECT CUST_NBR, NAME
FROM CUSTOMER WHERE REGION_ID = 5
INTERSECT
SELECT C.CUST_NBR, C.NAME
FROM CUSTOMER C
WHERE C.CUST_NBR IN (SELECT O.CUST_NBR
                    FROM CUST_ORDER O, EMPLOYEE E
                    WHERE O.SALES_EMP_ID = E.EMP_ID AND E.LNAME = 'MARTIN');

 CUST_NBR NAME
---------- ------------------------------
        4 Flowtech Inc.

As we saw earlier, "Flowtech Inc." was the only customer retrieved by both SELECT statements. Therefore, the INTERSECT operator returns just that one row.



MINUS

MINUS returns all rows from the first SELECT that are not also returned by the second SELECT. For example: 

SELECT CUST_NBR, NAME
FROM CUSTOMER WHERE REGION_ID = 5
MINUS
SELECT C.CUST_NBR, C.NAME
FROM CUSTOMER C
WHERE C.CUST_NBR IN (SELECT O.CUST_NBR
                    FROM CUST_ORDER O, EMPLOYEE E
                    WHERE O.SALES_EMP_ID = E.EMP_ID AND E.LNAME = 'MARTIN');

 CUST_NBR NAME
---------- ------------------------------
        1 Cooper Industries
        2 Emblazon Corp.
        3 Ditech Corp.
        5 Gentech Industries

You might wonder why we don't see "Zantech Inc." in the output. An important thing to note here is that the execution order of component queries in a set operation is from top to bottom. The results of UNION, UNION ALL, and INTERSECT will not change if we alter the ordering of component queries. However, the result of MINUS will be different if we alter the order of the component queries. If we rewrite the previous query by switching the positions of the two SELECTs, we get a completely different result: 

SELECT C.CUST_NBR, C.NAME
FROM CUSTOMER C
WHERE C.CUST_NBR IN (SELECT O.CUST_NBR
                    FROM CUST_ORDER O, EMPLOYEE E
                    WHERE O.SALES_EMP_ID = E.EMP_ID AND E.LNAME = 'MARTIN')
MINUS
SELECT CUST_NBR, NAME
FROM CUSTOMER WHERE REGION_ID = 5;

 CUST_NBR NAME
---------- ------------------------------
        8 Zantech Inc.

The row for "Flowtech Inc." is returned by both queries, so in our first MINUS example the first component query adds "Flowtech Inc." to the result set while the second component query removes it. The second example turns the MINUS operation around. The first component query adds "Flowtech Inc." and "Zantech Inc." to the result set. The second component query specifies rows to subtract. One of the rows to subtract is "Flowtech Inc.", leaving "Zantech Inc." as the sole remaining row. 

 In a MINUS operation, rows may be returned by the second SELECT that are not also returned by the first. These rows are not included in the output. 

Using Set Operations to Compare Two Tables
Developers, and even DBAs, occasionally need to compare the contents of two tables to determine whether the tables contain the same data. The need to do this is especially common in test environments, as developers may want to compare a set of data generated by a program under test with a set of "known good" data. Comparison of tables is also useful for automated testing purposes, when we have to compare actual results with a given set of expected results. SQL's set operations provide an interesting solution to this problem of comparing two tables. 

The following query uses both MINUS and UNION ALL to compare two tables for equality. The query depends on each table having either a primary key or at least one unique index. 

(SELECT * FROM CUSTOMER_KNOWN_GOOD MINUS
SELECT * FROM CUSTOMER_TEST) UNION ALL
(SELECT * FROM CUSTOMER_TEST MINUS
SELECT * FROM CUSTOMER_KNOWN_GOOD); Let's talk a bit about how this query works. We can look at it as the union of two compound queries. The parentheses ensure that both MINUS operations take place first before the UNION ALL operation is performed. The result of the first MINUS query will be those rows in CUSTOMER_KNOWN_GOOD that are not also in CUSTOMER_TEST. The result of the second MINUS query will be those rows in CUSTOMER_TEST that are not also in CUSTOMER_KNOWN_GOOD. The UNION ALL operator simply combines these two result sets for convenience. If no rows are returned by this query, then we know that both tables have identical rows. Any rows returned by this query represent differences between the CUSTOMER_TEST and CUSTOMER_KNOWN_GOOD tables.

If the possibility exists for one or both tables to contain duplicate rows, we must use a more general form of this query in order to test two tables for equality. This more general form uses row counts to detect duplicates:

(SELECT C1.*,COUNT(*) 
FROM CUSTOMER_KNOWN_GOOD GROUP BY C1.CUST_NBR, C1.NAME...
MINUS
SELECT C2.*, COUNT(*)
FROM CUSTOMER_TEST C2
GROUP BY C2.CUST_NBR, C2.NAME...)
UNION ALL
(SELECT C3.*,COUNT(*) 
FROM CUSTOMER_TEST C3
GROUP BY C3.CUST_NBR, C3.NAME...
MINUS
SELECT C4.*, COUNT(*)
FROM CUSTOMER_KNOWN_GOOD C4
GROUP BY C4.CUST_NBR, C4.NAME...)

This query is getting complex! The GROUP BY clause (see Chapter 4) for each SELECT must list all columns for the table being selected. Any duplicate rows will be grouped together, and the count will reflect the number of duplicates. If the number of duplicates is the same in both tables, the MINUS operations will cancel those rows out. If any rows are different, or if any occurrence counts are different, the resulting rows will be reported by the query. 

Let's look at an example to illustrate how this query works. We'll start with the following tables and data:

DESC CUSTOMER_KNOWN_GOOD
Name                         Null?    Type
---------------------------- -------- ----------------
CUST_NBR                     NOT NULL NUMBER(5)
NAME                         NOT NULL VARCHAR2(30)

SELECT * FROM CUSTOMER_KNOWN_GOOD; CUST_NBR NAME
----------- ------------------------------
         1 Sony
         1 Sony
         2 Samsung
         3 Panasonic
         3 Panasonic
         3 Panasonic

6 rows selected.

DESC CUSTOMER_TEST
Name                         Null?    Type
---------------------------- -------- ----------------
CUST_NBR                     NOT NULL NUMBER(5)
NAME                         NOT NULL VARCHAR2(30)

SELECT * FROM CUSTOMER_TEST; CUST_NBR NAME
----------- ------------------------------
         1 Sony
         1 Sony
         2 Samsung
         2 Samsung
         3 Panasonic

As we can see the CUSTOMER_KNOWN_GOOD and CUSTOMER_TEST tables have the same structure, but different data. Also notice that none of these tables has a primary or unique key; there are duplicate records in both. The following SQL will compare these two tables effectively:

(SELECT C1.*, COUNT(*)
FROM CUSTOMER_KNOWN_GOOD C1
GROUP BY C1.CUST_NBR, C1.NAME
MINUS
SELECT C2.*, COUNT(*)
FROM CUSTOMER_TEST C2
GROUP BY C2.CUST_NBR, C2.NAME)
UNION ALL
(SELECT C3.*, COUNT(*)
FROM CUSTOMER_TEST C3
GROUP BY C3.CUST_NBR, C3.NAME
MINUS
SELECT C4.*, COUNT(*)
FROM CUSTOMER_KNOWN_GOOD C4
GROUP BY C4.CUST_NBR, C4.NAME);

  CUST_NBR NAME                             COUNT(*)
----------- ------------------------------ ----------
         2 Samsung                                 1
         3 Panasonic                               3
         2 Samsung                                 2
         3 Panasonic                               1

These results indicate that one table (CUSTOMER_KNOWN_GOOD) has one record for "Samsung", whereas the second table (CUSTOMER_TEST) has two records for the same customer. Also, one table (CUSTOMER_KNOWN_GOOD) has three records for "Panasonic", whereas the second table (CUSTOMER_TEST) has one record for the same customer. Both the tables have the same number of rows (two) for "Sony", and therefore "Sony" doesn't appear in the output.

TIP:   Duplicate rows are not possible in tables that have a primary key or at least one unique index. Use the short form of the table comparison query for such tables.



Using NULLs in Compound Queries

We discussed union compatibility conditions at the beginning of this chapter. The union compatibility issue gets interesting when NULLs are involved. As we know, NULL doesn't have a datatype, and NULL can be used in place of a value of any datatype. If we purposely select NULL as a column value in a component query, Oracle no longer has two datatypes to compare in order to see whether the two component queries are compatible. For character columns, this is no problem. For example:

SELECT 1 NUM, 'DEFINITE' STRING FROM DUAL UNION
SELECT 2 NUM, NULL STRING FROM DUAL; NUM STRING
---------- --------
1 DEFINITE
2

Notice that Oracle considers the character string 'DEFINITE' from the first component query to be compatible with the NULL value supplied for the corresponding column in the second component qery. However, if a NUMBER or a DATE column of a component query is set to NULL, we must explicitly tell Oracle what "flavor" of NULL to use. Otherwise, we'll encounter errors. For example:

SELECT 1 NUM, 'DEFINITE' STRING FROM DUAL UNION
SELECT NULL NUM, 'UNKNOWN' STRING FROM DUAL; SELECT 1 NUM, 'DEFINITE' STRING FROM DUAL *
ERROR at line 1:
ORA-01790: expression must have same datatype as corresponding expression

Note that the use of NULL in the second component query causes a datatype mismatch between the first column of the first component query, and the first column of the second component query. Using NULL for a DATE column causes the same problem, as in the following example:

SELECT 1 NUM, SYSDATE DATES FROM DUAL UNION
SELECT 2 NUM, NULL DATES FROM DUAL; SELECT 1 NUM, SYSDATE DATES FROM DUAL *
ERROR at line 1:
ORA-01790: expression must have same datatype as corresponding expression

In these cases, we need to cast the NULL to a suitable datatype to fix the problem, as in the following examples:

SELECT 1 NUM, 'DEFINITE' STRING FROM DUAL UNION
SELECT TO_NUMBER(NULL) NUM, 'UNKNOWN' STRING FROM DUAL; NUM STRING
---------- --------
1 DEFINITE
UNKNOWN

SELECT 1 NUM, SYSDATE DATES FROM DUAL UNION
SELECT 2 NUM, TO_DATE(NULL) DATES FROM DUAL; NUM DATES
---------- ---------
1 06-JAN-02
2

This problem of union compatibility when using NULLs is encountered in Oracle8i. However, there is no such problem in Oracle9i, as we can see in the following examples generated from an Oracle9i database:

SELECT 1 NUM, 'DEFINITE' STRING FROM DUAL UNION
SELECT NULL NUM, 'UNKNOWN' STRING FROM DUAL; NUM STRING
---------- --------
1 DEFINITE
UNKNOWN

SELECT 1 NUM, SYSDATE DATES FROM DUAL UNION
SELECT 2 NUM, NULL DATES FROM DUAL; NUM DATES
---------- ---------
1 06-JAN-02
2

Oracle9i is smart enough to know which flavor of NULL to use in a compound query.

Rules and Restrictions on Set Operations
Other than the union compatibility conditions discussed at the beginning of the chapter, there are some other rules and restrictions that apply to the set operations. These rules and restrictions are as follows:

Column names for the result set are derived from the first SELECT:

SELECT CUST_NBR "Customer ID", NAME "Customer Name"
FROM CUSTOMER WHERE REGION_ID = 5
UNION
SELECT C.CUST_NBR, C.NAME
FROM CUSTOMER C
WHERE C.CUST_NBR IN (SELECT O.CUST_NBR
FROM CUST_ORDER O, EMPLOYEE E
WHERE O.SALES_EMP_ID = E.EMP_ID AND E.LNAME = 'MARTIN');

Customer ID Customer Name
----------- ----------------------
1 Cooper Industries
2 Emblazon Corp.
3 Ditech Corp.
4 Flowtech Inc.
5 Gentech Industries
8 Zantech Inc.

6 rows selected.

Although both SELECTs use column aliases, the result set takes the column names from the first SELECT. The same thing happens when we create a view based on a set operation. The column names in the view are taken from the first SELECT:

CREATE VIEW V_TEST_CUST AS SELECT CUST_NBR "Customer ID", NAME "Customer Name"
FROM CUSTOMER WHERE REGION_ID = 5
UNION
SELECT C.CUST_NBR, C.NAME
FROM CUSTOMER C
WHERE C.CUST_NBR IN (SELECT O.CUST_NBR
FROM CUST_ORDER O, EMPLOYEE E
WHERE O.SALES_EMP_ID = E.EMP_ID AND E.LNAME = 'MARTIN');

View created.

DESC V_TEST_CUST
Name Null? Type
------------------------------- -------- ----
Customer_ID NUMBER
Customer_Name VARCHAR2(45)

If we want to use ORDER BY in a query involving set operations, we must place the ORDER BY at the end of the entire statement. The ORDER BY clause can appear only once at the end of the compound query. The component queries can't have individual ORDER BY clauses. For example:

SELECT CUST_NBR, NAME
FROM CUSTOMER WHERE REGION_ID = 5
UNION
SELECT EMP_ID, LNAME
FROM EMPLOYEE WHERE LNAME = 'MARTIN'
ORDER BY CUST_NBR;

CUST_NBR NAME
---------- ---------------------
1 Cooper Industries
2 Emblazon Corp.
3 Ditech Corp.
4 Flowtech Inc.
5 Gentech Industries
7654 MARTIN 6 rows selected.

Note that the column name used in the ORDER BY clause of this query is taken from the first SELECT. We couldn't order these results by EMP_ID. If we attempt to ORDER BY EMP_ID, we will get an error, as in the following example:

SELECT CUST_NBR, NAME
FROM CUSTOMER WHERE REGION_ID = 5
UNION
SELECT EMP_ID, LNAME
FROM EMPLOYEE WHERE LNAME = 'MARTIN'ORDER BY EMP_ID;
ORDER BY EMP_ID
*
ERROR at line 8:
ORA-00904: invalid column name

The ORDER BY clause doesn't recognize the column names of the second SELECT. To avoid confusion over column names, it is a common practice to ORDER BY column positions:

SELECT CUST_NBR, NAME
FROM CUSTOMER WHERE REGION_ID = 5
UNION
SELECT EMP_ID, LNAME
FROM EMPLOYEE WHERE LNAME = 'MARTIN'
ORDER BY 1;

CUST_NBR NAME
---------- ---------------------
1 Cooper Industries
2 Emblazon Corp.
3 Ditech Corp.
4 Flowtech Inc.
5 Gentech Industries
7654 MARTIN 6 rows selected.

TIP: Unlike ORDER BY, we can use GROUP BY and HAVING clauses in component queries.

Component queries are executed from top to bottom. If we want to alter the sequence of execution, use parentheses appropriately. For example:

SELECT * FROM SUPPLIER_GOOD UNION
SELECT * FROM SUPPLIER_TEST MINUS
SELECT * FROM SUPPLIER; SUPPLIER_ID NAME
----------- --------------------------
4 Toshiba

Oracle performs the UNION between SUPPLIER_GOOD and SUPPLIER_TEST first, and then performs the MINUS between the result of the UNION and the SUPPLIER table. If we want the MINUS between SUPPLIER_TEST and SUPPLIER to be performed first, and then the UNION between SUPPLIER_GOOD and the result of MINUS, we must use parentheses to indicate so:

SELECT * FROM SUPPLIER_GOOD UNION
(SELECT * FROM SUPPLIER_TEST MINUS
SELECT * FROM SUPPLIER); SUPPLIER_ID NAME
----------- -------------------------
1 Sony
2 Samsung
3 Panasonic
4 Toshiba

The parentheses in this query forces the MINUS to be performed before the UNION. Notice the difference in the result as compared to the previous example.

The following list summarizes some simple rules, restrictions, and notes that don't require examples:

Set operations are not permitted on columns of type BLOB, CLOB, BFILE, and VARRAY, nor are set operations permitted on nested table columns.

Since UNION, INTERSECT, and MINUS operators involve sort operations, they are not allowed on LONG columns. However, UNION ALL is allowed on LONG columns.

Set operations are not allowed on SELECT statements containing TABLE collection expressions.

SELECT statements involved in set operations can't use the FOR UPDATE clause.

The number and size of columns in the SELECT list of component queries are limited by the block size of the database. The total bytes of the columns SELECTed can't exceed one database block.

تم تعديل بواسطة Admin05
رابط هذا التعليق
شارك

  • بعد 11 سنة...

انضم إلى المناقشة

يمكنك المشاركة الآن والتسجيل لاحقاً. إذا كان لديك حساب, سجل دخولك الآن لتقوم بالمشاركة من خلال حسابك.

زائر
أضف رد على هذا الموضوع...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   تمت استعادة المحتوى السابق الخاص بك.   مسح المحرر

×   You cannot paste images directly. Upload or insert images from URL.

جاري التحميل
×
×
  • أضف...

برجاء الإنتباه

بإستخدامك للموقع فأنت تتعهد بالموافقة على هذه البنود: سياسة الخصوصية