The PostgreSQL UNION
clause is used to combine the results of two or more SELECT
statements without returning any duplicate rows.
The basic rules to combine two or more queries using the UNION
join are as follows:
The number and order of columns of all queries must be the same
The data types of the columns on involving table in each query must be same or compatible
Usually, the returned column names are taken from the first query
By default, the UNION
join behaves like DISTINCT
, that is, eliminates the duplicate rows; however, using the ALL
keyword with the UNION
join returns all rows, including the duplicates, as shown in the following example:
SELECT <column_list> FROM table WHERE condition GROUP BY <column_list> [HAVING ] condition UNION SELECT <column_list> FROM table WHERE condition GROUP BY <column_list> [HAVING ] condition ORDER BY column list;
The queries are all executed independently, but their output is merged. The Union operator may place rows in the first query, before, after, or in between the rows in the result set of the second query. To sort the records in a combined result set, you can use ORDER BY
.
Let's consider an example where you combine the data of customers belonging to two different sites. The table structure of both the tables is the same, but they have data of the customers from two different sites:
SELECT customer_id,customer_name,location_id FROM customer_site1 UNION SELECT customer_id,customer_name,location_id FROM customer_site2 ORDER BY customer_name asc;
Both the SELECT
queries would run individually, combine the result set, remove the duplicates (as we are using UNION
), and sort the result set according to the condition, which is customer_name
in this case.