In the previous section, we saw subqueries that only returned a single result because an aggregate function was used in the subquery. Subqueries can also return zero or more rows.
Subqueries that return multiple rows can be used with the
SOME operators. We can also negate the condition like
A subquery that references one or more columns from its containing SQL statement is called a correlated subquery. Unlike non-correlated subqueries that are executed exactly once prior to the execution of a containing statement, a correlated subquery is executed once for each candidate row in the intermediate result set of the containing query.
The following statement illustrates the syntax of a correlated subquery:
SELECT column1,column2,.. FROM table 1 outer WHERE column1 operator( SELECT column1 from table 2 WHERE column2=outer.column4)
The PostgreSQL runs will pass the value of
column4 from the outer table to the inner query and will be compared to
table 2. Accordingly,
column1 will be fetched from
table 2 and depending on the operator it will be compared to
column1 of the outer table. If the expression turned out to be true, the row will be passed; otherwise, it will not appear in the output.
But with the correlated queries you might see some performance issues. This is because of the fact that for every record of the outer query, the correlated subquery will be executed. The performance is completely dependent on the data involved. However, in order to make sure that the query works efficiently, we can use some temporary tables.
Let's try to find all the employees who earn more than the average salary in their department:
SELECT last_name, salary, department_id FROM employee outer WHERE salary > (SELECT AVG(salary) FROM employee WHERE department_id = outer.department_id);
For each row from the
employee table, the value of
department_id will be passed into the inner query (let's consider that the value of
department_id of the first row is
30) and the inner query will try to find the average salary of that particular
department_id = 30. If the salary of that particular record will be more than the average salary of
department_id = 30, the expression will turn out to be true and the record will come in the output.
EXISTS condition is used in combination with a subquery, and is considered to be met if the subquery returns at least one row. It can be used in a
DELETE statement. If a subquery returns any rows at all, the
EXISTS subquery is true, and the
NOT EXISTS subquery is false.
The syntax for the PostgreSQL
EXISTS condition is as follows:
WHERE EXISTS ( subquery );
subquery is a
SELECT statement that usually starts with
SELECT * rather than a list of expressions or column names. To increase performance, you could replace
SELECT * with
SELECT 1 as the column result of the subquery is not relevant (only the rows returned matter).
The SQL statements that use the
EXISTS condition in PostgreSQL are very inefficient as the subquery is re-run for every row in the outer query's table. There are more efficient ways, such as using joins to write most queries, that do not use the
Let's look at the following example that is a
SELECT statement and uses the PostgreSQL
SELECT * FROM products WHERE EXISTS (SELECT 1 FROM inventory WHERE products.product_id = inventory.product_id);
EXISTS condition example will return all records from the
products table where there is at least one record in the
inventory table with the matching
product_id. We used
SELECT 1 in the subquery to increase performance as the column result set is not relevant to the
EXISTS condition (only the existence of a returned row matters).
EXISTS condition can also be combined with the
NOT operator, for example:
SELECT * FROM products WHERE NOT EXISTS (SELECT 1 FROM inventory WHERE products.product_id = inventory.product_id);
NOT EXISTS example will return all records from the
products table where there are no records in the
inventory table for the given