# Advanced SQL SQL is a composed by: - Data Manipulation Language (DML) - Data Definition Language (DDL) - Data Control Language (DCL) It also includes: - View definition - Integrity & Referential Constraints - Transactions > **Note**: SQL is based on **bags** (duplicates) adn **sets** (no duplicates) ## Aggregations Functions that return a single value from a bag of tuples: - `AVG()`: return the average value - `MIN()`: return the minimum value - `MAX()`: return the maximum value - `SUM()`: return the sum of values - `COUNT()`: count the number of values > **Note**: aggregate functions can only be used in the `SELECT` output list. > **Note**: `COUNT`, `SUM`, `AVG` support `DISTINCT`. ### Group By Project tuples into subsets and calculate aggregates against each subset. > **Note**: Not-aggregated values in `SELECT` output clause must appear in the `GROUP BY` clause ```sql SELECT * FROM GROUP BY
.; ``` ### Having Filters results based on aggregation computation. Like `WHERE` clause for a `GROUP BY`. ```sql SELECT COUNT(1) AS count FROM
GROUP BY
. HAVING count > 0; ``` ## String Operations > **Note**: strings are case sensitive and defined with single-quotes (`'`). `LIKE` is used for string matching: `%` matches any substring (including empty strings) while `_` matches any single character. ```sql SELECT * FROM
WHERE
. LIKE '%@c_'; ``` Other common string functions/operators: - `UPPER()`: convert string to uppercase - `LOWER()`: convert string to lowercase - `||` is used to concatenate two strings. ```sql SELECT * FROM
WHERE
. = LOWER(
. || '-suffix'); ``` ## DateTime Operations DateTime Functions: - `NOW()`: get current timestamp - `DATE('')`: convert string to date - `UNIX_TIMESTAMP()`: convert to unix epoch ## Output Redirection Store query results in another table provided that the number and type of columns is the same. ```sql SELECT * INTO FROM
; -- write result into new table (must not exist) INSERT INTO (SELECT * FROM ); -- write result into existing table ``` ## Output Control `ORDER BY` sorts the results based on a specific attribute(s). ```sql SELECT * FROM
ORDER BY
. ASC; -- default direction SELECT * FROM
ORDER BY
. DESC; SELECT * FROM
ORDER BY
. ASC,
. DESC; ``` `LIMIT` limits the number of tuples returned in output. Can set an offset to return a range. ```sql SELECT * FROM
LIMIT 10; SELECT * FROM
LIMIT 10 OFFSET 5; ``` ## Nested Queries Inner queries can appear (almost) anywhere in the query. They are often difficult to optimize. > **Note**: inner queries can reference attributes and tables of the outer query but not vice-versa. ```sql SELECT * FROM
WHERE
. IN (SELECT ...); SELECT (SELECT
. FROM
WHERE ...) FROM
WHERE ; ``` Nested query operators: - `ALL()`: must satisfy expression for all rows in sub-query. - `ANY()`: must satisfy expression for at least one row in sub-query. - `IN`: equivalent to `=ANY()`. - `EXISTS`: at least one row is returned. ## Window Functions Perform a calculation across a set of tuples related to a single row. Like an aggregation but tuples are not grouped into a single output tuple. The `OVER` keyword specifies how to group together tuples when computing the window function. ```sql SELECT ..., FUNC_NAME() OVER ( PARTITION BY , ... ORDER BY , ... ) FROM
; ``` Special window functions: - `ROW_NUMBER()`: number of the current row - `RANK()`: order position of the current row ## Common Table Expressions (CTE) Provides a way to write auxiliary statements for a larger query. Alternative to nested queries and views. ```sql WITH (, ...) AS (SELECT ...) -- temporary table from query result SELECT , ... FROM ``` **Note**: CTEs can be recursive with the `RECURSIVE` keyword.