# Advanced SQL
SQL is a composed by:
- Data Manipulation Language (DML)
- Data Definition Language (DDL)
- Data Control Language (DCL)
It also includes:
- View definition
- Integrity & Referential Constraints
- Transactions
> **Note**: SQL is based on **bags** (duplicates) adn **sets** (no duplicates)
## Aggregations
Functions that return a single value from a bag of tuples:
- `AVG()`: return the average value
- `MIN()`: return the minimum value
- `MAX()`: return the maximum value
- `SUM()`: return the sum of values
- `COUNT()`: count the number of values
> **Note**: aggregate functions can only be used in the `SELECT` output list.
> **Note**: `COUNT`, `SUM`, `AVG` support `DISTINCT`.
### Group By
Project tuples into subsets and calculate aggregates against each subset.
> **Note**: Not-aggregated values in `SELECT` output clause must appear in the `GROUP BY` clause
```sql
SELECT * FROM
GROUP BY .;
```
### Having
Filters results based on aggregation computation. Like `WHERE` clause for a `GROUP BY`.
```sql
SELECT COUNT(1) AS count FROM
GROUP BY .
HAVING count > 0;
```
## String Operations
> **Note**: strings are case sensitive and defined with single-quotes (`'`).
`LIKE` is used for string matching: `%` matches any substring (including empty strings) while `_` matches any single character.
```sql
SELECT * FROM
WHERE . LIKE '%@c_';
```
Other common string functions/operators:
- `UPPER()`: convert string to uppercase
- `LOWER()`: convert string to lowercase
- `||` is used to concatenate two strings.
```sql
SELECT * FROM
WHERE . = LOWER(. || '-suffix');
```
## DateTime Operations
DateTime Functions:
- `NOW()`: get current timestamp
- `DATE('')`: convert string to date
- `UNIX_TIMESTAMP()`: convert to unix epoch
## Output Redirection
Store query results in another table provided that the number and type of columns is the same.
```sql
SELECT * INTO FROM ; -- write result into new table (must not exist)
INSERT INTO (SELECT * FROM ); -- write result into existing table
```
## Output Control
`ORDER BY` sorts the results based on a specific attribute(s).
```sql
SELECT * FROM ORDER BY . ASC; -- default direction
SELECT * FROM ORDER BY . DESC;
SELECT * FROM ORDER BY . ASC, . DESC;
```
`LIMIT` limits the number of tuples returned in output. Can set an offset to return a range.
```sql
SELECT * FROM LIMIT 10;
SELECT * FROM LIMIT 10 OFFSET 5;
```
## Nested Queries
Inner queries can appear (almost) anywhere in the query. They are often difficult to optimize.
> **Note**: inner queries can reference attributes and tables of the outer query but not vice-versa.
```sql
SELECT * FROM WHERE . IN (SELECT ...);
SELECT (SELECT . FROM WHERE ...) FROM WHERE ;
```
Nested query operators:
- `ALL()`: must satisfy expression for all rows in sub-query.
- `ANY()`: must satisfy expression for at least one row in sub-query.
- `IN`: equivalent to `=ANY()`.
- `EXISTS`: at least one row is returned.
## Window Functions
Perform a calculation across a set of tuples related to a single row. Like an aggregation but tuples are not grouped into a single output tuple.
The `OVER` keyword specifies how to group together tuples when computing the window function.
```sql
SELECT ..., FUNC_NAME()
OVER (
PARTITION BY , ...
ORDER BY , ...
)
FROM ;
```
Special window functions:
- `ROW_NUMBER()`: number of the current row
- `RANK()`: order position of the current row
## Common Table Expressions (CTE)
Provides a way to write auxiliary statements for a larger query. Alternative to nested queries and views.
```sql
WITH (, ...) AS (SELECT ...) -- temporary table from query result
SELECT , ... FROM
```
**Note**: CTEs can be recursive with the `RECURSIVE` keyword.