dev-notes/docs/databases/database-systems/databases.md
2024-12-20 00:27:25 +01:00

148 lines
4.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Databases
Organized collection of inter-related data that models some aspect of the real-world. Databases are the core component of most computer applications.
A **DBMS** is the software that allows applications to store and analyze information in a database.
A general-purpose DBMS is designed to allow the definition, creation, querying, update and administration of database Management System
A **data model** is a collection of concepts for describing the data in a database. A **schema** is a description of a particular collection of data, using a given data model.
## Relational Model
It was proposed by Ted Codd in 1970. It's an abstraction to avoid high maintenance of the DBMS software:
- store database in simple data structures-
- access data through high level language.
- physical storage left to the implementation.
- store database in simple data structures-
- access data through high level language.
- physical storage left to the implementation.
Concepts:
- **Structure**: the definition of relations and their contents.
- **Integrity**: ensures that the database's contents satisfy constraints.
- **Manipulation**: how to access and modify a database's content.
### Relations
A **relation** (aka **table**)is an _unordered_ set that contains the relationship of the _attributes_ (aka _fields_) that represent entities.
The **domain** of a relation is the set of possible values that the relation can contain.
A **tuple** (aka **record**) is a set of attribute values in a relation. Values are (normally) atomic/scalar. The special value `NULL` is a member of every domain.
### Primary Keys (PK)
A relation's **primary key** uniquely identifies a single tuple. Some DBMSs automatically create an internal primary key if one is not provided.
### Foreign Keys (FK)
A **foreign key** specifies that an attribute from one relation has to map to a tuple in another relation.
## Data Manipulation Language (DML)
The **data manipulation language** describes how to _store_ and _retrieve_ information from a database.
Kinds of DMLs:
- **Procedural**: the query specifies the (high-level) strategy that the DBMS should use to find the desired result. (Relational Algebra)
- **Non-Procedural**: the query specifies only what data is wanted and not how to find it. (Relational Calculus)
## Relational Algebra
Set of fundamental operations to retrieve and manipulate tuples in a relation. Each operator takes one or more relations as inputs and outputs a new relation. This allows to chain operations together to create more complex operations.
Relational algebra describes the steps needed to obtain a particular result. The order of the steps does influence the performance of the complete operation.
Fundamental Operators:
- Selection (`σ`)
- Projection (`π`)
- Union (`U`)
- Intersection (`∩`)
- Difference (`-`)
- Product (`x`)
- Natural Join (`|X|`)
Extra Operators:
- Rename (`p`)
- Assignment (`R ← S`)
- Duplicate Elimination
- Aggregation (`Y`)
- Sorting (`τ`)
- Division (`÷`)
> **Note**: reactional algebra operates on sets. A set is an unordered list of unique values.
### Select (`σ`)
Choose a subset of tuples from a relation that satisfies a selection predicate. Predicates acts a filters to retain only tuples that fulfill its qualifying requirements. It's possible to combine multiple predicates using conjunctions/disjunctions.
Syntax: `σ(R)`
```sql
SELECT * FROM R WHERE <predicate>;
```
### Projection (`π`)
Generate a relation with tuples that contain only the specified attributes. Allows to rearrange attribute ordering and can manipulate values.
Syntax: `π(R)`
```sql
SELECT (<tuple>) FROM R;
```
### Union (`U`)
Generate a relation that contains all tuples that appear in either only one ore both input relations.
Syntax: `(R U S)`
```sql
(SELECT * FROM R) UNION ALL (SELECT * FROM S);
```
### Intersection (`∩`)
Generate a relation that contains only the tuples that appear in both of the input relations.
Syntax: `(R ∩ S)`
```sql
(SELECT * FROM R) INTERSECT (SELECT * FROM S);
```
### Difference (`-`)
Generate a relation that contains only the tuples that appear in the first and not the second of the input relations.
Syntax: `(R - S)`
```sql
(SELECT * FROM R) EXCEPT (SELECT * FROM S);
```
#### Product (`x`)
Generate a relation that contains all possible combinations of tuples from the input relations.
Syntax: `(R x S)`
```sql
SELECT * FROM R CROSS JOIN S;
SELECT * FROM R, S;
```
### Natural Join (`|X|`)
Generate a relation that contains all tuples that are a combination of two tuples (one from each relation) with common value(S) for one or more attributes.
> **Note**: the matching values must be on the same fields.
Syntax: (`R |X| S`)
```sql
SELECT * FROM R NATURAL JOIN S;
```