mongodb: mongod & logging notes

This commit is contained in:
Marcello 2022-09-08 00:16:32 +02:00
parent 197703eb04
commit 8f08bd96a4

View file

@ -1,467 +1,524 @@
# MongoDB # MongoDB
The database is a container of **collections**. The collections are containers of **documents**. The database is a container of **collections**. The collections are containers of **documents**.
The documents are _schema-less_ that is they have a dynamic structure that can change between documents in the same collection. The documents are _schema-less_ that is they have a dynamic structure that can change between documents in the same collection.
## Data Types ## Data Types
| Tipo | Documento | Funzione | | Tipo | Documento | Funzione |
| ----------------- | ------------------------------------------------ | ----------------------- | | ----------------- | ------------------------------------------------ | ----------------------- |
| Text | `"Text"` | | Text | `"Text"` |
| Boolean | `true` | | Boolean | `true` |
| Number | `42` | | Number | `42` |
| Objectid | `"_id": {"$oid": "<id>"}` | `ObjectId("<id>")` | | Objectid | `"_id": {"$oid": "<id>"}` | `ObjectId("<id>")` |
| ISODate | `"<key>": {"$date": "YYYY-MM-DDThh:mm:ss.sssZ"}` | `ISODate("YYYY-MM-DD")` | | ISODate | `"<key>": {"$date": "YYYY-MM-DDThh:mm:ss.sssZ"}` | `ISODate("YYYY-MM-DD")` |
| Timestamp | | `Timestamp(11421532)` | | Timestamp | | `Timestamp(11421532)` |
| Embedded Document | `{"a": {...}}` | | Embedded Document | `{"a": {...}}` |
| Embedded Array | `{"b": [...]}` | | Embedded Array | `{"b": [...]}` |
It's mandatory for each document ot have an unique field `_id`. It's mandatory for each document ot have an unique field `_id`.
MongoDB automatically creates an `ObjectId()` if it's not provided. MongoDB automatically creates an `ObjectId()` if it's not provided.
## Databases & Collections Usage ## Databases & Collections Usage
To create a database is sufficient to switch towards a non existing one with `use <database>` (implicit creation). To create a database is sufficient to switch towards a non existing one with `use <database>` (implicit creation).
The database is not actually created until a document is inserted. The database is not actually created until a document is inserted.
```sh ```sh
show dbs # list all databases show dbs # list all databases
use <database> # use a particular database use <database> # use a particular database
show collections # list all collection for the current database show collections # list all collection for the current database
dbs.dropDatabase() # delete current database dbs.dropDatabase() # delete current database
db.createCollection(name, {options}) # explicit collection creation db.createCollection(name, {options}) # explicit collection creation
db.<collection>.insertOne({document}) # implicit collection creation db.<collection>.insertOne({document}) # implicit collection creation
``` ```
## Operators (MQL Syntax) ## Operators (MQL Syntax)
```json ```json
/* --- Update operators --- */ /* --- Update operators --- */
{ "$inc": { "<key>": "<increment>", ... } } // Increment value { "$inc": { "<key>": "<increment>", ... } } // Increment value
{ "$set": { "<key>": "<value>", ... } } // Set value { "$set": { "<key>": "<value>", ... } } // Set value
{ "$push": { "<key>": "<value>", ... } } // add a value to an array field or turn field into array { "$push": { "<key>": "<value>", ... } } // add a value to an array field or turn field into array
/* --- Query Operators --- */ /* --- Query Operators --- */
{ "<key>": { "$in": [ "<value_1>", "<value_2>", ...] } } // Membership { "<key>": { "$in": [ "<value_1>", "<value_2>", ...] } } // Membership
{ "<key>": { "$nin": [ "<value_1>", "<value_2>", ...] } } // Membership { "<key>": { "$nin": [ "<value_1>", "<value_2>", ...] } } // Membership
{ "<key>": { "$exists": true } } // Field Exists { "<key>": { "$exists": true } } // Field Exists
/* --- Comparison Operators (DEFAULT: $eq) --- */ /* --- Comparison Operators (DEFAULT: $eq) --- */
{ "<key>": { "$gt": "<value>" }} // > { "<key>": { "$gt": "<value>" }} // >
{ "<key>": { "$gte": "<value>" }} // >= { "<key>": { "$gte": "<value>" }} // >=
{ "<key>": { "$lt": "<value>" }} // < { "<key>": { "$lt": "<value>" }} // <
{ "<key>": { "$lte": "<value>" }} // <= { "<key>": { "$lte": "<value>" }} // <=
{ "<key>": { "$eq": "<value>" }} // == { "<key>": { "$eq": "<value>" }} // ==
{ "<key>": { "$ne": "<value>" }} // != { "<key>": { "$ne": "<value>" }} // !=
/* --- Logic Operators (DEFAULT $and) --- */ /* --- Logic Operators (DEFAULT $and) --- */
{ "$and": [ { "<expression>" }, ...] } { "$and": [ { "<expression>" }, ...] }
{ "$or": [ { "<expression>" }, ...] } { "$or": [ { "<expression>" }, ...] }
{ "$nor": [ { "<expression>" }, ...] } { "$nor": [ { "<expression>" }, ...] }
{ "$not": { "<expression>" } } { "$not": { "<expression>" } }
/* --- Array Operators --- */ /* --- Array Operators --- */
{ "<key>": { "$all": ["value>", "<value>", ...] } } // field contains all values { "<key>": { "$all": ["value>", "<value>", ...] } } // field contains all values
{ "<key>": { "$size": "<value>" } } { "<key>": { "$size": "<value>" } }
{ "<array-key>": { "$elemMatch": { "<item-key>": "<expression>" } } } // elements in array must match an expression { "<array-key>": { "$elemMatch": { "<item-key>": "<expression>" } } } // elements in array must match an expression
/* --- REGEX Operator --- */ /* --- REGEX Operator --- */
{ "<key>": { "$regex": "/pattern/", "$options": "<options>" } } { "<key>": { "$regex": "/pattern/", "$options": "<options>" } }
{ "<key>": { "$regex": "pattern", "$options": "<options>" } } { "<key>": { "$regex": "pattern", "$options": "<options>" } }
{ "<key>": { "$regex": "/pattern/<options>" } } { "<key>": { "$regex": "/pattern/<options>" } }
{ "<key>": "/pattern/<options>" } { "<key>": "/pattern/<options>" }
``` ```
### Expressive Query Operator ### Expressive Query Operator
> **Note**: `$<key>` is used to access the value of the field dynamically > **Note**: `$<key>` is used to access the value of the field dynamically
```json ```json
{ "$expr": { "<expression>" } } // aggregation expression, variables, conditional expressions { "$expr": { "<expression>" } } // aggregation expression, variables, conditional expressions
{ "$expr": { "$<comparison_operator>": [ "$<key>", "$<key>" ] } } // compare field values (operators use aggregation syntax) { "$expr": { "$<comparison_operator>": [ "$<key>", "$<key>" ] } } // compare field values (operators use aggregation syntax)
``` ```
## CRUD Operations ## Mongo Query Language (MQL)
### Create ### Insertion
It's possible to insert a single document with the command `insertOne()` or multiple documents with `insertMany()`. It's possible to insert a single document with the command `insertOne()` or multiple documents with `insertMany()`.
Insertion results: Insertion results:
- error -> rollback - error -> rollback
- success -> entire documents gets saved - success -> entire documents gets saved
```sh ```sh
# explicit collection creation, all options are optional # explicit collection creation, all options are optional
db.createCollection( <name>, db.createCollection( <name>,
{ {
capped: <boolean>, capped: <boolean>,
autoIndexId: <boolean>, autoIndexId: <boolean>,
size: <number>, size: <number>,
max: <number>, max: <number>,
storageEngine: <document>, storageEngine: <document>,
validator: <document>, validator: <document>,
validationLevel: <string>, validationLevel: <string>,
validationAction: <string>, validationAction: <string>,
indexOptionDefaults: <document>, indexOptionDefaults: <document>,
viewOn: <string>, viewOn: <string>,
pipeline: <pipeline>, pipeline: <pipeline>,
collation: <document>, collation: <document>,
writeConcern: <document> writeConcern: <document>
} }
) )
db.createCollection("name", { capped: true, size: max_bytes, max: max_docs_num } ) # creation of a capped collection db.createCollection("name", { capped: true, size: max_bytes, max: max_docs_num } ) # creation of a capped collection
# SIZE: int - will be rounded to a multiple of 256 # SIZE: int - will be rounded to a multiple of 256
# implicit creation at doc insertion # implicit creation at doc insertion
db.<collection>.insertOne({ document }, options) # insert a document in a collection db.<collection>.insertOne({ document }, options) # insert a document in a collection
db.<collection>.insertMany([ { document }, { document }, ... ], options) # insert multiple docs db.<collection>.insertMany([ { document }, { document }, ... ], options) # insert multiple docs
db.<collection>.insertMany([ { document }, { document } ] , { "ordered": false }) # allow the unordered insertion, only documents that cause errors wont be inserted db.<collection>.insertMany([ { document }, { document } ] , { "ordered": false }) # allow the unordered insertion, only documents that cause errors wont be inserted
``` ```
> **Note**: If `insertMany()` fails the already inserted documents are not rolled back but all the successive ones (even the correct ones) will not be inserted. > **Note**: If `insertMany()` fails the already inserted documents are not rolled back but all the successive ones (even the correct ones) will not be inserted.
### Read ### Querying
```sh ```sh
db.<collection>.findOne() # find only one document db.<collection>.findOne() # find only one document
db.<collection>.find(filter) # show selected documents db.<collection>.find(filter) # show selected documents
db.<collection>.find().pretty() # show documents formatted db.<collection>.find().pretty() # show documents formatted
db.<collection>.find().limit(n) # show n documents db.<collection>.find().limit(n) # show n documents
db.<collection>.find().limit(n).skip(k) # show n documents skipping k docs db.<collection>.find().limit(n).skip(k) # show n documents skipping k docs
db.<collection>.find().count() # number of found docs db.<collection>.find().count() # number of found docs
db.<collection>.find().sort({ "<key-1>": 1, ... , "<key-n>": -1 }) # show documents sorted by specified keys in ascending (1) or descending (-1) order db.<collection>.find().sort({ "<key-1>": 1, ... , "<key-n>": -1 }) # show documents sorted by specified keys in ascending (1) or descending (-1) order
# projection # projection
db.<collection>.find(filter, { "<key>": 1 }) # show selected values form documents (1 or true => show, 0 or false => don't show, cant mix 0 and 1) db.<collection>.find(filter, { "<key>": 1 }) # show selected values form documents (1 or true => show, 0 or false => don't show, cant mix 0 and 1)
db.<collection>.find(filter, { _id: 0, "<key>": 1 }) # only _id can be set to 0 with other keys at 1 db.<collection>.find(filter, { _id: 0, "<key>": 1 }) # only _id can be set to 0 with other keys at 1
db.<collection>.find(filter, { "<array-key>": { "$elemMatch": { "<item-key>": "<expression>" } } }) # project only elements matching the expression db.<collection>.find(filter, { "<array-key>": { "$elemMatch": { "<item-key>": "<expression>" } } }) # project only elements matching the expression
# sub documents & arrays # sub documents & arrays
db.<collection>.find({ "<key>.<sub-key>.<sub-key>": "<expression>" }) db.<collection>.find({ "<key>.<sub-key>.<sub-key>": "<expression>" })
db.<collection>.find({ "<array-key>.<index>.<sub-key>": "<expression>" }) db.<collection>.find({ "<array-key>.<index>.<sub-key>": "<expression>" })
# GeoJSON - https://docs.mongodb.com/manual/reference/operator/query/near/index.html # GeoJSON - https://docs.mongodb.com/manual/reference/operator/query/near/index.html
db.<collection>.find( db.<collection>.find(
{ {
<location field>: { <location field>: {
$near: { $near: {
$geometry: { type: "Point", coordinates: [ <longitude> , <latitude> ] }, $geometry: { type: "Point", coordinates: [ <longitude> , <latitude> ] },
$maxDistance: <distance in meters>, $maxDistance: <distance in meters>,
$minDistance: <distance in meters> $minDistance: <distance in meters>
} }
} }
} }
) )
db.<collection>.find().hint( { "<key>": 1 } ) # specify the index db.<collection>.find().hint( { "<key>": 1 } ) # specify the index
db.<collection>.find().hint( "index-name" ) # specify the index using the index name db.<collection>.find().hint( "index-name" ) # specify the index using the index name
db.<collection>.find().hint( { $natural : 1 } ) # force the query to perform a forwards collection scan db.<collection>.find().hint( { $natural : 1 } ) # force the query to perform a forwards collection scan
db.<collection>.find().hint( { $natural : -1 } ) # force the query to perform a reverse collection scan db.<collection>.find().hint( { $natural : -1 } ) # force the query to perform a reverse collection scan
``` ```
> **Note**: `{ <key>: <value> }` in case of a field array will match if the array _contains_ the value > **Note**: `{ <key>: <value> }` in case of a field array will match if the array _contains_ the value
### Update ### Updating
[Update Operators](https://docs.mongodb.com/manual/reference/operator/update/ "Update Operators Documentation") [Update Operators](https://docs.mongodb.com/manual/reference/operator/update/ "Update Operators Documentation")
```sh ```sh
db.<collection>.replaceOne(filter, update, options) db.<collection>.replaceOne(filter, update, options)
db.<collection>.updateOne(filter, update, {upsert: true}) # modify document if existing, insert otherwise db.<collection>.updateOne(filter, update, {upsert: true}) # modify document if existing, insert otherwise
db.<collection>.updateOne(filter, { "$push": { ... }, "$set": { ... }, { "$inc": { ... }, ... } }) db.<collection>.updateOne(filter, { "$push": { ... }, "$set": { ... }, { "$inc": { ... }, ... } })
``` ```
### Delete ### Deletion
```sh ```sh
db.<collection>.deleteOne(filter, options) db.<collection>.deleteOne(filter, options)
db.<collection>.deleteMany(filter, options) db.<collection>.deleteMany(filter, options)
db.<collection>.drop() # delete whole collection db.<collection>.drop() # delete whole collection
db.dropDatabase() # delete entire database db.dropDatabase() # delete entire database
``` ```
## [Mongoimport](https://docs.mongodb.com/database-tools/mongoimport/) ---
Utility to import all docs into a specified collection. ## MongoDB Database Tools
If the collection already exists `--drop` deletes it before reuploading it.
**WARNING**: CSV separators must be commas (`,`) ### [Mongoimport](https://docs.mongodb.com/database-tools/mongoimport/)
```sh Utility to import all docs into a specified collection.
mongoimport <options> <connection-string> <file> If the collection already exists `--drop` deletes it before reuploading it.
**WARNING**: CSV separators must be commas (`,`)
--uri=<connectionString>
--host=<hostname><:port>, -h=<hostname><:port> ```sh
--username=<username>, -u=<username> mongoimport <options> <connection-string> <file>
--password=<password>, -p=<password>
--collection=<collection>, -c=<collection> # Specifies the collection to import. --uri=<connectionString>
--ssl # Enables connection to a mongod or mongos that has TLS/SSL support enabled. --host=<hostname><:port>, -h=<hostname><:port>
--type <json|csv|tsv> # Specifies the file type to import. DEFAULT: json --username=<username>, -u=<username>
--drop # drops the collection before importing the data from the input. --password=<password>, -p=<password>
--headerline # if file is CSV and first line is header --collection=<collection>, -c=<collection> # Specifies the collection to import.
--jsonarray # Accepts the import of data expressed with multiple MongoDB documents within a single json array. MAX 16 MB --ssl # Enables connection to a mongod or mongos that has TLS/SSL support enabled.
``` --type <json|csv|tsv> # Specifies the file type to import. DEFAULT: json
--drop # drops the collection before importing the data from the input.
## [Mongoexport](https://docs.mongodb.com/database-tools/mongoexport/) --headerline # if file is CSV and first line is header
--jsonarray # Accepts the import of data expressed with multiple MongoDB documents within a single json array. MAX 16 MB
Utility to export documents into a specified file. ```
```sh ### [Mongoexport](https://docs.mongodb.com/database-tools/mongoexport/)
mongoexport --collection=<collection> <options> <connection-string>
Utility to export documents into a specified file.
--uri=<connectionString>
--host=<hostname><:port>, -h=<hostname><:port> ```sh
--username=<username>, -u=<username> mongoexport --collection=<collection> <options> <connection-string>
--password=<password>, -p=<password>
--db=<database>, -d=<database> --uri=<connectionString>
--collection=<collection>, -c=<collection> --host=<hostname><:port>, -h=<hostname><:port>
--type=<json|csv> --username=<username>, -u=<username>
--out=<file>, -o=<file> #Specifies a file to write the export to. DEFAULT: stdout --password=<password>, -p=<password>
--jsonArray # Write the entire contents of the export as a single json array. --db=<database>, -d=<database>
--pretty # Outputs documents in a pretty-printed format JSON. --collection=<collection>, -c=<collection>
--skip=<number> --type=<json|csv>
--limit=<number> # Specifies a maximum number of documents to include in the export --out=<file>, -o=<file> #Specifies a file to write the export to. DEFAULT: stdout
--sort=<JSON> # Specifies an ordering for exported results --jsonArray # Write the entire contents of the export as a single json array.
``` --pretty # Outputs documents in a pretty-printed format JSON.
--skip=<number>
## [Mongodump][mongodump_docs] & [Mongorestore][mongorestore_docs] --limit=<number> # Specifies a maximum number of documents to include in the export
--sort=<JSON> # Specifies an ordering for exported results
`mongodump` exports the content of a running server into `.bson` files. ```
`mongorestore` Restore backups generated with `mongodump` to a running server. ### [Mongodump][mongodump_docs] & [Mongorestore][mongorestore_docs]
[mongodump_docs]: https://docs.mongodb.com/database-tools/mongodump/ `mongodump` exports the content of a running server into `.bson` files.
[mongorestore_docs]: https://docs.mongodb.com/database-tools/mongorestore/
`mongorestore` Restore backups generated with `mongodump` to a running server.
## Relations
[mongodump_docs]: https://docs.mongodb.com/database-tools/mongodump/
**Nested / Embedded Documents**: [mongorestore_docs]: https://docs.mongodb.com/database-tools/mongorestore/
- Group data logically ---
- Optimal for data belonging together that do not overlap
- Should avoid nesting too deep or making too long arrays (max doc size 16 mb) ## [Indexes](https://docs.mongodb.com/manual/indexes/ "Index Documentation")
```json Indexes support the efficient execution of queries in MongoDB.
{
"_id": "ObjectId()", Without indexes, MongoDB must perform a _collection scan_ (_COLLSCAN_): scan every document in a collection, to select those documents that match the query statement.
"<key>": "value", If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect (_IXSCAN_).
"<key>": "value",
Indexes are special data structures that store a small portion of the collection's data set in an easy to traverse form. The index stores the value of a specific field or set of fields, ordered by the value of the field. The ordering of the index entries supports efficient equality matches and range-based query operations. In addition, MongoDB can return sorted results by using the ordering in the index.
"innerDocument": {
"<key>": "value", Indexes _slow down writing operations_ since the index must be updated at every writing.
"<key>": "value"
} ![IXSCAN](../img/mongodb_ixscan.png ".find() using an index")
}
``` ### [Index Types](https://docs.mongodb.com/manual/indexes/#index-types)
**References**: - **Normal**: Fields sorted by name
- **Compound**: Multiple Fields sorted by name
- Divide data between collections - **Multikey**: values of sorted arrays
- Optimal for related but shared data used in relations or stand-alone - **Text**: Ordered text fragments
- Allows to overtake nesting and size limits - **Geospatial**: ordered geodata
NoSQL databases do not have relations and references. It's the app that has to handle them. **Sparse** indexes only contain entries for documents that have the indexed field, even if the index field contains a null value. The index skips over any document that is missing the indexed field.
```json ### Diagnosis and query planning
{
"<key>": "value", ```sh
"references": ["id1", "id2"] db.<collection>.find({...}).explain() # explain won't accept other functions
} db.explain().<collection>.find({...}) # can accept other functions
db.explain("executionStats").<collection>.find({...}) # more info
// referenced ```
{
"_id": "id1", ### Index Creation
"<key>": "value"
} ```sh
``` db.<collection>.createIndex( <key and index type specification>, <options> )
## [Indexes](https://docs.mongodb.com/manual/indexes/ "Index Documentation") db.<collection>.createIndex( { "<key>": <type>, "<key>": <type>, ... } ) # normal, compound or multikey (field is array) index
db.<collection>.createIndex( { "<key>": "text" } ) # text index
Indexes support the efficient execution of queries in MongoDB. db.<collection>.createIndex( { "<key>": 2dsphere } ) # geospatial 2dsphere index
Without indexes, MongoDB must perform a _collection scan_ (_COLLSCAN_): scan every document in a collection, to select those documents that match the query statement. # sparse index
If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect (_IXSCAN_). db.<collection>.createIndex(
{ "<key>": <type>, "<key>": <type>, ... },
Indexes are special data structures that store a small portion of the collection's data set in an easy to traverse form. The index stores the value of a specific field or set of fields, ordered by the value of the field. The ordering of the index entries supports efficient equality matches and range-based query operations. In addition, MongoDB can return sorted results by using the ordering in the index. { sparse: true } # sparse option
)
Indexes _slow down writing operations_ since the index must be updated at every writing.
# custom name
![IXSCAN](../img/mongodb_ixscan.png ".find() using an index") db.<collection>.createIndex(
{ <key and index type specification>, },
### [Index Types](https://docs.mongodb.com/manual/indexes/#index-types) { name: "index-name" } # name option
)
- **Normal**: Fields sorted by name ```
- **Compound**: Multiple Fields sorted by name
- **Multikey**: values of sorted arrays ### [Index Management](https://docs.mongodb.com/manual/tutorial/manage-indexes/)
- **Text**: Ordered text fragments
- **Geospatial**: ordered geodata ```sh
# view all db indexes
**Sparse** indexes only contain entries for documents that have the indexed field, even if the index field contains a null value. The index skips over any document that is missing the indexed field. db.getCollectionNames().forEach(function(collection) {
indexes = db[collection].getIndexes();
### Diagnosis and query planning print("Indexes for " + collection + ":");
printjson(indexes);
```sh });
db.<collection>.find({...}).explain() # explain won't accept other functions db.<collection>.getIndexes() # view collection's index
db.explain().<collection>.find({...}) # can accept other functions
db.explain("executionStats").<collection>.find({...}) # more info db.<collection>.dropIndexes() # drop all indexes
``` db.<collection>.dropIndex( { "index-name": 1 } ) # drop a specific index
```
### Index Creation
---
```sh
db.<collection>.createIndex( <key and index type specification>, <options> ) ## Roles and permissions
db.<collection>.createIndex( { "<key>": <type>, "<key>": <type>, ... } ) # normal, compound or multikey (field is array) index **Authentication**: identifies valid users
db.<collection>.createIndex( { "<key>": "text" } ) # text index **Authorization**: identifies what a user can do
db.<collection>.createIndex( { "<key>": 2dsphere } ) # geospatial 2dsphere index
- **userAdminAnyDatabase**: can admin every db in the instance (role must be created on admin db)
# sparse index - **userAdmin**: can admin the specific db in which is created
db.<collection>.createIndex( - **readWrite**: can read and write in the specific db in which is created
{ "<key>": <type>, "<key>": <type>, ... }, - **read**: can read the specific db in which is created
{ sparse: true } # sparse option
) ```sh
# create users in the current MongoDB instance
# custom name db.createUser(
db.<collection>.createIndex( {
{ <key and index type specification>, }, user: "dbAdmin",
{ name: "index-name" } # name option pwd: "password",
) roles:[
``` {
role: "userAdminAnyDatabase",
### [Index Management](https://docs.mongodb.com/manual/tutorial/manage-indexes/) db:"admin"
}
```sh ]
# view all db indexes },
db.getCollectionNames().forEach(function(collection) { {
indexes = db[collection].getIndexes(); user: "username",
print("Indexes for " + collection + ":"); pwd: "password",
printjson(indexes); roles:[
}); {
db.<collection>.getIndexes() # view collection's index role: "role",
db: "database"
db.<collection>.dropIndexes() # drop all indexes }
db.<collection>.dropIndex( { "index-name": 1 } ) # drop a specific index ]
``` }
)
## Database Profiling ```
Profiling Levels: ---
- `0`: no profiling ## Cluster Administration
- `1`: data on operations slower than `slowms`
- `2`: data on all operations ### `mongod`
Logs are saved in the `system.profile` _capped_ collection. `mongod` is the main deamon process for MongoDB. It's the core process of the database,
handling connections, requests and persisting the data.
```sh
db.setProfilingLevel(n) # set profiler level `mongod` default configuration:
db.setProfilingLevel(1, { slowms: <ms> })
db.getProfilingStatus() # check profiler status - port: `27017`
- dbpath: `/data/db`
db.system.profile.find().limit(n).sort( {} ).pretty() # see logs - bind_ip: `localhost`
db.system.profile.find().limit(n).sort( { ts : -1 } ).pretty() # sort by decreasing timestamp - auth: disabled
```
[`mongod` config file][mongod_config_file]
## Roles and permissions [`mongod` command line options][mongod_cli_options]
**Authentication**: identifies valid users [mongod_config_file]: https://www.mongodb.com/docs/manual/reference/configuration-options "`mongod` config file docs"
**Authorization**: identifies what a user can do [mongod_cli_options]: https://www.mongodb.com/docs/manual/reference/program/mongod/#options "`mongod` command line options docs"
- **userAdminAnyDatabase**: can admin every db in the instance (role must be created on admin db) ### Basic Shell Helpers
- **userAdmin**: can admin the specific db in which is created
- **readWrite**: can read and write in the specific db in which is created ```sh
- **read**: can read the specific db in which is created db.<method>() # database interaction
db.<collection>.<method>() # collection interaction
```sh rs.<method>(); # replica set deployment and management
# create users in the current MongoDB instance sh.<method>(); # sharded cluster deployment and management
db.createUser(
{ # user management
user: "dbAdmin", db.createUser()
pwd: "password", db.dropUser()
roles:[
{ # collection management
role: "userAdminAnyDatabase", db.renameCollection()
db:"admin" db.<collection>.createIndex()
} db.<collection>.drop()
]
}, # database management
{ db.dropDatabase()
user: "username", db.createCollection()
pwd: "password",
roles:[ # database status
{ db.serverStatus()
role: "role",
db: "database" # database command (underlying to shell helpers and drivers)
} db.runCommand({ "<COMMAND>" })
]
} # help
) db.commandHelp("<command>)
``` ```
## Sharding ### Logging
**Sharding** is a MongoDB concept through which big datasets are subdivided in smaller sets and distributed towards multiple instances of MongoDB. The **process log** displays activity on the MongoDB instance and collects activities of various components:
It's a technique used to improve the performances of large queries towards large quantities of data that require al lot of resources from the server.
Log Verbosity Level:
A collection containing several documents is splitted in more smaller collections (_shards_)
Shards are implemented via cluster that are none other a group of MongoDB instances. - `-1`: Inherit from parent
- `0`: Default Verbosity (Information)
Shard components are: - `1 - 5`: Increases the verbosity up to Debug messages
- Shards (min 2), instances of MongoDB that contain a subset of the data ```sh
- A config server, instance of MongoDB which contains metadata on the cluster, that is the set of instances that have the shard data. db.getLogComponents() # get components and their verbosity
- A router (or `mongos`), instance of MongoDB used to redirect the user instructions from the client to the correct server. db.adminCommand({"getLog": "<scope>"}) # retrieve logs (getLog must be run on admin db -> adminCommand)
db.setLogLevel(<level>, "<component>"); # set log level (output is OLD verbosity levels)
![Shared Cluster](../img/mongodb_shared-cluster.png "Components of a shared cluster")
tail -f /path/to/mongod.log # read end og log file
### [Replica set](https://docs.mongodb.com/manual/replication/) ```
A **replica set** in MongoDB is a group of `mongod` processes that maintain the `same dataset`. Replica sets provide redundancy and high availability, and are the basis for all production deployments. > **Note**: Log Message Structure: `<timestamp> <severity-level> <component> <connection> <event> ...`
## [Aggregation Framework](https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/) ### Database Profiling
Sequence of operations applied to a collection as a _pipeline_ to get a result: `db.collection.aggregate(pipeline, options)`. Profiling Levels:
Each step of the pipeline acts on its inputs and not on the original data in the collection.
- `0`: no profiling
```sh - `1`: data on operations slower than `slowms` (default 100ms)
- `2`: data on all operations
db.<collection>.aggregate([
{ "$project": { "_id": 0, "<key>": 1, ...} }, Events captured by the profiler:
{ "$match": { "<query>" } }, - CRUD operations
- Administrative operations
{ "$group": { - Configuration operations
"_id": "<expression>", # Group By Expression (Required)
"<key-1>": { "<accumulator-1>": "<expression-1>" }, > **Note**: Logs are saved in the `system.profile` _capped_ collection.
...
} ```sh
}, db.setProfilingLevel(n) # set profiler level
db.setProfilingLevel(1, { slowms: <ms> })
{ db.getProfilingStatus() # check profiler status
"$lookup": {
"from": "<collection to join>", db.system.profile.find().limit(n).sort( {} ).pretty() # see logs
"localField": "<field from the input documents>", db.system.profile.find().limit(n).sort( { ts : -1 } ).pretty() # sort by decreasing timestamp
"foreignField": "<field from the documents of the 'from' collection>", ```
"as": "<output array field>"
} ### [Replica set](https://docs.mongodb.com/manual/replication/)
},
A **replica set** in MongoDB is a group of `mongod` processes that maintain the `same dataset`. Replica sets provide redundancy and high availability, and are the basis for all production deployments.
{ "$sort": { "<key-1>": "<sort order>", "<key-2>": "<sort order>", ... } },
### Sharding
{ "$count": "<count-key>" },
**Sharding** is a MongoDB concept through which big datasets are subdivided in smaller sets and distributed towards multiple instances of MongoDB.
{ "$skip": "<positive 64-bit integer>" } It's a technique used to improve the performances of large queries towards large quantities of data that require al lot of resources from the server.
{ "$limit": "<positive 64-bit integer>" } A collection containing several documents is splitted in more smaller collections (_shards_)
Shards are implemented via cluster that are none other a group of MongoDB instances.
{ ... }
]) Shard components are:
```
- Shards (min 2), instances of MongoDB that contain a subset of the data
- A config server, instance of MongoDB which contains metadata on the cluster, that is the set of instances that have the shard data.
- A router (or `mongos`), instance of MongoDB used to redirect the user instructions from the client to the correct server.
![Shared Cluster](../img/mongodb_shared-cluster.png "Components of a shared cluster")
---
## [Aggregation Framework](https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/)
Sequence of operations applied to a collection as a _pipeline_ to get a result: `db.collection.aggregate(pipeline, options)`.
Each step of the pipeline acts on its inputs and not on the original data in the collection.
### Variables
Variable syntax in aggregations:
- `$key`: field path
- `$$UPPERCASE`: system variable (e.g.: `$$CURRENT`)
- `$$foo`: user defined variable
### Aggregation Syntax
```sh
db.<collection>.aggregate([
{ "$project": { "_id": 0, "<key>": 1, ...} },
{ "$match": { "<query>" } },
{ "$group": {
"_id": "<expression>", # Group By Expression (Required)
"<key-1>": { "<accumulator-1>": "<expression-1>" },
...
}
},
{
"$lookup": {
"from": "<collection to join>",
"localField": "<field from the input documents>",
"foreignField": "<field from the documents of the 'from' collection>",
"as": "<output array field>"
}
},
{ "$sort": { "<key-1>": "<sort order>", "<key-2>": "<sort order>", ... } },
{ "$count": "<count-key>" },
{ "$skip": "<positive 64-bit integer>" }
{ "$limit": "<positive 64-bit integer>" }
{ ... }
])
```