MongoDB

From ArchWiki

MongoDB (from humongous) is a source-available document-oriented database system developed and supported by MongoDB Inc. (formerly 10gen). It is part of the NoSQL family of database systems. Instead of storing data in tables as is done in a "classical" relational database, MongoDB stores structured data as JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster.

Installation

MongoDB has been removed from the official repositories due to its re-licensing issues [1].

Install one of the following for the latest version available:

Alternatively, there are older versions of MongoDB available:

Tools

Other MongoDB tools can be found packaged as well:

  • MongoDB Shell — The new mongosh tool, which replaces the legacy mongo shell[2]. It compatible with MongoDB 4.0 or newer.
https://www.mongodb.com/docs/mongodb-shell/ || mongosh-binAUR
  • MongoDB Compass — the GUI interactive tool for querying, optimizing, and analyzing MongoDB data
https://www.mongodb.com/docs/compass/ || mongodb-compassAUR, mongodb-compass-readonlyAUR, mongodb-compass-isolatedAUR
  • MongoDB Database Tools — Provides import, export, and diagnostic capabilities
https://www.mongodb.com/docs/database-tools/ || mongodb-toolsAUR
  • Mingo — A proprietary, EULA licensed, MongoDB GUI built on Electron, designed to aid MongoDB developers with managing their databases.
https://mingo.io/ || mingoAUR

Usage

Start/Enable the mongodb.service daemon.

Note: During the first startup of the mongodb service, it will pre-allocate space, by creating large files (for its journal and other data). This step may take a while, during which the MongoDB shell is unavailable.

To access the MongoDB shell [3]:

$ mongosh

Or, if authentication is configured:

$ mongosh -u userName
Warning: The legacy mongo shell has been deprecated in MongoDB v5.0 and replaced with mongosh[4]. Although it is available in some MongoDB packages, it is highly recommended that you switch it starting in version 5.0.

Configuration

File Format

MongoDB uses a YAML-based configuration file format. See https://docs.mongodb.com/manual/reference/configuration-options/ for available configuration options.

/etc/mongodb.conf
systemLog:
   destination: file
   path: "/var/log/mongodb/mongod.log"
   logAppend: true
storage:
   journal:
      enabled: true
processManagement:
   fork: true
net:
   bindIp: 127.0.0.1
   port: 27017
setParameter:
   enableLocalhostAuthBypass: false
..

Requiring Authentication

Warning: By default, MongoDB does not require any authentication! Although, MongoDB only listens on the localhost interface by default to prevent outside access. This will still allow any local user to connect without authenticating and may exposes the database(s). It is recommended to enable access control to prevent any unwanted access. If you set MongoDB to listen on 0.0.0.0 you MUST enable access control or your data will be stolen and held ransom.

To create a MongoDB user account with administrator access [5]:

$ mongosh
use admin
db.createUser(
  {
    user: "myUserAdmin",
    pwd: "abc123",
    roles: [ { role: "userAdminAnyDatabase", db: "admin" }, "readWriteAnyDatabase" ]
  }
)

Append the following to your /etc/mongodb.conf.

/etc/mongodb.conf
security:
  authorization: "enabled"

Restart mongodb.service.

NUMA

Running MongoDB with Non-Uniform Memory Access (NUMA) can significantly impact performance. [6]

To see if your system uses NUMA:

# dmesg | grep -i numa

Also, /var/log/mongodb/mongod.log will show warnings if NUMA is in use and MongoDB is not started through numactl. (The mongo shell will also show this, but only if you do not have authentication enabled.)

If your system uses NUMA, to improve performance, you should make MongoDB start through numactl.

Edit mongodb.service according to the package you installed.

If using mongodbAUR, change it from:

ExecStart=/usr/bin/mongod $OPTIONS

To:

ExecStart=/usr/bin/numactl --interleave=all /usr/bin/mongod $OPTIONS

If using mongodb-binAUR, change it from:

ExecStart=/usr/bin/mongod --quiet --config /etc/mongodb.conf

To:

ExecStart=/usr/bin/numactl --interleave=all /usr/bin/mongod --quiet --config /etc/mongodb.conf

Zone claim also needs to be disabled, but on arch, /proc/sys/vm/zone_reclaim_mode defaults to 0.

Reenable and Restart mongodb.service as needed.

Clean Start and Stop

By default, systemd immediately kills anything after asking it to start or stop, if it has not finished doing so within 90 seconds.

mongodbAUR makes systemd wait as long as it takes for MongoDB to start, but mongodb-binAUR does not. Both packages allow systemd to kill MongoDB after it is asked to stop, if it has not finished within 90 seconds.

Large MongoDB databases can take a considerable amount of time to cleanly shut down, especially if swap is being used. (An active 450GB database on a top of the line NVMe with 64GB RAM and 16GB swap can take an hour to shut down.)

By default, MongoDB uses journaling. [7] With journaling, an unclean shutdown should not pose a risk of data loss. But, if not shutdown cleanly, large MongoDB databases can take a considerable amount of time to start back up. In this case, choosing whether to require a clean shutdown is a choice of a slower shutdown versus a slower startup. [8]

Warning: If you disable journaling, failing to require a clean shutdown severely risks data loss, so you really need to require a clean shutdown. [9]

To prevent systemd from killing MongoDB after 90 seconds, edit mongodb.service.

To allow MongoDB to cleanly shutdown, append to the [Service] section: (On large databases, this may substantially slow down your system shutdown time, but speeds up your next MongoDB start time)

TimeoutStopSec=infinity

If MongoDB needs a long time to start back up, it can be very problematic for systemd to keep killing and restarting it every 90 seconds [10], so mongodbAUR prevents this. If using mongodb-binAUR, to make systemd wait as long as it takes for MongoDB to start, append to the [Service] section:

TimeoutStartSec=infinity

Troubleshooting

MongoDB will not start

If MongoDB will not start, and you just upgraded to mongodbAUR 4.0.6-2+, you probably have a custom /etc/mongodb.conf. When MongoDB was in the Official repositories, it used an Arch-specific configuration file that used the systemd service type of simple. It now supplies upstream's systemd service and configuration files, which instead use a systemd service type of forking. Pacman will automatically upgrade your systemd service file, but will only automatically upgrade your /etc/mongodb.conf if you never modified it. In that case, systemd will be expecting mongod to fork, but its configuration file will tell it not to. You need to: switch to the new configuration file installed at /etc/mongodb.conf.pacnew, and duplicate changes you made to the old one that you still need, considering the new one is now in the YAML format, and the old one is probably in the MongoDB 2.4 format; or modify your existing one to enable forking. (To continue using the old 2.4 file format instead of YAML, adding fork: true should be what is needed.)

Check that mongodb.service is configured to use the correct database location.

Add --dbpath /var/lib/mongodb to the ExecStart line:

ExecStart=/usr/bin/numactl --interleave=all mongod --quiet --config /etc/mongodb.conf --dbpath /var/lib/mongodb

Check that there is at least 3GB space available for its journal files, otherwise mongodb can fail to start (without issuing a message to the user):

$ df -h /var/lib/mongodb/

Check if the mongod.lock lock file is empty or not:

# ls  -lisa /var/lib/mongodb

If it is, stop mongodb.service. Run a repair on the database, specifying the dbpath (/var/lib/mongodb/ is the default --dbpath in Arch Linux):

# mongod --dbpath /var/lib/mongodb/ --repair

Upon completion, the dbpath should contain the repaired data files and an empty mongod.lock file.

Warning: In dire situations, you can remove the file, start the database using the possibly corrupt files, and attempt to recover data from the database. However, it is impossible to predict the state of the database in these situations. See upstream document for detail.

After running the repair as root, the files will be owned by the root user, whilst Arch Linux runs it under a different user. You will need to use chown to change the ownership of the files back to the correct user. See following link for further details: Further reference[dead link 2024-01-13 ⓘ]

# chown -R mongodb: /var/{log,lib}/mongodb/

Some computer just cannot run MongoDB

Some computers simply will not run MongoDB because their CPU architecture does not have the instruction set needed to run it. For instance, MongoDB was able to be installed on a GPD MicroPC which has an Intel "Gemini Lake Refresh"/Goldmount Plus microarchitecture, but running the MongoDB Shell returned the following:

$ mongosh 'mongodb://localhost:27017'
Current Mongosh Log ID: 642b48661e2fc4dd5bda05d0
Connecting to:          mongodb://localhost:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+1.8.0
MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017

Furuthermore, coredumpctl info reported a signal 4 (ILL) meaning that an illegal instruction execution was attempted. In other words, the computer did not have the instruction set to run this program, at least not locally.

It was able to connect to MongoDB Atlas where the server is hosted remotely on a machine that can run MongoDB, no mongodb.service required.

Warning about Transparent Huge Pages (THP)

One may want to permanently disable this feature by using a tmpfile:

/etc/tmpfiles.d/mongodb.conf
w /sys/kernel/mm/transparent_hugepage/enabled - - - - never
w /sys/kernel/mm/transparent_hugepage/defrag - - - - never

Use sysctl to disable THP at runtime:

# echo never > /sys/kernel/mm/transparent_hugepage/enabled
# echo never > /sys/kernel/mm/transparent_hugepage/defrag

Warning about Soft rlimits too low

If you are using systemd service, then edit the unit file:

[Service]
# Other directives omitted
# (file size)
LimitFSIZE=infinity
# (cpu time)
LimitCPU=infinity
# (virtual memory size)
LimitAS=infinity
# (locked-in-memory size)
LimitMEMLOCK=infinity
# (open files)
LimitNOFILE=64000
# (processes/threads)
LimitNPROC=64000

See following link for further details: Further reference