Setup MongoDB Shard Replica Cluster on AWS

 

 


Prerequisite
EC2 Instance

Replicas

Resources

Shard Cluster
1 Shard = 1 replicas with 3 instances

Amazon EC2

Description: MongoDB Shard Replicas r6g.medium 1VCPU/ 8GB RAM

Region: Asia Pacific (Singapore)

Operating system (Linux), Quantity (15), Pricing strategy (On-Demand Instances), Storage amount (344 GB) [System 10GB] [Data 334GB], Instance type (r6g.medium), General Purpose SSD (gp3) - IOPS (3000), General Purpose SSD (gp3) - Throughput (125 MBps)

Config Server
1 Shard = 1 replicas with 3 instances

Amazon EC2

Description: MongoDB ConfigServer r6g.medium 1VCPU/ 8GB RAM

Region: Asia Pacific (Singapore)

Quick estimate



Operating system (Linux), Quantity (3), Pricing strategy (On-Demand Instances), Storage amount (20 GB) [System 10GB] [Data 10GB], Instance type (r6g.medium), General Purpose SSD (gp3) - IOPS (3000), General Purpose SSD (gp3) - Throughput (125 MBps)

Mongos Router
1 Instances

Amazon EC2

Description: MongoDB Query Router c5a.large 2VCPU / 4 GB RAM

Region: Asia Pacific (Singapore)

Quick estimate



Operating system (Linux), Quantity (1), Pricing strategy (On-Demand Instances), Storage amount (20 GB), Instance type (c5a.large), General Purpose SSD (gp3) - IOPS (3000), General Purpose SSD (gp3) - Throughput (125 MBps)

subnet : private
security group:

TCP

22

0.0.0.0/0

SSH

TCP

9100

192.168.0.0/16

prometheus - node exporter

TCP

27018

192.168.0.0/16

mongod - shard replicas

TCP

9216

192.168.0.0/16

prometheus - mongodb exporter

TCP

27019

192.168.0.0/16

mongod - config server

TCP

27017

192.168.0.0/16

mongos - query router

Data Volumes

create logical volumes and volumes group to make it adjustable when we want to increase the storage size later.


01.setup-installation-and-volumes.sh

#!/bin/bash sudo lsblk sudo parted -s -a optimal -- /dev/nvme1n1 mklabel gpt sudo parted -s -a optimal -- /dev/nvme1n1 mkpart primary 0% 100% sudo parted -s -- /dev/nvme1n1 align-check optimal 1 sudo pvcreate /dev/nvme1n1p1 sudo vgcreate vg01 /dev/nvme1n1p1 sudo lvcreate -n data -l 100%FREE vg01 sudo mkfs.ext4 /dev/mapper/vg01-data sudo echo "/dev/mapper/vg01-data /data ext4 defaults 0 0" | sudo tee -a /etc/fstab sudo mkdir /data sudo mount -a ################################ # for mongos router only run # the installation part ################################ echo "Installing mongod services" sudo echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.4.list sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 656408E390CFB1F5 sudo apt update sudo apt install -y mongodb-org net-tools ############################## ############################## # shards replicas dir = mongodb # config server dir = configdb ############################### sudo mkdir /data/mongodb # or configdb sudo chown -R mongodb:mongodb /data/mongodb # or configdb
sudo chmod -R 775 /data/mongodb #or configdb

do this setup on mongo configserver and mongo shard-replica, for mongos router only run the installation part

chmod +x 01.setup-installation-and-volumes.sh
./01.setup-installation-and-volumes.sh


Configuration

ConfigDB Replicaset Cluster

with 2 node replica. port 27019
configNode01: 192.168.119.223
configNode02: 192.168.105.35

sudo vim /etc/mongod.conf

storage: dbPath: /data/configdb journal: enabled: true systemLog: destination: file logAppend: true path: /var/log/mongodb/mongodConfig.log net: port: 27019 bindIp: 127.0.0.1,192.168.119.223 sharding: clusterRole: configsvr replication:
replSetName: ConfigReplSet

do the same config for configNode02. just replace the bindIp address.

ensure the sharding roles and replication name is for config-db.
clusterRole: configsvr
and
replSetName: ConfigReplSet

set permission on disk into mongodb user and group. do this on each nodes.

sudo chown -R mongodb.mongodb /data sudo chown -R mongodb.mongodb /var/lib/mongodb sudo chown mongodb.mongodb /tmp/mongodb*

sudo chown -R mongodb.mongodb /var/log/mongodb

run and initiate replica-set for config-db
sudo service mongod start
sudo service mongod status


enable replicas for config-db on configNode01

$ mongo 127.0.0.1:27019 > rs.initiate() > rs.add(192.168.105.35) <--- ip / hostname configNode02
> rs.status()

If you register the secondary replica node using host-name, ensure it has been added on DNS server or on your local /etc/host file.


Mongos Router

Router with 1 node. port 27017
routerNode01: 192.168.101.167

sudo vim /etc/mongod.conf

systemLog: destination: file logAppend: true path: /var/log/mongodb/mongod.log net: port: 27017 bindIp: 127.0.0.1,192.168.101.167 sharding:
configDB: ConfigReplSet/192.168.119.223:27019,192.168.105.35:27019

ensure sharding is pointing the configDB replica ConfigReplSet

 

Create sysinit services

edit systemd and add mongos execution service

sudo vim /etc/systemd/system/mongos.service

[Unit] Description=MongoDB Database Server Documentation=https://docs.mongodb.org/manual After=network-online.target Wants=network-online.target [Service] User=mongodb Group=mongodb EnvironmentFile=-/etc/default/mongod ExecStart=/usr/bin/mongos --config /etc/mongod.conf PIDFile=/var/run/mongodb/mongod.pid # file size LimitFSIZE=infinity # cpu time LimitCPU=infinity # virtual memory size LimitAS=infinity # open files LimitNOFILE=64000 # processes/threads LimitNPROC=64000 # locked memory LimitMEMLOCK=infinity # total threads (user+kernel) TasksMax=infinity TasksAccounting=false # Recommended limits for mongod as specified in # https://docs.mongodb.com/manual/reference/ulimit/#recommended-ulimit-settings [Install]
WantedBy=multi-user.target

ExecStart=/usr/bin/mongos --config /etc/mongod.conf using mongos command

enable the services

$ sudo systemctl enable mongos $ sudo service mongos start
$ sudo service mongos status

 

Shard Replicaset Cluster

Shard cluster with 2 node replica. port 27018
shardNode01: 192.168.117.119
shardNode02: 192.168.99.245

sudo vim /etc/mongod.conf

storage: dbPath: /data/mongodb journal: enabled: true systemLog: destination: file logAppend: true path: /var/log/mongodb/mongod.log net: port: 27018 bindIp: 127.0.0.1,192.168.117.119 processManagement: timeZoneInfo: /usr/share/zoneinfo replication: replSetName: "ShardReplSet" sharding:
clusterRole: shardsvr

do the same config for shardNode02. just replace the bindIp address.

ensure the sharding roles and replication name is for shard db.
clusterRole: shardsvr and replSetName: ShardReplSet

set permission on disk into mongodb user and group. do this on each nodes.

sudo chown -R mongodb.mongodb /data
sudo chown -R mongodb.mongodb /var/lib/mongodb sudo chown mongodb.mongodb /tmp/mongodb*

sudo chown -R mongodb.mongodb /var/log/mongodb


run and initiate replica-set for shard-db

sudo service mongod start
sudo service mongod status


enable replicas for shard db on shardNode01

$ mongo 127.0.0.1:27018 > rs.initiate() > rs.add(192.168.99.245) <--- ip / hostname shardNode02
> rs.status()

If you register the secondary replica node using host-name, ensure it has been added on DNS server or on your local /etc/host file.

 

Register shard replica

on mongos router ( routerNode01 ) accessing the nodes using config-db instance or mongo client.

$ mongo 192.168.101.167:2017 mongos> sh.addShard( "ShardReplSet/192.168.117.119:27018,192.168.99.245:27018")
mongos> sh.status()

shard is pointing to shard replica cluster ShardReplSet

 

at this point now you can enable shard for DB

mongos> use profile mongos> sh.enableSharding("profile") 
mongos> db.createCollection("profiles")
 mongos> db.collection.ensureIndex({id: "hashed"})
 mongos> sh.shardCollection("profile.profiles", { id : "hashed" } ) mongos> db.collection.getShardDistribution()


 

 

Comments