Mani DBA - Learning Mongo DB: January 2017

Wednesday, January 25, 2017

The profiler

The profiler============

db.commandHelp("profile")

Levels
0= OFF
1=selective (slow)
2=ON

Setting profile:

Setting profile for Level 2 =>db.setProfilingLevel(2)

Setting profile for logging transaction >3Msecs => db.setProfilingLevel(1,3)

Checking profile:
db.getProfilingStatus()

show collections => "system.profile" will be in the collection list

List entries in the profile log=>db.system.profile.find().pretty()

Count number of entries in the profile => db.system.profile.find().count()

display last entry from profile => db.system.profile.find({},sort({$natural:-1}.limit(1).pretty()

display last entry from profile with type of operation query or update or??
=> db.system.profile.find({},{op:1}.sort({$natural:-1}.limit(10).pretty()

CHECKING Profile size=>
db.system.namesapces.find() - default 1MB in RAM, it's a circular queue

db.system.profile.stats()

Optimizer plan EXPLAIN

Explain:

***********query planner******************

db.example.explain().find({a:17}).sort({b:-1})

- Stage tells "collscan" or "Idxscan"

- Winning plan tells which plan is choosen by this execution

db.example.explain().remove({a:17,b:12})

- Taking explain plan for removing indexes

*************executionStats*****************

Examing explain plan

exp=db.example.explain("executionStats")
Explainable(test.example)

exp.find({a:17,b:12})

This throw output with query planner section and exectuion stats which gives details how
itration time, number search, records returned, time to execute etc

*************allplansexecution*****************

expall=db.example.explain("allPlansExecution")

exp.find({a:17,b:12})

Examples:

db.sensor_readings.createIndex({active:1,tstamp:1})

db.products.createIndex({for:1})

db.products.find({for:"ac3"})

db.products.explain().find({for:"ac3"})

exp=db.products.explain("executionStats")

exp.find({for:"ac3"})

Tuesday, January 24, 2017

indexes

Indexes:

createIndex -> db.collection.createIndex({a:1})

getIndex-> db.collection.getIndex(

dropIndex

1)default index

2)Unique:true

db.collection.ensureIndex(key#,<options>)}

eg:

db.foo.ensureIndex({empid:1},{unique:true})

3) sparse:true

db.foo.ensureIndex({empid:1},{Special:true})

4)TTL - Time to live
db.eventlog.createIndex("lastModifiedDate":1},{expireAfterSeconds:3600})

5) Geo Spatial and 2d Indexes

db.places.ensureIndex({location:"2dsphere"})

- 2dsphere to mention the "Latitude & Longitude"

db.places.ensureIndex({location:"2d"})

- 2d refers x & y axis

Spatial query

db.places.find({loc:
{$near:{$geometry:
{type:"point", coordinates:[2,2,01]},
spherical:true
}
}
}
)

--$near -> spatial operator
--$geometry:{type:"point", coordinates:[2,2,01]} -> parameters

6) Text indexes

db.sentences.ensureIndex({words:"text"}) - CREATING TEXT INDEX ON COLUMN WORDS

db.sentences.find({$text:{$search:"cat"}})

db.movies.find( { $text : {$search : "Big Lebowski" } } ) - it gives output of strings maches with "Big" or Lebowski" or both.

db.sentences.find({$text:{search:"Trees cat"}),{score:{$meta:"textScore"},_id:0}) - It gives search match ranking, _id:0 is for display subpression

db.sentences.ensureIndex({words:"text"},{default_language:"english"}) - text index language selection

******************NOTE**************************

Index can be created at the background so that read and write on the collection will not be impacted

- db.collection.createIndex({a:1}) => db.collection.createIndex({a:1},{background:true})

-However, background is applicable only for "primary" and secondary still uses foreground option

-Background uses RANDOM I/O, it takes extra time

-It will not return to prompt until background process is completed

Performance - Storage

Performance

Two choices in choosing storage

* MMAPv1 (default storage)

===========================

Classic storage maps datafiles into virutal memory

eg: mongod --storageEngine mmapv1

or

db.serverStatus()

Locking :

locking 2.2 - 2.6
-> Data (before mongo ver 3.0, db level locking and after 3.0 version it supports collection level locking),
-> Meta data
-> Indexes

Mongo supports - Single Write and multi read operation

Journal - Write ahead log (Redo - archive log in Oracle)
- FSSYNC syncs journal to disk

Data on disk is in consistent state by "write ahead" log

Extend allocation method -> Power of 2 allocation

Start with 32bytes -> 32, 64, 128,256,512.....2MB

Documents will not have to move as soon as they grow in size. - This is because you have some space to grow before you reach your record size.
Record spaces are likely to get re-used. - Without standardized sizes, the likelihood that your new document will want the same size of record space as an old document might be slim, especially if your documents have different sizes (which they will if they're all growing). With standard sizes, all documents are likely to find record spaces that fit them.
Documents that grow at a constant rate will move less often as time goes on. - This makes sense, if you think about it. If your documents start out at an average size of 20 bytes, and grow at a rate of 1 byte per day, your first move will happen in 12 days (since the smallest record size is 32 bytes). Your second move will take 32 days, and your third will take 64. Each successive move will take longer and longer.

*Wired Tiger
============
Opensource storage
First pluggable storage engine

Note : It may not be obvious when you first think about it, but indexes are controlled by the storage engine.
For instance, MongoDB uses Btrees. With MongoDB 3.0, WiredTiger will be using B+ trees, with other formats expected to come in later releases.

Features :

1)Document level locking
2)Compression
3)Better performance
4)Fixes shortfalls in MMAPv1 storage engine

mongod --storageEngine wiretiger

it uses 2 caches
a) WT Cache (Half of RAM - Default)
b) FS Cacge

Every 60Secs - Check point flues WT cache -> FS cache -> Disk

Every check point - take data snapshot and at any point in time 2 previous snapshots will be available for data recovery purpose, hence "Journal or write ahead log" is not used.

Document level locking and writes handled with mutiple threads (based on number of CPU cores)

Compression 2 types
A) SNAPPY (Default) - Fast
B) zlib - More compression but perf will be compromised

Monday, January 23, 2017

Mongodb - Admin utilities

(1)Export/Import:
===========
mongoexport -d mydb -c testdata > testdata.json

mongoimport -d mydb -c testdata < testdata.json

(2) mongodump
==========
//mongodump creates "bson" binary formatted dump for all the DBs and it's collections under dump direcotry

(a) command => mongodump

(b) How to convert bsondump to readable json format

bsondump dump/mydb/testdata.bson > testdata.json

(3) Mongostat
===========
// It provides statistics about the runing mongod instances

./mongostat

(4) Mongoperf
===========
Two Modes:
MMF : False - I/O test for NON SAN disks (Default)
----------------------------------------------------------

echo "{nThreads:10, fileSizeMB:100, r:true)" |mongoperf

- Test run with 10 threads, file size 100

MMF : True - I/O Bench marking purpose (MMF -> Memory Mapped File - Caching enabled)
---------------------------------------------------------

RAID/SAN mode testing

echo "{recSizeKB:8,nThreads:4, fileSizeMB:100, r:true,mmf:true)" |mongoperf
echo "{nThreads:10, fileSizeMB:100, r:true)" |mongoperf

(5) mongofiles
===========

This utility is used for moving files across local to and from mongo GRIDFS storage

Eg:
I have a file called test.lp in my local file system @ /tmp/test.lp and I want o copy this to mongo gridfs then use the below syntax

mongofiles --host localhost:27017 -d mydb put /tmp/test.lp

mongofiles --host localhost:27017 -d mydb list

mongofiles --host localhost:27017 delete /tmp/test.lp

(6) mongos
========
using mongos stands for "mongo db shrad"
It is a routing service for mongo db shrad configuration to locate your data in the shrad configuration
uning shrad config we can partition our data across mongodb shrad nodes and mongos will help to identify the data address when you query.

Saturday, January 21, 2017

Admin commands quick over view

admin DB

List databases from admin db

db.runCommand({listDatabases:1})

Remove Vs Drop

Remove - Deleting the records for the collection and making the collection become empty

Drop - Removing the collection itself from the database

currentOp and killOp for session monitoring and killing long running sessions

db.currentOp() -- This is for listing current operation processes/sessions information

db.killOp() -- Using operation id we can kill long running processess

db.currentOp()
--find the opid for the long running session

db.killOp(opid)

db.currentOp()

db.currentOp().inprog.length

Script for identifying long running ops
===========================
db.currentOp().inprog.forEach(
function(op)
{
if(op.secs_running>10)
{
print("slow op in progress? secs:" + op.secs_running+"opid:"+
op.opid);
}
}
)

rs.status() - replica status

db.killOp(<opid>)

Misc commands
============
netstat -n|grep <port>
lsof|grep pid

Creating open transaction:
===================
while (1){db.example.update({},{$inc:{x:1}},false,true);}

Script writting/execution of Mongo commands:

Script writting/execution of Mongo commands:

When we execute an operation through the shell or another interface,
we can use db.runCommand()

db.runCommand({command:value,....p1:v1,p2:v2....})

eg:

db.runCommand({getLastError:1}) ---- Last execution error msg

db.runCommand({isMaster:1}) ----Primary or not

db.runCommand({serverStatus:1}) -- Health of the server

Bulk upload - Ordered & Unordered

Bulk upload
===============

1. Unordered - Faster execution but data will not be executed in the same order as mentioned
2. Ordered - Slower but execution will be in the same order how it's mentioned for execution

Examples

(1) Unordered

var bulk=db.items.initializeUnorderedBulkOp()

bulk.insert({item:"abc12c", defaultqty:100,status:"A",points=100});
db.items.find()

bulk.insert({item:"ijk123",defaultqty:200, status:"A", points:200});
bulk.insert({item:"mop123",defaultqty:0, status:"P", points:0});

bulk.execute()

db.items.find()

(2) Ordered

var Unbulk=db.items.initializeOrderedBulkOp()

Unbulk.insert({item:"abc13c", defaultqty:100,status:"A",points:200});
Unbulk.insert({item:"abc14c", defaultqty:100,status:"A",points:300});

Unbulk.find({item:"abc12c"}).remove()
Unbulk.find({item:"mop123"}).remove()

Unbulk.find({item:"ijk123"}).update({$inc:{points:1}})

Unbulk.execute()

db.items.find()

upsert - update if that value is not present in the collection

sample:

db.pageviews.update({_id:"/sports/foodball"},{$inc:{views:1}},true)

Eg:

Suppose we have documents in the users collection of the form:

{
_id : "Jane",
likes : [ "tennis", "golf" ]
}

How would we, in the mongo shell, add that this user likes "football"? We want to record this even if the user does not yet have a document. We also want to avoid having duplicate items in the "likes" field.

db.users.update({_id: "Jane"}, {$addToSet: {likes: "football"}}, {upsert: true})

multi updates

multi updates
============

db.<coll>.update(<where>,<obj>,<upsert T/F>,<multi T/F>

db.orders.update({order_id:1234},
{$inc:{priority:1}},
false,
true
}

User update may have intended to update multiple documents but it will
exist prematurely due to "multi set to FALSE"

remove/delete

=============

db.collectionname.remove({expr})

db.test.find({_id:100})

db.test.find({_id:100}).count()

db.test.remove({_id:100})

remove all=>db.test.remove({})

db.test.remove({x:/ello/})

Example:

Suppose we have documents in the users collection of the form:

{
 _id : ObjectId("50897dbb9b96971d287202a9"),
 name : "Jane",
 likes : [ "tennis", "golf" ],
 registered : false,
 addr : {
   city : "Lyon",
   country : "France"
 }
}

How would we, in the mongo shell, delete all documents in the collection where city is "Lyon" and registered is false?

Friday, January 20, 2017

Update - partial update and update operators

Partial update operators:

$set - set new value
$push - add new to array
$addToSet - add to an array if it's not present
$pop - remove from array

$unset - remove value

Thursday, January 19, 2017

_id can not be updated

Wednesday, January 18, 2017

Import

mongoimport --drop -d pcat -c products products.json

Cursor

{
var cursor = db.test.find().limit(100);
while(cursor.hasNext())
{
print("x:" + cursor.next().x);
}
}

Help usage

Help usage

db.products.find().help()

A MongoDB sharded cluster

A MongoDB sharded cluster consists of the following components:

shard: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set.
mongos: The mongos acts as a query router, providing an interface between client applications and the sharded cluster.
config servers: Config servers store metadata and configuration settings for the cluster. As of MongoDB 3.2, config servers can be deployed as a replica set.

Diagram of a sample sharded cluster for production purposes. Contains exactly 3 config servers, 2 or more ``mongos`` query routers, and at least 2 shards. The shards are replica sets.

Diagram of a sample sharded cluster for production purposes. Contains exactly 3 config servers, 2 or more ``mongos`` query routers, and at least 2 shards. The shards are replica sets.

Development Configuration

For testing and development, you can deploy a sharded cluster with a minimum number of components. These non-production clusters have the following components:

A replica set config server with one member.
At least one shard as a single-member replica set.
One mongos instance.

Diagram of a sample sharded cluster for testing/development purposes only. Contains only 1 config server, 1 ``mongos`` router, and at least 1 shard. The shard can be either a replica set or a standalone ``mongod`` instance.

Creating NEW collection TEST and inserting 20k records using FOR LOOP & LIMIT, SKIP & Sort combination OUTPUT

Creating NEW collection TEST and inserting 20k records using FOR LOOP

for(var i=0; i<20000; i++) {db.test.insert({x:i, y:"hi"});}

show collections

test

db.test.count()
20000

SKIP/LIMIT/SORT combination

db.test.find().limit(7)
db.test.find().skip(20).limit(5)
db.test.find().sort({x:-1}.skip(20).limit(5)

var query=db.test.find().sort({x:-1}.skip(20).limit(5)
query

Eg:

Write a query that retrieves documents of type "exam", sorted by score in descending order, skipping the first 50 and showing only the next 20.
db.scores.find({type:"exam"}).sort({score:-1}).skip(50).limit(20)

Select prices in ascending/Decending order where price exist

Select prices in ascending order where price exist

Asending:

db.products.find({price.{$exists.true}},(name:1,price:1}.sort({price:1})

Decending:

db.products.find({price.{$exists.true}},(name:1,price:1}.sort({price:-1})

Order by 2 fields:

db.customer.find().sort({lastname:1, first:1}) order by lastname, firstname

Order by 2 fields:

db.books.find().sort({author:1, date_posted:-1})