MongoDB is relational, you just need to rewire your brain

Arnav Zek
6 min readAug 1, 2020

If you are new to MongoDB you might have to faced difficulties with relations. Which is due to the reason, MongoDB is documented as non-relational database which is misleading, We will explore why in the article, by directly comparing SQL relational query with MongoDB substitute query

Disclaimer: MySQL is a superset of SQL so below discussion will apply to all super superset of SQL

LEFT JOIN / LEFT OUTER JOIN

First, let’s understand Left Join in SQL

The LEFT JOIN keyword in SQL, returns all records from the left table, and the matched records from the right table, it returns all records from the left table, even if there are no matches in the right table

What is equalization?

Equalization is used to know if two records have a common value

SELECT orders.id, users.first_name, users.surname, products.name as 'product name'FROM ordersINNER JOIN users on orders.user_id = users.id//orders.user_id = users.id is equalization

In SQL, the values of both rows are merged if two rows have the same equalization in MongoDB you provide an object key that will contain the other row

How to do it in MongoDB?

MongoDB has a well-documented alternative for Left Join, $lookup

$lookup is part of the Aggregate method

To understand lookup we need to first understand how aggregate works

Consider these documents part of purchases collection

db.purchases.distinct(‘products’)

Will give all the distinct values of products [‘toothbrush’,’pizza’,’ milk’,’guitar’]

db.purchases,aggregate({$match:{customer:{$in:['mike','tom']}},$group:{name:'$customer',expenditure:{$add:'$total'}},$sort:{ expenditure:- 1}})

$match

It is used to narrow down the data set

customer:{$in:['mike','tom']} means 

It means, only does further aggregation with documents where the customer is mike or tom

$group

It combines multiple documents into one document for reading purpose, keys inside $group define the new structure of the returned document

The above query add all the total values and output it expenditure key in the returned document

$sort

will sort the returned documents in descending order. -1 means descending and 1 means ascending (the default)

Understanding $lookup

db.orders.aggregate([{    $match:{//narrow the dataset of orders},$lookup:
{
from: "inventory",
localField: "field in orders for equilization",
foreignField: "field in inventory for equilization",
as: "inventory_docs"
}
}
])

as: The field name, that will contain the data of inventory corresponding to the order

The output will be something like this

[{
price:'bla bla bla'
customer:'bla bla bla'
date:'bla bla bla'
inventory_docs:{

inventory document corresponding to the order localfield
which matched with foreignField (given field of inventory)
}},
{
price:'bla bla bla'
customer:'bla bla bla'
date:'bla bla bla'
inventory_docs:{

inventory document corresponding to the order localfield
which matched with foreignField (given field of inventory)
}}
]

Official Documentation on Lookup

RIGHT JOIN

Right Join is just the opposite way of writing Left join, where Left table and Right table exchange places

So, obviously Right Join is possible in MongoDB, just have to exchange values between local and external values inside $lookup

INNER JOIN

It returns only common records

How it can be implemented on MongoDB

There is no direct analogy in MongoDB

You may be thinking `what if I just query all the documents from both collections with no limit and see which of them match?`

It will be expensive and slow when the number of Documents grows

Now you might be thinking

How the fuck SQL does it in milliseconds? Well, indexing. but as records will grow, it can take more than 5 seconds

There is hope

To find documents with a common field value in both collections we can use $in

$in:[field values of right table]

$in matches the field with all the array values given to it

So if we have a posts collection and pages liked collection

//posts collection[
{pageTitle:'lorem', postContent:'lorem came before ipsum'},
{pageTitle:'programming', postContent:'react is old'},
{pageTitle:'fashion', postContent:'too complicated'},
]//pagesLiked Collection
[
{pageTitle:lorem, user:'bananaPants',liked:true},
{pageTitle:programming, user:'bananaPants',liked:true},{pageTitle:lorem, user:'bananaShirt',liked:true},
{pageTitle:lorem, user:'bananaJeans',liked:true}
]

To find all the posts from pages you have liked, In SQL we would have done something like this

SELECT posts.content FROM posts INNER JOIN pagesLiked ON posts.pageTitle= pagesLiked.pageTitle WHERE pagesLiked.user'%{$loggedInUser}%' ";

With MongoDB, we will have to make two queries like below and it will be faster than SQL

let pagesLiked = await db.pagesLikes.find({user:loggedInUser})
pagesLiked = pagesLiked.map(item=>{ return item.pageTitle })
let posts = await db.posts.find({pageTitle: {$in:pagesLiked}})

You may ask why doesn’t mongo DB has an abstracted method to encapsulate all of this?

It is a good question, but it has drawbacks, MongoDB believes in the principle of writing code vertically than horizontally which is easier to manage while SQL believes in writing code horizontally which is inconvenient to debug and understand, I don’t like scrolling horizontally

MongoDB is not less powerful than SQL, you need to change your thinking which takes time

I prefer MongoDB for 3 reasons

  • You can scale it horizontally, It means you can add more servers instead of making one server more powerful.

This approach successfully prevents downtime. It is easy to scale, So it can manage huge data and that is in the name. “Mongo” being short for the word humongous which means huge

  • You can save JSON objects and query them
Lets say you have a users collection that has the following document{
userData:{
username:'bananaPants',
password:'hashedData',
fullName:'sir banana pants'
},
signupData:'some data',
banned:false
}

I can search the above document by

db.users.find({userData.username:'bananaPants'})

You can’t do this in SQL

  • You can change schema whenever you want. so it is easier to add new features

Now you know why developers love no-SQL database.

The Dont’s

You will come across the idea of one collection, The premise of One collection is storing data as BSON in one Collection which was supposed to be stored in a different collection. This way you don’t have to care about relations because there is only one collection

Even Firebase (A BAAS service of Google) encourages this.

But Keep in mind it is not always great, you can quickly exceed the max document size and you may be able to extract what you want

As don’t do this

//image this document is part of a collection named mostStuff{
userData:{username:'banana',pass:'hashed'}
goodGuy:False,
articles:{
aTitle:{content:'sun is yellow'},
anotherTitle: {content:'I like emoji'}
}}

As you won’t be able to extract specific article, rather create a different collection for articles, then u can extract article individually as well as with users with $lookup just save username inside for articles

It is encouraged in Firebase because it doesn’t have any serverside join like $lookup you are supposed to do that on the client-side, you first load a document (let’s say docA) to load another document which is referenced by docA we need to load it the same way we loaded docA(well, firebase also doesn’t has the feature of making a field unique for that you will have to use its serverless function)

Leave a clap, It makes me feel less miserable writing, I live in India I don’t make a penny from these articles.

Best Of Luck, I appreciate you are still reading in 2020, You are gonna make it If you can survive this boring article you can battle anything. I believe in you.

Good Bye, for now, see you soon, also here are a few resources that helped me in the research

Resources that helped me

Resources: https://websitesetup.org/sql-cheat-sheet

https://www.mongodb.com/blog/post/joins-and-other-aggregation-enhancements-coming-in-mongodb-3-2-part-1-of-3-introduction

--

--