Gotcha! This article is another clickbait damn it. Hold it right there. Before you put any more time into reading this article, let’s first make sure we are on the same page. Read the following questions…
Hint: MongoDB’s Oplog cursor plays a key role in capturing real-time changes made to the database.
If these questions make your brain go…
Then I’m here with this article about What is cursor in mongoDB. So, grab a coffee and put demon slayer on TV and see how cute is Nezuko…
Cough Cough back to the topic.
Note: Pay close attention here as I explain the mongodb cursor.
MongoDB document says:
A cursor is pointer to the result set of a query. Clients can iterate through a cursor to retrieve results. By default, cursors timeout after 10 minutes of inactivity.
There you go, folks. It’s a pointer to the result set of a query and that’s answers your questions, let’s move on…
Okay! Okay! HOLD UP!! I apologize!
Let me explain the cursor and its use in MongoDB more clearly. And I get it now that you are hungry for some real answers.
In MongoDB, when you perform a query, the results are returned as a cursor. A cursor is a pointer that points to the result set of the query on the server-side. It doesn’t contain the actual data but acts as a reference to the documents that match your query.
The cursor object provides some methods to iterate over the documents, fetch batches of documents, or just fetch the next document. When you start iterating over the cursor, MongoDB will fetch batches of documents from the server as needed, this allowing MongoDB to efficiently handle large result sets without loading all the data into memory at once.
I hope you just got a little context about the mongodb cursor. But I know again your brain goes…
Then I can save you with the step-by-step explanation of how a cursor works-
collection.find()
in MongoDB driver, MongoDB will return a cursor object representing the result set that matches the query.cursor.forEach()
or other iteration methods like cursor.next()
, cursor.toArray()
to ****process the documents one by one or in batches.cursor.close()
, the cursor is automatically closed, and any resources associated with it are released.You are correct that when you execute a
find
query in MongoDB using the official MongoDB driver. The driver automatically converts the cursor to an array by calling the.find().toArray()
method.The
cursor.toArray()
method is used to return an array that contains all documents returned by the cursor. Internally this iterate all the cursor data usingcursor.hasNext()
Yes, You need to wait for the server to identify the first batch size of documents. If you are doing, a query that requires sorting the entire collection, then the collection must be completely iterated before you get the first document.
If you are doing a query that doesn’t require sorting it may return the first batch before all matching the documents are visited by the server.
Let’s create sample data. The dataset contains 100 documents, each with an “index” field representing a number from 1 to 100.
var docs = [];
for (let i = 1; i <= 100; i++) {
docs.push({ index: NumberInt(i) })
}
db.myCollection.insertMany(docs)
/**
{
"_id" : ObjectId("5ad24fe286ac9fc7b5c4bbd8"),
"index": 1
},
...
...
**/
When you hit the find() method, mongodb server just returns the cursor to the client. (No document at this moment)
- MongoDB shell iterator size is 20 documents.
- Initial batch size is 101 documents or 1MB.
Large Result Sets: When the query result contains a huge number of documents, loading them all into memory at once might lead to performance issues or even out-of-memory errors. Using a cursor, you can process documents one at a time or in smaller batches, keeping memory consumption under control.
Streaming Data: If you are continuously receiving data from the database (e.g., real-time data streams), using a cursor with a tailable or awaitable cursor type allows you to listen for new documents as they are inserted or modified, providing a real-time streaming experience.
So to summarise, for most common use cases with relatively small result sets, using an array is perfectly fine. However, if you anticipate dealing with large datasets or have specific use cases like real-time streaming or complex aggregations, using a cursor can be beneficial for more efficient data processing.