[SERVER-3531] map reduce doesn't seem to yield unless it finds matches Created: 04/Aug/11  Updated: 12/Jul/16  Resolved: 21/Jun/13

Status: Closed
Project: Core Server
Component/s: MapReduce
Affects Version/s: None
Fix Version/s: 2.4.5, 2.5.1

Type: Bug Priority: Major - P3
Reporter: Aaron Staple Assignee: Daniel Pasette (Inactive)
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-9905 mapreduce can lock the server (both r... Closed
Related
related to SERVER-9983 Authenticating as internal user shoul... Closed
related to SERVER-8579 Consolidate Mongod Lock/Resource Sche... Closed
is related to SERVER-6818 audit map / reduce yield recovery cases Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

Mongod operations that support yielding must periodically yield a db lock voluntarily. The map reduce implementation does not yield the db lock voluntarily while its cursor iterates over non matching documents.

from mr.cpp

                        while ( cursor->ok() ) {
                            if ( ! cursor->currentMatches() ) {
                                cursor->advance();
                                continue;
                            }



 Comments   
Comment by auto [ 20/Jun/13 ]

Author:

{u'username': u'monkey101', u'name': u'Dan Pasette', u'email': u'dan@10gen.com'}

Message: SERVER-3531 - yield in m/r when docs do not match
Branch: v2.4
https://github.com/mongodb/mongo/commit/d2b8eab1cc6f05d6d75df2a7fbd5c81917b588d7

Comment by Andy Schwerin [ 20/Jun/13 ]

A patch for SERVER-9983 keeps replicaset heartbeats alive in the presence of operations that do not regularly yield their locks.

Comment by Antoine Girbal [ 19/Jun/13 ]

See below for an interesting currentOp: a long MR job that doesnt yield (1726) which blocks a normal write (1939) which blocks both getMore on oplog.rs and authentication requests from heartbeats.

The fix to ensure yielding should prevent long blocking, but the locking doesnt seem right..
why is MR taking a global read lock ("^" : "r")?

rs0:PRIMARY> db.currentOp()
{
        "inprog" : [
                {
                        "opid" : 1942,
                        "active" : true,
                        "secs_running" : 15,
                        "op" : "query",
                        "ns" : "",
                        "query" : {
                                "authenticate" : 1,
                                "nonce" : "a985c344174b519b",
                                "user" : "__system",
                                "key" : "34cdda5b4eb2614cfb649072bc3e9f74"
                        },
                        "client" : "10.149.7.57:42550",
                        "desc" : "conn45",
                        "threadId" : "0x7f55f25b0700",
                        "connectionId" : 45,
                        "locks" : {
                                "^" : "r"
                        },
                        "waitingForLock" : true,
                        "numYields" : 0,
                        "lockStats" : {
                                "timeLockedMicros" : {
 
                                },
                                "timeAcquiringMicros" : {
 
                                }
                        }
                },
                {
                        "opid" : 1934,
                        "active" : true,
                        "secs_running" : 17,
                        "op" : "getmore",
                        "ns" : "local.oplog.rs",
                        "query" : {
                                "ts" : {
                                        "$gte" : {
                                                "t" : 1371622295,
                                                "i" : 10
                                        }
                                }
                        },
                        "client" : "10.149.7.57:42527",
                        "desc" : "conn22",
                        "threadId" : "0x7f58fe1fa700",
                        "connectionId" : 22,
                        "locks" : {
                                "^" : "r"
                        },
                        "waitingForLock" : true,
                        "numYields" : 0,
                        "lockStats" : {
                                "timeLockedMicros" : {
                                        "r" : NumberLong(91),
                                        "w" : NumberLong(0)
                                },
                                "timeAcquiringMicros" : {
                                        "r" : NumberLong(9),
                                        "w" : NumberLong(0)
                                }
                        }
                },
                {
                        "opid" : 1936,
                        "active" : true,
                        "secs_running" : 17,
                        "op" : "getmore",
                        "ns" : "local.oplog.rs",
                        "query" : {
                                "ts" : {
                                        "$gte" : {
                                                "t" : 1371227968,
                                                "i" : 9
                                        }
                                }
                        },
                        "client" : "10.149.7.57:42510",
                        "desc" : "conn4",
                        "threadId" : "0x7f58fdcf3700",
                        "connectionId" : 4,
                        "locks" : {
                                "^" : "r"
                        },
                        "waitingForLock" : true,
                        "numYields" : 0,
                        "lockStats" : {
                                "timeLockedMicros" : {
                                        "r" : NumberLong(35),
                                        "w" : NumberLong(0)
                                },
                                "timeAcquiringMicros" : {
                                        "r" : NumberLong(6),
                                        "w" : NumberLong(0)
                                }
                        }
                },
                {
                        "opid" : 1726,
                        "active" : true,
                        "secs_running" : 51,
                        "op" : "query",
                        "ns" : "CCiRecon_I007.iReconTxCollection",
                        "query" : {
                                "$msg" : "query not recording (too large)"
                        },
                        "client" : "127.0.0.1:43853",
                        "desc" : "conn7",
                        "threadId" : "0x7f58fdbf2700",
                        "connectionId" : 7,
                        "locks" : {
                                "^" : "r",
                                "^CCiRecon_I007" : "R"
                        },
                        "waitingForLock" : false,
                        "msg" : "m/r: (1/3) emit phase M/R: (1/3) Emit Progress: 77293/77293 100%",
                        "progress" : {
                                "done" : 77293,
                                "total" : 77293
                        },
                        "numYields" : 824,
                        "lockStats" : {
                                "timeLockedMicros" : {
                                        "r" : NumberLong(70118532),
                                        "w" : NumberLong(2995)
                                },
                                "timeAcquiringMicros" : {
                                        "r" : NumberLong(34996341),
                                        "w" : NumberLong(35)
                                }
                        }
                },
                {
                        "opid" : 1939,
                        "active" : false,
                        "op" : "insert",
                        "ns" : "",
                        "insert" : {
 
                        },
                        "client" : "10.149.7.57:42536",
                        "desc" : "conn31",
                        "threadId" : "0x7f55f881d700",
                        "connectionId" : 31,
                        "locks" : {
                                "^CCiRecon_I007" : "W"
                        },
                        "waitingForLock" : true,
                        "numYields" : 0,
                        "lockStats" : {
                                "timeLockedMicros" : {
 
                                },
                                "timeAcquiringMicros" : {
 
                                }
                        }
                },
                {
                        "opid" : 1945,
                        "active" : true,
                        "secs_running" : 3,
                        "op" : "query",
                        "ns" : "",
                        "query" : {
                                "authenticate" : 1,
                                "nonce" : "89051e317293e45",
                                "user" : "__system",
                                "key" : "ba0fe3a433b8204bb1f3a8ae97c4e910"
                        },
                        "client" : "10.149.7.57:42551",
                        "desc" : "conn46",
                        "threadId" : "0x7f58fe2fb700",
                        "connectionId" : 46,
                        "locks" : {
                                "^" : "r"
                        },
                        "waitingForLock" : true,
                        "numYields" : 0,
                        "lockStats" : {
                                "timeLockedMicros" : {
 
                                },
                                "timeAcquiringMicros" : {
 
                                }
                        }
                }
        ]
}

Comment by auto [ 13/Jun/13 ]

Author:

{u'username': u'monkey101', u'name': u'Dan Pasette', u'email': u'dan@10gen.com'}

Message: SERVER-3531 - yield in m/r when docs do not match
Branch: master
https://github.com/mongodb/mongo/commit/71544f212f797826cb46d2702bb802942e3de4b9

Generated at Thu Feb 08 03:03:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.