[SERVER-9982] update_setOnInsert.js not stable in parallel tests Created: 20/Jun/13  Updated: 11/Jul/16  Resolved: 18/Mar/14

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure, Write Ops
Affects Version/s: 2.4.4, 2.5.0
Fix Version/s: 2.6.0-rc2

Type: Bug Priority: Major - P3
Reporter: Eric Milkie Assignee: Amalia Hawkins
Resolution: Done Votes: 0
Labels: 26qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

http://buildlogs.mongodb.org/V2.4%20Solaris-SmartOS%2064-bit/builds/116/test/parallel/basicPlus.js

Participants:

 Description   

update_setOnInsert.js attempts to scan the system.profile collection for a write that it did. Unfortunately, during the 'parallel' test suite, it's entirely possible for other tests to be running at the same time which flood system.profile and cause the op of interest to roll off the end of the capped collection.



 Comments   
Comment by Daniel Pasette (Inactive) [ 16/Apr/14 ]

this was moved to jstests/libs/parallelTester.js in:
174f9bea00a46cb4e301dcc0f4dcbaddd2ec21cb

Comment by Amalia Hawkins [ 18/Mar/14 ]

Commit here: https://github.com/mongodb/mongo/commit/b2944e7c7ed6fc5195e99d7d2ec50f887ac1a9b3

Comment by Daniel Pasette (Inactive) [ 20/Nov/13 ]

will re-open if it's seen again.

Comment by Eric Milkie [ 03/Oct/13 ]

Profile level should be restored at the end of a test.
The link has no content because we purge old buildlogger data after a while, unfortunately.
This test actually failed in the parallel suite by saying that it couldn't find the thing that it was looking for in the system.profile collection. Around the time of its failure, other tests were running and presumably logging to system.profile, so I hypothesized about the underlying issue.
A code fix for this issue would be hard to definitively prove with a test, due to the nondeterminism of the parallel test suite.

Comment by Andrew Morrow (Inactive) [ 03/Oct/13 ]

milkie The link under 'reproduce' doesn't seem to have any content, but I'd like to better understand how to debug failures in the parallel suite. Can you point to another failure and add some information about how you were able to sort out that this test was responsible?

I suspect that the right thing to do here is to blacklist this test from the parallel suite. If 'fastmod' is not relayed to the invoker somehow, and if the only way to check it is via the profile collection, then there really isn't any way to make this test work except in isolation. I don't see that it does anything particularly complex that makes it interesting to run in parallel with other tests.

Also, I think the test is flawed in that it fails to restore the profiling level, if we care about that. There seem to be other tests that suffer from this as well. Please let me know if that is something we should fix. Is it harmful to subsequent testing to leave the profiling level set to 2?

Generated at Thu Feb 08 03:21:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.