[SERVER-7849] manyclients.js failing since switch to V8 Created: 05/Dec/12  Updated: 11/Jul/16  Resolved: 09/Jan/13

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 2.3.2

Type: Bug Priority: Critical - P2
Reporter: Ian Whalen (Inactive) Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 0
Labels: buildbot
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File manyclients.js    
Issue Links:
Related
related to SERVER-7773 parallel test suite fails on Windows Closed
Operating System: ALL
Participants:

 Description   

http://buildbot.mongodb.org/builders/Nightly%20OS%20X%2010.5%2064-bit/builds/1107
http://buildbot.mongodb.org/builders/Nightly%20Solaris-SmartOS%2064-bit/builds/163



 Comments   
Comment by auto [ 08/Jan/13 ]

Author:

{u'date': u'2013-01-08T21:31:10Z', u'email': u'ben.becker@10gen.com', u'name': u'Ben Becker'}

Message: SERVER-7849: typo
Branch: master
https://github.com/mongodb/mongo/commit/e03279bb3d581b87b01486fa496267b57bf6b304

Comment by auto [ 08/Jan/13 ]

Author:

{u'date': u'2013-01-08T20:56:49Z', u'email': u'eliot@10gen.com', u'name': u'Eliot Horowitz'}

Message: SERVER-7849: fix Random.genExp when random returns 0
Branch: master
https://github.com/mongodb/mongo/commit/95a3407fa9a673ac21e966e6f5e751ab35b63311

Comment by Tad Marshall [ 08/Jan/13 ]

There are some print() statements in the test, and I got output from them in only one out of four trials.

I agree that the test is aggressive and I also concluded that testing whether a large number of connections could be completed with none taking over five seconds was not useful.

Comment by auto [ 08/Jan/13 ]

Author:

{u'date': u'2013-01-08T05:27:56Z', u'email': u'eliot@10gen.com', u'name': u'Eliot Horowitz'}

Message: SERVER-7849: lower number of clients for debug builds and non-linux machines
Branch: master
https://github.com/mongodb/mongo/commit/0547ed8a19c701822bc562cfcecea0688573eccc

Comment by Eliot Horowitz (Inactive) [ 08/Jan/13 ]

tad - where do you see the logging issue?
i think this test is too aggressive for a lot of machines.
it tries to spawn 400 connections instantly, and if it takes more than 5 seconds will fail.
don't think that's reasonable.
remember, these tests never ran with spidermonkey, so don't think this is a regression at all.

Comment by Tad Marshall [ 07/Jan/13 ]

I tested on Windows 64-bit debug. The test fails with 200 threads on my machine, but passes with 50, 100 or 150. There are other problems: we are losing log messages (at least any from "print()") and so the logged output is not reliable. Changing a timeout value seemed to allow the creation of more connections than the test is supposed to create (I got 401 simultaneous connections when set to 200). In one trial, the test just "stopped running", with 53 threads and an equal number of open TCP connections, but all were waiting for a response to a "find". We should get the logging solid and see what information we are missing.

Comment by Ben Becker [ 01/Jan/13 ]

OS X nightly needs more file descriptors (though segv is odd):

http://buildlogs.mongodb.org/Nightly%20OS%20X%2010.5%2064-bit%20DEBUG/builds/18/test/recent%20failures/manyclients.js

Linux 32-bit failure is definitely a bug:

http://buildlogs.mongodb.org/Nightly%20Linux%20RHEL%2032-bit/builds/332/test/parallel/manyclients.js

Comment by Eliot Horowitz (Inactive) [ 01/Jan/13 ]

Still happening.

Comment by auto [ 26/Dec/12 ]

Author:

{u'date': u'2012-12-26T20:57:05Z', u'name': u'Ben Becker', u'email': u'ben.becker@10gen.com'}

Message: SERVER-7849: enter v8 from a new js thread, and always mark _killPending
Branch: master
https://github.com/mongodb/mongo/commit/1e48eeab982d385e75ad9a25b5eecd2c93cc1659

Comment by Ben Becker [ 24/Dec/12 ]

This appears to have exposed an issue with our use of the v8 API in JSThread::operator()(). TLS could be invalid at this point, so I'm not exactly sure how this was expected to work given that we use the same isolate. This may also be why some of the older builders are more likely to encounter this bug.

Comment by Ben Becker [ 21/Dec/12 ]

The smartos failure looks like a ulimit -n issue.

Comment by Ian Whalen (Inactive) [ 21/Dec/12 ]

Same test is failing with different error messages on the Nightly Solaris-SmartOS 64-bit builder:

http://buildlogs.mongodb.org/Nightly%20Solaris-SmartOS%2064-bit/builds/179/test/recent%20failures/manyclients.js

Comment by Randall Hunt [ 14/Dec/12 ]

changed ulimit on all machines

Comment by Eric Milkie [ 12/Dec/12 ]

There are many limits that ulimit can adjust. I'm not sure that number is the one that we're hitting now, but I haven't really looked into this further yet.

Comment by Ian Whalen (Inactive) [ 12/Dec/12 ]

randall has said the ulimit has been raised to 2048 across all OS X machines. does it need to go higher?

Comment by Eric Milkie [ 12/Dec/12 ]

Might just be ulimit problems at this point.

Comment by Ian Whalen (Inactive) [ 12/Dec/12 ]

reopening because OS X DEBUG is still failing at:

http://buildbot.mongodb.org/builders/OS%20X%2010.5%2064-bit%20DEBUG/builds/1721

Comment by auto [ 10/Dec/12 ]

Author:

{u'date': u'2012-12-10T20:27:22Z', u'name': u'Ben Becker', u'email': u'ben.becker@10gen.com'}

Message: SERVER-7849: fix JSThread read from offset into string literal
Branch: master
https://github.com/mongodb/mongo/commit/df1fdd985b7ee3789cda646a3c2f7d42b011f65d

Generated at Thu Feb 08 03:15:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.