[SERVER-19948] The balancingWindow does not work Created: 14/Aug/15 Updated: 10/Nov/15 Resolved: 10/Nov/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Edision chow | Assignee: | Sam Kleinman (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Steps To Reproduce: | 1.set activewindow }}, {upsert:true}) 2.check 3.start insert documents outside the activeWindow time. )} (shard key is num) 4.check the chunks and log 5.found out the migration is done. |
||||||||
| Participants: | |||||||||
| Description |
|
HI all
But found out that the migrate and balancing is still working outside the activeWindow. And then check the source code of balancing. I found out the source code in "mongo/src/mongo/s/grid.cpp" is like below:
It looks like whether the time is outside the activeWindow or not ,the return is always "true" |
| Comments |
| Comment by Randolph Tan [ 10/Nov/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks for the update. Glad that this has been resolved on your end. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Stefan Wojcik [ 10/Nov/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Randolph, I realized one of my mongos wasn't restarted after the v3.0.6 -> v3.0.7 upgrade and it still wasn't respecting the balancing window. It's working fine after a restart. Thank you for your help and sorry for a false report - this issue is indeed fixed in v3.0.7. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Randolph Tan [ 09/Nov/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
You can try it on one of the mongos, and if you're lucky (chances get lower with more mongos), it will get to do a single balancing round. If not, you might need to do it on more mongos to increase the chances of one of them performing a balancing round. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Stefan Wojcik [ 09/Nov/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Randolph, can I do it on just a single mongos or do I have to do it for all our machines that run mongos? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Randolph Tan [ 06/Nov/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Is it possible to crank up the mongos log level to 1 temporarily (you can do this with setParameter)? Once you turn it up, you should be able to see logs like this:
if it is outside the window, or this otherwise:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Stefan Wojcik [ 06/Nov/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Yes, all the machines are running NTP and are set to the UTC timezone. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Randolph Tan [ 06/Nov/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
stefan@close.io You should not need to upgrade the mongod. Do the mongos have the same timezone settings as your shards? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Stefan Wojcik [ 27/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Following up in this issue | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Stefan Wojcik [ 20/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you Randolph! I upgraded my mongos and config servers to v3.0.7, but I still see the issue:
Do I need to upgrade the data mongods too before I see this issue fixed? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Randolph Tan [ 13/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
stefan@close.io I believe you are experiencing | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Stefan Wojcik [ 13/Oct/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi there, I believe this is still an issue. 1) Based on the attached screenshots (all_mongos_servers.png and all_mongos_servers_time.png), you can see that the time is in sync for all the machines running mongos. 2) The balancing window is set to 4am - 11am UTC:
3) And yet, looking at the profile collection on one of our primaries, we can see that some chunks were moved between shards outside of the balancing window (pay attention to the "ts" value):
There's a lot more examples where a "moveChunk" happened outside of the balancing window's hours. Note that our mongos instances and config servers run MongoDB v3.0.6, but our data servers run mongod v2.6.11 (mongod v3.0.6 performed very poorly for us and we had to downgrade - we have a ticket open about that at jira.mongodb.org/browse/SUPPORT-1448). Could that be an issue here? Let me know if you need anything else from me to debug this issue. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ramon Fernandez Marina [ 26/Sep/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
eshujiushiwo, we haven't heard back from you for a while so we're going to close this ticket. If this is still an issue for you please provide the additional information requested above. Regards, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Spencer Brody (Inactive) [ 14/Aug/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
What version are you running? Prior to 3.0.1 there was also | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Randolph Tan [ 14/Aug/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi, How many mongos processes do you have? Are they all in the same time zone? The active window setting is currently ambiguous as each mongos will use it's own local wall clock to compare with the active window which can be an issue if you have multiple mongos with different time zones. Also see Thanks! | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Edision chow [ 14/Aug/15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Well, "bool Grid::shouldBalance(const SettingsType& balancerSettings) const { if (balancerSettings.isBalancerActiveWindowSet()) { boost::posix_time::ptime now = boost::posix_time::second_clock::local_time(); return balancerSettings.inBalancingWindow(now); } return true; is ok. |