Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-39006

Read query processing is frequently queued in MongoDB

    XMLWordPrintable

    Details

    • Operating System:
      ALL

      Description

      We are using MongoDB 3.4.14.
      We are operating a service as a sharded cluster, and often there is an issue of increasing connection due to delays in query processing.
      We have 15 shards and only use primary server. (There are no quries in secondary.)
      The issue has occurred at a some point. And one shard server is the starting point of the issue. (not specific server, It is different every time)
      At some point on one of the shard servers, query processing is delayed and finally connection full is occurred.
      This cause of delaying query processing and occurring connection full on other shard servers. (Because of the aggregate mergeCursor, problem is occurred on other shard servers too.)

      When we checked, the mass incoming of certain types of query was not the cause of the problem.
      All the queries that came in when MongoDB was in trouble were the usual queries.
      The usual queries were slowing down and problem happened.

       

      >> mongostat (we are using find/insert/update/aggregate queries.)

      insert query update delete getmore command dirty  used flushes vsize   res  qrw   arw net_in net_out conn       set repl                time
         515  1534    961     *0    2282  2779|0  2.4% 80.0%       0  135G 56.2G  0|0  25|2  5.81m   16.4m 4978 shard04  PRI Dec 16 22:05:26.610
         525  1459    755     *0    2130  2532|0  2.3% 80.0%       0  135G 56.2G  0|0  17|2  5.05m   14.1m 4985 shard04  PRI Dec 16 22:05:27.676
         455  1873    550     *0    2478  2965|0  2.2% 80.0%       0  135G 56.2G  0|1  26|1  5.83m   16.2m 4996 shard04  PRI Dec 16 22:05:28.604
         429  1682    936     *0    2320  2619|0  2.0% 80.0%       0  135G 56.2G  0|0  36|3  4.97m   15.3m 5010 shard04  PRI Dec 16 22:05:29.614
         489  1625    545     *0    2328  2723|0  2.0% 80.0%       0  135G 56.2G  0|2  22|0  5.02m   14.7m 5023 shard04  PRI Dec 16 22:05:30.602
         270  1487    826     *0    2333  2770|0  2.0% 80.0%       0  135G 56.2G  9|1  19|1  4.39m   12.8m 5044 shard04  PRI Dec 16 22:05:31.609
         167   661    269     *0     939  1140|0  2.0% 80.0%       0  135G 56.2G  0|0 10|17  1.99m   6.03m 5062 shard04  PRI Dec 16 22:05:34.043
       16223 59846  28133     *0   79397 95484|0  2.1% 80.0%       0  135G 56.2G  0|0  19|7   174m    488m 5078 shard04  PRI Dec 16 22:05:34.087
         721  1301    846     *0    2318  2562|0  2.1% 80.0%       0  135G 56.2G  0|0  15|2  4.41m   13.0m 5081 shard04  PRI Dec 16 22:05:34.590
         540  1758    558     *0    2255  2587|0  2.2% 80.0%       0  135G 56.2G  0|3  17|0  4.80m   15.2m 5098 shard04  PRI Dec 16 22:05:35.631
      insert query update delete getmore command dirty  used flushes vsize   res  qrw   arw net_in net_out conn       set repl                time
         378  1613    726     *0    2304  2757|0  2.3% 80.0%       0  135G 56.2G  0|0  28|2  4.64m   13.6m 5110 shard04  PRI Dec 16 22:05:36.635
         247  1903    608     *0    2408  2842|0  2.3% 80.0%       0  135G 56.2G  0|0  21|2  5.51m   18.8m 5124 shard04  PRI Dec 16 22:05:37.592
         404  1628    731     *0    2321  2834|0  2.3% 80.0%       0  135G 56.2G  0|0  18|0  4.74m   13.8m 5143 shard04  PRI Dec 16 22:05:38.596
         428  1494    987     *0    2292  2748|0  2.4% 80.0%       0  135G 56.2G  0|0  18|2  4.50m   14.0m 5157 shard04  PRI Dec 16 22:05:39.605
         110   577    180     *0     673   764|0  2.4% 80.0%       0  135G 56.2G  0|2  33|0  1.41m   4.60m 5173 shard04  PRI Dec 16 22:05:43.029
        6706 34615  12151     *0   49254 60381|0  2.5% 80.0%       0  135G 56.2G  0|1  30|3  93.2m    274m 5235 shard04  PRI Dec 16 22:05:43.148
         423  1977   1213     *0    2525  3191|0  2.6% 80.0%       0  135G 56.2G  0|0  17|1  5.15m   15.8m 5245 shard04  PRI Dec 16 22:05:43.606
         387  1677    557     *0    2357  2990|0  2.6% 80.0%       0  135G 56.2G  2|2  14|3  4.49m   12.6m 5284 shard04  PRI Dec 16 22:05:44.609
         187  1500    672     *0    2403  2843|0  2.7% 80.0%       0  135G 56.2G  4|0   8|1  4.39m   13.0m 5305 shard04  PRI Dec 16 22:05:45.580
         408  1480    490     *0    2250  2893|0  2.7% 80.0%       0  135G 56.2G  0|0  16|1  4.30m   12.1m 5350 shard04  PRI Dec 16 22:05:46.602
      insert query update delete getmore command dirty  used flushes vsize   res  qrw   arw net_in net_out conn       set repl                time
         231  1603    437     *0    2389  3145|0  2.7% 80.0%       0  135G 56.2G  0|2  12|1  4.66m   13.7m 5394 shard04  PRI Dec 16 22:05:47.599
         196  1482    395     *0    2178  3055|0  2.7% 80.0%       0  135G 56.2G  6|4  31|1  4.32m   12.9m 5457 shard04  PRI Dec 16 22:05:48.659
         232  1275    797     *0    2102  3843|0  2.8% 80.0%       0  137G 56.2G  0|0  63|6  4.23m   10.3m 5607 shard04  PRI Dec 16 22:05:49.689
         247  1514    361     *0    2403  5241|0  2.8% 80.0%       0  139G 56.2G  0|0  12|1  5.07m   14.4m 5786 shard04  PRI Dec 16 22:05:50.574
          66  1459    256     *0    2065  4455|0  2.8% 80.0%       0  140G 56.2G  5|0  17|0  4.71m   11.9m 5978 shard04  PRI Dec 16 22:05:51.608
          80  1057    188     *0    1689  4031|0  2.8% 80.0%       0  142G 56.2G  0|0  83|6  3.90m   8.80m 6270 shard04  PRI Dec 16 22:05:52.933
          69   957    451     *0    1415  4790|0  2.8% 80.0%       0  146G 56.2G 62|0 128|6  4.00m   7.64m 6717 shard04  PRI Dec 16 22:05:54.196
          12  1334     83     *0    1317 10142|0  2.8% 80.0%       0  151G 56.3G 297|664 128|128  5.97m   7.87m 7800 shard04  PRI Dec 16 22:05:55.152
           8  1521     17     *0     929 11458|0  2.9% 80.0%       0  154G 56.1G 2097|343   128|2  5.87m   9.42m 9083 shard04  PRI Dec 16 22:05:55.942
          23  1506    365     *0    1296  7617|0  2.9% 80.1%       0  155G 56.1G    669|0 128|128  5.11m   13.7m 9899 shard04  PRI Dec 16 22:05:56.992
      insert query update delete getmore command dirty  used flushes vsize   res      qrw     arw net_in net_out  conn       set repl                time
          11   837     32     *0    1435  8388|0  2.9% 80.1%       0  156G 56.2G     16|1    46|1  4.00m   9.81m 10847 shard04  PRI Dec 16 22:05:57.896
          *0    49      2     *0      73   735|0  2.9% 80.1%       0  157G 56.2G      0|5    97|0   336k    555k 11799 shard04  PRI Dec 16 22:06:10.530
          70  4143    319     *0    4587 39201|0  2.9% 80.0%       0  171G 56.6G 11777|1965 128|128  26.7m   36.6m 19000 shard04  PRI Dec 16 22:06:12.776
         147  1745     67     *0    1232 22395|0  2.9% 80.0%       0  174G 57.0G  6428|6774 128|128  18.3m   22.9m 18957 shard04  PRI Dec 16 22:06:13.387
          11   367      8     *0     322  4526|0  2.9% 80.0%       0  176G 57.2G  4084|9374 128|128  3.83m   3.41m 19000 shard04  PRI Dec 16 22:06:14.100
           4   621     13     *0     702  7959|0  3.0% 80.0%       0  177G 57.5G    4|12828  73|128  7.02m   7.93m 19000 shard04  PRI Dec 16 22:06:14.712
           2   512      4     *0    1128  6520|0  3.0% 80.0%       0  177G 57.6G   40|12068 128|128  5.59m   8.50m 19000 shard04  PRI Dec 16 22:06:15.851
           1   634      4     *0    1346  7566|0  3.0% 80.0%       0  177G 57.7G    0|11841 128|128  6.77m   8.73m 18995 shard04  PRI Dec 16 22:06:16.816
          *0   720      3     *0    1593  7595|0  3.0% 80.0%       0  177G 57.8G   20|11773 101|128  7.11m   8.81m 19000 shard04  PRI Dec 16 22:06:17.758
          *0   634     *0     *0    1320  5941|0  3.0% 80.0%       0  177G 57.9G   10|11809  90|128  5.72m   7.05m 19000 shard04  PRI Dec 16 22:06:18.878

       

      Attach the server stat.
      Please let me know if you need anything else.

       

        Attachments

        1. diagnostic.data.tar.gz
          141.61 MB
        2. mongod.conf
          1 kB
        3. serverStat.tar.gz
          1.81 MB

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: