[SERVER-12859] Text indexing is slow on VM setup Created: 24/Feb/14  Updated: 06/Jun/14  Resolved: 20/May/14

Status: Closed
Project: Core Server
Component/s: Text Search
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Radu Dan Assignee: Asya Kamsky
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File import.py     File mongodb.conf    
Participants:

 Description   

Hey,

First of all, let me start by stating that up until this point, I used to be a bit of a mongodb fanboy so I wholeheartedly tried to test this in a way that favors mongo. I also want it on the record that I completely understand that the current text search feature is marked as experimental and that it should only be used for testing purposes (which is the case for me so far). However, all evidence points to the fact that you plan on making this feature non-experimental in 2.6 with the $text query syntax and I believe that this is a big mistake given the current state of affairs.

I have a project under development that requires full-text searching capabilities and would otherwise map extremely well (schema-less collections: check, map/reduce: check) on the feature-set that mongodb currently provides. Because ideally, I would like a single point of truth design, I wanted to first test the performance of the full text search and indexing capabilities. To this end, I downloaded the english wikipedia datasource, wrote up an xml sax parser for it and started importing non-redirecting 1000 pages (schema:

{"title": "string", "text": "wikitext", "category": "integer"}

) at a time into a collection that was configured with ensureIndex(

{"title": "text", "text": "text", "category": 1}

) and no other indices defined. I also timed the mongo api call and calculated an average of rows (documents) inserted per second. Here are the results for the first 9 1000-row batches:

[root@dev1 var]# python import.py
Total: 1000. Speed: 56.00r/s
Total: 2000. Speed: 32.50r/s
Total: 3000. Speed: 13.74r/s
Total: 4000. Speed: 7.76r/s
Total: 5000. Speed: 4.69r/s
Total: 6000. Speed: 4.52r/s
Total: 7000. Speed: 3.37r/s
Total: 8000. Speed: 2.51r/s
Total: 9000. Speed: 2.73r/s

At this point, I killed the script because I lost my patience. My first impression was that something else must be going on. I used the unstable build from the 10gen RPM repository, so I immediately switched to the stable version. In addition, I moved the virtual disk image file (I did all of this in a virtual machine) from a raid1 setup to a "dumb" disk, preallocated it (it was previously a "smart" virtual drive that only expanded as needed by the guest os) and defragmented the host drive. While this did give me roughly a 50% boost, I still had the same logarithmic drop in throughput as the dataset size grew.

So I installed elasticsearch with a default config. Set up the index to do snowball analysis because based on your supported language list, I figured that's what you're using. Quite honestly, I was expecting it to outperform mongo because elasticsearch and lucene were designed for this exact purpose. However, the results are not even close:

Total: 1000. Speed: 405.83r/s
Total: 2000. Speed: 456.65r/s
Total: 3000. Speed: 411.27r/s
Total: 4000. Speed: 462.72r/s
Total: 5000. Speed: 407.93r/s
Total: 6000. Speed: 348.01r/s
Total: 7000. Speed: 286.52r/s
Total: 8000. Speed: 385.70r/s
Total: 9000. Speed: 419.75r/s
Total: 10000. Speed: 307.36r/s
Total: 11000. Speed: 266.52r/s
Total: 12000. Speed: 315.56r/s
Total: 13000. Speed: 466.00r/s
Total: 14000. Speed: 409.62r/s
Total: 15000. Speed: 359.44r/s
Total: 16000. Speed: 352.84r/s
Total: 17000. Speed: 403.31r/s
Total: 18000. Speed: 834.88r/s
Total: 19000. Speed: 1481.40r/s
Total: 20000. Speed: 418.26r/s
Total: 21000. Speed: 427.12r/s
Total: 22000. Speed: 1145.24r/s
Total: 23000. Speed: 370.15r/s
Total: 24000. Speed: 593.52r/s
Total: 25000. Speed: 497.98r/s
Total: 26000. Speed: 400.37r/s
Total: 27000. Speed: 474.05r/s
Total: 28000. Speed: 541.41r/s
Total: 29000. Speed: 611.02r/s
Total: 30000. Speed: 443.91r/s
Total: 31000. Speed: 467.19r/s
Total: 32000. Speed: 558.63r/s
Total: 33000. Speed: 549.81r/s
Total: 34000. Speed: 480.87r/s
Total: 35000. Speed: 612.75r/s
Total: 36000. Speed: 580.19r/s
Total: 37000. Speed: 482.62r/s
Total: 38000. Speed: 697.63r/s
Total: 39000. Speed: 681.52r/s
Total: 40000. Speed: 636.69r/s
...
Total: 1140000. Speed: 1288.25r/s
Total: 1141000. Speed: 1454.45r/s
Total: 1142000. Speed: 2061.57r/s
Total: 1143000. Speed: 1145.54r/s
Total: 1144000. Speed: 1775.82r/s
Total: 1145000. Speed: 1629.78r/s
Total: 1146000. Speed: 2178.34r/s
Total: 1147000. Speed: 2205.68r/s
Total: 1148000. Speed: 1658.26r/s
Total: 1149000. Speed: 2199.42r/s
Total: 1150000. Speed: 1018.77r/s
Total: 1151000. Speed: 1877.80r/s
Total: 1152000. Speed: 2337.72r/s
Total: 1153000. Speed: 1719.36r/s
Total: 1154000. Speed: 1906.16r/s
Total: 1155000. Speed: 970.74r/s
Total: 1156000. Speed: 1828.84r/s
Total: 1157000. Speed: 1958.14r/s
Total: 1158000. Speed: 1809.90r/s
Total: 1159000. Speed: 2240.17r/s
Total: 1160000. Speed: 1851.73r/s
Total: 1161000. Speed: 1731.13r/s
Total: 1162000. Speed: 1454.58r/s
Total: 1163000. Speed: 1933.39r/s
Total: 1164000. Speed: 2136.85r/s
Total: 1165000. Speed: 2001.85r/s
Total: 1166000. Speed: 1818.62r/s
Total: 1167000. Speed: 1341.41r/s
Total: 1168000. Speed: 2339.45r/s
Total: 1169000. Speed: 1619.38r/s
Total: 1170000. Speed: 2391.82r/s
Total: 1171000. Speed: 2239.17r/s
Total: 1172000. Speed: 2244.49r/s
Total: 1173000. Speed: 1848.49r/s
Total: 1174000. Speed: 2324.02r/s
Total: 1175000. Speed: 2042.02r/s
Total: 1176000. Speed: 1800.07r/s
Total: 1177000. Speed: 2091.75r/s
Total: 1178000. Speed: 1648.40r/s
Total: 1179000. Speed: 1245.07r/s
Total: 1180000. Speed: 1931.28r/s
Total: 1181000. Speed: 1752.95r/s
Total: 1182000. Speed: 1945.75r/s
Total: 1183000. Speed: 1856.97r/s
Total: 1184000. Speed: 1667.88r/s
Total: 1185000. Speed: 1361.52r/s
Total: 1186000. Speed: 2033.49r/s
Total: 1187000. Speed: 1682.91r/s
Total: 1188000. Speed: 1210.39r/s
Total: 1189000. Speed: 1491.82r/s
...
Total: 1436000. Speed: 737.77r/s
Total: 1437000. Speed: 1304.33r/s
Total: 1438000. Speed: 894.11r/s
Total: 1439000. Speed: 1344.08r/s
Total: 1440000. Speed: 480.54r/s
Total: 1441000. Speed: 1228.01r/s
Total: 1442000. Speed: 929.55r/s
Total: 1443000. Speed: 1165.60r/s
Total: 1444000. Speed: 1004.54r/s
Total: 1445000. Speed: 770.31r/s
Total: 1446000. Speed: 862.52r/s
Total: 1447000. Speed: 658.08r/s
Total: 1448000. Speed: 1465.76r/s
Total: 1449000. Speed: 1066.47r/s
Total: 1450000. Speed: 1347.94r/s
Total: 1451000. Speed: 956.64r/s
Total: 1452000. Speed: 1082.98r/s
Total: 1453000. Speed: 1002.42r/s
Total: 1454000. Speed: 1395.81r/s
Total: 1455000. Speed: 995.55r/s
Total: 1456000. Speed: 1153.25r/s
Total: 1457000. Speed: 1639.04r/s
Total: 1458000. Speed: 1049.73r/s
Total: 1459000. Speed: 563.95r/s
Total: 1460000. Speed: 1486.23r/s
Total: 1461000. Speed: 1098.64r/s
Total: 1462000. Speed: 1255.93r/s
Total: 1463000. Speed: 1357.39r/s
Total: 1464000. Speed: 929.26r/s
Total: 1465000. Speed: 861.80r/s

As you can see, while there are fluctuations in the throughput (which can be blamed on the dataset: some article batches are obviously larger than others), the trend is not a logarithmic drop.

As such, it is my opinion that you shouldn't release the full text feature as public until you solve the throughput issue. Anything else would simply be lying to your customers which expect huMONGOus dataset support. (hell, 10k documents could even be considered small by current standards).

If requested, I can provide the VM, dataset and import script I used for testing.

System specs:

Intel quad core i7 2600K @ 3.4GHz (two cores assigned to the VM)
16GB DDR3-SDRAM @ 1.3GHz (2GB assigned to the VM)
1TB WDC @ 7200RPM w/ 64MB buffer (128GB assigned to the VM; host OS runs on a different drive)
Virtualization used: Oracle VirtualBox 4.3.6 running CentOS 6.5



 Comments   
Comment by Asya Kamsky [ 20/May/14 ]

Since we haven't been able to reproduce, I'm closing.

Please feel free to reopen if additional diagnostic information becomes available.

Comment by Radu Dan [ 28/Feb/14 ]

I was curious if you had the same difference in throughput as I do. Basically, I wanted to determine if there was something wrong with my disk setup.

Here's the throughput for an unindexed import:

[root@dev1 var]# python import.py xml mongo
1000 @ 2917.96d/s
2000 @ 4144.54d/s
3000 @ 4067.50d/s
4000 @ 4730.17d/s
5000 @ 4115.79d/s
6000 @ 4560.88d/s
7000 @ 2134.19d/s
8000 @ 3602.40d/s
9000 @ 4619.11d/s
10000 @ 3783.66d/s

So a easy way to highly optimize throughput in this case would be to do what elasticsearch is doing, i.e. delay text index creation and perform a bulk index update every couple of seconds. This would of course alter the consistency mechanics provided by MongoDB (search after insert may not yield correct results, even if the search was performed on a replica set master), but for the purpose of text search, one could argue that consistency is not all that important since it's almost always user-driven and probabilistic results are to be expected. This should of course be configurable by the user in the event that a particular application does require higher consistency guarantees.

Comment by Asya Kamsky [ 27/Feb/14 ]

I'm basically running the fio config that you provided (on my mac I had to change first pass to posixasync).

Since my builtin SSD isn't really comparable I am rerunning this test with /data/db directory on an external USB3.0 WD MyPassport which according to the specs I can find has a 5400 RPM drive inside.

Full fio output:

$ cat fioPassport.out
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
bgwriter: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=posixaio, iodepth=32
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
queryA: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=mmap, iodepth=1
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
queryB: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=mmap, iodepth=1
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
bgupdater: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=posixaio, iodepth=16
fio-2.1
Starting 4 threads
bgwriter: Laying out IO file(s) (1 file(s) / 256MB)
queryA: Laying out IO file(s) (1 file(s) / 256MB)
queryB: Laying out IO file(s) (1 file(s) / 256MB)
bgupdater: Laying out IO file(s) (1 file(s) / 32MB)
 
bgwriter: (groupid=0, jobs=1): err= 0: pid=4355: Thu Feb 27 15:02:28 2014
  write: io=262144KB, bw=765231B/s, iops=186, runt=350790msec
    slat (usec): min=0, max=12746, avg= 4.13, stdev=56.18
    clat (usec): min=19, max=2491.1K, avg=80399.99, stdev=74945.41
     lat (usec): min=20, max=2491.1K, avg=80404.11, stdev=74945.16
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[   11], 10.00th=[   18], 20.00th=[   41],
     | 30.00th=[   54], 40.00th=[   64], 50.00th=[   74], 60.00th=[   86],
     | 70.00th=[   96], 80.00th=[  110], 90.00th=[  139], 95.00th=[  169],
     | 99.00th=[  231], 99.50th=[  265], 99.90th=[ 1172], 99.95th=[ 1483],
     | 99.99th=[ 2474]
    bw (KB  /s): min=   39, max= 1595, per=100.00%, avg=763.22, stdev=200.56
    lat (usec) : 20=0.01%, 50=0.02%, 100=0.02%, 250=0.04%, 500=0.05%
    lat (usec) : 750=0.03%, 1000=0.05%
    lat (msec) : 2=0.22%, 4=1.01%, 10=2.96%, 20=7.39%, 50=14.90%
    lat (msec) : 100=46.94%, 250=25.73%, 500=0.48%, 750=0.02%, 1000=0.04%
    lat (msec) : 2000=0.09%, >=2000=0.02%
  cpu          : usr=4.77%, sys=2.54%, ctx=848457, majf=8740, minf=189
  IO depths    : 1=0.1%, 2=0.2%, 4=0.5%, 8=89.5%, 16=9.8%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.8%, 8=0.1%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=65536/d=0, short=r=0/w=0/d=0
queryA: (groupid=0, jobs=1): err= 0: pid=4611: Thu Feb 27 15:02:28 2014
  read : io=262144KB, bw=209900B/s, iops=51, runt=1278873msec
    clat (usec): min=6, max=493720, avg=19500.46, stdev=20769.49
     lat (usec): min=6, max=493720, avg=19500.95, stdev=20769.55
    clat percentiles (usec):
     |  1.00th=[  438],  5.00th=[ 6240], 10.00th=[ 8032], 20.00th=[10432],
     | 30.00th=[12224], 40.00th=[13760], 50.00th=[15040], 60.00th=[16512],
     | 70.00th=[18304], 80.00th=[20352], 90.00th=[23936], 95.00th=[60672],
     | 99.00th=[120320], 99.50th=[138240], 99.90th=[179200], 99.95th=[193536],
     | 99.99th=[268288]
    bw (KB  /s): min=   13, max= 1480, per=49.25%, avg=207.82, stdev=109.63
    lat (usec) : 10=0.01%, 250=0.08%, 500=1.13%, 750=0.13%, 1000=0.01%
    lat (msec) : 2=0.03%, 4=0.68%, 10=15.76%, 20=60.83%, 50=15.43%
    lat (msec) : 100=3.97%, 250=1.94%, 500=0.01%
  cpu          : usr=1.58%, sys=1.41%, ctx=1102045, majf=130905, minf=294
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=65536/w=0/d=0, short=r=0/w=0/d=0
queryB: (groupid=0, jobs=1): err= 0: pid=4867: Thu Feb 27 15:02:28 2014
  read : io=262144KB, bw=209918B/s, iops=51, runt=1278758msec
    clat (usec): min=3, max=536590, avg=19496.26, stdev=20766.77
     lat (usec): min=3, max=536590, avg=19496.80, stdev=20766.77
    clat percentiles (usec):
     |  1.00th=[  434],  5.00th=[ 6240], 10.00th=[ 8032], 20.00th=[10432],
     | 30.00th=[12224], 40.00th=[13760], 50.00th=[15040], 60.00th=[16512],
     | 70.00th=[18304], 80.00th=[20352], 90.00th=[23936], 95.00th=[60672],
     | 99.00th=[119296], 99.50th=[138240], 99.90th=[181248], 99.95th=[199680],
     | 99.99th=[254976]
    bw (KB  /s): min=    7, max= 1095, per=49.23%, avg=207.74, stdev=106.19
    lat (usec) : 4=0.03%, 10=0.19%, 20=0.03%, 50=0.01%, 250=0.01%
    lat (usec) : 500=1.01%, 750=0.12%, 1000=0.01%
    lat (msec) : 2=0.02%, 4=0.62%, 10=15.79%, 20=60.80%, 50=15.41%
    lat (msec) : 100=3.94%, 250=2.00%, 500=0.01%, 750=0.01%
  cpu          : usr=1.58%, sys=1.41%, ctx=1101882, majf=130747, minf=269
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=65536/w=0/d=0, short=r=0/w=0/d=0
bgupdater: (groupid=0, jobs=1): err= 0: pid=5123: Thu Feb 27 15:02:28 2014
  read : io=16416KB, bw=43003B/s, iops=10, runt=390895msec
    slat (usec): min=0, max=3371, avg= 8.74, stdev=76.14
    clat (usec): min=17, max=2420.6K, avg=153365.16, stdev=77842.25
     lat (usec): min=25, max=2420.6K, avg=153373.90, stdev=77842.25
    clat percentiles (msec):
     |  1.00th=[   12],  5.00th=[   60], 10.00th=[   81], 20.00th=[  102],
     | 30.00th=[  120], 40.00th=[  135], 50.00th=[  147], 60.00th=[  163],
     | 70.00th=[  182], 80.00th=[  202], 90.00th=[  229], 95.00th=[  253],
     | 99.00th=[  314], 99.50th=[  347], 99.90th=[  619], 99.95th=[ 1319],
     | 99.99th=[ 2409]
    bw (KB  /s): min=    1, max=  410, per=10.90%, avg=46.01, stdev=76.15
  write: io=16352KB, bw=42836B/s, iops=10, runt=390895msec
    slat (usec): min=0, max=2000, avg= 8.24, stdev=59.39
    clat (usec): min=27, max=2346.2K, avg=85088.57, stdev=85082.76
     lat (usec): min=35, max=2346.2K, avg=85096.81, stdev=85082.50
    clat percentiles (usec):
     |  1.00th=[ 1864],  5.00th=[11200], 10.00th=[22656], 20.00th=[42752],
     | 30.00th=[54016], 40.00th=[63744], 50.00th=[76288], 60.00th=[88576],
     | 70.00th=[100864], 80.00th=[117248], 90.00th=[144384], 95.00th=[173056],
     | 99.00th=[244736], 99.50th=[305152], 99.90th=[1400832], 99.95th=[1728512],
     | 99.99th=[2342912]
    bw (KB  /s): min=    2, max=  497, per=6.87%, avg=48.88, stdev=84.21
    lat (usec) : 20=0.01%, 50=0.04%, 100=0.13%, 250=0.16%, 500=0.02%
    lat (msec) : 2=0.29%, 4=0.71%, 10=1.17%, 20=2.75%, 50=9.64%
    lat (msec) : 100=28.81%, 250=53.08%, 500=2.97%, 750=0.04%, 1000=0.05%
    lat (msec) : 2000=0.11%, >=2000=0.02%
  cpu          : usr=4.33%, sys=2.35%, ctx=864149, majf=9974, minf=260
  IO depths    : 1=34.7%, 2=2.5%, 4=0.6%, 8=29.0%, 16=33.1%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=4104/w=4088/d=0, short=r=0/w=0/d=0
 
Run status group 0 (all jobs):
   READ: io=540704KB, aggrb=422KB/s, minb=41KB/s, maxb=204KB/s, mint=390895msec, maxt=1278873msec
  WRITE: io=278496KB, aggrb=712KB/s, minb=41KB/s, maxb=747KB/s, mint=350790msec, maxt=390895msec

Ingestion of the same wiki xml with text index:

$ python import.py xml mongo
1000 @ 285.68d/s
2000 @ 187.24d/s
3000 @ 163.46d/s
4000 @ 110.76d/s
5000 @ 65.10d/s
6000 @ 112.75d/s
7000 @ 92.36d/s
8000 @ 110.23d/s
9000 @ 78.14d/s
10000 @ 93.86d/s
11000 @ 85.14d/s
12000 @ 84.33d/s
13000 @ 74.15d/s
14000 @ 65.89d/s
15000 @ 91.58d/s
16000 @ 75.86d/s
17000 @ 36.38d/s
18000 @ 32.67d/s

Again, at this point my machine is memory limited (indexes get to be bigger than RAM) and it's all thrashing.

I'm not set up to benchmark ES - and I'm not sure how it would help since we want to understand why your ingest is so slow with MongoDB (and text indexes).

By the way, what's your ingest rate of the same documents without text indexes (just for comparison)?

Comment by Radu Dan [ 27/Feb/14 ]

I can confirm 600 IOPS on AWS. However, numbers are comparable to what I'm getting from a shared SSD box so my guess is that they're using high performance SAS drives, they probably tuned their fsync calls to be no-ops and instead rely on battery power for consistency.
Even on a 15k drive, the theoretical max random access IOPS is 2 * 15000 / 60 = 500 (assuming no actuator movement and that on average it takes half a rotation to seek to the requested position). Since the fio test does 4 simultaneous ops (one random write, two random reads and one random read+write) and the last command has the lowest throughput of them all, you are getting in excess of 2.7kIOPS from that drive (maybe higher, you only gave the results of the r/w throughput) which is physically impossible from a single rotating drive (however, a good cache could provide much better read performance and disabling buffer flushing would provide faster writes). The same math sort of checks out on my consumer-grade 7200 drives: max IOPS is 2 * 7200 / 60 = 240; measured is 49 + 61 + 61 + 21 = 192

What are your throughput results for elasticsearch on the same machine(s) ?

Comment by Asya Kamsky [ 27/Feb/14 ]

I'm afraid not. I did run a different test on SSD drive (my laptop) first and I got:

  write: io=262144KB, bw=64346KB/s, iops=16086, runt=  4074msec
  read : io=262144KB, bw=31737KB/s, iops=7934, runt=  8260msec
  read : io=262144KB, bw=381578KB/s, iops=95394, runt=   687msec
  write: io=16352KB, bw=4638.9KB/s, iops=1159, runt=  3525msec

I also got correspondingly much faster insert rate on the text index test - in fact I went looking for AWS instance with slow storage because I figured SSD wasn't in any way comparable to your set-up.

Comment by Radu Dan [ 27/Feb/14 ]

Quick follow up:

I just realized that there's no way to get 600 IOPS single mechanical drive (even high-perf SAS ones peak at around 200). I've ran the same test on a SSD-vm hosted by DigitalOcean and I got a similar numbers to yours on the last thread (which seems to always be the slowest):

  write: io=262144KB, bw=13264KB/s, iops=3316, runt= 19763msec
  read : io=262144KB, bw=11134KB/s, iops=2783, runt= 23545msec
  read : io=262144KB, bw=10927KB/s, iops=2731, runt= 23990msec
  write: io=16352KB, bw=2167.6KB/s, *iops=541*, runt=  7544msec

Here are the numbers on a 5+ year old laptop (intel dual core @ 2.5GHz, 2GB of DDR2 with a 7200rpm HDD running centos 6.5 natively):

  write: io=262144KB, bw=203343B/s, iops=49, runt=1320109msec
  read : io=262144KB, bw=250412B/s, iops=61, runt=1071972msec
  read : io=262144KB, bw=250234B/s, iops=61, runt=1072734msec
  write: io=16352KB, bw=86981B/s, *iops=21*, runt=192506msec

In addition, this page http://aws.amazon.com/ebs/details/#piops seems to indicate that there are two different versions of EBS volumes: standard ones and SSD-backed drives. Current testing seems to indicate that you used a SSD for this test which would explain why you got such high numbers compared to my tests.

Comment by Radu Dan [ 26/Feb/14 ]

I used 1.0

Comment by Asya Kamsky [ 25/Feb/14 ]

The linked page has multiple different versions - which one did you use?

Comment by Radu Dan [ 24/Feb/14 ]

Yes, testing was done on the same machine. I just stopped mongodb, started elasticsearch instead and ran the same test.

ES data files are in /var/lib/elasticsearch, and this is my disk config:

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        20G  2.6G   17G  14% /
tmpfs           939M     0  939M   0% /dev/shm
/dev/sda2       107G   53G   49G  52% /var

/dev/sda resides on a 128GB file stored on a rather full but defragmented 1TB NTFS formatted drive on the host drive.

Out of curiosity, what numbers are you getting for an elasticsearch default config: http://www.elasticsearch.org/blog/apt-and-yum-repositories/ , using this index config:

[root@dev1 var]# curl -XPOST localhost:9200/text -d '{
>     "mappings": {
>         "wiki": {
>             "text": {
>                 "type": "string",
>                 "analyzer": "snowball"
>             },
>             "title": {
>                 "type": "string",
>                  "analyzer": "snowball"
>             },
>             "category": {
>                 "type": "integer"
>             }
>         }
>     }
> }'

Comment by Asya Kamsky [ 24/Feb/14 ]

I went back to 2.4.x and I got a slow disk (EBS volume on amazon) and reran the test:

1000 @ 265.39d/s
2000 @ 166.77d/s
3000 @ 101.85d/s
4000 @ 105.35d/s
5000 @ 94.52d/s
6000 @ 58.37d/s
7000 @ 35.02d/s
8000 @ 34.92d/s
9000 @ 36.41d/s
10000 @ 63.57d/s
11000 @ 53.98d/s
12000 @ 66.50d/s
13000 @ 29.04d/s
14000 @ 37.35d/s
15000 @ 46.16d/s
16000 @ 30.04d/s
17000 @ 50.20d/s

The flushing to disk was slow - I have lots of messages like this in my log:

Mon Feb 24 21:22:24.667 [DataFileSync] flushing mmaps took 16629ms  for 10 files
Mon Feb 24 21:23:22.943 [DataFileSync] flushing mmaps took 14905ms  for 10 files
Mon Feb 24 21:24:29.107 [DataFileSync] flushing mmaps took 21069ms  for 10 files

And disk IO utilization was pretty much at 100% of majority of the test:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
xvdi              0.00     0.00    0.00  212.00     0.00 18435.00    86.96    24.68  146.94   1.64  34.80
xvdi              0.00     0.00    0.00  346.00     0.00 28856.00    83.40    96.89  232.25   1.50  52.00
xvdi              0.00     0.00    0.00  872.00     0.00 40918.00    46.92   142.62  170.61   1.15 100.00
xvdi              0.00     0.00    0.00 1004.00     0.00 39136.00    38.98   183.05  181.13   1.00 100.00
xvdi              0.00     0.00    0.00  948.00     0.00 48366.00    51.02   182.71  156.04   1.05 100.00
xvdi              0.00     0.00    0.00  904.00     0.00 43280.00    47.88   272.18  301.71   1.11 100.00
xvdi              0.00     1.00    0.00  945.00     0.00 41247.00    43.65   271.79  306.11   1.06 100.00
xvdi              0.00     0.00    0.00 1204.00     0.00 30224.00    25.10   176.68  150.36   0.83 100.00
xvdi              0.00     0.00    0.00 1050.00     0.00 46048.00    43.86   275.05  257.28   0.95 100.00
xvdi              0.00     0.00    0.00 1277.00     0.00 24224.00    18.97   271.23  212.19   0.78 100.00
xvdi              0.00     0.00    0.00 1240.00     0.00 25656.00    20.69   271.36  214.48   0.81 100.00
xvdi              0.00     0.00    0.00 1276.00     0.00 28392.00    22.25   179.76  161.68   0.78 100.00
xvdi              0.00     0.00    0.00  921.00     0.00 47080.00    51.12   239.32  219.06   1.09 100.00
xvdi              0.00     0.00    0.00  928.00     0.00 53400.00    57.54   202.37  250.28   1.08 100.00
xvdi              0.00     0.00    0.00  325.00     0.00 28107.00    86.48    52.54  209.90   1.48  48.00
xvdi              0.00     0.00    0.00  171.00     0.00 14822.00    86.68    19.67  115.04   1.61  27.60
xvdi              0.00     0.00    0.00  174.00     0.00 15206.00    87.39    18.10  104.02   1.22  21.20
xvdi              0.00     0.00    0.00  169.00     0.00 14774.00    87.42    20.46  121.09   1.49  25.20
xvdi              0.00     0.00    0.00  169.00     0.00 14598.00    86.38    20.64  122.13   1.42  24.00

Now, this disk's FIO numbers are a few orders of magnitude higher than yours. From your output last part:

// yours
  write: io=16352KB, bw=61638B/s, iops=15, runt=271657msec
    slat (usec): min=0, max=1079.8K, avg=457.48, stdev=17202.62
    clat (usec): min=0, max=3757.1K, avg=11023.49, stdev=121262.21
     lat (usec): min=60, max=3757.1K, avg=11504.25, stdev=122582.06
    clat percentiles (usec):
     |  1.00th=[    8],  5.00th=[   64], 10.00th=[  135], 20.00th=[  207],
     | 30.00th=[  225], 40.00th=[  231], 50.00th=[  237], 60.00th=[  247],
     | 70.00th=[  270], 80.00th=[  290], 90.00th=[  346], 95.00th=[  516],
     | 99.00th=[117248], 99.50th=[987136], 99.90th=[1597440], 99.95th=[1892352],
     | 99.99th=[3751936]
    bw (KB  /s): min=    1, max=  450, per=11.12%, avg=113.98, stdev=96.45
    lat (usec) : 2=0.02%, 10=1.76%, 20=0.52%, 50=0.16%, 100=0.33%
    lat (usec) : 250=28.33%, 500=16.31%, 750=0.55%, 1000=0.26%
    lat (msec) : 2=0.49%, 4=1.09%, 10=5.42%, 20=19.81%, 50=17.83%
    lat (msec) : 100=4.94%, 250=0.63%, 500=0.12%, 750=0.10%, 1000=0.16%
    lat (msec) : 2000=0.95%, >=2000=0.21%
  WRITE: io=278496KB, aggrb=1025KB/s, minb=60KB/s, maxb=1301KB/s, mint=201475msec, maxt=271657msec
// 
// EBS
  write: io=16352KB, bw=2746.9KB/s, iops=686, runt=  5953msec
    slat (usec): min=0, max=27, avg= 1.77, stdev= 0.69
    clat (msec): min=5, max=55, avg=12.06, stdev= 5.17
     lat (msec): min=5, max=55, avg=12.06, stdev= 5.17
    clat percentiles (usec):
     |  1.00th=[ 7840],  5.00th=[ 8768], 10.00th=[ 9280], 20.00th=[10048],
     | 30.00th=[10432], 40.00th=[10816], 50.00th=[11200], 60.00th=[11712],
     | 70.00th=[12096], 80.00th=[12736], 90.00th=[13760], 95.00th=[15680],
     | 99.00th=[50944], 99.50th=[52480], 99.90th=[54016], 99.95th=[54528],
     | 99.99th=[55552]
    bw (KB  /s): min= 2288, max= 3000, per=67.47%, avg=2741.82, stdev=203.68
    lat (msec) : 4=0.02%, 10=26.86%, 20=70.91%, 50=1.33%, 100=0.88%
  WRITE: io=278496KB, aggrb=4064KB/s, minb=2746KB/s, maxb=3825KB/s, mint=5953msec, maxt=68521msec

Your mongod data files are in dbpath =/var/lib/mongodb - is this the same location/drive that your ES test was using?

Asya

Comment by Asya Kamsky [ 24/Feb/14 ]

I ran this with same text index:

1000 @ 375.35d/s
2000 @ 258.09d/s
3000 @ 217.65d/s
4000 @ 174.47d/s
5000 @ 161.53d/s
6000 @ 134.32d/s
7000 @ 101.67d/s
8000 @ 132.99d/s
9000 @ 116.45d/s
10000 @ 105.50d/s
11000 @ 110.00d/s
12000 @ 110.08d/s
13000 @ 101.94d/s
14000 @ 92.81d/s
15000 @ 109.14d/s
16000 @ 108.46d/s
17000 @ 120.69d/s
18000 @ 101.88d/s
19000 @ 106.84d/s
20000 @ 99.72d/s
21000 @ 106.74d/s
22000 @ 90.70d/s
23000 @ 98.92d/s
24000 @ 96.32d/s
25000 @ 90.84d/s
26000 @ 98.58d/s
27000 @ 98.20d/s
28000 @ 104.97d/s
29000 @ 93.44d/s
30000 @ 100.07d/s
31000 @ 91.81d/s
32000 @ 103.49d/s
33000 @ 96.61d/s
34000 @ 104.29d/s
35000 @ 72.03d/s
36000 @ 100.91d/s
37000 @ 87.88d/s
38000 @ 93.46d/s
39000 @ 93.50d/s
40000 @ 78.44d/s
41000 @ 85.89d/s
42000 @ 98.92d/s
43000 @ 92.84d/s
44000 @ 94.71d/s
45000 @ 60.56d/s
46000 @ 64.53d/s
47000 @ 69.74d/s
48000 @ 66.10d/s
49000 @ 63.03d/s
50000 @ 57.55d/s

I interrupted it at this point....

The reason mine got very slow at this point was the size of the index became larger the RAM that the process was getting:

text@local(2.6.0-rc0) > db.wiki.stats(1024*1024)
{
	"ns" : "text.wiki",
	"count" : 51000,
	"size" : 385,
	"avgObjSize" : 7922,
	"storageSize" : 447,
	"numExtents" : 15,
	"nindexes" : 2,
	"lastExtentSize" : 119,
	"paddingFactor" : 1,
	"systemFlags" : 1,
	"userFlags" : 1,
	"totalIndexSize" : 733,
	"indexSizes" : {
		"_id_" : 1,
		"title_text_text_text_category_1" : 731
	},
	"ok" : 1
}

Mongostat showing resident memory used:

insert  query update delete getmore command flushes mapped  vsize    res faults  locked db idx miss %     qr|qw   ar|aw  netIn netOut  conn       time
  1000     *0     *0     *0       0     5|0       1  59.6g   122g   803m    743 text:95.6%          0       0|0     0|0     5m    62k     3   11:57:17
    *0     *0     *0     *0       0     1|0       0  59.6g   122g   850m    574 text:67.4%          0       1|0     0|1    62b    30k     3   11:57:33
  1000     *0     *0     *0       0     2|0       0  59.6g   122g   757m  13760 text:94.9%          0       0|0     0|0     5m    30k     3   11:57:34

Even so, at the slowest ingest rate for me, it was faster than yours was for the first batch...

Your disk is definitely a lot slower - was your ES test using the same device as where mongod files are stored?

Comment by Radu Dan [ 24/Feb/14 ]

Welp, forking didn't fix it at all. I tried with 8 consumers and the results look like this:

(7): 100 @ 119.42d/s
(4): 100 @ 58.02d/s
(5): 100 @ 32.06d/s
(2): 100 @ 22.41d/s
(3): 100 @ 16.62d/s
(0): 100 @ 13.26d/s
(6): 100 @ 10.65d/s
(1): 100 @ 8.96d/s
(7): 200 @ 8.28d/s
(4): 200 @ 7.84d/s
(5): 200 @ 7.88d/s
(2): 200 @ 7.76d/s
(3): 200 @ 7.64d/s
(0): 200 @ 7.65d/s
(6): 200 @ 7.49d/s
(1): 200 @ 7.18d/s
(7): 300 @ 6.01d/s
(5): 300 @ 6.12d/s
(2): 300 @ 5.88d/s
(3): 300 @ 5.20d/s
(0): 300 @ 4.91d/s
(6): 300 @ 4.65d/s
(1): 300 @ 4.20d/s
(4): 300 @ 2.59d/s
(7): 400 @ 3.47d/s
(5): 400 @ 3.09d/s
(2): 400 @ 2.71d/s
(3): 400 @ 2.61d/s
(0): 400 @ 2.54d/s
(6): 400 @ 2.57d/s
(1): 400 @ 2.22d/s
(4): 400 @ 2.09d/s
(7): 500 @ 2.08d/s
(5): 500 @ 2.19d/s
(2): 500 @ 2.35d/s
(3): 500 @ 2.18d/s
(0): 500 @ 2.15d/s
(6): 500 @ 2.12d/s
(1): 500 @ 2.25d/s
(7): 600 @ 2.71d/s
(5): 600 @ 2.37d/s
(2): 600 @ 2.05d/s
(3): 600 @ 2.25d/s
(0): 600 @ 2.27d/s
(6): 600 @ 2.09d/s
(4): 500 @ 1.20d/s
(1): 600 @ 1.99d/s
(7): 700 @ 2.00d/s
(5): 700 @ 2.06d/s
(2): 700 @ 2.21d/s
(3): 700 @ 2.18d/s
(0): 700 @ 1.89d/s
(6): 700 @ 1.89d/s
(4): 600 @ 1.81d/s
(1): 700 @ 1.70d/s
(5): 800 @ 1.83d/s

For this test, I had to reduce the batch size to 100 because I the consumers were throwing MemoryErrors (probably to be expected, considering that redis was storing some 10k documents, each of the 8 consumers was attempting to store 1k and on top of that, there's mongo & the OS). However the logarithmic drop in throughput is still there.

I attempted to do a mongostat during this process, but no data was ever sent to the tty. Gonna upload my mongo config and import script (requires an extracted enwiki-pages-articles.xml). Since the virtual machine is some 128GB, uploading it is currently unfeasible. And while I do have a routable IP, it's constantly changing so providing reliable ssh access to the VM would not be easy. I am however willing to try just about anything to help get to the bottom of this so if you still can't reproduce with the provided data, we'll figure it out somehow.

Comment by Asya Kamsky [ 24/Feb/14 ]

Could you attach the complete log to the ticket in case there is something further down that might hold a clue to this?
If you can attach the Python script as well, that would be helpful as I couldn't reproduce such low numbers in my quick tests either.

Comment by Radu Dan [ 24/Feb/14 ]

My first guess was the drive config. However, how do you explain ES's performance on the same testing environment?

Anyway, here's a FIO config and output:

; Four threads, two query, two writers.
 
[global]
rw=randread
size=256m
directory=/tmp/fio-testing/data
ioengine=libaio
iodepth=4
invalidate=1
direct=1
 
[bgwriter]
rw=randwrite
iodepth=32
 
[queryA]
iodepth=1
ioengine=mmap
direct=0
thinktime=3
 
[queryB]
iodepth=1
ioengine=mmap
direct=0
thinktime=5
 
[bgupdater]
rw=randrw
iodepth=16
thinktime=40
size=32m

[root@dev1 fio]# fio four-threads-randio.fio
bgwriter: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32
queryA: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=mmap, iodepth=1
queryB: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=mmap, iodepth=1
bgupdater: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=16
fio-2.1.4
Starting 4 processes
bgwriter: Laying out IO file(s) (1 file(s) / 256MB)
queryA: Laying out IO file(s) (1 file(s) / 256MB)
queryB: Laying out IO file(s) (1 file(s) / 256MB)
bgupdater: Laying out IO file(s) (1 file(s) / 32MB)
Jobs: 2 (f=2): [_rr_] [99.1% done] [3248KB/0KB/0KB /s] [812/0/0 iops] [eta 00m:09s]
bgwriter: (groupid=0, jobs=1): err= 0: pid=4587: Mon Feb 24 11:04:12 2014
  write: io=262144KB, bw=1301.2KB/s, iops=325, runt=201475msec
    slat (usec): min=0, max=604332, avg=52.93, stdev=3186.51
    clat (usec): min=154, max=8346.9K, avg=98286.18, stdev=490616.48
     lat (usec): min=186, max=8346.9K, avg=98350.92, stdev=490659.54
    clat percentiles (usec):
     |  1.00th=[  924],  5.00th=[ 1432], 10.00th=[ 1624], 20.00th=[ 1864],
     | 30.00th=[ 2224], 40.00th=[ 2864], 50.00th=[ 3632], 60.00th=[ 3984],
     | 70.00th=[ 4192], 80.00th=[ 4512], 90.00th=[ 6752], 95.00th=[14144],
     | 99.00th=[2572288], 99.50th=[3620864], 99.90th=[4685824], 99.95th=[4751360],
     | 99.99th=[8355840]
    bw (KB  /s): min=    3, max= 4763, per=100.00%, avg=1328.79, stdev=992.95
    lat (usec) : 250=0.01%, 500=0.09%, 750=0.28%, 1000=1.11%
    lat (msec) : 2=22.86%, 4=36.53%, 10=32.90%, 20=1.30%, 50=0.04%
    lat (msec) : 250=0.06%, 500=0.01%, 750=0.09%, 1000=0.65%, 2000=2.46%
    lat (msec) : >=2000=1.63%
  cpu          : usr=0.05%, sys=2.72%, ctx=34518, majf=0, minf=27
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=65536/d=0, short=r=0/w=0/d=0
queryA: (groupid=0, jobs=1): err= 0: pid=4588: Mon Feb 24 11:04:12 2014
  read : io=262144KB, bw=270867B/s, iops=66, runt=991021msec
    clat (usec): min=112, max=3725.7K, avg=15074.10, stdev=69829.13
     lat (usec): min=116, max=3725.7K, avg=15081.80, stdev=69829.18
    clat percentiles (usec):
     |  1.00th=[  434],  5.00th=[ 3056], 10.00th=[ 5792], 20.00th=[ 8096],
     | 30.00th=[ 9664], 40.00th=[10944], 50.00th=[12096], 60.00th=[13248],
     | 70.00th=[14528], 80.00th=[16192], 90.00th=[18304], 95.00th=[20864],
     | 99.00th=[42752], 99.50th=[70144], 99.90th=[1335296], 99.95th=[1728512],
     | 99.99th=[3227648]
    bw (KB  /s): min=    1, max= 1831, per=54.96%, avg=299.56, stdev=135.20
    lat (usec) : 250=0.07%, 500=1.48%, 750=2.00%, 1000=0.59%
    lat (msec) : 2=0.29%, 4=1.65%, 10=26.61%, 20=61.24%, 50=5.21%
    lat (msec) : 100=0.55%, 250=0.14%, 500=0.03%, 750=0.01%, 1000=0.02%
    lat (msec) : 2000=0.09%, >=2000=0.03%
  cpu          : usr=0.01%, sys=1.46%, ctx=66903, majf=65536, minf=32
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=65536/w=0/d=0, short=r=0/w=0/d=0
queryB: (groupid=0, jobs=1): err= 0: pid=4589: Mon Feb 24 11:04:12 2014
  read : io=262144KB, bw=270838B/s, iops=66, runt=991128msec
    clat (usec): min=81, max=3718.6K, avg=15076.08, stdev=69697.60
     lat (usec): min=84, max=3718.6K, avg=15083.92, stdev=69697.65
    clat percentiles (usec):
     |  1.00th=[  398],  5.00th=[ 2672], 10.00th=[ 5792], 20.00th=[ 8096],
     | 30.00th=[ 9664], 40.00th=[10944], 50.00th=[12096], 60.00th=[13248],
     | 70.00th=[14528], 80.00th=[16192], 90.00th=[18304], 95.00th=[20864],
     | 99.00th=[42752], 99.50th=[69120], 99.90th=[1351680], 99.95th=[1728512],
     | 99.99th=[3227648]
    bw (KB  /s): min=    1, max= 2111, per=54.92%, avg=299.29, stdev=136.53
    lat (usec) : 100=0.01%, 250=0.07%, 500=1.70%, 750=1.99%, 1000=0.59%
    lat (msec) : 2=0.29%, 4=1.62%, 10=26.27%, 20=61.29%, 50=5.30%
    lat (msec) : 100=0.56%, 250=0.15%, 500=0.02%, 750=0.01%, 1000=0.02%
    lat (msec) : 2000=0.09%, >=2000=0.03%
  cpu          : usr=0.01%, sys=1.46%, ctx=67098, majf=65536, minf=32
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=65536/w=0/d=0, short=r=0/w=0/d=0
bgupdater: (groupid=0, jobs=1): err= 0: pid=4590: Mon Feb 24 11:04:12 2014
  read : io=16416KB, bw=61879B/s, iops=15, runt=271657msec
    slat (usec): min=0, max=406392, avg=265.37, stdev=7178.64
    clat (usec): min=8, max=3573.2K, avg=59063.69, stdev=255470.76
     lat (usec): min=467, max=3573.2K, avg=59342.49, stdev=255706.46
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    6], 10.00th=[    9], 20.00th=[   13],
     | 30.00th=[   16], 40.00th=[   18], 50.00th=[   20], 60.00th=[   22],
     | 70.00th=[   25], 80.00th=[   29], 90.00th=[   56], 95.00th=[   77],
     | 99.00th=[ 1631], 99.50th=[ 1762], 99.90th=[ 3195], 99.95th=[ 3392],
     | 99.99th=[ 3589]
    bw (KB  /s): min=    1, max=  353, per=19.80%, avg=107.93, stdev=86.20
  write: io=16352KB, bw=61638B/s, iops=15, runt=271657msec
    slat (usec): min=0, max=1079.8K, avg=457.48, stdev=17202.62
    clat (usec): min=0, max=3757.1K, avg=11023.49, stdev=121262.21
     lat (usec): min=60, max=3757.1K, avg=11504.25, stdev=122582.06
    clat percentiles (usec):
     |  1.00th=[    8],  5.00th=[   64], 10.00th=[  135], 20.00th=[  207],
     | 30.00th=[  225], 40.00th=[  231], 50.00th=[  237], 60.00th=[  247],
     | 70.00th=[  270], 80.00th=[  290], 90.00th=[  346], 95.00th=[  516],
     | 99.00th=[117248], 99.50th=[987136], 99.90th=[1597440], 99.95th=[1892352],
     | 99.99th=[3751936]
    bw (KB  /s): min=    1, max=  450, per=11.12%, avg=113.98, stdev=96.45
    lat (usec) : 2=0.02%, 10=1.76%, 20=0.52%, 50=0.16%, 100=0.33%
    lat (usec) : 250=28.33%, 500=16.31%, 750=0.55%, 1000=0.26%
    lat (msec) : 2=0.49%, 4=1.09%, 10=5.42%, 20=19.81%, 50=17.83%
    lat (msec) : 100=4.94%, 250=0.63%, 500=0.12%, 750=0.10%, 1000=0.16%
    lat (msec) : 2000=0.95%, >=2000=0.21%
  cpu          : usr=0.01%, sys=0.89%, ctx=7959, majf=0, minf=27
  IO depths    : 1=99.8%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=4104/w=4088/d=0, short=r=0/w=0/d=0
 
Run status group 0 (all jobs):
   READ: io=540704KB, aggrb=545KB/s, minb=60KB/s, maxb=264KB/s, mint=271657msec, maxt=991128msec
  WRITE: io=278496KB, aggrb=1025KB/s, minb=60KB/s, maxb=1301KB/s, mint=201475msec, maxt=271657msec
 
Disk stats (read/write):
  sda: ios=135457/70187, merge=10/5468, ticks=2217146/6510616, in_queue=8725095, util=100.00%
 

As far as the -noprealloc option, I didn't explicitly set it, and the following log seems to indicate that it's not likely:

Mon Feb 24 09:44:55.396 [initandlisten] MongoDB starting : pid=4178 port=27017 dbpath=/var/lib/mongodb 64-bit host=dev1
Mon Feb 24 09:44:55.396 [initandlisten] db version v2.4.6
Mon Feb 24 09:44:55.396 [initandlisten] git version: nogitversion
Mon Feb 24 09:44:55.396 [initandlisten] build info: Linux buildvm-09.phx2.fedoraproject.org 3.10.9-200.fc19.x86_64 #1 SMP Wed Aug 21 19:27:58 UTC 2013 x86_64 BOOST_LIB_VERSION=1_41
Mon Feb 24 09:44:55.396 [initandlisten] allocator: tcmalloc
Mon Feb 24 09:44:55.396 [initandlisten] options: { bind_ip: "127.0.0.1", command: [ "run" ], config: "/etc/mongodb.conf", dbpath: "/var/lib/mongodb", fork: "true", journal: "true", logpath: "/var/log/mongodb/mongodb.log", nohttpinterface: "true", pidfilepath: "/var/run/mongodb/mongodb.pid", port: 27017, quiet: true, setParameter: [ "textSearchEnabled=true" ] }
Mon Feb 24 09:44:55.434 [initandlisten] journal dir=/var/lib/mongodb/journal
Mon Feb 24 09:44:55.434 [initandlisten] recover : no journal files present, no recovery needed
Mon Feb 24 09:44:55.457 [initandlisten] waiting for connections on port 27017
Mon Feb 24 09:45:17.346 [FileAllocator] allocating new datafile /var/lib/mongodb/text.ns, filling with zeroes...
Mon Feb 24 09:45:17.346 [FileAllocator] creating directory /var/lib/mongodb/_tmp
Mon Feb 24 09:45:17.361 [FileAllocator] done allocating datafile /var/lib/mongodb/text.ns, size: 16MB,  took 0.001 secs
Mon Feb 24 09:45:17.361 [FileAllocator] allocating new datafile /var/lib/mongodb/text.0, filling with zeroes...
Mon Feb 24 09:45:17.364 [FileAllocator] done allocating datafile /var/lib/mongodb/text.0, size: 64MB,  took 0.002 secs
Mon Feb 24 09:45:17.364 [FileAllocator] allocating new datafile /var/lib/mongodb/text.1, filling with zeroes...
Mon Feb 24 09:45:17.365 [conn1] build index text.wiki { _id: 1 }
Mon Feb 24 09:45:17.366 [conn1] build index done.  scanned 0 total records. 0 secs
Mon Feb 24 09:45:17.366 [conn1] info: creating collection text.wiki on add index
Mon Feb 24 09:45:17.366 [conn1] build index text.wiki { _fts: "text", _ftsx: 1, category: 1.0 }
Mon Feb 24 09:45:17.368 [conn1] build index done.  scanned 0 total records. 0.002 secs
Mon Feb 24 09:45:17.388 [FileAllocator] done allocating datafile /var/lib/mongodb/text.1, size: 128MB,  took 0.019 secs
Mon Feb 24 09:46:02.331 [FileAllocator] allocating new datafile /var/lib/mongodb/text.2, filling with zeroes...
Mon Feb 24 09:46:02.334 [FileAllocator] done allocating datafile /var/lib/mongodb/text.2, size: 256MB,  took 0.002 secs
Mon Feb 24 09:46:13.430 [conn2] insert text.wiki ninserted:1000 keyUpdates:0 locks(micros) w:14942328 14944ms
Mon Feb 24 09:46:23.025 [FileAllocator] allocating new datafile /var/lib/mongodb/text.3, filling with zeroes...
Mon Feb 24 09:46:23.722 [FileAllocator] done allocating datafile /var/lib/mongodb/text.3, size: 512MB,  took 0.696 secs
Mon Feb 24 09:46:41.596 [conn2] insert text.wiki ninserted:1000 keyUpdates:0 locks(micros) w:26483969 26486ms

Comment by Asya Kamsky [ 24/Feb/14 ]

There might be a disk problem... wonder is mongod by any chance may be running with --noprealloc option?

Regardless, every time a file is allocated the system seems to come to a halt:

160m   712m   159m      2  text:0.0%      0|0     0|0   120b     3k     2   07:48:15
160m   712m   159m      0  text:0.0%      0|0     0|0    62b     2k     2   07:48:16    // 15 seconds gap
288m   968m   280m     13  text:0.0%      0|0     0|1    62b     2k     2   07:48:31
288m   968m   273m   1234 text:92.6%      0|0     0|0    31m     3k     2   07:48:32
288m   968m   268m      0     .:0.1%      0|0     0|0    62b     2k     2   07:48:33     // 28 second gap
544m  1.45g   406m     55 text:52.6%      1|0     0|1    62b     2k     2   07:49:01
544m  1.45g   382m   1268 text:97.3%      0|0     0|0    28m     3k     2   07:49:02
...
544m  1.45g   472m      0  test:0.0%      0|0     0|0    62b     2k     2   07:49:58
544m  1.45g   472m      0  text:0.0%      0|0     0|0    62b     2k     2   07:49:59     // 91 second gap
1.03g  2.45g   651m     20  text:0.0%     1|0     0|1    62b     2k     2   07:51:30

I wasn't able to reproduce your numbers, so wondering if it's a hardware/disk config issue.

Comment by Radu Dan [ 24/Feb/14 ]

Gonna push all inputs into a redis list, then setup 8 consumers in separate python processes to avoid the GIL. Will follow up in an hour or so with the results

Comment by Eliot Horowitz (Inactive) [ 24/Feb/14 ]

For now can just continue with 2.4.6

Do you think splitting inputs up is easy?

Comment by Radu Dan [ 24/Feb/14 ]

python and mongo are on the same machine, however python is idling while mongo is running and viceversa. I could do it from a different VM if you want, but I highly doubt the result will be different.

I'm also having trouble running rc0 with either the 1.4.6 or the 1.5.2 config files (I've deleted /var/log/mongo* and /var/lib/mongo*)

[root@dev1 bin]# pwd
/root/mongodb-linux-x86_64-2.6.0-rc0/bin
 
[root@dev1 bin]# ps aux | grep mongo
root      3773  0.0  0.0 103244   844 pts/0    S+   08:39   0:00 grep mongo
 
[root@dev1 bin]# ./mongod
./mongod --help for help and startup options
2014-02-24T08:39:17.276+0200 [initandlisten] MongoDB starting : pid=3774 port=27017 dbpath=/data/db 64-bit host=dev1
2014-02-24T08:39:17.276+0200 [initandlisten] db version v2.6.0-rc0
2014-02-24T08:39:17.276+0200 [initandlisten] git version: 3779bf419438d2625277cf3c1c7216c497f683a8
2014-02-24T08:39:17.276+0200 [initandlisten] build info: Linux build9.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2014-02-24T08:39:17.276+0200 [initandlisten] allocator: tcmalloc
2014-02-24T08:39:17.276+0200 [initandlisten] options: {}
2014-02-24T08:39:17.276+0200 [initandlisten] exception in initAndListen: 10296
*********************************************************************
 ERROR: dbpath (/data/db) does not exist.
 Create this directory or give existing directory in --dbpath.
 See http://dochub.mongodb.org/core/startingandstoppingmongo
*********************************************************************
, terminating
dbexit: 2014-02-24T08:39:17.276+0200 [initandlisten] shutdown: going to close listening sockets...
2014-02-24T08:39:17.276+0200 [initandlisten] shutdown: going to flush diaglog...
2014-02-24T08:39:17.276+0200 [initandlisten] shutdown: going to close sockets...
2014-02-24T08:39:17.277+0200 [initandlisten] shutdown: waiting for fs preallocator...
2014-02-24T08:39:17.277+0200 [initandlisten] shutdown: lock for final commit...
2014-02-24T08:39:17.277+0200 [initandlisten] shutdown: final commit...
2014-02-24T08:39:17.277+0200 [initandlisten] shutdown: closing all files...
2014-02-24T08:39:17.277+0200 [initandlisten] closeAllFiles() finished
dbexit: really exiting now
 
[root@dev1 bin]# ./mongod --config /etc/mongodb.conf
2014-02-24T08:39:22.720+0200 Attempted to set textSearchEnabled server parameter.
2014-02-24T08:39:22.720+0200 Text search is enabled by default and cannot be disabled.
2014-02-24T08:39:22.720+0200 The following are now deprecated and will be removed in a future release:
2014-02-24T08:39:22.720+0200 - the "textSearchEnabled" server parameter (setting it has no effect)
2014-02-24T08:39:22.720+0200 - the "text" command (has been replaced by the $text query operator)
about to fork child process, waiting until server is ready for connections.
forked process: 3780
ERROR: child process failed, exited with error number 1
 
[root@dev1 bin]# ./mongod --config /etc/mongod.conf
about to fork child process, waiting until server is ready for connections.
forked process: 3786
ERROR: child process failed, exited with error number 1
[root@dev1 bin]#

In addition, there are no files or folders created in /var/log

Comment by Eliot Horowitz (Inactive) [ 24/Feb/14 ]

As an easy test, may want to try parallelizing.
If you can split input up and run one python process per file, would be an easy way to diagnose where bottleneck is.

Comment by Eliot Horowitz (Inactive) [ 24/Feb/14 ]

can you do:

import pymongo
pymongo.has_c()

You can get 2.6.0-rc0 from http://www.mongodb.org/downloads and just run the binary from the tarball.

Are python and mongo on the same machine or different?

Can you send top output?

Comment by Radu Dan [ 24/Feb/14 ]

I did run top on my first try and mongod was indeed largely idle and the python process was sleeping. Here's the relevant code:

start = time.time()
self.db.wiki.insert(self.rows)
try:
        print "Total: %d. Speed: %2.2fr/s" % (self.total, len(self.rows) / (time.time() - start))
except ZeroDivisionError:
        print "Total: %d. Speed: unknown" % self.total

self.db is a pymongo database bound to "text"

>>> import pymongo
p>>> pymongo
<module 'pymongo' from '/usr/lib64/python2.6/site-packages/pymongo/__init__.pyc'>
>>> pymongo.version
'2.6.3'

And yes, I have the C module installed: edited

>>> import pymongo
>>> pymongo.has_c()
True

Any easy way to install 2.6.0 without compiling from source?

Comment by Eliot Horowitz (Inactive) [ 24/Feb/14 ]

Looking at the mongostat above, the database is largely idle.

Can you run top on the boxes where mongod and python are?

Are you sure you installed the python driver with the c extension?

Also, would be better you use 2.6.0-rc2 instead of 2.5.2

Comment by Radu Dan [ 24/Feb/14 ]

Kudos for fast reply

The previously provided results are for 2.5.2 (which is the current unstable build on the 10gen repo). I got better (first 1000 documents inserted with a 76dps throughput) results with 2.4.6 from the epel repo, however these were after doing the disc defrag.

Here's the collection setup:

> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "text.wiki", "name" : "_id_" }
{ "v" : 1, "key" : { "_fts" : "text", "_ftsx" : 1, "category" : 1 }, "ns" : "text.wiki", "name" : "title_text_text_text_category_1", "weights" : { "text" : 1, "title" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 1 }

mongostat for 2.4.6:

insert  query update delete getmore command flushes mapped  vsize    res faults  locked db idx miss %     qr|qw   ar|aw  netIn netOut  conn       time
    *0     *0     *0     *0       0     1|0       0   160m   712m   159m      0  text:0.0%          0       0|0     0|0    62b     2k     1   07:48:12
    *0     *0     *0     *0       0     1|0       0   160m   712m   159m      0  text:0.0%          0       0|0     0|0    62b     2k     1   07:48:13
    *0     *0     *0     *0       0     1|0       0   160m   712m   159m      0     .:0.0%          0       0|0     0|0    62b     2k     1   07:48:14
    *0     *0     *0     *0       0     2|0       0   160m   712m   159m      2  text:0.0%          0       0|0     0|0   120b     3k     2   07:48:15
    *0     *0     *0     *0       0     1|0       0   160m   712m   159m      0  text:0.0%          0       0|0     0|0    62b     2k     2   07:48:16
    *0     *0     *0     *0       0     1|0       0   288m   968m   280m     13  text:0.0%          0       0|0     0|1    62b     2k     2   07:48:31
  1000      3     *0     *0       0     2|0       0   288m   968m   273m   1234 text:92.6%          0       0|0     0|0    31m     3k     2   07:48:32
    *0     *0     *0     *0       0     1|0       0   288m   968m   268m      0     .:0.1%          0       0|0     0|0    62b     2k     2   07:48:33
    *0     *0     *0     *0       0     1|0       0   544m  1.45g   406m     55 text:52.6%          0       1|0     0|1    62b     2k     2   07:49:01
  1000     *0     *0     *0       0     2|0       1   544m  1.45g   382m   1268 text:97.3%          0       0|0     0|0    28m     3k     2   07:49:02
insert  query update delete getmore command flushes mapped  vsize    res faults  locked db idx miss %     qr|qw   ar|aw  netIn netOut  conn       time
    *0     *0     *0     *0       0     1|0       0   544m  1.45g   364m      0     .:0.2%          0       0|0     0|0    62b     2k     2   07:49:03
    *0     *0     *0     *0       0     1|0       0   544m  1.45g   364m      0  text:0.0%          0       0|0     0|0    62b     2k     2   07:49:04
    *0     *0     *0     *0       0     1|0       0   544m  1.45g   509m     25 text:70.8%          0       1|0     0|1    62b     2k     2   07:49:56
  1000      3     *0     *0       0     2|0       1   544m  1.45g   472m   1467 text:98.1%          0       0|0     0|0    31m     3k     2   07:49:57
    *0     *0     *0     *0       0     1|0       0   544m  1.45g   472m      0  test:0.0%          0       0|0     0|0    62b     2k     2   07:49:58
    *0     *0     *0     *0       0     1|0       0   544m  1.45g   472m      0  text:0.0%          0       0|0     0|0    62b     2k     2   07:49:59
    *0     *0     *0     *0       0     1|0       0  1.03g  2.45g   651m     20  text:0.0%          0       1|0     0|1    62b     2k     2   07:51:30
  1000      3     *0     *0       0     2|0       1  1.03g  2.45g   603m   1390 text:99.1%          0       0|0     0|0    29m     3k     2   07:51:31
    *0     *0     *0     *0       0     1|0       0  1.03g  2.45g   573m      0     .:0.2%          0       0|0     0|0    62b     2k     2   07:51:32
    *0     *0     *0     *0       0     1|0       0  1.03g  2.45g   573m      0  text:0.0%          0       0|0     0|0    62b     2k     1   07:51:33
insert  query update delete getmore command flushes mapped  vsize    res faults  locked db idx miss %     qr|qw   ar|aw  netIn netOut  conn       time
    *0     *0     *0     *0       0     1|0       0  1.03g  2.45g   573m      0  text:0.0%          0       0|0     0|0    62b     2k     1   07:51:34
    *0     *0     *0     *0       0     1|0       0  1.03g  2.45g   573m      0  text:0.0%          0       0|0     0|0    62b     2k     1   07:51:35
    *0     *0     *0     *0       0     1|0       0  1.03g  2.45g   573m      0  text:0.0%          0       0|0     0|0    62b     2k     1   07:51:36
    *0     *0     *0     *0       0     1|0       1  1.03g  2.45g   573m      0  text:0.0%          0       0|0     0|0    62b     2k     1   07:51:37

Test results:

Total: 1000. Speed: 70.88r/s
Total: 2000. Speed: 35.89r/s
Total: 3000. Speed: 19.19r/s
Total: 4000. Speed: 11.05r/s

Gonna follow up with results for 2.5.2

Comment by Eliot Horowitz (Inactive) [ 24/Feb/14 ]

Interestingly we use the wikipedia data set as well with very different results.
Can you send mongostat output while inserts are happening?
Which version was this with?

Generated at Thu Feb 08 03:29:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.