|
The issue is that the first $sort is on zipcount and many cities have the same zipcount, which means that their order is unspecified. This causes a problem since the next operation depends on the order (due to the use of $last).
As a demonstration, here is your data through the first group and sorted on (state, zipcount) under 2.4 on my machine:
> db.test9371.aggregate( { $group: { _id: { state: "$state", city: "$city" }, zipcount : {$sum : 1}, pop: { $sum: "$pop" } } }, { $sort: { '_id.state':1, zipcount: 1 } }).result.forEach(printjsononeline)
|
{ "_id" : { "state" : "AL", "city" : "ADGER" }, "zipcount" : 1, "pop" : 3205 }
|
{ "_id" : { "state" : "AL", "city" : "ACMAR" }, "zipcount" : 1, "pop" : 6055 }
|
{ "_id" : { "state" : "AL", "city" : "ADAMSVILLE" }, "zipcount" : 1, "pop" : 10616 }
|
{ "_id" : { "state" : "CA", "city" : "LOS ANGELES" }, "zipcount" : 3, "pop" : 146408 }
|
{ "_id" : { "state" : "FL", "city" : "BRANFORD" }, "zipcount" : 1, "pop" : 2439 }
|
{ "_id" : { "state" : "FL", "city" : "BRYCEVILLE" }, "zipcount" : 1, "pop" : 1875 }
|
{ "_id" : { "state" : "FL", "city" : "CALLAHAN" }, "zipcount" : 1, "pop" : 9111 }
|
{ "_id" : { "state" : "IL", "city" : "ANTIOCH" }, "zipcount" : 1, "pop" : 18058 }
|
{ "_id" : { "state" : "IL", "city" : "ARLINGTON HEIGHT" }, "zipcount" : 2, "pop" : 79689 }
|
{ "_id" : { "state" : "ND", "city" : "AMENIA" }, "zipcount" : 1, "pop" : 321 }
|
{ "_id" : { "state" : "ND", "city" : "ABSARAKA" }, "zipcount" : 1, "pop" : 124 }
|
{ "_id" : { "state" : "ND", "city" : "ALICE" }, "zipcount" : 1, "pop" : 255 }
|
{ "_id" : { "state" : "TX", "city" : "ALLEN" }, "zipcount" : 1, "pop" : 24151 }
|
{ "_id" : { "state" : "TX", "city" : "CARROLLTON" }, "zipcount" : 2, "pop" : 92495 }
|
{ "_id" : { "state" : "UT", "city" : "AMERICAN FORK" }, "zipcount" : 1, "pop" : 21864 }
|
{ "_id" : { "state" : "UT", "city" : "ALTAMONT" }, "zipcount" : 1, "pop" : 146 }
|
{ "_id" : { "state" : "UT", "city" : "ALTONAH" }, "zipcount" : 1, "pop" : 10 }
|
and 2.5:
> db.test9371.aggregate( { $group: { _id: { state: "$state", city: "$city" }, zipcount : {$sum : 1}, pop: { $sum: "$pop" } } }, { $sort: { '_id.state':1, zipcount: 1 } })
|
{ "_id" : { "state" : "AL", "city" : "ADGER" }, "zipcount" : 1, "pop" : 3205 }
|
{ "_id" : { "state" : "AL", "city" : "ADAMSVILLE" }, "zipcount" : 1, "pop" : 10616 }
|
{ "_id" : { "state" : "AL", "city" : "ACMAR" }, "zipcount" : 1, "pop" : 6055 }
|
{ "_id" : { "state" : "CA", "city" : "LOS ANGELES" }, "zipcount" : 3, "pop" : 146408 }
|
{ "_id" : { "state" : "FL", "city" : "BRANFORD" }, "zipcount" : 1, "pop" : 2439 }
|
{ "_id" : { "state" : "FL", "city" : "CALLAHAN" }, "zipcount" : 1, "pop" : 9111 }
|
{ "_id" : { "state" : "FL", "city" : "BRYCEVILLE" }, "zipcount" : 1, "pop" : 1875 }
|
{ "_id" : { "state" : "IL", "city" : "ANTIOCH" }, "zipcount" : 1, "pop" : 18058 }
|
{ "_id" : { "state" : "IL", "city" : "ARLINGTON HEIGHT" }, "zipcount" : 2, "pop" : 79689 }
|
{ "_id" : { "state" : "ND", "city" : "AMENIA" }, "zipcount" : 1, "pop" : 321 }
|
{ "_id" : { "state" : "ND", "city" : "ABSARAKA" }, "zipcount" : 1, "pop" : 124 }
|
{ "_id" : { "state" : "ND", "city" : "ALICE" }, "zipcount" : 1, "pop" : 255 }
|
{ "_id" : { "state" : "TX", "city" : "ALLEN" }, "zipcount" : 1, "pop" : 24151 }
|
{ "_id" : { "state" : "TX", "city" : "CARROLLTON" }, "zipcount" : 2, "pop" : 92495 }
|
{ "_id" : { "state" : "UT", "city" : "ALTONAH" }, "zipcount" : 1, "pop" : 10 }
|
{ "_id" : { "state" : "UT", "city" : "AMERICAN FORK" }, "zipcount" : 1, "pop" : 21864 }
|
{ "_id" : { "state" : "UT", "city" : "ALTAMONT" }, "zipcount" : 1, "pop" : 146 }
|
Both of these are equally correct as they are both ordered as per the $sort, but they differ on how cities from the same state with the zipcount are ordered. If you check your results, you will see that each version chose ONE OF the the valid cities as the one with the most zipcodes, however, they happened to choose different cities.
If you want reproducible results, I'd suggest making your sort be more selective so that each document has a unique sort key. You can do that by adding _id to the end of the $sort key so that ties always go to the city that comes first (or last) alphabetically.
|