[SERVER-74408] Insert document exactly as specified on upserts Created: 27/Feb/23  Updated: 27/Oct/23  Resolved: 09/Jun/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Lenny Elzey Assignee: Bernard Gorman
Resolution: Works as Designed Votes: 1
Labels: external-user, query-director-triage
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-76705 Fields are reordered alphabetically b... Closed
Assigned Teams:
Query Execution
Sprint: QE 2023-06-12
Participants:

 Description   

How are you using Mongo? What version of the server and driver are you using?

MongoDB 6.0

MongoDB NodeJS Driver 5.1

What is the feature/improvement you would like?

When documents are upserted, insert them exactly as specified in other words maintain the position of each field.

What use case would this feature/improvement enable?

More pleasant documents that fit the form I want it to as well as requiring less function calls to the server to replicate the desired behavior

 

Code example:

const task = {
  kind: 'repeating',
  name: options.name,
  repeatsEvery: options.repeatsEvery,
  error: null,
  startsAt: nextRunAt,
  lastFinishedAt: null,
  lastFailedAt: null,
  lockedAt: null,
  createdAt: now,
  updatedAt: now
};
 
/*
  const exists = await _collection.findOne({ name: options.name });
  if (exists) {
    // We don't use upsert because we want to maintain the positions of each field
    await _collection.updateOne({ _id: exists._id }, { $set: task });
  } else {
    await _collection.insertOne(task); 
  }
*/
 
await collection.updateOne({ name: options.name }, { $set: task }, { upsert: true });
 
// Produces:
const generated = {
  _id: {
    $oid: '63fc0358c071ce978101c877'
  },
  name: 'add_two_single_tasks',
  createdAt: {
    $date: {
      $numberLong: '1677460312226'
    }
  },
  error: null,
  kind: 'repeating',
  lastFailedAt: null,
  lastFinishedAt: {
    $date: {
      $numberLong: '1677460327252'
    }
  },
  lockedAt: null,
  repeatsEvery: 5000,
  startsAt: null,
  updatedAt: {
    $date: {
      $numberLong: '1677460327252'
    }
  }
};
 
// Instead of:
 
const desired = {
  _id: {
    $oid: '63fc0578591cc70ce3f9665e'
  },
  kind: 'repeating',
  name: 'add_two_single_tasks',
  repeatsEvery: 5000,
  error: null,
  startsAt: {
    $date: {
      $numberLong: '1677460866287'
    }
  },
  lastFinishedAt: {
    $date: {
      $numberLong: '1677460861287'
    }
  },
  lastFailedAt: null,
  lockedAt: null,
  createdAt: {
    $date: {
      $numberLong: '1677460856262'
    }
  },
  updatedAt: {
    $date: {
      $numberLong: '1677460861287'
    }
  }
};



 Comments   
Comment by Bernard Gorman [ 08/Jun/23 ]

Hi developer.lenneth@gmail.com,

Thanks for raising this issue, I'll try to explain what's happening here. The important point to recognise is that, despite the fact that your code snippet is written in a way which visually implies that task is the new document to be inserted, that is not how the system sees it.

const task = {
  kind: 'repeating',
  name: options.name,
  repeatsEvery: options.repeatsEvery,
  error: null,
  startsAt: nextRunAt,
  lastFinishedAt: null,
  lastFailedAt: null,
  lockedAt: null,
  createdAt: now,
  updatedAt: now
};
 
await collection.updateOne({ name: options.name }, { $set: task }, { upsert: true });

From the update system's perspective, {$set:task} is not a complete new document to be inserted - it is a set of individual modifications to be applied to a matching document, just like a regular update. There could just as easily have been an $inc operator alongside the $set, or a $mul, or any of our other update operators. If no document matches the filter, then the update system generates a new document for upsert by applying all the individual modifications to an empty document (in practice, a document that is seeded with _id and the values from the query part of the update). And per our documentation, the update system always applies modifications in lexicographic or numeric field order. This is why the final document in your example has _id and name first, followed by the other fields from task in alphabetical order.

To illustrate why {$set:task} is not equivalent to a literal document for upsert, consider what would happen if task had a dotted field like {"a.b": 1}. If we assume that task was the document to be inserted, we would expect this field to appear exactly as-is in the upserted document. But the upsert would actually produce a document containing {a: {b: 1}}, because $set is processing the fields of task as individual update modifications, and in the update language a dotted path indicates a subdocument separator.

As an alternative, assuming that it suits your use-case, you could use replaceOne instead of updateOne. The difference here is that replaceOne does expect its argument to be a complete new document, and not a set of separate modification operators:

replset [direct: primary] test> db.testing.replaceOne({ name: 'add_two_single_tasks' }, task, { upsert: true });
{
  acknowledged: true,
  insertedId: ObjectId("64819b256dda04d846a1bafa"),
  matchedCount: 0,
  modifiedCount: 0,
  upsertedCount: 1
}
replset [direct: primary] test> db.testing.find()
[
  {
    _id: ObjectId("64819b256dda04d846a1bafa"),
    kind: 'repeating',
    name: 'add_two_single_tasks',
    repeatsEvery: 5000,
    error: null,
    startsAt: ISODate("2023-06-08T09:10:03.845Z"),
    lastFinishedAt: null,
    lastFailedAt: null,
    lockedAt: null,
    createdAt: ISODate("2023-06-08T09:03:52.656Z"),
    updatedAt: ISODate("2023-06-08T09:03:52.656Z")
  }
]

Finally, you could consider using an aggregation update instead of a regular update. The syntax is very similar, but the field-order outcome is different because aggregation updates use projection instead of the update subsystem - the $set stage is actually an alias for $addFields.

// Note that this is the same update as your example, except with the $set wrapped in an array.
replset [direct: primary] test> db.testing.updateOne({ name: 'add_two_single_tasks' }, [{ $set: task }], { upsert: true });
{
  acknowledged: true,
  insertedId: ObjectId("64819bdd6dda04d846a1bb3b"),
  matchedCount: 0,
  modifiedCount: 0,
  upsertedCount: 1
}
replset [direct: primary] test> db.testing.find()
[
  {
    _id: ObjectId("64819bdd6dda04d846a1bb3b"),
    name: 'add_two_single_tasks',
    kind: 'repeating',
    repeatsEvery: 5000,
    error: null,
    startsAt: ISODate("2023-06-08T09:10:03.845Z"),
    lastFinishedAt: null,
    lastFailedAt: null,
    lockedAt: null,
    createdAt: ISODate("2023-06-08T09:03:52.656Z"),
    updatedAt: ISODate("2023-06-08T09:03:52.656Z")
  }
]

Since applying update modifications in lexicographic order is expected behaviour, and workarounds that allow you to retain your preferred field ordering exist, I'll close this ticket as Works As Designed. Please do note, however, that in general we recommend that applications should not rely on specific field ordering, except in cases where it is necessary to compare two complete documents or subdocuments against each other.

Hope this helps!

Best regards,
Bernard

Comment by Edwin Zhou [ 02/May/23 ]

Hi developer.lenneth@gmail.com,

Thank you for your patience while we investigate this issue. It does appear that an upsert using $set will reorder keys.

I will pass this on to the appropriate team to investigate further investigate this behavior.

Kind regards,
Edwin Zhou

Comment by Lenny Elzey [ 27/Feb/23 ]

Apologize for the poorly formatted JSON, didn't realize copying straight from Compass would do that.

Comment by PM Bot [ 27/Feb/23 ]

Hi developer.lenneth@gmail.com, thank you for submitting this ticket! The team is going to investigate and reply back with more info after the investigation is completed.

Generated at Thu Feb 08 06:27:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.