[SERVER-42844] Refactor OpObserverImpl Created: 16/Aug/19  Updated: 06/Dec/22  Resolved: 24/Mar/22

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: A. Jesse Jiryu Davis Assignee: Backlog - Replication Team
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
Assigned Teams:
Replication
Participants:

 Description   

OpObserverImpl has a few confusing aspects that make its code difficult to understand.

1. Its responsibilities are not well defined. Most of its code is concerned with creating oplog entries on a replica set primary for all write operations, but it also handles FCV checks, transaction bookkeeping, direct writes to the config.transactions table or system.js tables, and perhaps other responsibilities. *Proposal*: split this class into several OpObservers, so each has one well-defined responsibility.

2. OpObserverImpl is instantiated only by embedded MongoDB. A regular mongod instantiates OpObserverShardingImpl. Therefore the class names seem unrelated to their purposes. *Proposal*: Perhaps there should be three classes: OpObserverBase, OpObserverMongoD, and OpObserverEmbedded.

3. OpObserverImpl is tightly coupled with its subclass OpObserverShardingImpl. We'd expect OpObserverImpl's methods to know nothing about sharding. OpObserverShardingImpl's methods would call the base class methods, with sharding-specific code executed before and after the base class calls. Instead, OpObserverImpl has empty shard-specific private virtual methods. Its public methods contain some shard-specific logic, plus calls to these empty private methods. OpObserverShardingImpl implements the remainder of the shard-specific logic in its overrides of these private methods. This design is confusing, and it usually means that a change in one class requires a change in the other. This design also forces embedded MongoDB to waste time executing sharding logic. *Proposal*: move all sharding logic from OpObserverImpl into OpObserverShardingImpl.

4. Even if OpObserverImpl were responsible only for creating oplog entries, it would still be a "God object" that knows about every subsystem that can execute writes on a primary. *Proposal*: Create a registry of oplog entry generators. Each subsystem that executes write operations on the primary (CRUD, DDL, transactions, etc.) can also register an oplog entry generator. OpObserverImpl is responsible for executing the generators in the registry, but it need not know the details of every write operation. This registry may be the same as the registry created for PM-1469.



 Comments   
Comment by Elizabeth Roytburd [ 24/Mar/22 ]

Replication will be addressing this via PM-2822 so we are closing this ticket out. 

Comment by Geert Bosch [ 10/Dec/19 ]

I already added an OpObserverRegistry that people are supposed to use to register subsystem specific observers, in preference of cramming everything in OpObserverImpl. This last class should probably be renamed.

Generated at Thu Feb 08 05:01:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.