This is on 6696c3edd28, with ninja 1.8.2 on Ubuntu 18.04, and a scons cmdline of:
python3 buildscripts/scons.py CC=gcc CXX=g++ --variables-files=etc/scons/mongodbtoolchain_stable_gcc.vars 'CCFLAGS=-Wa,--compress-debug-sections -gsplit-dwarf -fdiagnostics-color' LINKFLAGS=-fuse-ld=gold MONGO_VERSION=4.5.0 MONGO_GIT_HASH=unknown --ssl --detect-odr-violations --link-model=dynamic --dbg=on --install-mode=hygienic ICECC=icecc CCACHE=ccache --ninja=next generate-ninja
and a full working build of install-core.
I made a change to VectorClock in db/vector_clock.h to add an unused protected pure virtual function, and then implementations in db/vector_clock_mongod.cpp, db/vector_clock_trivial.cpp, and s/commands/vector_clock_mongos.cpp:
diff --git a/src/mongo/db/vector_clock.h b/src/mongo/db/vector_clock.h index b7381652e9c..e4887dcc3c1 100644 --- a/src/mongo/db/vector_clock.h +++ b/src/mongo/db/vector_clock.h @@ -122,6 +122,7 @@ protected: virtual void _gossipOutExternal(BSONObjBuilder* out) const = 0; virtual LogicalTimeArray _gossipInInternal(const BSONObj& in) = 0; virtual LogicalTimeArray _gossipInExternal(const BSONObj& in) = 0; + virtual bool _somethingUnused() const = 0; void _gossipOutComponent(BSONObjBuilder* out, VectorTime time, Component component) const; void _gossipInComponent(const BSONObj& in, LogicalTimeArray* newTime, Component component); diff --git a/src/mongo/db/vector_clock_mongod.cpp b/src/mongo/db/vector_clock_mongod.cpp index 69d2eb2e7c1..534cefbe624 100644 --- a/src/mongo/db/vector_clock_mongod.cpp +++ b/src/mongo/db/vector_clock_mongod.cpp @@ -57,6 +57,9 @@ protected: void _gossipOutExternal(BSONObjBuilder* out) const override; LogicalTimeArray _gossipInInternal(const BSONObj& in) override; LogicalTimeArray _gossipInExternal(const BSONObj& in) override; + bool _somethingUnused() const override { + return true; + } private: enum class ReplState { diff --git a/src/mongo/db/vector_clock_trivial.cpp b/src/mongo/db/vector_clock_trivial.cpp index 88f00a512ea..93a99a41f96 100644 --- a/src/mongo/db/vector_clock_trivial.cpp +++ b/src/mongo/db/vector_clock_trivial.cpp @@ -53,6 +53,9 @@ protected: void _gossipOutExternal(BSONObjBuilder* out) const override; LogicalTimeArray _gossipInInternal(const BSONObj& in) override; LogicalTimeArray _gossipInExternal(const BSONObj& in) override; + bool _somethingUnused() const override { + return true; + } }; const auto vectorClockTrivialDecoration = ServiceContext::declareDecoration<VectorClockTrivial>(); diff --git a/src/mongo/s/commands/vector_clock_mongos.cpp b/src/mongo/s/commands/vector_clock_mongos.cpp index a8d0b0dbc81..2dcb1b59701 100644 --- a/src/mongo/s/commands/vector_clock_mongos.cpp +++ b/src/mongo/s/commands/vector_clock_mongos.cpp @@ -50,6 +50,9 @@ protected: void _gossipOutExternal(BSONObjBuilder* out) const override; LogicalTimeArray _gossipInInternal(const BSONObj& in) override; LogicalTimeArray _gossipInExternal(const BSONObj& in) override; + bool _somethingUnused() const override { + return true; + } }; const auto vectorClockMongoSDecoration = ServiceContext::declareDecoration<VectorClockMongoS>();
After making this change, running ninja install-core with ninja next rebuilds only 40 targets, whereas stable rebuilds 567. The binaries produced by ninja stable run fine, but those built with ninja next exhibit segfaults and/or other random problems. As a canary test I used:
./buildscripts/resmoke.py run --suite=no_passthrough jstests/noPassthrough/client_disconnect_during_sign_logical_time.js
which reliably crashes (for me) with a MONGO_UNREACHABLE in the JournalFlusher on the first config server. Again, this crash does not happen if the procedure is repeated with ninja stable instead of next.
If the change applied is even further reduced to just:
diff --git a/src/mongo/db/vector_clock.h b/src/mongo/db/vector_clock.h index b7381652e9c..51a7b075841 100644 --- a/src/mongo/db/vector_clock.h +++ b/src/mongo/db/vector_clock.h @@ -122,6 +122,7 @@ protected: virtual void _gossipOutExternal(BSONObjBuilder* out) const = 0; virtual LogicalTimeArray _gossipInInternal(const BSONObj& in) = 0; virtual LogicalTimeArray _gossipInExternal(const BSONObj& in) = 0; + bool _somethingUnused() const {} void _gossipOutComponent(BSONObjBuilder* out, VectorTime time, Component component) const; void _gossipInComponent(const BSONObj& in, LogicalTimeArray* newTime, Component component);
then running ninja reports:
ninja: no work to do.
which is clearly incorrect, since the cpp files that #include "mongo/db/vector_clock.h" need to be rebuilt, and everything that depends on them relinked and reinstalled.
Looking at the targets built in each case (for the first patch) confirms this hypothesis — ninja next is correctly rebuilding the changed cpp files, but not the cpp files (and associated libs) that include the changed header file, eg. rpc/metadata.cpp and the rpc/metadata lib.
Since the in-memory layout of the class has changed, the not-rebuilt object files are using the (now incorrect) original layout, which leads to incorrect memory usages, and so explains the segfaults and/or other weird problems.
cc acm
- is depended on by
-
SERVER-47776 Promote ninja-next to stable
- Closed