[JAVA-227] NPE when replica set is down Created: 06/Dec/10  Updated: 17/Mar/11  Resolved: 17/Feb/11

Status: Closed
Project: Java Driver
Component/s: Cluster Management
Affects Version/s: 2.3
Fix Version/s: 2.5

Type: Bug Priority: Major - P3
Reporter: Jeff Yemin (Inactive) Assignee: Antoine Girbal
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

1.62 linux 64 bit server


Attachments: Java Archive File mongo.jar    

 Description   

When the whole replica set is down, java driver generates this log message every 5 seconds.

2010-12-06 14:35:10,294 ERROR [STDERR] (ReplicaSetStatus:Updater) Dec 6, 2010 2:35:10 PM com.mongodb.ReplicaSetStatus$Node update
SEVERE: can't update node: shared-mongo-004.1515.mtvi.com:27017
java.lang.NullPointerException
at com.mongodb.OutMessage.reset(OutMessage.java:73)
at com.mongodb.OutMessage.<init>(OutMessage.java:51)
at com.mongodb.OutMessage.query(OutMessage.java:38)
at com.mongodb.DBPort.findOne(DBPort.java:142)
at com.mongodb.DBPort.runCommand(DBPort.java:159)
at com.mongodb.ReplicaSetStatus$Node.update(ReplicaSetStatus.java:119)
at com.mongodb.ReplicaSetStatus.updateAll(ReplicaSetStatus.java:277)
at com.mongodb.ReplicaSetStatus$Updater.run(ReplicaSetStatus.java:238)

Seems like an NPE is not what should happen. Strange thing is that OutMessage.java:73 is

        _id = ID.getAndIncrement();

and I'm not sure how that line could generate an NPE.



 Comments   
Comment by Antoine Girbal [ 17/Feb/11 ]

will reopen if issues is brought up again.

Comment by Jeff Yemin (Inactive) [ 17/Feb/11 ]

I haven't see it again, but I haven't brought down the replica set entirely. This is definitely not high priority for us though.

Comment by Antoine Girbal [ 17/Feb/11 ]

let me know if you are still experiencing the problem with newest driver.
If not I'll close ticket thx

Comment by Antoine Girbal [ 13/Dec/10 ]

I attached a jar file that has a new init block for the ID field.

private final static AtomicInteger ID;

static

{ ID = new AtomicInteger(1); }

let me know if it makes any difference.
Note this .jar is from trunk and should not be used in production.

Comment by Jeff Yemin (Inactive) [ 10/Dec/10 ]

I tried with a simple test program and couldn't reproduce it, which is not surprising.

Comment by Swen Thümmler [ 10/Dec/10 ]

I'm seeing the very same symptom with glassfish 2.1.1

Comment by Antoine Girbal [ 07/Dec/10 ]

ok well maybe jboss reorders the loading of class, and does not initialize the ID first.
No matter how we fix this, it may be a problem in other parts of code too.
You should probably open a ticket with JBoss to mention the bug.

If you have ability to compile, could you try:

  • make the ID static final like
    final static AtomicInteger ID = new AtomicInteger(1);
  • move the query() method to end of class
  • if above doesnt work, then replace line with static block:
    static AtomicInteger ID;
    static {
    ID = new AtomicInteger(1);
    }

if that doesnt work, we can always add a catch for NPE errors and then initialize ID, but it's sad.

Comment by Jeff Yemin (Inactive) [ 06/Dec/10 ]

So far only seen it after restarting our app with replica set already down (all nodes in replica set are unreachable).

Comment by Antoine Girbal [ 06/Dec/10 ]

That's very odd behavior, this ID is a static variable that should be initialized whenever class is loaded, before using any method from it.
Most likely due to JBoss class loader.
Maybe there is a workaround.
How exactly to you get it to happen?
Is is after restart of your app with replica down?
Or does it also happen if you run queries in your app, then take all replica down?

Comment by Jeff Yemin (Inactive) [ 06/Dec/10 ]

Also, don't see the NPE when the replica set is up.

Comment by Jeff Yemin (Inactive) [ 06/Dec/10 ]

Also note that we are running this in JBoss 4.2.3, which has rather funky class-loading.

Comment by Jeff Yemin (Inactive) [ 06/Dec/10 ]

Confirmed in debugger that OutMessage.ID is null.

Generated at Thu Feb 08 08:51:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.