Wrong world state #fabric #fabric-questions


Joao Antunes
 

Hi,

Recently I got a strange behaviour in my network regarding ledger height, that was not consistent.

I solved that issue but, after a bit more investigation, I found that some values in world state were not correct. I used "peer node reset" across all my peers and the value was corrected. So is it correct of me to assume that the ledger is correct but the world state is not? How can this happen?

My setup, is 4 peers, 2 orderers, 2 ca and 2 orgs (2 peers, 1 orderer and 1 ca in each). I'm also using kafka setup. To communicate with the Fabric, I'm using Java SDK.


David Enyeart
 

We haven't seen such a problem in Fabric system tests (e.g. there are tests that repeatedly crash peers and ensures that peer restart recovers in-flight data with correct data integrity). The occurrences we've seen have been due to things like mounting incorrect database volumes or having data left over from a prior trial. Not that it would be impossible for there to be a bug in the state database code, but nobody has opened such an issue. So if you can reproduce such a problem, please open a bug in Jira with reproduction steps.

If you do get into a bad state, the 'bad' peer won't be able to impact the network... transaction endorsements from a 'bad' peer won't match endorsements from 'good' peers for the afflicted keys, therefore your client application should detect such a problem, and if such a transaction were submitted to the channel anyways, it would be invalidated by all peers due to the mismatched endorsements.


Dave Enyeart

"Joao Antunes" ---11/08/2019 04:43:06 PM---Hi, Recently I got a strange behaviour in my network regarding ledger height, that was not consisten

From: "Joao Antunes" <joao.antunes@...>
To: fabric@...
Date: 11/08/2019 04:43 PM
Subject: [EXTERNAL] [Hyperledger Fabric] Wrong world state #fabric #fabric-questions
Sent by: fabric@...





Hi,

Recently I got a strange behaviour in my network regarding ledger height, that was not consistent.

I solved that issue but, after a bit more investigation, I found that some values in world state were not correct. I used "peer node reset" across all my peers and the value was corrected. So is it correct of me to assume that the ledger is correct but the world state is not? How can this happen?

My setup, is 4 peers, 2 orderers, 2 ca and 2 orgs (2 peers, 1 orderer and 1 ca in each). I'm also using kafka setup. To communicate with the Fabric, I'm using Java SDK.





Joao Antunes
 

Thank you for reaching out again Dave,

In my case, for some unknown reason, I had 1 good world state and 3 bad world state. After updating fabric image on peers and resetting all peers, the world state got coherent with the ledger. So from my analysis, world state was wrong, but the ledger was correct.
I had 3 wrong peers and 1 correct, but even then I was not able to change anything due to my queries readset. One of the readsets was different so, no actions were possible.
How this happened I don't know. I know that these entities originated from a single huge transaction (10K new entities created) but the ownership of some of them (3 or 4) were not coherent in the peers. Even after deleting the CouchDB database from the peer, it still recovered but with wrong data. Only resetting the peer solved the issue.

I'll investigate more and if possible I'll provide more data to this thread and if that's the case a Jira issue


Tong Li
 

Joan,
This is really interesting problem. I wonder if you can share the things that you did to the network and the timing etc, so others who are also interested in the problem can try to do similar things to watch the behavior.

Thanks.

Tong Li
IBM Open Technology

"Joao Antunes" ---11/09/2019 11:37:48 AM---Thank you for reaching out again Dave, In my case, for some unknown reason, I had 1 good world state

From: "Joao Antunes" <joao.antunes@...>
To: fabric@...
Date: 11/09/2019 11:37 AM
Subject: [EXTERNAL] Re: [Hyperledger Fabric] Wrong world state #fabric #fabric-questions
Sent by: fabric@...





Thank you for reaching out again Dave,

In my case, for some unknown reason, I had 1 good world state and 3 bad world state. After updating fabric image on peers and resetting all peers, the world state got coherent with the ledger. So from my analysis, world state was wrong, but the ledger was correct.
I had 3 wrong peers and 1 correct, but even then I was not able to change anything due to my queries readset. One of the readsets was different so, no actions were possible.
How this happened I don't know. I know that these entities originated from a single huge transaction (10K new entities created) but the ownership of some of them (3 or 4) were not coherent in the peers. Even after deleting the CouchDB database from the peer, it still recovered but with wrong data. Only resetting the peer solved the issue.

I'll investigate more and if possible I'll provide more data to this thread and if that's the case a Jira issue





Joao Antunes
 

Hi Tong,

I'm currently investigating the issue and trying to replicate it.

I'll give an update when I reach a conclusion.