Re: How to verify RAFT health in HLF version 1.4.4 #raft

Jason Yellick <jyellick@...>

For v1.4.x, there is no metric to check the set of active nodes.  Usually, it's adequate to simply ensure that all OSNs are alive and have been online for a reasonable period of time to ensure that the Raft cluster has all nodes active.

There are ways to detect nodes out of quorum, but it is a little tedious.  When a node is not part of the Raft quorum, it will make services for that channel unavailable, so, you may attempt to pull a block via the Deliver interface from each OSN on a channel, and those which respond with SERVICE_UNAVAILABLE are not active in the Raft cluster currently.

As was pointed out in your original e-mail, the "consensus_etcdraft_active_nodes" metric was added into v2.0 to address this operational aspect.  It was not back-ported to v1.4.x because the code change required new messages on the wire, and was considered to be too invasive for an LTS release.


----- Original message -----
From: "Senthil Nathan" <cendhu@...>
Sent by: fabric@...
To: shrugupt@...
Cc: fabric <fabric@...>
Subject: [EXTERNAL] Re: [Hyperledger Fabric] How to verify RAFT health in HLF version 1.4.4 #raft
Date: Fri, Jun 19, 2020 7:12 AM
On Fri, Jun 19, 2020 at 9:38 AM shrugupt via <> wrote:

Would really appreciate if anybody has any thought/suggestion regarding my query.






Join to automatically receive all group messages.