Missed data/transactions while stress testing #hyperledger-fabric #fabric-questions #fabric-orderer #fabric #network


jefferson.rs@...
 

Hi.
 
My team is running a couple of stress tests over HLF 2.3.0 in a network with 3 orderers and 2 peers (each peer is of different org), sending an array of JSON data through an API developed with Node SDK 2.2 (transactions are of approx. 50KB size).
 
While running a load of some millions of transactions we've observed that there were a couple of documents missing in the ledger. At the same time the API didn't received any error from these missing transactions.
 
During the test, we've noticed some WARN messages in orderer logs that might be a clue for this situation but anyways they are not getting returned as errors to the API. So we are not sure if these messages might be related to it or not:
 
[33m2021-02-01 12:10:06.708 UTC [orderer.common.broadcast] ProcessMessage -> WARN 88af9a2 [channel: ch1] Rejecting broadcast of normal message from 57.145.150.41:54444 with SERVICE_UNAVAILABLE: rejected by Order: aborted
 
[33m2021-01-26 14:16:16.530 UTC [orderer.common.cluster.step] sendMessage -> WARN 33245e[0m Stream 7 to orderer1(orderer1:443) was forcibly terminated because timeout (7s) expired
[33m2021-01-26 14:18:22.123 UTC [orderer.consensus.etcdraft] run -> WARN 19e666[0m WAL sync took 29.466370256 seconds and the network is configured to start elections after 5 seconds. Your disk is too slow and may cause loss of quorum and trigger leadership election. channel=ch1 node=2
 
So we have a couple of doubts that we would like to get some feedback from the community:
- are there any internal errors (more related to orderers) that might not return to the API and cause missing transactions?
- if this is true, which would be the best way to assure that this data get registered into the ledger? Wouldn't the orderer be "smart enough" that an error ocurred and replay the transaction itself after some time, as the transaction was already proposed and approved by a peer? Would any listener be able to catch failures like this so it enable us to do some replay in the API? If so, could someone provide an example, please?
 
Thanks in advance.

Jeff.

Join fabric@lists.hyperledger.org to automatically receive all group messages.