Topics

#fabric-kubernetes #fabric-questions #fabric-sdk-java Resilient fabric cluste #fabric-kubernetes #fabric-questions #fabric-sdk-java


jk@...
 

I have a simple fabric network with an orderer in solo mode and single peer connected. The network is used by two spring-boot apps, each of them using a separate channel. Both apps and fabric nodes are run in AWS and orchestrated using Kubernetes. I'd like to prepare my network for a production environment and ensure resilience of the network. I have couple of questions regarding this topic:

  1. What is the best setup for a resilient network? How many orderers and peers should I have? Are 2 orderer pods and 3 peer pods enough? Would I have to change the setup if I were running the network on bare metal?
  2. How Kafka is utilized by orders?
  3. How do I revive nodes that were off?

    • Do I need to rerun peer channel create -o REVIVED_ORDERER_HOST:PORT -c $CHANNEL_NAME -f ./channel-artifacts/channel.tx when a dead orderer node comes back? In my case I'd have to run it twice for both channels that I have? What would happen if both orderers disappears at the same time?
    • For a peer I assume I need to rerun peer channel join -b $CHANNEL_NAME.block if a peer was offline for some time?
  4. Is there a way to automate this things, e.g. for a peer to automatically rejoin a channel after restart?

Thank you in advance!

PS. Right now I'm not using CLI container at all and my spring apps are creating or fetching the existing channel and then installing chaincode onto it if missing. The flow works correct if fabric nodes are started first and then spring apps. Also if I reboot one of the spring apps, they are still able to rejoin the network. Unfortunately a problem arises when I reboot the peer pod because a peer doesn't have channel information:

peer | 2018-12-06 11:10:44.853 UTC [protoutils] ValidateProposalMessage -> WARN 037 channel [audit]: MSP error: channel doesn't exist peer | 2018-12-06 11:10:47.524 UTC [common/deliver] Handle -> WARN 038 Error reading from 172.18.0.1:49218: rpc error: code = Canceled desc = context canceled

but the orderer has has registered previous installation of the channel so when my spring app is trying to rejoin the network the orderer logs following:

2018-11-22 19:02:48.158 UTC [orderer/common/broadcast] Handle -> WARN 9a0 [channel: audit] Rejecting broadcast of config message from 100.96.35.159:55544 because of error: error authorizing update: error validating ReadSet: readset expected key [Group] /Channel/Application at version 0, but got version 1

Is there a facility within JavSDK to automatically rejoin a channel/install chaincode? Should I even create a channel as a part of Java code or is using CLI container a more preferable way of managing fabric system?


l.cintron@...
 

Jk,

 

My attempt at your questions below. Anyone please correct me if wrong.

 

1.      What is the best setup for a resilient network? How many orderers and peers should I have? Are 2 orderer pods and 3 peer pods enough? Would I have to change the setup if I were running the network on bare metal?

Perhaps other folks can expand on this but here are my two cents. It depends on your end goal and how you plan to organize your network – I personally believe it depends on how many orgs are participating and the network activity (# of tx’s being submitted). Each org should ideally run their own orderer (this makes more sense in a BFT consensus network – not so much in Kafka). I would say definitely more than 1 orderer is necessary in a Kafka consensus configuration, otherwise you won’t need the zookeeper/kafka nodes at all (that’s what solo consensus does-without the kafka overhead- and there’s no crash tolerance in that mode). I highly recommend you read Performance Benchmarking and Optimizing HLF (https://arxiv.org/pdf/1805.11390.pdf), it describes how some of the configuration/network design factors affect the performance of your network.

2.      How Kafka is utilized by orders?

The zookeeper-kafka cluster allows consensus by electing a leader among the orderers and distributing blocks across the network in an atomic way. It ensures all orderers receive the same block with the same order of transactions. Keep in mind this approach is not BFT, only crash-fault tolerant.

3.      How do I revive nodes that were off?

If you are using the ./startFabric script from the examples, replace the down with stop when calling docker-compose. Down stops the containers and removes all images, volumes, etc. thus losing your state. (https://docs.docker.com/compose/reference/down/).  That way you can ‘revive’ them the same way you started them the first time.

·         Do I need to rerun peer channel create -o REVIVED_ORDERER_HOST:PORT -c $CHANNEL_NAME -f ./channel-artifacts/channel.tx when a dead orderer node comes back? In my case I'd have to run it twice for both channels that I have? What would happen if both orderers disappears at the same time?

Not when a dead orderer node comes back. If you have 2 orderers and they both go down , you won’t be able to  submit any tx, fabric’s blockchain/world-state will not change. You will have to write logic to resubmit those tx’s that weren’t committed due to your ordering service suffering downtime.

 

·         For a peer I assume I need to rerun peer channel join -b $CHANNEL_NAME.block if a peer was offline for some time?

You will have to fetch the channel first then join it.

4.      Is there a way to automate this things, e.g. for a peer to automatically rejoin a channel after restart?

You can always call your start script upon startup (e.g., add it to your /etc/rc.local). Like the previous answers, unless you lost your containers, you can just call docker-compose up  to run them.

 

Hope this helps!

-Luis

 


From: fabric@... <fabric@...> on behalf of jk@... <jk@...>
Sent: Friday, December 7, 2018 10:56 AM
To: fabric@...
Subject: [Hyperledger Fabric] #fabric-kubernetes #fabric-questions #fabric-sdk-java Resilient fabric cluste

 

I have a simple fabric network with an orderer in solo mode and single peer connected. The network is used by two spring-boot apps, each of them using a separate channel. Both apps and fabric nodes are run in AWS and orchestrated using Kubernetes. I'd like to prepare my network for a production environment and ensure resilience of the network. I have couple of questions regarding this topic:

5.      What is the best setup for a resilient network? How many orderers and peers should I have? Are 2 orderer pods and 3 peer pods enough? Would I have to change the setup if I were running the network on bare metal?

6.      How Kafka is utilized by orders?

7.      How do I revive nodes that were off?

·         Do I need to rerun peer channel create -o REVIVED_ORDERER_HOST:PORT -c $CHANNEL_NAME -f ./channel-artifacts/channel.tx when a dead orderer node comes back? In my case I'd have to run it twice for both channels that I have? What would happen if both orderers disappears at the same time?

·         For a peer I assume I need to rerun peer channel join -b $CHANNEL_NAME.block if a peer was offline for some time?

8.      Is there a way to automate this things, e.g. for a peer to automatically rejoin a channel after restart?

Thank you in advance!

PS. Right now I'm not using CLI container at all and my spring apps are creating or fetching the existing channel and then installing chaincode onto it if missing. The flow works correct if fabric nodes are started first and then spring apps. Also if I reboot one of the spring apps, they are still able to rejoin the network. Unfortunately a problem arises when I reboot the peer pod because a peer doesn't have channel information:

peer | 2018-12-06 11:10:44.853 UTC [protoutils] ValidateProposalMessage -> WARN 037 channel [audit]: MSP error: channel doesn't exist peer | 2018-12-06 11:10:47.524 UTC [common/deliver] Handle -> WARN 038 Error reading from 172.18.0.1:49218: rpc error: code = Canceled desc = context canceled

but the orderer has has registered previous installation of the channel so when my spring app is trying to rejoin the network the orderer logs following:

2018-11-22 19:02:48.158 UTC [orderer/common/broadcast] Handle -> WARN 9a0 [channel: audit] Rejecting broadcast of config message from 100.96.35.159:55544 because of error: error authorizing update: error validating ReadSet: readset expected key [Group] /Channel/Application at version 0, but got version 1

Is there a facility within JavSDK to automatically rejoin a channel/install chaincode? Should I even create a channel as a part of Java code or is using CLI container a more preferable way of managing fabric system?