Re: Orderer ledger pruning

Vitalii Demianets

Thank you all for the discussion and your input!
I think we now have a high-level understanding of what has to be done, looks simple enough.
Now we start the process of drafting the RFC. It will take some time though, as we want to familiarize ourselves with the internals of the code and the architecture from the Fabric dev point of view, but I hope it will not take more than 3-4 weeks.

// Vitalii Demianets @ norbloc AB

On Tue, Feb 23, 2021 at 3:56 PM Manish Sethi <manish.sethi@...> wrote:
In regards to changes to the blockstore, Jason is right that they would be very trivial. For the peers case, this function ( does the job. For the orderer, I would expect an even simpler function (say, "BootstrapFromSnapshotInfo"), as there is nothing much other than the information of the initial state that needs to be supplied.


On Tue, Feb 23, 2021 at 6:27 AM Vitalii Demianets <vitalii@...> wrote:
Hi Jason!

> At a very high level I would suggest.
> 1) Modifying the channel participation API to allow specifying a minimum block sequence number to replicate to when joining.
> 2) Add a new pruning function to the channel participation API, which allows reconfiguring that minimum block sequence to a higher sequence number. (Implicitly existing channels would have this sequence set to 0).
> 3) Ignoring the system channel case, instead prereq-ing that channel participation API.
> 4) Introducing a new return status for the Deliver API indicating to peers and clients that the sequence is too old to help users understand why a block request is being rejected.  You'll actually notice that the APi already has the notion of 'OLDEST' as a seek position, as the earliest incarnations of the orderer ledger were RAM based and self-pruning.
> Perhaps Manish can weigh in here, but I believe the block storage code was already modified in the peer to handle starting from a later block.  The orderer uses a thin wrapping layer around the peer blockstore, so if not already exposed, exposing it should be relatively simple.

In the above description you do not mention snapshot at all. Do I understand this correctly that we basically do not need any of the peer snapshot functionality (private data collections, stateDB, transactions id list) to start an orderer from an arbitrary block? Basically from your description it follows that we only need to supply the latest config block and the desired start block number, and the orderer will happily start from that info? (given it is able to fetch the start block and other blocks on top of it)
In other words, for orderers the "snapshot" is only the latest config block + start block number?

> It might also be helpful to extend the block height discovery in the orderer block replication to discover the lowest block height as well, to detect when an orderer cannot catch up, though I'm not sure I would consider this a blocking item for an initial implementation.

Yes, in our use case the condition that we start the orderer from the block that can be discovered is satisfied, so it makes sense not to add this complexity from the start.
After all if an orderer is not able to find a block, it will be more or less obvious from the logs.

> I would argue that the most challenging aspect of implementing pruning at the orderer is an operational one.  The orderers are necessarily not involved in the peer snapshot operations, but, if the orderers prune to a block after the peer's most recent snapshot, onboarding will be broken until peers take a new snapshot.  In the worst case, if the orderer prunes before the blocks are disseminated to the peers, then data loss could occur.

We are going to keep several peers as "archiving" nodes, having all the history of blocks from block 0.
Will this solve the issue?

// Vitalii Demianets @ norbloc AB

Join to automatically receive all group messages.