Private data : issues and problems #fabric #fabric-questions #fabric-dstorage

Ivan Ch <acizlan@...>

Hi Vipin,

I was on vacation for a few weeks and I am now starting a new topic regarding the private data design since we are no longer talking about the original security issue (unsalted hash).

Hello Ivan,
I have been following this thread for a while.
Thanks for raising some of these issues.
While it is important to question and to challenge the assumptions underlying Hyperledger Fabric, the best way to get attention, answers and influence the design may not be by using language like "Major Security hole...". This raises hackles and creates an atmosphere of defensiveness.
First- The issue you raised at first (the salted hash) may be just related to documentation according to all who debated this let us drop that from the list.
So that leaves:
1) hashes on chain cannot be validated by any third party, so they can be used by adversaries to trick honest participants (open)-
The design of private data collections, setup in effect "a covert channel" between the people who exchange that information. I use the term "covert channel" guardedly, before the cryptographers and crypto engineers among us object strenuously to that term. All those who need to know have access to methods to check the hash. Please re-examine this and re-read the private channel documentation. In terms of the veracity of the data (or the claim); this is a problem that has to be solved anyway-in any blockchain; through attestation by the party who put the data on the chain (in other words the issuers of the claim). There are many ways to share these "covert" claims  - Edge architectures with certain proof on the chain and so forth- a la Aries supported by Indy etc.
Chain hash just don't solve any problem. ZKP would be the solution to the problem, hashes are not. Sure, some people would argue that ZKP is slow and premature, I have to disagree since protocols such as bulletproof and many other customized ZKP protocols are fairly efficient. I understand there are plenty of people like to use chain hash because it is easy and comfortable for them. however if we want to to move ahead we have to look for the best technology not what's making people comfortable at the moment.

2) private data use gossip to transact data, which would require all participants be connected with any other participant part of a chain. if there are 20 participants in a channel, each participant must open up their firewalls to all other 19 participants of a single channel (open)
This may not be as it seems as gossip protocols can transmit information using connections to a limited number of "near peers". Overlay this with the three types of nodes (i.e. endorsing peers, validating peers and orderers- with Anchor peers being special types of peers that can serve as the "gateways" for endorsing and validating peers. As far as the orderers, I am not aware of the exact network that they participate in (i.e. is it gossip driven?). Also this interaction can be over TLS which is a widely used method today to protect communications over the open internet. I believe Fabric has this feature.
the issue is not whether you can use secure protocol such as TLS to securely transmit data, the problem is you have to make pre-arrangements with all peers (open fire wall to each other), which is not possible in practice unless all nodes operate on the same cloud. 

You have a point about firewalls, the disposition of the components in a regulated enterprise may need some design modifications to accommodate  firewalls. Since Firewalls, whether  on prem or in the cloud are not monolithic (include multiple layers like the DMZ etc.) currently use reverse proxies (for incoming messages) and Socks compliant protocols for outgoing. In Corda Enterprise, there is a component called the "Float" which functions as a reverse proxy. I was involved in conversations around the design of this component, when I was working in a regulated financial institution. I do not know the status of "the float" since that is available only in Enterprise Corda. There are also multiple architectural patterns written up on the provisioning of the components inside firewalls. We need that thought process in fabric if it does not exist.
this problem actually gets bigger when we have to try to get all participants to do the same. each enterprise seem to have their own little ghost setting behind firewall. this is still doable, but a big husslle.

Another feature that is demanded by IT architecture and security teams in Enterprise are the componentization of nodes. By that I mean the breaking up of (say) any endorsing or validating peer into data access and smart contact execution layers with the possibility of scaling and housing in various parts of the enterprise stack.
All this points to having community involvement in Architecture best practices for projects and the presence and participation in such exercises so that the Fabric team can take advantage of expertise such as yours that exist in the open source community.
We must collaborate, otherwise why be in an open source consortium?
I've been trying hard to convince my client to avoid using the private data feature :)  we are able to configure orderers like a shared cluster group so that all org can just make their peer nodes connected to the orderer service running on a cloud to bypass the firewall issue (each org would only need to open their firewall to the central orderer service), and then things got a lot more complicated when the private data feature kicks in. people somehow just assume that a feature is right just because its on fabric documentation

Join { to automatically receive all group messages.