Re: Limiting dissemination of hashes of private collection keys and values?


Victor Dods
 

Thanks for the reply, Yacov!

Appending a salt to the plaintext of each key* means that I need to retain and supply that salt in order to use GetPrivateData, which further means that I need to know which keys I'll be accessing in advance of the transaction proposal so I can send the appropriate salt(s) in the transient data field.  This represents (1) a significant architectural burden, (2) broken encapsulation of the chaincode call (the caller can't and shouldn't be expected to know all the keys that the chaincode will use in advance).

Perhaps it would be useful to elaborate on a more essential version of my suggestion.  Forget the part about the HMAC.  Add a modified version of PutPrivateData to ChaincodeStubInterface:
- PutPrivateDataWithNonces(collection string, key string, keyNonce []byte, value []byte, valueNonce []byte) error
When this is called, it stores the quadruple (key, keyNonce, value, valueNonce) in the Private State, and it stores (hash(keyNonce+key), hash(valueNonce+value)) in the Channel State.  Here, the chaincode author is responsible for generating those nonces.  Then the hash scheme is exactly as you say, except that now the nonces don't interfere with the keys and values, and I can use GetPrivateData and other related functions just as I would if I didn't have to worry about nonces (i.e. GetPrivateData etc are backward-compatible).  The concerns are now separate.
- PutPrivateData doesn't change, and internally it just uses empty byte arrays for keyNonce and valNonce, so the behavior of the hash scheme is still compatible with the existing Fabric protocol.

The difference between this and my original suggestion is that Fabric would be:
- generating those nonces for you (so no repetitive boilerplate for chaincode authors)
- automatically (so all chaincode authors benefit without additional work on their part)
- using a well-defined, tested scheme (so each chaincode author isn't burdened with having to implement it correctly themselves).

*You might suggest that you could just use a single salt for all keys, but (1) that still imposes a burden to manage that salt outside of the chaincode and (2) still allows an attacker to see that two keys are equal, two values are equal, or a key is equal to a value.

Victor Dods
Chief Software Architect
LedgerDomain


On Fri, Mar 27, 2020 at 1:43 PM Yacov Manevich <YACOVM@...> wrote:
why do you need a MAC? What's wrong with just appending a salt to the plaintext? You're not protecting against any length extension attacks or doing any kind of authentication.



From:        "Victor Dods" <victor.dods@...>
To:        fabric@...
Date:        03/27/2020 05:28 AM
Subject:        [EXTERNAL] [Hyperledger Fabric] Limiting dissemination of hashes of private collection keys and values?
Sent by:        fabric@...




Hi all, I'm wondering if there's a mechanism for specifying that the hashes of private collection keys and values should not be disseminated beyond the peers of that private collection.  Basically my use case is such that I know that a particular private collection's data will never be used outside of the peers of that collection, and the issue regarding having to put nonces into private values and KEYS in particular is a total pain.  Because I don't need this extra-collection verification, I don't want the overhead that comes along with that.

Or failing that, what is a reasonable scheme for putting nonces into the keys of a private collection?  It can't be truly random, otherwise there's no way to recover the nonced key (e.g. "abc@...%00<random-nonce>") from the "bare" key (e.g. "abc@...").  One could make it deterministic by making the nonce a the hashed value of the bare key, but then if an attacker knows which hash function you're using, they could brute force the key in smaller spaces (phone numbers, email addresses, etc).  You could use an HMAC with a specified key, but then you (1) have to manage that HMAC key, (2) have to tie the lifetime of that HMAC key to the lifetime of the key/value pair, and (3) have to use the same HMAC key for all key/value pairs OR have some scheme for using multiple HMAC keys and mapping them to the key/value pairs.  Basically the problem of having to manually embed the nonces in the keys and values really gets in the way.

One thing that comes to mind is using truly random nonces for the keys, and then doing a GetPrivateDataByPartialCompositeKey using the bare key to deal with not knowing the nonce, but then you're in the situation where you've done a range-based query, and are then subject to the limitations detailed here: https://hyperledger-fabric.readthedocs.io/en/release-1.4/private-data-arch.html#querying-private-datawhich is a total pain.

Anyway, I'd generally like to hear about how people have been dealing with this issue.

As a side comment for the Fabric maintainers -- I've brought this issue up in the past, and so have others -- why not have Fabric do this sort of nonce bookkeeping automatically?  If everyone is responsible for manually implementing it themselves, how many people are (1) going to do it at all and (2) going to do it correctly?

An idea for how this could work:
- If a transaction is going to write any private data, require a RNG seed be set using a new SeedRNG method in the chaincode stub (otherwise the transaction fails with error).  This RNG seed would be passed into the endorsing peer via the transient data field.  Potentially the name of the HMAC scheme should be a channel configuration value.
- When a private data write is done, generate a pair of HMAC keys from the seeded RNG, one for the key and one for the value, and keep them in parallel with the key and value.  Thus the diagram in this section https://hyperledger-fabric.readthedocs.io/en/release-1.4/private-data/private-data.html#what-is-a-private-data-collectionwould have a quadruple (k1, k1HMACkey, secret-value, secret-value-HMACkey) for Private State.
- Then instead of hash(k1) and hash(secret-value) being stored in the Channel State, it's HMAC(k1, k1HMACkey) and HMAC(secret-value, secret-value-HMACkey), which can be verified just as easily as before.
- Finally, all the stub's Get/Put/Del methods for private data operate on the "bare" key/value pairs as expected and ideally desired, and there's no interference from the details of the HMAC scheme, and all is well in the world again.

I'd be curious to hear thoughts on this from Fabric maintainers and other interested people.

PS: Thanks to Mike Lodder for educating me on HMACs and other security mechanisms.



Join fabric@lists.hyperledger.org to automatically receive all group messages.