Dave, Yacov, and Alex
Seems that the general response to this scenario is “this is an application design problem and should be solved by chaincode”
here is an example: my national ID is "1234567", but I am a bad guy and want others to believe that my national ID number is "7654321". so I put the false hash(salt, "7654321") on chain, and then send pre-images (salt, "7654321") to whoever I want to convince. Since nobody can verify the hash(salt, "7654321") when the hash was put on chain without prior knowledge of the data, an adversary can use the claims about private data functionality to trick people to believe forged data.
But my argument here is that chaincode design can’t solve this problem, and I can assure you that there is a large number of DLT deployments are at risk because of this.
As I stated earlier, hashes cannot be verified by third parties like digital signature or ZKP algorithm. There is almost no way to guard against adversaries from putting fake data and then trick others to believe the fake data is real.
Since chaincode can’t decode hashes so the only thing a chaincode can perform is to limit on number of updates. In most financial use cases (e.g. trade transactions) this is irrelevant since pre-image data are not constants in the first place. Even for constant data such as “national ID” in the aforementioned scenario, chaincode most likely will still allow at least a few updates to cover typos.
Leaving it to applications is easier said than done since there are so few ways to get it right and this functionality simply opens door for attackers and yet offers almost nothing.
This bug is neither an application design issue nor fabric implementation issue, but a methodology problem that private data feature promotes. My humble recommendation is to depreciate this functionality or at least put warning signs to people still plan to use it