Re: CouchDB Double-Spending Proble


jeroiraz
 

Following the discussion

I think the statement that using Couchdb will make MVCC fail is not correct. Because that validation is made in the very same way in despite of the underlying state database being used (leveldb or couchdb). Whenever a chaincode makes use of the KeyValue API, the readwrite-set is prepared in-memory at the endorsing peer level by the transaction simulator. For rich-queries, the digest of the result is included in the read part (that last solution was not present on earlier versions of the Fabric).

When we added SQL capabilities at chaincode level, we had to create a new transaction simulator, use a different read-write set and a new validation procedure because we didn't employed MVCC. But the validation made when using Leveldb or Couchdb is the same.

However, I think Ivan had another motivation, that's to reduce the number of transactions that get invalidated due to MVCC, and his approach is to use different keys as much as possible. While this a required workaround due to MVCC, I'm not able to see why if using the same keys using Couchdb would make any difference with using leveldb instead.

Jerónimo



On Sat, Apr 21, 2018 at 1:44 PM, Kim Letkeman <kletkema@...> wrote:

Boiling the issue down to the MVCC check on key versions, which is how a conflict is detected -- i.e. transactions ran in parallel and read the same key versions, thus invalidating all parallel transactions after the first this is committed (which may or may not be the first that is endorsed).

So my question is: are you saying that the MVCC check can fail when using couchDB?

The reason I ask is that it is not clear that there was a key conflict in your scenario. If the new marble has a unique key that was never read when the first transaction occured, there is no reason to invalidate any of the transactions you mentioned.

Kim


Kim Letkeman
Senior Technical Staff Member, IBM Watson IoT

IoT Blockchain


Phone: +1 (613) 762-0352
E-mail:
kletkema@...


"Ivan Vankov" ---04/21/2018 04:24:29 AM---Author of the article here. The problem with COuchDB is that phantom reads can occurs. Let me give

From: "Ivan Vankov" <gatakka@...>
To: fabric@...
Date: 04/21/2018 04:24 AM
Subject: Re: [Hyperledger Fabric] CouchDB Double-Spending Proble
Sent by: fabric@...



Author of the article here.

The problem with COuchDB is that phantom reads can occurs. Let me give you example, imagine that Bob have 10 marbles and decide to transfer all of them to Alice. The chaincode read all marbles owned by Bob (during the stimulation) and prepare read/write set, then this set is send to orderer for commitment. When peers receive the block they will apply this RW set. It seems OK, but there is time between simulation and commitment, what will happen if some other transaction added new marble to Bob? This new marble will not be transfered to Alice, and no error will be raised. This is when using CouchDB. If you are using LevelDB, then before commit additional check will be made and this change in data will be detected and transaction will be invalidated, and the ledger will not be updated. There are many other "edge" cases when using CouchDB, this is a simple example.

So how to solve this? In some cases this may be desired behavior, or this phantoms reads may not cause any data degradation. All depends on the particular flow and data. But if you must protect yourself from this then you have couple of options:

1. Use LevelDB. You will lose richQueryes, but if you start using composite keys as indexes this can solve most of the limitations.

2. Application layer must guarantee the stability of the set between simulation and commitment time. There many ways to do this, and non of them is perfect. In general, you create a queue in app layer and put transactions in such a way so while Bob transfer marbles to Alice no other transaction add new marbles to Bob.

From my experience I found that this problems can be solved WHEN people stop modeling the data like in relational database. Denormalization of the data in Fabric can help you reorganize it in such a way so no collisions can happen. I know that data normalization is "embedded" in our minds, but data normalization is not effective here.

This is a new technology, all of us are learning and testing now, but god practices start to appear. Do not be afraid to experiment!





Join fabric@lists.hyperledger.org to automatically receive all group messages.