Weird "panic: runtime error" after multiple transactions


Tomás Peixinho
 

Good afternoon fellow colleagues,

I'm developing an application using Hyperledger Fabric and I've found a weird error that I'm not sure how to fix.

I'm running everything on just one machine, with two orgs, one peer on each, and one orderer. My application receives a file with multiple lines (right now, around 5000), which correspond to one transaction each (technically, two transactions each, because I have to do a query by partial key, and then a separate transaction for the actual update, as per recommendation of Dave Enyart, from a previous question of mine). So, 5000 lines, around 10000 transactions. They are being handled concurrently using threads. And the peers are being alternated, so that one doesn't happen to take all the load, while the other does nothing.

My problem now is that it usually ends with one of the peers randomly crashing with an Exit Code 2, and the logs from that peer saying "panic: runtime error: invalid memory address or nil pointer dereference". I say randomly because it can happen after a 1000 lines (2000 transactions) or 4000 lines (8000 transactions), which is what I find weird!

I've read that it could be because of problems with certificates or with misconfigured paths for them, but I feel that if that were the case, it would happen much sooner, or actually right at the start.

My question now is (and I know that I haven't provided much information or any logs, and that it's difficult to find the problem like this) if there is any obvious situation in which this might happen? Or if there are a few cases that might lead to this error? Something that might point me in the right direction of what I need to be looking at to see if I have any configuration problems. Or could it simply be a problem of my physical machine? Could it be just a memory problem, which crashes the peer if the traffic gets too heavy?

These errors are really not obvious sometimes, and I was just wondering if someone might have some insight as to what the problem might be, based on previous experiences or something.

Thanks in advance

Tomás


David Enyeart
 

You would need to provide the full stack trace from the panic message in order to troubleshoot.


Dave Enyeart

"Tomás Peixinho" ---02/10/2021 01:34:13 PM---Good afternoon fellow colleagues, I'm developing an application using Hyperledger Fabric and I've fo

From: "Tomás Peixinho" <tom.peixinho@...>
To: "fabric@..." <fabric@...>
Date: 02/10/2021 01:34 PM
Subject: [EXTERNAL] [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
Sent by: fabric@...





Good afternoon fellow colleagues, I'm developing an application...
This Message Is From an External Sender
This message came from outside your organization.
Good afternoon fellow colleagues,

I'm developing an application using Hyperledger Fabric and I've found a weird error that I'm not sure how to fix.

I'm running everything on just one machine, with two orgs, one peer on each, and one orderer. My application receives a file with multiple lines (right now, around 5000), which correspond to one transaction each (technically, two transactions each, because I have to do a query by partial key, and then a separate transaction for the actual update, as per recommendation of Dave Enyart, from a previous question of mine). So, 5000 lines, around 10000 transactions. They are being handled concurrently using threads. And the peers are being alternated, so that one doesn't happen to take all the load, while the other does nothing.

My problem now is that it usually ends with one of the peers randomly crashing with an Exit Code 2, and the logs from that peer saying "panic: runtime error: invalid memory address or nil pointer dereference". I say randomly because it can happen after a 1000 lines (2000 transactions) or 4000 lines (8000 transactions), which is what I find weird!

I've read that it could be because of problems with certificates or with misconfigured paths for them, but I feel that if that were the case, it would happen much sooner, or actually right at the start.

My question now is (and I know that I haven't provided much information or any logs, and that it's difficult to find the problem like this) if there is any obvious situation in which this might happen? Or if there are a few cases that might lead to this error? Something that might point me in the right direction of what I need to be looking at to see if I have any configuration problems. Or could it simply be a problem of my physical machine? Could it be just a memory problem, which crashes the peer if the traffic gets too heavy?

These errors are really not obvious sometimes, and I was just wondering if someone might have some insight as to what the problem might be, based on previous experiences or something.

Thanks in advance

Tomás




Tomás Peixinho
 

I was reading about similar issues in fabric version 1.4.2 (the images I'm using on my containers are 1.4.1, so I'm not sure if it is related at all or not), and how it was related to a certain timeout when one of the containers stops (from an issue on Jira from August 2019, https://jira.hyperledger.org/browse/FAB-16292). Don't know if that's what is happening. Last execution, it stopped after approximately 15 minutes, but I've seen it crash sooner and later.


The full error message is as follows:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1c pc=0xb4e221]

goroutine 1265 [running]:
github.com/hyperledger/fabric/vendor/github.com/syndtr/goleveldb/leveldb.(*dbIter).Release(0xc0021fc510)
/opt/gopath/src/github.com/hyperledger/fabric/vendor/github.com/syndtr/goleveldb/leveldb/db_iter.go:343 +0xe1
github.com/hyperledger/fabric/core/ledger/kvledger/txmgmt/statedb/stateleveldb.(*kvScanner).Close(0xc003ce1fb0)
/opt/gopath/src/github.com/hyperledger/fabric/core/ledger/kvledger/txmgmt/statedb/stateleveldb/stateleveldb.go:265 +0x34
github.com/hyperledger/fabric/core/ledger/kvledger/txmgmt/txmgr/lockbasedtxmgr.(*resultsItr).Close(0xc007fefe00)
/opt/gopath/src/github.com/hyperledger/fabric/core/ledger/kvledger/txmgmt/txmgr/lockbasedtxmgr/helper.go:422 +0x34
github.com/hyperledger/fabric/core/chaincode.(*TransactionContext).CloseQueryIterators(0xc0050ee090)
/opt/gopath/src/github.com/hyperledger/fabric/core/chaincode/transaction_context.go:113 +0xd7
github.com/hyperledger/fabric/core/chaincode.(*Handler).Notify(0xc000494120, 0xc005926400)
/opt/gopath/src/github.com/hyperledger/fabric/core/chaincode/handler.go:538 +0x1db
github.com/hyperledger/fabric/core/chaincode.(*Handler).handleMessageReadyState(0xc000494120, 0xc005926400, 0x42, 0xc003ac1950)
/opt/gopath/src/github.com/hyperledger/fabric/core/chaincode/handler.go:204 +0x233
github.com/hyperledger/fabric/core/chaincode.(*Handler).handleMessage(0xc000494120, 0xc005926400, 0x3, 0x0)
/opt/gopath/src/github.com/hyperledger/fabric/core/chaincode/handler.go:185 +0x187
github.com/hyperledger/fabric/core/chaincode.(*Handler).ProcessStream(0xc000494120, 0x7f6f0247ba30, 0xc0031cc170, 0x0, 0x0)
/opt/gopath/src/github.com/hyperledger/fabric/core/chaincode/handler.go:435 +0x40e
github.com/hyperledger/fabric/core/chaincode.(*ChaincodeSupport).HandleChaincodeStream(0xc00054c360, 0x7f6f0247ba30, 0xc0031cc170, 0x7f6f0247ba30, 0xc0031cc170)
/opt/gopath/src/github.com/hyperledger/fabric/core/chaincode/chaincode_support.go:191 +0x362
github.com/hyperledger/fabric/core/chaincode.(*ChaincodeSupport).Register(0xc00054c360, 0x13bd860, 0xc0031cc170, 0x139c300, 0xc00054c360)
/opt/gopath/src/github.com/hyperledger/fabric/core/chaincode/chaincode_support.go:196 +0x63
github.com/hyperledger/fabric/protos/peer._ChaincodeSupport_Register_Handler(0x118c540, 0xc00054c360, 0x13bbe80, 0xc0005769a0, 0x1fc5e60, 0xc0032ef230)
/opt/gopath/src/github.com/hyperledger/fabric/protos/peer/chaincode_shim.pb.go:1066 +0xad
github.com/hyperledger/fabric/vendor/google.golang.org/grpc.(*Server).processStreamingRPC(0xc00031c000, 0x13bf540, 0xc002258a80, 0xc0003f3000, 0xc0003487b0, 0x1ea8a40, 0xc0032ef200, 0x0, 0x0)
/opt/gopath/src/github.com/hyperledger/fabric/vendor/google.golang.org/grpc/server.go:1124 +0x8bd
github.com/hyperledger/fabric/vendor/google.golang.org/grpc.(*Server).handleStream(0xc00031c000, 0x13bf540, 0xc002258a80, 0xc0003f3000, 0xc0032ef200)
/opt/gopath/src/github.com/hyperledger/fabric/vendor/google.golang.org/grpc/server.go:1212 +0x12aa
github.com/hyperledger/fabric/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0039e3ba0, 0xc00031c000, 0x13bf540, 0xc002258a80, 0xc0003f3000)
/opt/gopath/src/github.com/hyperledger/fabric/vendor/google.golang.org/grpc/server.go:686 +0x9f
created by github.com/hyperledger/fabric/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
/opt/gopath/src/github.com/hyperledger/fabric/vendor/google.golang.org/grpc/server.go:684 +0xa1



De: David Enyeart <enyeart@...>
Enviado: quarta-feira, 10 de fevereiro de 2021 19:31
Para: Tomás Peixinho <tom.peixinho@...>
Cc: fabric@... <fabric@...>
Assunto: Re: [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
 

You would need to provide the full stack trace from the panic message in order to troubleshoot.


Dave Enyeart

"Tomás Peixinho" ---02/10/2021 01:34:13 PM---Good afternoon fellow colleagues, I'm developing an application using Hyperledger Fabric and I've fo

From: "Tomás Peixinho" <tom.peixinho@...>
To: "fabric@..." <fabric@...>
Date: 02/10/2021 01:34 PM
Subject: [EXTERNAL] [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
Sent by: fabric@...





Good afternoon fellow colleagues, I'm developing an application...
This Message Is From an External Sender
This message came from outside your organization.
Good afternoon fellow colleagues,

I'm developing an application using Hyperledger Fabric and I've found a weird error that I'm not sure how to fix.

I'm running everything on just one machine, with two orgs, one peer on each, and one orderer. My application receives a file with multiple lines (right now, around 5000), which correspond to one transaction each (technically, two transactions each, because I have to do a query by partial key, and then a separate transaction for the actual update, as per recommendation of Dave Enyart, from a previous question of mine). So, 5000 lines, around 10000 transactions. They are being handled concurrently using threads. And the peers are being alternated, so that one doesn't happen to take all the load, while the other does nothing.

My problem now is that it usually ends with one of the peers randomly crashing with an Exit Code 2, and the logs from that peer saying "panic: runtime error: invalid memory address or nil pointer dereference". I say randomly because it can happen after a 1000 lines (2000 transactions) or 4000 lines (8000 transactions), which is what I find weird!

I've read that it could be because of problems with certificates or with misconfigured paths for them, but I feel that if that were the case, it would happen much sooner, or actually right at the start.

My question now is (and I know that I haven't provided much information or any logs, and that it's difficult to find the problem like this) if there is any obvious situation in which this might happen? Or if there are a few cases that might lead to this error? Something that might point me in the right direction of what I need to be looking at to see if I have any configuration problems. Or could it simply be a problem of my physical machine? Could it be just a memory problem, which crashes the peer if the traffic gets too heavy?

These errors are really not obvious sometimes, and I was just wondering if someone might have some insight as to what the problem might be, based on previous experiences or something.

Thanks in advance

Tomás




Manish
 

Hi Tomás,

I had seen something similar in a test flake in the past and the issue was a race condition between the closing of the levelDB and acquiring an iterator. I am not sure if that's app;icable here still, to clear any doubts, can you check your leveldb logs (located at {peer.fileSystemPath in core.yaml}/ledgersData/stateLeveldb/LOG near the time of this panic. Do you see something like this - "db@close"? If not, sharing the log may help further.

Thanks,
Manish

On Wed, Feb 10, 2021 at 2:57 PM Tomás Peixinho <tom.peixinho@...> wrote:
I was reading about similar issues in fabric version 1.4.2 (the images I'm using on my containers are 1.4.1, so I'm not sure if it is related at all or not), and how it was related to a certain timeout when one of the containers stops (from an issue on Jira from August 2019, https://jira.hyperledger.org/browse/FAB-16292). Don't know if that's what is happening. Last execution, it stopped after approximately 15 minutes, but I've seen it crash sooner and later.


The full error message is as follows:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1c pc=0xb4e221]

goroutine 1265 [running]:
github.com/hyperledger/fabric/protos/peer._ChaincodeSupport_Register_Handler(0x118c540, 0xc00054c360, 0x13bbe80, 0xc0005769a0, 0x1fc5e60, 0xc0032ef230)
github.com/hyperledger/fabric/vendor/google.golang.org/grpc.(*Server).processStreamingRPC(0xc00031c000, 0x13bf540, 0xc002258a80, 0xc0003f3000, 0xc0003487b0, 0x1ea8a40, 0xc0032ef200, 0x0, 0x0)
/opt/gopath/src/github.com/hyperledger/fabric/vendor/google.golang.org/grpc/server.go:684 +0xa1



De: David Enyeart <enyeart@...>
Enviado: quarta-feira, 10 de fevereiro de 2021 19:31
Para: Tomás Peixinho <tom.peixinho@...>
Cc: fabric@... <fabric@...>
Assunto: Re: [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
 

You would need to provide the full stack trace from the panic message in order to troubleshoot.


Dave Enyeart

"Tomás Peixinho" ---02/10/2021 01:34:13 PM---Good afternoon fellow colleagues, I'm developing an application using Hyperledger Fabric and I've fo

From: "Tomás Peixinho" <tom.peixinho@...>
To: "fabric@..." <fabric@...>
Date: 02/10/2021 01:34 PM
Subject: [EXTERNAL] [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
Sent by: fabric@...





Good afternoon fellow colleagues, I'm developing an application...
This Message Is From an External Sender
This message came from outside your organization.
Good afternoon fellow colleagues,

I'm developing an application using Hyperledger Fabric and I've found a weird error that I'm not sure how to fix.

I'm running everything on just one machine, with two orgs, one peer on each, and one orderer. My application receives a file with multiple lines (right now, around 5000), which correspond to one transaction each (technically, two transactions each, because I have to do a query by partial key, and then a separate transaction for the actual update, as per recommendation of Dave Enyart, from a previous question of mine). So, 5000 lines, around 10000 transactions. They are being handled concurrently using threads. And the peers are being alternated, so that one doesn't happen to take all the load, while the other does nothing.

My problem now is that it usually ends with one of the peers randomly crashing with an Exit Code 2, and the logs from that peer saying "panic: runtime error: invalid memory address or nil pointer dereference". I say randomly because it can happen after a 1000 lines (2000 transactions) or 4000 lines (8000 transactions), which is what I find weird!

I've read that it could be because of problems with certificates or with misconfigured paths for them, but I feel that if that were the case, it would happen much sooner, or actually right at the start.

My question now is (and I know that I haven't provided much information or any logs, and that it's difficult to find the problem like this) if there is any obvious situation in which this might happen? Or if there are a few cases that might lead to this error? Something that might point me in the right direction of what I need to be looking at to see if I have any configuration problems. Or could it simply be a problem of my physical machine? Could it be just a memory problem, which crashes the peer if the traffic gets too heavy?

These errors are really not obvious sometimes, and I was just wondering if someone might have some insight as to what the problem might be, based on previous experiences or something.

Thanks in advance

Tomás




Tomás Peixinho
 

I can't seem to find that file. I can't check from inside the peer, because the peer exited. If I try to run "docker-compose run" with the peer id, it says "service not found", which seems expected, and I can't exec to it because it is down. And from the outside, I can't seem to find any logs in "/var/lib/docker/volumes/network_peer1.org1.example.com/_data", which I think is the right path. That last folder should have a ledgersData subfolder in it but it's completely empty. Am I looking in the wrong place?

As for the logs, the one from the peer, apart from the final stack trace which I have already shared, don't really show me anything, it's just blocks being received and transactions being commited. From the orderer, there's nothing out of the ordinary, it's just grpc calls from what I assume are the blocks being sent to the peers.

Also, I can't seem to find the core.yaml file. Any idea where it might be?


De: Manish Sethi <manish.sethi@...>
Enviado: quarta-feira, 10 de fevereiro de 2021 21:21
Para: Tomás Peixinho <tom.peixinho@...>
Cc: David Enyeart <enyeart@...>; fabric@... <fabric@...>
Assunto: Re: [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
 
Hi Tomás,

I had seen something similar in a test flake in the past and the issue was a race condition between the closing of the levelDB and acquiring an iterator. I am not sure if that's app;icable here still, to clear any doubts, can you check your leveldb logs (located at {peer.fileSystemPath in core.yaml}/ledgersData/stateLeveldb/LOG near the time of this panic. Do you see something like this - "db@close"? If not, sharing the log may help further.

Thanks,
Manish

On Wed, Feb 10, 2021 at 2:57 PM Tomás Peixinho <tom.peixinho@...> wrote:

I was reading about similar issues in fabric version 1.4.2 (the images I'm using on my containers are 1.4.1, so I'm not sure if it is related at all or not), and how it was related to a certain timeout when one of the containers stops (from an issue on Jira from August 2019, https://jira.hyperledger.org/browse/FAB-16292). Don't know if that's what is happening. Last execution, it stopped after approximately 15 minutes, but I've seen it crash sooner and later.


The full error message is as follows:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1c pc=0xb4e221]

goroutine 1265 [running]:
github.com/hyperledger/fabric/protos/peer._ChaincodeSupport_Register_Handler(0x118c540, 0xc00054c360, 0x13bbe80, 0xc0005769a0, 0x1fc5e60, 0xc0032ef230)
github.com/hyperledger/fabric/vendor/google.golang.org/grpc.(*Server).processStreamingRPC(0xc00031c000, 0x13bf540, 0xc002258a80, 0xc0003f3000, 0xc0003487b0, 0x1ea8a40, 0xc0032ef200, 0x0, 0x0)
/opt/gopath/src/github.com/hyperledger/fabric/vendor/google.golang.org/grpc/server.go:684 +0xa1



De: David Enyeart <enyeart@...>
Enviado: quarta-feira, 10 de fevereiro de 2021 19:31
Para: Tomás Peixinho <tom.peixinho@...>
Cc: fabric@... <fabric@...>
Assunto: Re: [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
 

You would need to provide the full stack trace from the panic message in order to troubleshoot.


Dave Enyeart

"Tomás Peixinho" ---02/10/2021 01:34:13 PM---Good afternoon fellow colleagues, I'm developing an application using Hyperledger Fabric and I've fo

From: "Tomás Peixinho" <tom.peixinho@...>
To: "fabric@..." <fabric@...>
Date: 02/10/2021 01:34 PM
Subject: [EXTERNAL] [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
Sent by: fabric@...





Good afternoon fellow colleagues, I'm developing an application...
This Message Is From an External Sender
This message came from outside your organization.
Good afternoon fellow colleagues,

I'm developing an application using Hyperledger Fabric and I've found a weird error that I'm not sure how to fix.

I'm running everything on just one machine, with two orgs, one peer on each, and one orderer. My application receives a file with multiple lines (right now, around 5000), which correspond to one transaction each (technically, two transactions each, because I have to do a query by partial key, and then a separate transaction for the actual update, as per recommendation of Dave Enyart, from a previous question of mine). So, 5000 lines, around 10000 transactions. They are being handled concurrently using threads. And the peers are being alternated, so that one doesn't happen to take all the load, while the other does nothing.

My problem now is that it usually ends with one of the peers randomly crashing with an Exit Code 2, and the logs from that peer saying "panic: runtime error: invalid memory address or nil pointer dereference". I say randomly because it can happen after a 1000 lines (2000 transactions) or 4000 lines (8000 transactions), which is what I find weird!

I've read that it could be because of problems with certificates or with misconfigured paths for them, but I feel that if that were the case, it would happen much sooner, or actually right at the start.

My question now is (and I know that I haven't provided much information or any logs, and that it's difficult to find the problem like this) if there is any obvious situation in which this might happen? Or if there are a few cases that might lead to this error? Something that might point me in the right direction of what I need to be looking at to see if I have any configuration problems. Or could it simply be a problem of my physical machine? Could it be just a memory problem, which crashes the peer if the traffic gets too heavy?

These errors are really not obvious sometimes, and I was just wondering if someone might have some insight as to what the problem might be, based on previous experiences or something.

Thanks in advance

Tomás




David Enyeart
 

For finding the peer files, I find it easier to change the peer's volume in docker-compose-test-net.yaml (if using test network) to a local directory like this (last entry):

volumes:
- /var/run/docker.sock:/host/var/run/docker.sock
- ../organizations/peerOrganizations/org1.example.com/peers/peer0.org1.example.com/msp:/etc/hyperledger/fabric/msp
- ../organizations/peerOrganizations/org1.example.com/peers/peer0.org1.example.com/tls:/etc/hyperledger/fabric/tls
- /tmp/fabric/data/peer0.org1.example.com:/var/hyperledger/production

Then even after peer crash you'll find the statedb files at:
/tmp/fabric/data/peer0.org1.example.com/ledgersData/stateLeveldb

core.yaml can be found within the peer container at /etc/hyperledger/fabric/core.yaml (it is not mapped to an external volume).

Since this thread is getting involved, I'd encourage you to open a Jira bug where you can post the files - https://jira.hyperledger.org/issues/?filter=13110


Dave Enyeart

"Tomás Peixinho" ---02/10/2021 05:24:29 PM---I can't seem to find that file. I can't check from inside the peer, because the peer exited. If I tr

From: "Tomás Peixinho" <tom.peixinho@...>
To: Manish Sethi <manish.sethi@...>
Cc: "fabric@..." <fabric@...>
Date: 02/10/2021 05:24 PM
Subject: [EXTERNAL] Re: [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
Sent by: fabric@...





I can't seem to find that file. I can't check from inside the peer,...
This Message Is From an External Sender
This message came from outside your organization.
I can't seem to find that file. I can't check from inside the peer, because the peer exited. If I try to run "docker-compose run" with the peer id, it says "service not found", which seems expected, and I can't exec to it because it is down. And from the outside, I can't seem to find any logs in "/var/lib/docker/volumes/network_peer1.org1.example.com/_data", which I think is the right path. That last folder should have a ledgersData subfolder in it but it's completely empty. Am I looking in the wrong place?

As for the logs, the one from the peer, apart from the final stack trace which I have already shared, don't really show me anything, it's just blocks being received and transactions being commited. From the orderer, there's nothing out of the ordinary, it's just grpc calls from what I assume are the blocks being sent to the peers.

Also, I can't seem to find the core.yaml file. Any idea where it might be?



De: Manish Sethi <manish.sethi@...>
Enviado:
quarta-feira, 10 de fevereiro de 2021 21:21
Para:
Tomás Peixinho <tom.peixinho@...>
Cc:
David Enyeart <enyeart@...>; fabric@... <fabric@...>
Assunto:
Re: [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions

Hi Tomás,

I had seen something similar in a test flake in the past and the issue was a race condition between the closing of the levelDB and acquiring an iterator. I am not sure if that's app;icable here still, to clear any doubts, can you check your leveldb logs (located at {peer.fileSystemPath in core.yaml}/ledgersData/stateLeveldb/LOG near the time of this panic. Do you see something like this - "db@close"? If not, sharing the log may help further.

Thanks,
Manish

On Wed, Feb 10, 2021 at 2:57 PM Tomás Peixinho <tom.peixinho@...> wrote:




Tomás Peixinho
 

After checking the logs (with some help from Dave Enyart), I don't see any "db@close". I see the "db@open" and the hours seem correct. 

18:49:21.592582 log@legend F·NumFile S·FileSize N·Entry C·BadEntry B·BadBlock Ke·KeyError D·DroppedEntry L·Level Q·SeqNum T·TimeElapsed
18:49:21.696795 db@open opening
18:49:21.698575 version@stat F·[] S·0B[] Sc·[]
18:49:21.702405 db@janitor F·2 G·0
18:49:21.702453 db@open done T·5.62227ms

I assume the "db@close" would be an entry if the peer had exited correctly, or am I thinking about it wrong? 


De: Manish Sethi <manish.sethi@...>
Enviado: quarta-feira, 10 de fevereiro de 2021 21:21
Para: Tomás Peixinho <tom.peixinho@...>
Cc: David Enyeart <enyeart@...>; fabric@... <fabric@...>
Assunto: Re: [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
 
Hi Tomás,

I had seen something similar in a test flake in the past and the issue was a race condition between the closing of the levelDB and acquiring an iterator. I am not sure if that's app;icable here still, to clear any doubts, can you check your leveldb logs (located at {peer.fileSystemPath in core.yaml}/ledgersData/stateLeveldb/LOG near the time of this panic. Do you see something like this - "db@close"? If not, sharing the log may help further.

Thanks,
Manish

On Wed, Feb 10, 2021 at 2:57 PM Tomás Peixinho <tom.peixinho@...> wrote:

I was reading about similar issues in fabric version 1.4.2 (the images I'm using on my containers are 1.4.1, so I'm not sure if it is related at all or not), and how it was related to a certain timeout when one of the containers stops (from an issue on Jira from August 2019, https://jira.hyperledger.org/browse/FAB-16292). Don't know if that's what is happening. Last execution, it stopped after approximately 15 minutes, but I've seen it crash sooner and later.


The full error message is as follows:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1c pc=0xb4e221]

goroutine 1265 [running]:
github.com/hyperledger/fabric/protos/peer._ChaincodeSupport_Register_Handler(0x118c540, 0xc00054c360, 0x13bbe80, 0xc0005769a0, 0x1fc5e60, 0xc0032ef230)
github.com/hyperledger/fabric/vendor/google.golang.org/grpc.(*Server).processStreamingRPC(0xc00031c000, 0x13bf540, 0xc002258a80, 0xc0003f3000, 0xc0003487b0, 0x1ea8a40, 0xc0032ef200, 0x0, 0x0)
/opt/gopath/src/github.com/hyperledger/fabric/vendor/google.golang.org/grpc/server.go:684 +0xa1



De: David Enyeart <enyeart@...>
Enviado: quarta-feira, 10 de fevereiro de 2021 19:31
Para: Tomás Peixinho <tom.peixinho@...>
Cc: fabric@... <fabric@...>
Assunto: Re: [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
 

You would need to provide the full stack trace from the panic message in order to troubleshoot.


Dave Enyeart

"Tomás Peixinho" ---02/10/2021 01:34:13 PM---Good afternoon fellow colleagues, I'm developing an application using Hyperledger Fabric and I've fo

From: "Tomás Peixinho" <tom.peixinho@...>
To: "fabric@..." <fabric@...>
Date: 02/10/2021 01:34 PM
Subject: [EXTERNAL] [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
Sent by: fabric@...





Good afternoon fellow colleagues, I'm developing an application...
This Message Is From an External Sender
This message came from outside your organization.
Good afternoon fellow colleagues,

I'm developing an application using Hyperledger Fabric and I've found a weird error that I'm not sure how to fix.

I'm running everything on just one machine, with two orgs, one peer on each, and one orderer. My application receives a file with multiple lines (right now, around 5000), which correspond to one transaction each (technically, two transactions each, because I have to do a query by partial key, and then a separate transaction for the actual update, as per recommendation of Dave Enyart, from a previous question of mine). So, 5000 lines, around 10000 transactions. They are being handled concurrently using threads. And the peers are being alternated, so that one doesn't happen to take all the load, while the other does nothing.

My problem now is that it usually ends with one of the peers randomly crashing with an Exit Code 2, and the logs from that peer saying "panic: runtime error: invalid memory address or nil pointer dereference". I say randomly because it can happen after a 1000 lines (2000 transactions) or 4000 lines (8000 transactions), which is what I find weird!

I've read that it could be because of problems with certificates or with misconfigured paths for them, but I feel that if that were the case, it would happen much sooner, or actually right at the start.

My question now is (and I know that I haven't provided much information or any logs, and that it's difficult to find the problem like this) if there is any obvious situation in which this might happen? Or if there are a few cases that might lead to this error? Something that might point me in the right direction of what I need to be looking at to see if I have any configuration problems. Or could it simply be a problem of my physical machine? Could it be just a memory problem, which crashes the peer if the traffic gets too heavy?

These errors are really not obvious sometimes, and I was just wondering if someone might have some insight as to what the problem might be, based on previous experiences or something.

Thanks in advance

Tomás




Manish
 

I would have expected more in the leveldb logs after opening, as you get this crash after a while. Anyways, looking at goleveldb code, the other possibility is that in the Fabric code, under some circumstances, we end up invoking this Release function in parallel goroutine.

I'll suggest opening a Jira for this and attach the Fabric logs and leveldb logs.

Thanks,
Manish

On Thu, Feb 11, 2021 at 5:09 PM Tomás Peixinho <tom.peixinho@...> wrote:
After checking the logs (with some help from Dave Enyart), I don't see any "db@close". I see the "db@open" and the hours seem correct. 

18:49:21.592582 log@legend F·NumFile S·FileSize N·Entry C·BadEntry B·BadBlock Ke·KeyError D·DroppedEntry L·Level Q·SeqNum T·TimeElapsed
18:49:21.696795 db@open opening
18:49:21.698575 version@stat F·[] S·0B[] Sc·[]
18:49:21.702405 db@janitor F·2 G·0
18:49:21.702453 db@open done T·5.62227ms

I assume the "db@close" would be an entry if the peer had exited correctly, or am I thinking about it wrong? 


De: Manish Sethi <manish.sethi@...>
Enviado: quarta-feira, 10 de fevereiro de 2021 21:21
Para: Tomás Peixinho <tom.peixinho@...>
Cc: David Enyeart <enyeart@...>; fabric@... <fabric@...>
Assunto: Re: [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
 
Hi Tomás,

I had seen something similar in a test flake in the past and the issue was a race condition between the closing of the levelDB and acquiring an iterator. I am not sure if that's app;icable here still, to clear any doubts, can you check your leveldb logs (located at {peer.fileSystemPath in core.yaml}/ledgersData/stateLeveldb/LOG near the time of this panic. Do you see something like this - "db@close"? If not, sharing the log may help further.

Thanks,
Manish

On Wed, Feb 10, 2021 at 2:57 PM Tomás Peixinho <tom.peixinho@...> wrote:
I was reading about similar issues in fabric version 1.4.2 (the images I'm using on my containers are 1.4.1, so I'm not sure if it is related at all or not), and how it was related to a certain timeout when one of the containers stops (from an issue on Jira from August 2019, https://jira.hyperledger.org/browse/FAB-16292). Don't know if that's what is happening. Last execution, it stopped after approximately 15 minutes, but I've seen it crash sooner and later.


The full error message is as follows:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1c pc=0xb4e221]

goroutine 1265 [running]:
github.com/hyperledger/fabric/protos/peer._ChaincodeSupport_Register_Handler(0x118c540, 0xc00054c360, 0x13bbe80, 0xc0005769a0, 0x1fc5e60, 0xc0032ef230)
github.com/hyperledger/fabric/vendor/google.golang.org/grpc.(*Server).processStreamingRPC(0xc00031c000, 0x13bf540, 0xc002258a80, 0xc0003f3000, 0xc0003487b0, 0x1ea8a40, 0xc0032ef200, 0x0, 0x0)
/opt/gopath/src/github.com/hyperledger/fabric/vendor/google.golang.org/grpc/server.go:684 +0xa1



De: David Enyeart <enyeart@...>
Enviado: quarta-feira, 10 de fevereiro de 2021 19:31
Para: Tomás Peixinho <tom.peixinho@...>
Cc: fabric@... <fabric@...>
Assunto: Re: [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
 

You would need to provide the full stack trace from the panic message in order to troubleshoot.


Dave Enyeart

"Tomás Peixinho" ---02/10/2021 01:34:13 PM---Good afternoon fellow colleagues, I'm developing an application using Hyperledger Fabric and I've fo

From: "Tomás Peixinho" <tom.peixinho@...>
To: "fabric@..." <fabric@...>
Date: 02/10/2021 01:34 PM
Subject: [EXTERNAL] [Hyperledger Fabric] Weird "panic: runtime error" after multiple transactions
Sent by: fabric@...





Good afternoon fellow colleagues, I'm developing an application...
This Message Is From an External Sender
This message came from outside your organization.
Good afternoon fellow colleagues,

I'm developing an application using Hyperledger Fabric and I've found a weird error that I'm not sure how to fix.

I'm running everything on just one machine, with two orgs, one peer on each, and one orderer. My application receives a file with multiple lines (right now, around 5000), which correspond to one transaction each (technically, two transactions each, because I have to do a query by partial key, and then a separate transaction for the actual update, as per recommendation of Dave Enyart, from a previous question of mine). So, 5000 lines, around 10000 transactions. They are being handled concurrently using threads. And the peers are being alternated, so that one doesn't happen to take all the load, while the other does nothing.

My problem now is that it usually ends with one of the peers randomly crashing with an Exit Code 2, and the logs from that peer saying "panic: runtime error: invalid memory address or nil pointer dereference". I say randomly because it can happen after a 1000 lines (2000 transactions) or 4000 lines (8000 transactions), which is what I find weird!

I've read that it could be because of problems with certificates or with misconfigured paths for them, but I feel that if that were the case, it would happen much sooner, or actually right at the start.

My question now is (and I know that I haven't provided much information or any logs, and that it's difficult to find the problem like this) if there is any obvious situation in which this might happen? Or if there are a few cases that might lead to this error? Something that might point me in the right direction of what I need to be looking at to see if I have any configuration problems. Or could it simply be a problem of my physical machine? Could it be just a memory problem, which crashes the peer if the traffic gets too heavy?

These errors are really not obvious sometimes, and I was just wondering if someone might have some insight as to what the problem might be, based on previous experiences or something.

Thanks in advance

Tomás