Generate TLS certificates using CA and not cryptogen #fabric #fabric-ca #fabricca


Jean-Gaël Dominé <jgdomine@...>
 

Hi,

I'm looking for some help to understand how to use the CA to generate the TLS certificates, private key, ... of my components (peers, orderer)
Indeed I'm currently working on moving from using cryptogen which is for development to the CA that is more for production from my understanding.

What I'm currently doing is having a CLI that uses fabric-ca-client to connect to the CA to register and enroll the different components. I also do that to get the TLS artifacts.
To do so my inspiration was here: https://github.com/aidtechnology/hgf-k8s-workshop/tree/master/prod_example.

Everything works fine (channel creation, channel join, ..., chaincode instantiation) until I try to enable TLS.
Then I get the following error:
2019-05-24 08:40:29.232 UTC [grpc] newHTTP2Transport -> DEBU 17a grpc: Server.Serve failed to create ServerTransport: connection error: desc = "transport: http2Server.HandleStreams failed to receive the preface from client: EOF"
2019-05-24 08:42:28.023 UTC [grpc] newHTTP2Transport -> DEBU 17b grpc: Server.Serve failed to create ServerTransport: connection error: desc = "transport: http2Server.HandleStreams failed to receive the preface from client: read tcp 10.50.131.94:7050->10.50.131.97:36244: read: connection reset by peer"
The two IPs are the orderer's and the REST api's (inspired from balance-transfer sample)

I managed to figure out that the problem comes from the TLS artifacts I get from the CA.
Indeed, for a test I kept generating all the normal artifacts (msp) using the CA but for to get the TLS artifacts I used cryptogen.

In this case everything works again so I assume the issue is around the way I call the CA for the TLS artifacts.
The only source to do it I found is this one: 
https://stackoverflow.com/questions/54058674/how-to-generate-tls-certificate-and-keys-with-fabric-ca

So I ended up doing this:
fabric-ca-client enroll --enrollment.profile tls -m orderer-miles-com -u https://ord:OrdPW@$CA_URL -M ./crypto-config/ordererOrganizations/miles-com/orderer-miles-com/tls --tls.certfiles /etc/hyperledger/tls/ca-miles-com-cert.pem

It gives me a tls folder with some files inside 3 folders: keystore, signcerts and tlscacerts

It's worth noting that when using the fabric-ca-client command, it uses the default fabric-ca-client-config.yaml it creates on the fly. It may be the problem I don't know...

Does someone have any idea what I'm doing wrong?

Thank you in advance for you help


Nye Liu <nye@...>
 

Some of my notes are here:

https://jira.hyperledger.org/browse/FABC-60

When you say "enable TLS" do you mean mutual TLS?

On 5/24/2019 5:23 AM, Jean-Gaël Dominé wrote:

Hi,

I'm looking for some help to understand how to use the CA to generate the TLS certificates, private key, ... of my components (peers, orderer)
Indeed I'm currently working on moving from using cryptogen which is for development to the CA that is more for production from my understanding.

What I'm currently doing is having a CLI that uses fabric-ca-client to connect to the CA to register and enroll the different components. I also do that to get the TLS artifacts.
To do so my inspiration was here: https://github.com/aidtechnology/hgf-k8s-workshop/tree/master/prod_example.

Everything works fine (channel creation, channel join, ..., chaincode instantiation) until I try to enable TLS.
Then I get the following error:
2019-05-24 08:40:29.232 UTC [grpc] newHTTP2Transport -> DEBU 17a grpc: Server.Serve failed to create ServerTransport: connection error: desc = "transport: http2Server.HandleStreams failed to receive the preface from client: EOF"
2019-05-24 08:42:28.023 UTC [grpc] newHTTP2Transport -> DEBU 17b grpc: Server.Serve failed to create ServerTransport: connection error: desc = "transport: http2Server.HandleStreams failed to receive the preface from client: read tcp 10.50.131.94:7050->10.50.131.97:36244: read: connection reset by peer"
The two IPs are the orderer's and the REST api's (inspired from balance-transfer sample)

I managed to figure out that the problem comes from the TLS artifacts I get from the CA.
Indeed, for a test I kept generating all the normal artifacts (msp) using the CA but for to get the TLS artifacts I used cryptogen.

In this case everything works again so I assume the issue is around the way I call the CA for the TLS artifacts.
The only source to do it I found is this one: 
https://stackoverflow.com/questions/54058674/how-to-generate-tls-certificate-and-keys-with-fabric-ca

So I ended up doing this:
fabric-ca-client enroll --enrollment.profile tls -m orderer-miles-com -u https://ord:OrdPW@$CA_URL -M ./crypto-config/ordererOrganizations/miles-com/orderer-miles-com/tls --tls.certfiles /etc/hyperledger/tls/ca-miles-com-cert.pem

It gives me a tls folder with some files inside 3 folders: keystore, signcerts and tlscacerts

It's worth noting that when using the fabric-ca-client command, it uses the default fabric-ca-client-config.yaml it creates on the fly. It may be the problem I don't know...

Does someone have any idea what I'm doing wrong?

Thank you in advance for you help


Jean-Gaël Dominé <jgdomine@...>
 

Hi,

Thank you for your answer.

By enabling TLS, I mean setting those parameters to true:

ORDERER_GENERAL_TLS_ENABLED
CORE_PEER_TLS_ENABLED

I will take a look at your notes to see if it helps me find my solution


Jean-Gaël Dominé <jgdomine@...>
 

Hi,

So I studied the JIRA (and the subsequent links I found in the ticket) you gave me to try to understand how to make server TLS work on my network (not mutual since I don't want to enable client TLS before server TLS is working).

Have you managed to make it work in your case? If so how did you do?

Did you create a multi-root CAs (using the --cacount or --cafiles options) because of the problem you talked about?
=> 
However, it seems the CA server can only sign using FABRIC_CA_SERVER_CA_KEYFILE since FABRIC_CA_SERVER_TLS_KEYFILE is only used in the TLS server endpoint.
I'm a bit lost in the way the CA works...

Thank you for your help 
 


Nye Liu <nye@...>
 

Yes, I started 3 separate CA server instances on the same container using two --cafiles arguments

On 5/29/2019 5:31 AM, Jean-Gaël Dominé wrote:

Hi,

So I studied the JIRA (and the subsequent links I found in the ticket) you gave me to try to understand how to make server TLS work on my network (not mutual since I don't want to enable client TLS before server TLS is working).

Have you managed to make it work in your case? If so how did you do?

Did you create a multi-root CAs (using the --cacount or --cafiles options) because of the problem you talked about?
=> 
However, it seems the CA server can only sign using FABRIC_CA_SERVER_CA_KEYFILE since FABRIC_CA_SERVER_TLS_KEYFILE is only used in the TLS server endpoint.
I'm a bit lost in the way the CA works...

Thank you for your help 
 


Jean-Gaël Dominé <jgdomine@...>
 

Hi,

Thank you for your answer.

I've tried to set up the same configuration but to no avail.
I attached my CA configuration files so that you can have a look to check if I did something wrong that is obvious.

I also have a question, how can I generate the following artifacts that I have to give to the CA at startup:
tls:
# Enable TLS (default: false)
enabled: true
# TLS for the server's listening port
certfile: /etc/hyperledger/tls/ca-org1-miles-com-cert.pem
keyfile: /etc/hyperledger/tls/org1-miles-com-ca-keystore

Because before, I was using the artifacts provided by cryptogen and now I do not know how to generate them.

As for the enrollment command, is this correct?
fabric-ca-client enroll --enrollment.profile tls -m orderer-miles-com -u https://ord:OrdPW@$CA_URL -M ./crypto-config/ordererOrganizations/miles-com/orderer-miles-com/tls --tls.certfiles /etc/hyperledger/tls/ca-miles-com-cert.pem

Thank you again for your help


Jean-Gaël Dominé <jgdomine@...>
 

After a lot of struggle, I managed to progress without using a multi-root CA. My issue was that neither the Common Name nor the SAN of my certificates matched the name of the component it was associated to.

My workaround was to overwrite the SAN using the --csr.hosts option of the fabric-ca-client command.

I still have an issue though that prevents the orderer and peers to communicate (I get many tls handshake errors). To me, it seems that the problem is coming from the tlsca certificate I get back from the enrollment process.

For instance, when looking at a peer tlsca certificate obtained using cryptogen, here is what it contains:



And when I take a look at the one obtained using the CA client, I see the root CA...



NB: by tlsca certificate, I mean the file located in the tlsca sub-folder of the tls folder



Does somebody have an idea why it does that and how to solve this?

Thank you


Nye Liu <nye@...>
 

Please don't put external links in your emails, many of us have that blocked

Instead just copy/paste the actual text, which is also preferable to screen shots.

Thanks!

On 9/11/2019 4:53 AM, Jean-Gaël Dominé wrote:

After a lot of struggle, I managed to progress without using a multi-root CA. My issue was that neither the Common Name nor the SAN of my certificates matched the name of the component it was associated to.

My workaround was to overwrite the SAN using the --csr.hosts option of the fabric-ca-client command.

I still have an issue though that prevents the orderer and peers to communicate (I get many tls handshake errors). To me, it seems that the problem is coming from the tlsca certificate I get back from the enrollment process.

For instance, when looking at a peer tlsca certificate obtained using cryptogen, here is what it contains:



And when I take a look at the one obtained using the CA client, I see the root CA...



NB: by tlsca certificate, I mean the file located in the tlsca sub-folder of the tls folder



Does somebody have an idea why it does that and how to solve this?

Thank you


Jean-Gaël Dominé <jgdomine@...>
 

Hi SteveLiuu,

I'll answer here as it could also help other people struggling with the same problem.

So I finally progressed on the orderer<->peer communication that was raising TLS handshake errors:

Orderer logs:

2019-09-26 11:38:26.715 UTC [core.comm] ServerHandshake -> ERRO 00f TLS handshake failed with error remote error: tls: bad certificate server=Orderer remoteaddress=10.50.129.2:46848

 

2019-09-26 11:38:26.715 UTC [grpc] handleRawConn -> DEBU 010 grpc: Server.Serve failed to complete security handshake from "10.50.129.2:46848": remote error: tls: bad certificate
Peer logs:
2019-09-26 11:47:26.124 UTC [grpc] createTransport -> DEBU f93 grpc: addrConn.createTransport failed to connect to {orderer-miles-com:7050 0  <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority". Reconnecting...
My problem was that I didn't know how the orderer and the peers were verifying each other identity during the TLS in Fabric as they don't have the other's CA root certificate. Especially the peers only have their own certificates.
The light came from this post (the only one who eventually and really helped me :) ): https://stackoverflow.com/a/57553669/10378672

In fact, the root certificates are retrieved from the genesis block (I must have missed that explanation in the Fabric documentation). But though the CA, the /enroll calls I was doing did not put the generated TLS root certificates were in a separated folder from the msp one. But the genesis block creation command (configtxgen) uses only the files inside the msp...
So I ended up manually copying the TLS root certificates in the different msp folders (orderer and peers).
Unsurprisingly, the genesis block now contained more certificates than before...

It instantly solved the issue.

However, I have another issue now with the gossip communication. I have error messages after making the join call.

Here are the logs of peer1 of org1:
2019-09-27 13:53:33.542 UTC [gossip.comm] sendToEndpoint -> WARN 18a Failed obtaining connection for 10.50.133.30:7051, PKIid:8d8455016f555e26952048f7342aeb558d4fff9d7d57af4b3960e5eedcf63c10 reason: context deadline exceeded
2019-09-27 13:53:33.542 UTC [gossip.discovery] expireDeadMembers -> WARN 18b Entering [8d8455016f555e26952048f7342aeb558d4fff9d7d57af4b3960e5eedcf63c10]
2019-09-27 13:53:33.542 UTC [gossip.discovery] expireDeadMembers -> WARN 18c Closing connection to Endpoint: peer2-org1-miles-com:7051, InternalEndpoint: 10.50.133.30:7051, PKI-ID: 8d8455016f555e26952048f7342aeb558d4fff9d7d57af4b3960e5eedcf63c10, Metadata:
2019-09-27 13:53:33.542 UTC [gossip.discovery] expireDeadMembers -> WARN 18d Exiting
2019-09-27 13:53:33.791 UTC [core.comm] ServerHandshake -> ERRO 18e TLS handshake failed with error remote error: tls: bad certificate {"server": "PeerServer", "remote address": "10.50.133.30:58644"}
2019-09-27 13:53:34.261 UTC [grpc] createTransport -> DEBU 191 grpc: addrConn.createTransport failed to connect to {10.50.133.30:7051 0  <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: cannot validate certificate for 10.50.133.30 because it doesn't contain any IP SANs". Reconnecting...
10.50.133.30 is the IP of the peer2 of org1.

also this:
2019-09-30 12:46:47.717 UTC [deliveryClient] try -> WARN 0bb Got error: rpc error: code = Canceled desc = context canceled , at 1 attempt. Retrying in 1s
2019-09-30 12:46:47.717 UTC [blocksProvider] DeliverBlocks -> WARN 0bc [mychannel] Receive error: client is closing
2019-09-30 12:47:58.716 UTC [gossip.comm] func1 -> WARN 06d peer0-afkl-miles-com:7051, PKIid:7323baf5d65181d17116dddccbd7794093f43047195d5c73baef6da2bc3b2274 isn't responsive: EOF
2019-09-30 12:47:58.716 UTC [gossip.discovery] expireDeadMembers -> WARN 06e Entering [7323baf5d65181d17116dddccbd7794093f43047195d5c73baef6da2bc3b2274]
2019-09-30 12:47:58.716 UTC [gossip.discovery] expireDeadMembers -> WARN 06f Closing connection to Endpoint: peer0-afkl-miles-com:7051, InternalEndpoint: , PKI-ID: 7323baf5d65181d17116dddccbd7794093f43047195d5c73baef6da2bc3b2274, Metadata:
2019-09-30 12:47:58.716 UTC [gossip.discovery] expireDeadMembers -> WARN 070 Exiting
The network still seems to work fine but I don't know it is not reassuring. I have no idea if the Warnings are linked to the use of CA generated certificates and why I get TLS handshake errors between the peers now.

In a nutshell, this topic is not an easy one...

Hope it'll help you solve your own problem

Regards,

JG


 


steveLiuu
 

Hi Jean,

Thanks! I'll try this solution.

BTW, how do you solve this error that you mentioned on your first post?
2019-05-24 08:40:29.232 UTC [grpc] newHTTP2Transport -> DEBU 17a grpc: Server.Serve failed to create ServerTransport: connection error: desc = "transport: http2Server.HandleStreams failed to receive the preface from client: EOF
2019-05-24 08:42:28.023 UTC [grpc] newHTTP2Transport -> DEBU 17b grpc: Server.Serve failed to create ServerTransport: connection error: desc = "transport: http2Server.HandleStreams failed to receive the preface from client: read tcp 10.50.131.94:7050->10.50.131.97:36244: read: connection reset by peer"
I am facing the same issue but I can figure out why. 


Jean-Gaël Dominé <jgdomine@...>
 

It was explained in my post https://lists.hyperledger.org/g/fabric/message/6783 I think.

The problem comes from the fact that the common name of the certificates issued by the CA is the login you give on the enroll command. For example, my peer is name peer0-afkl and my enroll command looks like this (helm syntax with variables):
fabric-ca-client enroll --enrollment.profile tls -m {{ .name }}-{{ $root.Values.network.ordererOrganization.domain }} --csr.hosts {{ .subjectAlternativeName }} \
-u $PROTOCOL://{{ .login }}:{{ .password }}@$CA_URL \
-M ./crypto-config/ordererOrganizations/{{ $root.Values.network.ordererOrganization.domain }}/{{ .name }}-{{ $root.Values.network.ordererOrganization.domain }}/tls $TLS_OPTION

Basically my .login variable was set to peer0 and I realized that the common name was set to peer0. So when the TLS communication was set up, I got errors because the component peer0-afkl had a certificate that did not match its name. That is why I got the error. I then added the --csr.hosts option so that peer0-afkl was added as a SAN DNS name. This solved this issue.

You can also use the same value for the login and the peer's name but I don't know which approach is the cleaner.

Hope it helps

My workaround was to overwrite the SAN using the --csr.hosts option of the fabric-ca-client command.