Exporting chat data


Brian Behlendorf
 

Let's take another look at search & indexing after the move to Matrix.

My personal take is that archives of chat channels are, by nature, very low quality when it comes to serving as sources of answers to future similar questions, and are even worse for making decisions about fixes or new features. They are best as ways to help new users or new contributors up a learning curve, or to build social connections, or to brainstorm on something. But if a question is posted regularly to chat it might make sense to capture that either as a FAQ in the docs for a project or even just an email to the developer mailing lists, which tend to have better utility for this.

A lot of Q&A takes place on StackOverflow and similar, which I consider unfortunate as there's no way to integrate those into our search and always runs the risk of disappearing, but that's where some folks go to ask questions, and in terms of Google/etc discoverability are pretty good.

Brian

On 3/28/21 1:14 PM, Arun .S.M. wrote:
If we have an archive of all public channels, it would greatly help in getting answers to repeated questions. It would also help in building a story line of features that went in, discussions if any that happened around that.

Rocket.Chat search hasn't been nicer to me so far. Any plans of tooling around these archives or letting search engines index them?

Regards,
Arun

On Mon, Mar 29, 2021, 1:12 AM VIPIN BHARATHAN <vip@...> wrote:
Glad to see that privacy by design is baked in. There are administrative backdoors in many platforms...

Can't be evil is better than don't be evil.

Vipin


Vipin Bharathan
Digital Transformation Consultant
Financial Services (Blockchain, ML, Design Thinking)


From: Ry Jones <rjones@...>
Sent: Sunday, March 28, 2021 3:30 PM
To: Brian Behlendorf <bbehlendorf@...>
Cc: Community Architects <community-architects@...>; VIPIN BHARATHAN <vip@...>; hariti sanchaniya <haritiut@...>; tsc@... <tsc@...>
Subject: Re: [Hyperledger TSC] [Hyperledger Sawtooth] Exporting chat data
 
This would only include public channels, which are sort-of indexed by search engines. Getting a better real-time-ish copy of chat data out where search engines can find it has been on my list for a while.

For your information, I cannot see the content of private messages, or messages in private channels I am not in.

On Sun, Mar 28, 2021 at 09:39 Brian Behlendorf <bbehlendorf@...> wrote:
Over my dead body. :) I suspect Ry feels similarly.

Brian

On 3/28/21 7:57 AM, VIPIN BHARATHAN wrote:
Will this include private one-on-one chats? If so I would strongly discourage the sharing of such data.
Thanks,
Vipin


Vipin Bharathan
Digital Transformation Consultant
Financial Services (Blockchain, ML, Design Thinking)


From: tsc@... <tsc@...> on behalf of Ry Jones via lists.hyperledger.org <rjones=linuxfoundation.org@...>
Sent: Sunday, March 28, 2021 9:57 AM
To: hariti sanchaniya <haritiut@...>
Cc: Community Architects <community-architects@...>; tsc@... <tsc@...>
Subject: Re: [Hyperledger TSC] [Hyperledger Sawtooth] Exporting chat data
 
I did export all public data from slack. 

It it possible to export data from rocket chat; let me see how hard it is

On Sat, Mar 27, 2021 at 22:33 Arun .S.M. <arun.s.m.cse@...> wrote:
+ Community Architects
Not sure, if Hyperledger welcomes that.

Regards,
Arun

On Sun, Mar 28, 2021, 5:32 AM hariti sanchaniya <haritiut@...> wrote:
Hi,

This is Hariti.

I am a Master's student at the University of Tartu, Estonia. I am writing a thesis regarding "communication overhead in opensource software projects" I selected this project and I wanted an export of the below-mentioned channels to perform my quantitative analysis.
I tried exporting JSON files from the channel. I guess only admin users could do that. so I am contacting you.

chat for which data required:
SAWTOOTH - 2018 to 2020

This information would be so helpful with my thesis work. Let me know if you need any more information regarding my thesis or anything. Thank you in advance.

Let me know if I have to contact some specific person for this.

Regards,
Hariti
--
Ry Jones
Community Architect, Hyperledger


-- 
Brian Behlendorf
General Manager for Blockchain, Healthcare and Identity
bbehlendorf@...
Twitter: @brianbehlendorf
--
Ry Jones
Community Architect, Hyperledger


-- 
Brian Behlendorf
General Manager for Blockchain, Healthcare and Identity
bbehlendorf@...
Twitter: @brianbehlendorf


VIPIN BHARATHAN
 

A truly great advance would be the integration of all our resources into a comprehensive search engine targeted at Hyperledger properties.
  1. Chat
  2. Mailing lists
  3. Wiki pages
  4. Youtube channels
  5. Jira
  6. readthedocs and other documentation
  7. Github release notes, comments, issues
I had been looking into this sort of integration for a big enterprise (to ease support calls and reuse intelligence that had been gathered before etc.) querying unstructured data from a set of resources without burying the user in an irrelevant avalanche of results is a tough problem. Some amount of pre-processing is necessary.  Manual intervention is high cost and possibly friction-filled. Maybe there is a happy medium between programmatic leverage and manual intervention. 
Integration of data silos in the continuum of structured to unstructured is a tough subject. 

Thanks,
Vipin


dlt.nyc
Vipin Bharathan
Digital Transformation Consultant
Financial Services (Blockchain, ML, Design Thinking)
vip@...


From: Brian Behlendorf <bbehlendorf@...>
Sent: Monday, March 29, 2021 10:52 AM
To: Arun .S.M. <arun.s.m.cse@...>; VIPIN BHARATHAN <vip@...>
Cc: Ry Jones <rjones@...>; Community Architects <community-architects@...>; hariti sanchaniya <haritiut@...>; Technical Steering Committee (TSC) <tsc@...>
Subject: Re: [Hyperledger TSC] Exporting chat data
 
Let's take another look at search & indexing after the move to Matrix.

My personal take is that archives of chat channels are, by nature, very low quality when it comes to serving as sources of answers to future similar questions, and are even worse for making decisions about fixes or new features. They are best as ways to help new users or new contributors up a learning curve, or to build social connections, or to brainstorm on something. But if a question is posted regularly to chat it might make sense to capture that either as a FAQ in the docs for a project or even just an email to the developer mailing lists, which tend to have better utility for this.

A lot of Q&A takes place on StackOverflow and similar, which I consider unfortunate as there's no way to integrate those into our search and always runs the risk of disappearing, but that's where some folks go to ask questions, and in terms of Google/etc discoverability are pretty good.

Brian

On 3/28/21 1:14 PM, Arun .S.M. wrote:
If we have an archive of all public channels, it would greatly help in getting answers to repeated questions. It would also help in building a story line of features that went in, discussions if any that happened around that.

Rocket.Chat search hasn't been nicer to me so far. Any plans of tooling around these archives or letting search engines index them?

Regards,
Arun

On Mon, Mar 29, 2021, 1:12 AM VIPIN BHARATHAN <vip@...> wrote:
Glad to see that privacy by design is baked in. There are administrative backdoors in many platforms...

Can't be evil is better than don't be evil.

Vipin


Vipin Bharathan
Digital Transformation Consultant
Financial Services (Blockchain, ML, Design Thinking)


From: Ry Jones <rjones@...>
Sent: Sunday, March 28, 2021 3:30 PM
To: Brian Behlendorf <bbehlendorf@...>
Cc: Community Architects <community-architects@...>; VIPIN BHARATHAN <vip@...>; hariti sanchaniya <haritiut@...>; tsc@... <tsc@...>
Subject: Re: [Hyperledger TSC] [Hyperledger Sawtooth] Exporting chat data
 
This would only include public channels, which are sort-of indexed by search engines. Getting a better real-time-ish copy of chat data out where search engines can find it has been on my list for a while.

For your information, I cannot see the content of private messages, or messages in private channels I am not in.

On Sun, Mar 28, 2021 at 09:39 Brian Behlendorf <bbehlendorf@...> wrote:
Over my dead body. :) I suspect Ry feels similarly.

Brian

On 3/28/21 7:57 AM, VIPIN BHARATHAN wrote:
Will this include private one-on-one chats? If so I would strongly discourage the sharing of such data.
Thanks,
Vipin


Vipin Bharathan
Digital Transformation Consultant
Financial Services (Blockchain, ML, Design Thinking)


From: tsc@... <tsc@...> on behalf of Ry Jones via lists.hyperledger.org <rjones=linuxfoundation.org@...>
Sent: Sunday, March 28, 2021 9:57 AM
To: hariti sanchaniya <haritiut@...>
Cc: Community Architects <community-architects@...>; tsc@... <tsc@...>
Subject: Re: [Hyperledger TSC] [Hyperledger Sawtooth] Exporting chat data
 
I did export all public data from slack. 

It it possible to export data from rocket chat; let me see how hard it is

On Sat, Mar 27, 2021 at 22:33 Arun .S.M. <arun.s.m.cse@...> wrote:
+ Community Architects
Not sure, if Hyperledger welcomes that.

Regards,
Arun

On Sun, Mar 28, 2021, 5:32 AM hariti sanchaniya <haritiut@...> wrote:
Hi,

This is Hariti.

I am a Master's student at the University of Tartu, Estonia. I am writing a thesis regarding "communication overhead in opensource software projects" I selected this project and I wanted an export of the below-mentioned channels to perform my quantitative analysis.
I tried exporting JSON files from the channel. I guess only admin users could do that. so I am contacting you.

chat for which data required:
SAWTOOTH - 2018 to 2020

This information would be so helpful with my thesis work. Let me know if you need any more information regarding my thesis or anything. Thank you in advance.

Let me know if I have to contact some specific person for this.

Regards,
Hariti
--
Ry Jones
Community Architect, Hyperledger


-- 
Brian Behlendorf
General Manager for Blockchain, Healthcare and Identity
bbehlendorf@...
Twitter: @brianbehlendorf
--
Ry Jones
Community Architect, Hyperledger


-- 
Brian Behlendorf
General Manager for Blockchain, Healthcare and Identity
bbehlendorf@...
Twitter: @brianbehlendorf


Arun S M
 

Sure, sounds good. Thanks


On Mon, Mar 29, 2021, 8:22 PM Brian Behlendorf <bbehlendorf@...> wrote:
Let's take another look at search & indexing after the move to Matrix.

My personal take is that archives of chat channels are, by nature, very low quality when it comes to serving as sources of answers to future similar questions, and are even worse for making decisions about fixes or new features. They are best as ways to help new users or new contributors up a learning curve, or to build social connections, or to brainstorm on something. But if a question is posted regularly to chat it might make sense to capture that either as a FAQ in the docs for a project or even just an email to the developer mailing lists, which tend to have better utility for this.

A lot of Q&A takes place on StackOverflow and similar, which I consider unfortunate as there's no way to integrate those into our search and always runs the risk of disappearing, but that's where some folks go to ask questions, and in terms of Google/etc discoverability are pretty good.

Brian

On 3/28/21 1:14 PM, Arun .S.M. wrote:
If we have an archive of all public channels, it would greatly help in getting answers to repeated questions. It would also help in building a story line of features that went in, discussions if any that happened around that.

Rocket.Chat search hasn't been nicer to me so far. Any plans of tooling around these archives or letting search engines index them?

Regards,
Arun

On Mon, Mar 29, 2021, 1:12 AM VIPIN BHARATHAN <vip@...> wrote:
Glad to see that privacy by design is baked in. There are administrative backdoors in many platforms...

Can't be evil is better than don't be evil.

Vipin

dlt.nyc
Vipin Bharathan
Digital Transformation Consultant
Financial Services (Blockchain, ML, Design Thinking)


From: Ry Jones <rjones@...>
Sent: Sunday, March 28, 2021 3:30 PM
To: Brian Behlendorf <bbehlendorf@...>
Cc: Community Architects <community-architects@...>; VIPIN BHARATHAN <vip@...>; hariti sanchaniya <haritiut@...>; tsc@... <tsc@...>
Subject: Re: [Hyperledger TSC] [Hyperledger Sawtooth] Exporting chat data
 
This would only include public channels, which are sort-of indexed by search engines. Getting a better real-time-ish copy of chat data out where search engines can find it has been on my list for a while.

For your information, I cannot see the content of private messages, or messages in private channels I am not in.

On Sun, Mar 28, 2021 at 09:39 Brian Behlendorf <bbehlendorf@...> wrote:
Over my dead body. :) I suspect Ry feels similarly.

Brian

On 3/28/21 7:57 AM, VIPIN BHARATHAN wrote:
Will this include private one-on-one chats? If so I would strongly discourage the sharing of such data.
Thanks,
Vipin

dlt.nyc
Vipin Bharathan
Digital Transformation Consultant
Financial Services (Blockchain, ML, Design Thinking)


From: tsc@... <tsc@...> on behalf of Ry Jones via lists.hyperledger.org <rjones=linuxfoundation.org@...>
Sent: Sunday, March 28, 2021 9:57 AM
To: hariti sanchaniya <haritiut@...>
Cc: Community Architects <community-architects@...>; tsc@... <tsc@...>
Subject: Re: [Hyperledger TSC] [Hyperledger Sawtooth] Exporting chat data
 
I did export all public data from slack. 

It it possible to export data from rocket chat; let me see how hard it is

On Sat, Mar 27, 2021 at 22:33 Arun .S.M. <arun.s.m.cse@...> wrote:
+ Community Architects
Not sure, if Hyperledger welcomes that.

Regards,
Arun

On Sun, Mar 28, 2021, 5:32 AM hariti sanchaniya <haritiut@...> wrote:
Hi,

This is Hariti.

I am a Master's student at the University of Tartu, Estonia. I am writing a thesis regarding "communication overhead in opensource software projects" I selected this project and I wanted an export of the below-mentioned channels to perform my quantitative analysis.
I tried exporting JSON files from the channel. I guess only admin users could do that. so I am contacting you.

chat for which data required:
SAWTOOTH - 2018 to 2020

This information would be so helpful with my thesis work. Let me know if you need any more information regarding my thesis or anything. Thank you in advance.

Let me know if I have to contact some specific person for this.

Regards,
Hariti
--
Ry Jones
Community Architect, Hyperledger


-- 
Brian Behlendorf
General Manager for Blockchain, Healthcare and Identity
bbehlendorf@...
Twitter: @brianbehlendorf
--
Ry Jones
Community Architect, Hyperledger


-- 
Brian Behlendorf
General Manager for Blockchain, Healthcare and Identity
bbehlendorf@...
Twitter: @brianbehlendorf