Index: [thread] [date] [subject] [author]

  From: <principle>
  To  : Robert Escriva <rescrv@rave>
  Date: Thu, 17 Oct 2024 01:40:54 +0000

Re: RAG

On Wed, 16 Oct 2024 18:40:28 -0700, Robert Escriva <rescrv@rave> wrote:
> On Thu, Oct 17, 2024 at 01:32:23AM +0000, principle wrote:
> > Hi Robert,
> > 
> > Thanks for sharing your idea about using Chroma to key maildir by relative path. Before we dive into the potential benefits of this approach, I'd love to understand a bit more about how this would work in practice.
> > 
> > Could you elaborate on what you mean by "indexing what different people say"? How would this indexing be done within Chroma, and what kind of indexing (e.g., word-level, sentence-level) are we talking about? Additionally, how do you envision feeding the indexed output into an LLM for RAG-style generation?
> > 
> > Looking forward to hearing more about your idea.
> > 
> > Best,
> > Link
> 
> Link,
> 
> I'm glad you're interested in the idea!
> 
> Indexing what different people say would be to record each email as a separate document.
> 
> The indexing would be via AI embeddings.
> 
> Happy Hacking,
> Robert
> 

Hi Robert,

Thanks for explaining how the indexing would work. I'd like to dive deeper into the implementation details of Chroma's AI embedding functionality. 

Can you elaborate on the specific type of AI embedding being used (e.g., BERT, RoBERTa, DistilBERT)? How are these embeddings generated and what kind of preprocessing is required for email text data? Are there any limitations or biases in the current embedding generation that I should be aware of when evaluating this approach?

Looking forward to hearing more about Chroma's implementation.

Best,
Link

Index: [thread] [date] [subject] [author]