Ragas and Evaluation Modules
satya - 4/21/2024, 2:50:11 PM
What are good prompt examples to answer questions solely based on provided context in a RAG application?
What are good prompt examples to answer questions solely based on provided context in a RAG application?
satya - 4/21/2024, 2:59:14 PM
One RAG prompt to rule them all: article: Dhar Rawal, Medium
One RAG prompt to rule them all: article: Dhar Rawal, Medium
satya - 4/21/2024, 3:05:36 PM
This covers
- Covers a series of instructions to the LLM how to structure the answer by just using the context
- Step by step instructions
- Instructions to adjust the response to get a final response
- Structure of the prompt
- An example
- intermediate response
- How to test manually how the prompting works
satya - 4/21/2024, 3:12:43 PM
An evaluation approach video with RAGas
satya - 4/21/2024, 3:28:04 PM
Few terms
- Context recall: How good is the retrieval based on the question
- Answer Relevancy: How relevant is the answer from LLM
- Faith fullness: How faithful to the retrieved context is this answer?
satya - 4/21/2024, 3:42:48 PM
Additional notes
- Focus on retrieval
- Manually test the retrieval system how good it is
- Find out what are the "parameters" that effect retrieval?
satya - 4/21/2024, 3:43:41 PM
and....
- Using GPT 3.5 to answer
- Using GPT 4 for ground truth
satya - 4/21/2024, 3:46:49 PM
Understand
- ResponseSchema
- StructuredOutputParser
- How to use them
satya - 4/21/2024, 4:15:17 PM
In summary
- Not a very good video! :(
- Important stuff is covered too quickly
- It is not clear if this is a QA and optimization time utility or do you put this on Prod
- Probably not on prod
- To tune the retrieval params, human intervention on the data set first, and then use that tuning to decide on the development and production
- In other words it could be a way to decide on what chunking and retrieval strategy works best for a given data set.
satya - 4/21/2024, 4:45:10 PM
LangChain and Ragas sample code and examples article
LangChain and Ragas sample code and examples article
Search for: LangChain and Ragas sample code and examples article
satya - 4/21/2024, 8:40:51 PM
Here is how you customize LLMs for Ragas: yes it needs an LLM
Here is how you customize LLMs for Ragas: yes it needs an LLM
satya - 4/21/2024, 8:42:12 PM
Getting started with ragas: Ragas docs
satya - 4/21/2024, 9:05:14 PM
Quick summary
- Ragas is just a python library
- Installed via pip install
- Well integrated with langchain
- It takes question, answer, context and optionally a "reference answer" called a ground truth
- Then it analyzes them for relevancy, how faithful to the context, precision (how much junk was brought along with good stuff), and if the context has ALL the answers that are in the ground truth (Meaning is what is retrieved a superset of the truth. It is ok to have junk)
- The article "Evaluating RAG Applications with RAGAs by Leonie Monigatti" is really good introduction
- The second set of references are at the docs.ragas.io, the official docs, especially the explanation of their metrics and how they do it.
- Here at their doc site you have some information on how to install and some getting started deal
- Ragas does use an LLM to make these metrics.
- So the official docs has a section on how to customize the LLMs for it
satya - 4/21/2024, 9:11:45 PM
It is not clear however
- How many round trips it does to the LLM
- As you analyze by varying the embeddings and other chunking methodologies, and measure their metrics, is this only of value during "development" time?
- Can this be used for production such as analyzing the response?
- It definitely does not have the wherewithal to decide on the relevancy or appropriateness of the question itself! That may need to be another module!