# Building a simple Retrieval Augmented Generation System

## Step 0: Install dependencies
We already do this for your. Here, we load
- [ChatNoir](https://chatnoir.web.webis.de/) for retrieval and
- the OpenAI Python library for generating responses.

In [None]:
!pip install -q git+https://github.com/chatnoir-eu/chatnoir-api.git git+https://github.com/chatnoir-eu/chatnoir-pyterrier.git openai

## Step 1: Add your API key here (please read carefully)

By now, you should be onboarded to [Webis](https://webis.de) and should have gotten a login. Visit our Open WebUI instance at [open-webui.web.webis.de](https://open-webui.web.webis.de/) and login using your Webis GitLab account.

Visit `Settings > Account > API keys > API Key`, add a new API key if it does not exist yet and copy your API key into the following string (it should start with `sk-`):

In [None]:
API_KEY = "sk-..."

## Step 2: Implement RAG

Provided a query, Retrieval-Augmented Generation (RAG) consists of two steps:
1. **Retrieving documents that are relevant to the query**<br/>This is exactly the same process as if you were to use Google or any other websearch engine to find relevant websites. Here, we use ChatNoir since it provides a simple API. You can learn more [here](https://github.com/chatnoir-eu/chatnoir-pyterrier). In our code this means that we need to implement the method
```python
def _retrieve(self, query: str) -> list[str]:
```
to query ChatNoir for the contents of relevant documents and return them as a list.
2. **Generation**<br/>This is where the "magic" happens and what differentiates RAG from traditional websearch: Instead of dumping a list of links to the documents we found relevant, we give the query and the list of documents (called "the context") and ask a generative model to answer the query using the information provided in the context. To do so, use the [OpenAI Python API](https://github.com/openai/openai-python?tab=readme-ov-file#usage) to generate a response. For your convenience, we already created a method `_fetch_response(prompt: str) -> str` that does the generation for you but you need to come up with a way to combine the query and documents into a single prompt that lets the LLM generate a good response.

In [None]:
from textwrap import dedent

from chatnoir_api import Index
from chatnoir_pyterrier import ChatNoirRetrieve, Feature
from openai import OpenAI

class RAG():

  def __init__(self, topk=5) -> None:
    self.llm = OpenAI(api_key=API_KEY, base_url="https://open-webui.web.webis.de/ollama/v1/")
    self.chatnoir = ChatNoirRetrieve(staging=True, num_results=topk, index=Index.MSMarcoV21, features=Feature.SNIPPET_TEXT)

  def _fetch_response(self, prompt: str) -> str:
    """Prompts mistral:7b and returns the generated response."""
    completion = self.llm.chat.completions.create(
        model="mistral:7b",
        messages=[
            {
              'role': 'user',
              'content': prompt,
            },
        ],
        temperature=0,
        max_completion_tokens=200,
        stream=True
    )
    return "".join(chunk.choices[0].delta.content for chunk in completion)

  def _retrieve(self, query: str) -> list[str]:
    """Give a query, return a list of texts that may answer the query."""
    # TASK: Use ChatNoir to retrieve relevant documents and return only their
    # content ("snippet_text").
    # For more information on ChatNoir's API have a look at
    #  https://github.com/chatnoir-eu/chatnoir-pyterrier
    raise NotImplemented

  def _generate(self, query: str, documents: list[str]) -> list[str]:
    """
    Takes a query and a list of documents that may answer the query and
    generates a text answering the query using the information provided.
    """
    # TASK: Come up with a prompt (a string) to generate an answer to the query
    # using the context documents.
    # You can generate text using self._fetch_response. To get started, you can
    # have print out, what the model returns for example, if you write
    # self._fetch_response("Who is Barack Obama")
    raise NotImplemented

  def __call__(self, query: str) -> str:
    return self._generate(query, self._retrieve(query))

rag = RAG(topk=5)

# Step 3: Success
Now that the RAG system is implemented, you can test it by querying it for any information you like!

In [None]:
rag("How tall is the empire state building?")