Skip to main content

Day 1 · Block 3

E03 - Retrieval A/B with local corpus

  • Can you look up some services that offer RAG systems? How expensive is this?
  • Imagine you would have a PDF. What steps are necessary to get it into a retriever?
  • Now imagine you would have an image or even a video. Could you (perhaps in a "hacky" way) get that into a retriever?
  • Run `python exercises/03/retrieval_ab.py`.
  • Complete all `TODO-STUDENT` steps in the script (question change + k change + comparison note).
  • Record one baseline failure fixed by retrieval.

Inputs

  • exercises/03/retrieval_ab.py
  • Local corpus in exercises/03/local_docs

Deliverable

A/B run trace + one retrieval-fixed failure + one source-backed comparison note.

Target

Can you look up some services that offer RAG systems? How expensive is this?

Checklist

  • Make one explicit design decision.
  • Include one verification check.
  • State one limitation or risk.

Common failure modes

  • Baseline and retrieval runs use different prompts/settings.
  • Improvement claimed without source chunk evidence or boundary disclosure.

Extension task (optional)

Add metadata filtering and report precision difference.