An OpenAI-compatible API wrapper that sits in front of your LLMs and adds a semantic cache layer using Qdrant vector search. If a similar question was asked before, it returns the cached answer ...