Generative models such as ChatGPT have changed many product roadmaps. Interfaces and user experience can now be re-imagined and often drastically simplified to what resembles a google search bar where the input is natural language. However, some models remain behind APIs without the ability to re-train on contextually appropriate data. Even in the case where the model weights are publicly available, re-training or fine-tuning is often expensive, requires expertise and is ill-suited to problem domains with constant updates. How then can such APIs be used when the data needed to generate an accurate output was not present in the training set because it is consistently changing?

Vector embeddings represent the impression a model has of some, likely unstructured, data. When combined with a vector database or search algorithm, embeddings can be used to retrieve information that provides context for a generative model. Such embeddings, linked to specific information, can be updated in real-time providing generative models with a continually up-to-date, external body of knowledge.

In this talk we will demonstrate the validity of this approach through examples. We will provide instructions, code and other assets that are open source and available on GitHub.


Talk given at Large Language Models in Production conference hosted by MLOps community about Vector Databases and Large Language Models


You can download the slides the the talk here.