Crafting Conversations with AI: Unleashing Intelligence in Q&A Applications through RAG

7 min readFeb 11, 2024

Introduction

In this tutorial, we’ll walk through the process of building a sample application using langchain.js, Ollama, and ChromaDB.

LangChain is more than just a framework — it helps you to create applications that not only respond but understand context. It lets your app connect to various sources of context, helping it provide better, more relevant answers.

The application we’ll build will focus on answering user questions with the help of langchain.js, Ollama, and ChromaDB (Vector Store).

One of the most powerful applications enabled by LLMs is sophisticated question-answering (Q&A) chatbots. These are applications that can answer questions about specific source information. These applications use a technique known as Retrieval Augmented Generation, or RAG.

What is RAG?

RAG is like giving your language model an extra boost of knowledge. While LLMs are smart, they’re limited to what they knew at training time. RAG allows us to update this knowledge by pulling in specific information when needed.

In this tutorial, we will connect to external sources for accurate and useful responses. We will use chromaDB as vector store.

Source Code

Entire source code for the current demo can be found here

https://github.com/hacktronaut/ollama-rag-demo/tree/main

Prerequisites

If you are new to ollama then please check out these articles. You need to setup ollama server locally and pull tinydolphin model.

Unlocking AI Magic: A Beginner’s Guide to Setting up and Running Ollama Locally

Introduction

raokarthik83.medium.com

Crafting Conversations with LangChain.js and Ollama: A Quickstart Guide

Introduction

raokarthik83.medium.com

Before we dive into the implementation, make sure you have the following dependencies installed:

If you are new to chromaDB then follow along, I will show you how to set it up in your local system Or you can also look at official documentation and set it up https://docs.trychroma.com/getting-started?lang=js

Install chroma using python pip

pip install chromadb

Use the following command to bring up chroma server

$ chroma run --path .


                (((((((((    (((((####
             ((((((((((((((((((((((#########
           ((((((((((((((((((((((((###########
         ((((((((((((((((((((((((((############
        (((((((((((((((((((((((((((#############
        (((((((((((((((((((((((((((#############
         (((((((((((((((((((((((((##############
         ((((((((((((((((((((((((##############
           (((((((((((((((((((((#############
             ((((((((((((((((##############
                (((((((((    #########

    

Running Chroma

Saving data to: .
Connect to chroma at: http://localhost:8000
Getting started guide: https://docs.trychroma.com/getting-started


INFO:     [10-02-2024 21:14:07] Set chroma_server_nofile to 65535
INFO:     [10-02-2024 21:14:07] Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.
DEBUG:    [10-02-2024 21:14:07] Starting component System
DEBUG:    [10-02-2024 21:14:07] Starting component OpenTelemetryClient
DEBUG:    [10-02-2024 21:14:07] Starting component SimpleAssignmentPolicy
DEBUG:    [10-02-2024 21:14:07] Starting component SqliteDB
DEBUG:    [10-02-2024 21:14:07] Starting component Posthog
DEBUG:    [10-02-2024 21:14:07] Starting component LocalSegmentManager
DEBUG:    [10-02-2024 21:14:07] Starting component SegmentAPI
INFO:     [10-02-2024 21:14:08] Started server process [33130]
INFO:     [10-02-2024 21:14:08] Waiting for application startup.
INFO:     [10-02-2024 21:14:08] Application startup complete.
INFO:     [10-02-2024 21:14:08] Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)

Chroma will listen at http://localhost:8000

Pull tinydolphin model from ollama library, Our application will use this model to get responses.

ollama pull tinydolphin

Now we have our ollama server running at http://localhost:11434 and chromadb running at http://localhost:8000

Setting up the Project

Let’s start by setting up our project and installing dependencies:

I am using node version 18. You can stick to 18 or work with any other version of your choice.

$ nvm use 18.16.0
$ mkdir ollama-rag-demo
$ cd ollama-rag-demo
$ npm init

$ npm install @langchain/community --save-dev
$ npm install langchain --save-dev
$ npm install chromadb --save-dev

NOTE: Make sure that you have mentioned “type”: “module” in you package.json, We will be using import in our js files.

Here is the context file we will be using → https://github.com/hacktronaut/ollama-rag-demo/blob/main/utils/data.txt

This is the knowledge base used by llm model to answer user queries.

Utility Function and Vector Store Setup

Create loadData.js file and paste the following code.

import {OllamaEmbeddings} from "@langchain/community/embeddings/ollama"
import { Chroma } from "@langchain/community/vectorstores/chroma";
import { TextLoader } from "langchain/document_loaders/fs/text";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";


//Get an instance of ollama embeddings
const ollamaEmbeddings = new OllamaEmbeddings({
    baseUrl:"http://192.168.29.118:11434",
    model:"tinydolphin"
});


//Load data from txt file
const loader = new TextLoader("./data.txt");
const docs = await loader.load();

//Create a text splitter
const splitter = new RecursiveCharacterTextSplitter({
    chunkSize:1000,
    separators: ['\n\n','\n',' ',''],
    chunkOverlap: 200
});

//Split the text and get Document list as response
const output = await splitter.splitDocuments(docs);

//Creat embeddings and push it to collection
const vectorStore = await Chroma.fromDocuments(output, ollamaEmbeddings, {
    collectionName: "myLangchainCollection",
    url: "http://localhost:8000", // Optional, will default to this value
});


//Search and see if we are able to get results from similarity search

// Search for the most similar document
const vectorStoreResponse = await vectorStore.similaritySearch("What is langchain", 1);

console.log("Printing docs after similarity search --> ",vectorStoreResponse);

Run this script using following command

node loadData.js

You will be getting an output similar to this

Printing docs after similarity search -->  [
  Document {
    pageContent: 'Getting Started\n' +
      '\n' +
      "Here's a quick guide to installing LangChain, setting up your environment, and starting your first LangChain application. We recommend following our Quickstart guide and reviewing our Security best practices for safe development.\n" +
      '\n' +
      'Note: The documentation focuses on the JS/TS LangChain library. For Python LangChain library documentation, refer to the provided link.\n' +
      'LangChain Expression Language (LCEL)\n' +
      '\n' +
      'LCEL is a declarative way to compose chains, designed to support putting prototypes into production with no code changes. Explore the overview, standard interface, key features, and example code in the LCEL section.\n' +
      'Modules\n' +
      '\n' +
      'LangChain provides standard, extendable interfaces and integrations for the following modules:\n' +
      '\n' +
      '    Model I/O: Interface with language models.\n' +
      '\n' +
      '    Retrieval: Interface with application-specific data.\n' +
      '\n' +
      '    Agents: Let models choose tools to use given high-level directives.\n' +
      '\n' +
      'Use Cases',
    metadata: { source: './data.txt', loc: [Object] }
  }
]

Basically we have created our vector embeddings and are stored in chromadb.

Now next step is to ask LLM a question by providing the search results from the vector store.

Answering the User Question

Create a file called index.js and paste the following code

import {OllamaEmbeddings} from "@langchain/community/embeddings/ollama"
import { Ollama } from "@langchain/community/llms/ollama";
import { Chroma } from "@langchain/community/vectorstores/chroma";
import {PromptTemplate} from "@langchain/core/prompts";
import {StringOutputParser} from "@langchain/core/output_parsers"


const ollamaEmbeddings = new OllamaEmbeddings({
    baseUrl:"http://192.168.29.118:11434",
    model:"tinydolphin"
});

const ollamaLlm = new Ollama({
    baseUrl:"http://192.168.29.118:11434",
    model:"tinydolphin"
});


//Utility function to combine documents
function combineDocuments(docs) {
    return docs.map((doc) => doc.pageContent).join('\n\n');
}


//Get instance of vector store
//We will connect to langchainData collection
const vectorStore = await Chroma.fromExistingCollection(
    ollamaEmbeddings,
    { collectionName: "myLangchainCollection" , url: "http://localhost:8000"},
  );


//Get retriever
const chromaRetriever = vectorStore.asRetriever();

const userQuestion = "What are the three modules provided by langchain?";

//Create a prompt tempalate and convert the user question into standalone question
const simpleQuestionPrompt = PromptTemplate.fromTemplate(`
For following user question convert it into a standalone question
{userQuestion}`);

//Create a chain and get the results from retriever by invoking the chain
const simpleQuestionChain = simpleQuestionPrompt.pipe(ollamaLlm).pipe(new StringOutputParser()).pipe(chromaRetriever);

const documents = await simpleQuestionChain.invoke({
    userQuestion: userQuestion
});

//Combine the results into a string
const combinedDocs = combineDocuments(documents);


//Create user question prompt template
const questionTemplate = PromptTemplate.fromTemplate(`
    You are a langchain instructor who is good at answering questions raised by new developers or users. Answer the below question using the context.
    Strictly use the context and answer in crisp and point to point.
    <context>
    {context}
    </context>

    question: {userQuestion}
`)


const answerChain = questionTemplate.pipe(ollamaLlm);


const llmResponse = await answerChain.invoke({
    context: combinedDocs,
    userQuestion: userQuestion
});

console.log("Printing llm response --> ",llmResponse);


/**
 Printing llm response -->   The three modules provided by LangChain are Model I/O, Retrieval, and Agents. These modules allow developers to connect with various language models and interact with their data using high-level directives. They can be used to interface with applications that require the use of the Large Language Model (LLM).

Here is a brief explanation of each module:

1. Model I/O: This module provides an interface for connecting with various language models, allowing developers to interact with their data and use them as part of their chain.
2. Retrieval: This module allows developers to retrieve the output of a model or tool, allowing them to use the information in a given sequence.
3. Agents: This module provides a set of tools and APIs for running agents on top of the language models they interact with. Agents can be used to evaluate and analyze results, as well as perform tasks like automated feedback on user input or task completion.

 */

Let’s run the code and see what happens

node index.js

Note that question is hardcoded in the index.js

Question : What are the three modules provided by langchain?

here is the response we got from LLM

/**
 Printing llm response -->   The three modules provided by LangChain are Model I/O, Retrieval, and Agents. These modules allow developers to connect with various language models and interact with their data using high-level directives. They can be used to interface with applications that require the use of the Large Language Model (LLM).

Here is a brief explanation of each module:

1. Model I/O: This module provides an interface for connecting with various language models, allowing developers to interact with their data and use them as part of their chain.
2. Retrieval: This module allows developers to retrieve the output of a model or tool, allowing them to use the information in a given sequence.
3. Agents: This module provides a set of tools and APIs for running agents on top of the language models they interact with. Agents can be used to evaluate and analyze results, as well as perform tasks like automated feedback on user input or task completion.

 */

Conclusion

Isn’t it amazing?. That’s the magic of AI. People are building amazing apps using RAG model.

But RAG is not the only thing in the world of AI. In the next tutorial I will show how to create an Agent. By having a combination of agents we can build a baby AGI. Yes you heard it right.

In this tutorial, we’ve seen how to build a sample application using langchain.js, Ollama, and ChromaDB. This application processes user questions, retrieves relevant information, and generates concise and context-aware responses. You can further extend this example to build more sophisticated applications by exploring the capabilities of langchain.js, Ollama, and ChromaDB.

Happy coding!

Crafting Conversations with AI: Unleashing Intelligence in Q&A Applications through RAG

Introduction

What is RAG?

Source Code

Prerequisites

Unlocking AI Magic: A Beginner’s Guide to Setting up and Running Ollama Locally

Introduction

Crafting Conversations with LangChain.js and Ollama: A Quickstart Guide

Introduction

Setting up the Project

Utility Function and Vector Store Setup

Answering the User Question

Conclusion

Written by karthik Ganti

No responses yet