LlamaParse is a powerful service that extracts text and structure from documents. It parses documents into a vector database, which can then be used for RAG applications. It is a cloud service, it requires an API key for access.

In this tutorial, we will build a simple Retrieval Augmented Generation (RAG) example using LlamaParse and Upstash Vector. We will create a Node application that reads a PDF file, extracts the text, and chunks it into smaller pieces using the LlamaParse API. Each chunk will be converted into a vector embedding and stored in Upstash Vector. The Node application will then query Upstash Vector to find the most similar chunks to the user’s query.

We will use the RAGChat SDK to build the RAG application. RAGChat is a library that provides a simple interface for building RAG applications. It has built-in support for Upstash Vector and LlamaParse.

Prerequisites

Step 1: Create a new Node.js project

mkdir rag-with-llamaparse
cd rag-with-llamaparse
npm init -y

Step 2: Install the dependencies

npm install llama-index @upstash/rag-chat dotenv

Step 3: Create a new file named index.js

import { RAGChat, openai } from "@upstash/rag-chat";
import dotenv from "dotenv";

dotenv.config();

export const ragChat = new RAGChat({
  model: openai("gpt-4-turbo"),
});

const fileSource = "./upstash-terms.pdf";

await ragChat.context.add({
  options: {
    namespace: "llama-parse-upstash",
  },
  fileSource,
  processor: {
    name: "llama-parse",
    options: { apiKey: process.env.LLAMA_CLOUD_API_KEY },
  },
});

const result = await ragChat.chat("What is excessive bandwidth policy of upstash?", {
  streaming: false,
  namespace: "llama-parse-upstash",
});

console.log(result);

Step 4: Create a new file named .env and add the following environment variables

LLAMA_CLOUD_API_KEY="your_llama_cloud_api_key_here"

UPSTASH_VECTOR_REST_URL="your_upstash_vector_rest_url_here"
UPSTASH_VECTOR_REST_TOKEN="your_upstash_vector_rest_token_here"

OPENAI_API_KEY="your_openai_api_key_here"

Step 5: Add a PDF file to the project to test

You can update the fileSource variable in the index.js file to any PDF file you want to test. Additionally, you can modify the prompt to ask a relevant question about the PDF file.

Step 6: Run the application

node index.js

Step 7: Check the result

You should see the result in the console. Sometimes, the result may be not as expected. Indexing takes time, so run the application again after a while.

Also check the Vector database to see if the chunks are stored correctly. You can use the Upstash Vector Console to view the data. Go to the Data Browser tab and select the namespace you set in the index.js file.

Upstash Vector Console