Create Chatbot Using GPT-Index | OpenAI | Python

In this article, I’ll show you how you can create a basic chatbot that utilizes your data. Here we will be using GPT-Index, OpenAI, and Python.

Let’s get started by installing the required Python module.

Install modules/packages

We need to install two packages named gpt_index and langchain, and this can be done using below lines:

pip install gpt_index
pip install langchain

Importing packages

Next, we need to import those packages so that we can use them:

from gpt_index import SimpleDirectoryReader,GPTListIndex,GPTSimpleVectorIndex,LLMPredictor,PromptHelper
from langchain import OpenAI
import sys
import os

Please note that we don’t need a GPU here because we are not doing anything locally. All we are doing is using the OpenAI server.

Grab OpenAI Key

To grab the OpenAI key, you need to go to https://openai.com/, log in and then grab the keys using highlighted way:

Create Chatbot Using GPT

Once you get the key, set that inside an environment variable(I’m using Windows). If you do not want to set it as an environment, you must pass this key in every function call.

os.environ["OPENAI_API_KEY"] = "YOUR_KEY"

Collect Data

Once our environment is set, the next thing we need is data. Here, you can either take an URL with all the data or take the data, which is already downloaded and available in the form of a flat file.

Once the text file is downloaded, keep it in a directory. If you have multiple text files, you can keep all of them in the same directory.

Now we have the data and knowledge. The next thing is to use this knowledge base.

Create Index

We need to create an index using all the text files. For this, we will create a function to take the directory path where our text file is saved.

def create_index(path):
  max_input = 4096
  tokens = 200
  chunk_size = 600 #for LLM, we need to define chunk size
  max_chunk_overlap = 20
  
  #define prompt
  promptHelper = PromptHelper(max_input,tokens,max_chunk_overlap,chunk_size_limit=chunk_size)
  
  #define LLM — there could be many models we can use, but in this example, let’s go with OpenAI model
  llmPredictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-ada-001",max_tokens=tokens))
  
  #load data — it will take all the .txtx files, if there are more than 1
  docs = SimpleDirectoryReader(path).load_data()

  #create vector index
  vectorIndex = GPTSimpleVectorIndex(documents=docs,llm_predictor=llmPredictor,prompt_helper=promptHelper)
  vectorIndex.save_to_disk(‘vectorIndex.json’)
  return vectorIndex

The above process is called embedding, and you need to do this again only when new data flows in.

Create Answering System

Next, we need to build a system that can respond to user. Let’s create a function for that.

def answerMe(vectorIndex):
  vIndex = GPTSimpleVectorIndex.load_from_disk(vectorIndex)
  while True:
    input = input(‘Please ask: ‘)
    response = vIndex.query(input,response_mode=”compact”)
    print(f”Response: {response} \n”)

Test The System

Finally, it’s time to test the system. Run your application, ask some questions, and get a response.

Create Chatbot Using GPT

I hope you enjoyed creating your basic bot.

If you find anything unclear, watch my video demonstrating this entire flow here.