Build a Simple Chatbot in Python Using NLTK and Rule-Based Logic

Shivang
Aug 05
1.3k
0
1

Article

Introduction

Chatbots are everywhere—from customer support to personal assistants. Building a simple chatbot helps you understand natural language processing (NLP) fundamentals such as tokenization, stemming/lemmatization, intent matching, and fallback strategies. In this project, you will create a Python chatbot that uses NLTK for preprocessing and a mix of rule-based and similarity-based logic to answer user inputs.

Prerequisites

Python 3.7+ installed
Basic familiarity with Python (functions, dictionaries, I/O)
Internet access (for initial NLTK data download)
Install required libraries:
```
pip install nltk
```

Initial setup (run once to download necessary NLTK data):

import nltk
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('omw-1.4')
nltk.download('stopwords')

Project Features

Greeting detection
FAQ-style predefined responses
Similarity fallback using word overlap and WordNet synonyms
Small talk (how are you, thanks, etc.)
Exit command

Code: chatbot.py

import random
import nltk
from nltk.corpus import wordnet, stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer

# Ensure NLTK data is present (uncomment if running first time)
# nltk.download('punkt')
# nltk.download('wordnet')
# nltk.download('omw-1.4')
# nltk.download('stopwords')

lemmatizer = WordNetLemmatizer()
stop_words = set(stopwords.words('english'))

# Predefined patterns and responses
greeting_inputs = ["hi", "hello", "hey", "good morning", "good evening"]
greeting_responses = ["Hello!", "Hey there!", "Hi! How can I help you today?", "Greetings!"]

farewell_inputs = ["bye", "exit", "quit", "see you", "goodbye"]
farewell_responses = ["Goodbye!", "See you later!", "Have a great day!", "Bye!"]

faq = {
    "what is your name": "I am a simple Python chatbot.",
    "how are you": "I'm a program, so I am always functioning as expected!",
    "what can you do": "I can chat, answer basic questions, and try to understand you using simple NLP.",
    "who created you": "You did! Well, the tutorial did. Shivang is credited for this project.",
}

def preprocess(text):
    tokens = word_tokenize(text.lower())
    filtered = []
    for token in tokens:
        if token.isalpha() and token not in stop_words:
            lemma = lemmatizer.lemmatize(token)
            filtered.append(lemma)
    return filtered

def word_overlap_score(user_tokens, key_tokens):
    return len(set(user_tokens) & set(key_tokens))

def synonym_match_score(user_tokens, key_tokens):
    score = 0
    for ut in user_tokens:
        synsets = wordnet.synsets(ut)
        synonyms = set()
        for syn in synsets:
            for lemma in syn.lemmas():
                synonyms.add(lemma.name())
        for kt in key_tokens:
            if kt == ut or kt in synonyms:
                score += 1
    return score

def get_best_faq_response(user_input):
    user_tokens = preprocess(user_input)
    best_score = 0
    best_response = None
    for question, answer in faq.items():
        key_tokens = preprocess(question)
        overlap = word_overlap_score(user_tokens, key_tokens)
        synonym_score = synonym_match_score(user_tokens, key_tokens)
        total = overlap + 0.5 * synonym_score  # weight synonyms a bit less
        if total > best_score:
            best_score = total
            best_response = answer
    if best_score >= 1:  # threshold
        return best_response
    return None

def respond(user_input):
    # Check for farewell
    for phrase in farewell_inputs:
        if phrase in user_input.lower():
            return random.choice(farewell_responses), True

    # Check for greeting
    for phrase in greeting_inputs:
        if phrase in user_input.lower():
            return random.choice(greeting_responses), False

    # FAQ or similarity match
    faq_resp = get_best_faq_response(user_input)
    if faq_resp:
        return faq_resp, False

    # Fallback: echo with acknowledgement
    return "Sorry, I didn't fully understand that. Can you rephrase?", False

def main():
    print("Welcome to the Python Chatbot. Type 'exit' to quit.")
    while True:
        user_input = input("You: ").strip()
        if not user_input:
            print("Bot: Please say something.")
            continue
        reply, should_exit = respond(user_input)
        print(f"Bot: {reply}")
        if should_exit:
            break

if __name__ == "__main__":
    main()

Explanation

Text is tokenized, lowercased, stripped of stop words, and lemmatized for normalization.
Basic intent matching: greetings and farewells are handled with direct phrase checks.
FAQ-like questions use a combination of word overlap and synonym matching via WordNet to pick the best answer.
Fallback politely asks for clarification.

Example Conversation

Welcome to the Python Chatbot. Type 'exit' to quit.
You: hello
Bot: Hey there!
You: what is your name
Bot: I am a simple Python chatbot.
You: who made you
Bot: You did! Well, the tutorial did. Shivang is credited for this project.
You: can you tell me what you do
Bot: I can chat, answer basic questions, and try to understand you using simple NLP.
You: bye
Bot: Have a great day!

Enhancements & Next Steps

Add context tracking so the bot remembers previous user intents.
Integrate sentiment analysis to adjust tone.
Replace rule-based fallback with a vector similarity model (e.g., using sentence embeddings).
Hook into a GUI (Tkinter) or web interface (Flask).
Expand the knowledge base from a JSON/YAML file for easier maintenance.

Security & Best Practices

Sanitize user input if integrating with external systems.
Avoid exposing internal logic or data when adapting for production.

Conclusion

This chatbot project introduces core NLP preprocessing steps and demonstrates how simple logic combined with language resources like WordNet can yield a conversational agent. It's lightweight, extensible, and a great stepping stone toward more advanced AI assistants.