AI  

πŸ“ƒ Text Summarization with T5 and Hugging Face Transformers

πŸ“ƒ Text Summarization with T5 and Hugging Face Transformers

Text summarization is one of the most practical applications of NLP. Whether you're dealing with long articles, reports, or customer feedback, being able to condense text automatically saves time and improves productivity.

Text Summarization with T5 and Hugging Face Transformers

In this tutorial, we'll show you how to use the T5 (Text-to-Text Transfer Transformer) model from Hugging Face to build a text summarizer in Python. You'll learn how to load a pre-trained T5 model, fine-tune it (optional), and wrap it in a simple web app using Gradio.

πŸš€ What You'll Build

  • A Python script that summarizes long text using a T5 model
  • A Gradio web interface for instant summarization
  • Optional: Fine-tune T5 on custom data for domain-specific summaries

πŸ›  Step 1. Install Required Libraries

pip install transformers datasets gradio

πŸ“† Step 2. Load the Pretrained T5 Model

We'll use t5-small for fast inference. You can upgrade to t5-base or t5-large for better results.

from transformers import T5Tokenizer, T5ForConditionalGeneration
model_name = "t5-small"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

πŸ”€ Step 3. Define a Summarization Function

def summarize(text, max_length=150):
Β  Β  input_text = "summarize: " + text
Β  Β  input_ids = tokenizer.encode(input_text, return_tensors="pt", max_length=512, truncation=True)
Β  Β  summary_ids = model.generate(input_ids, max_length=max_length, min_length=30, length_penalty=2.0, num_beams=4, early_stopping=True)
Β  Β  return tokenizer.decode(summary_ids[0], skip_special_tokens=True)

πŸ“„ Step 4. Try It on Sample Text

text = """
The Eiffel Tower is one of the most iconic landmarks in the world. Located in Paris, France, it was constructed for the 1889 World's Fair. Initially criticized by some of France's leading artists and intellectuals for its design, it has since become a global cultural icon and one of the most recognizable structures in the world.
"""

print(summarize(text))

🌐 Step 5. Create a Gradio Web App

import gradio as gr
demo = gr.Interface(fn=summarize, inputs="textbox", outputs="textbox", title="T5 Text Summarizer")
demo.launch()

This launches a local web app where users can paste text and get an instant summary.

βœ… Optional: Fine-Tune T5 on Custom Summarization Dataset

You can fine-tune the model using datasets like CNN/DailyMail, XSum, or your own content.

Use the datasets library to load a dataset:

from datasets import load_dataset
dataset = load_dataset("cnn_dailymail", "3.0.0")
print(dataset["train"][0])

Then format the input as "summarize: <text>" and use Hugging Face's Trainer class to fine-tune.

πŸ”§ Conclusion

With just a few lines of code, you've built a real-time text summarizer using Hugging Face Transformers and T5. This setup can handle everything from news articles and blog posts to research summaries and customer reviews.

C# Corner started as an online community for software developers in 1999.