I didn’t start using OpenSearch because I wanted a search engine.
I started using it because our engineering team was drowning in logs.
We were managing multiple microservices, each writing its own log file.
During incidents, everyone would open five terminals and manually grep through gigabytes of text. It worked… until it didn’t.
That is where OpenSearch entered the picture.
Why We Picked OpenSearch
The decision wasn’t driven by something fancy. We needed:
A way to search logs quickly
A dashboard to visualize spikes and failures
A tool that didn’t require a license nightmare
Something easy to deploy on regular infrastructure
OpenSearch checked all those boxes, and more importantly, we could start small.
The Practical Example: API Error Monitoring System
Let me walk through the exact setup we built — simple but powerful.
Step 1: Sending Logs to OpenSearch
Each microservice (Node.js, Java, and Python) wrote logs in JSON format.
Instead of shipping raw text, we forwarded logs to OpenSearch using Filebeat.
A typical log entry looked like this:
{
"timestamp": "2025-01-15T14:21:10Z",
"service": "payment-api",
"level": "error",
"message": "Transaction failed due to invalid token",
"userId": "U3021",
"latencyMs": 842
}
The moment logs started flowing in, OpenSearch automatically created an index.
But this is where we made the first improvement: we defined our own index mapping.
Why? Because the default mapping treated latencyMs as text.
Try running aggregations on text — it won’t end well.
Our custom mapping:
{
"mappings": {
"properties": {
"timestamp": { "type": "date" },
"service": { "type": "keyword" },
"level": { "type": "keyword" },
"message": { "type": "text" },
"userId": { "type": "keyword" },
"latencyMs": { "type": "integer" }
}
}
}
Just this small step improved reliability dramatically.
Step 2: Querying Errors Using DSL
Once the logs landed in OpenSearch, we started exploring the Query DSL.
One of the first real queries we wrote was:
“Show me all payment-api errors in the last 15 minutes.”
{
"query": {
"bool": {
"must": [
{ "term": { "service": "payment-api" }},
{ "term": { "level": "error" }}
],
"filter": {
"range": {
"timestamp": {
"gte": "now-15m"
}
}
}
}
}
}
The speed surprised everyone.
What took minutes with manual grepping became a sub-second search.
Step 3: Error Spike Alerts
Once we saw how quickly queries worked, we started thinking:
Can OpenSearch tell us when something unusual happens?
Yes — through alerting.
We set up a rule:
Every minute, count errors in the payment-api service
If error count rises above 20 per minute
Send a Slack alert to the DevOps team
Aggregation used for the rule:
{
"size": 0,
"aggs": {
"errors_per_minute": {
"date_histogram": {
"field": "timestamp",
"interval": "1m"
},
"aggs": {
"error_count": {
"filter": { "term": { "level": "error" }}
}
}
}
}
}
This small automation saved us from multiple late-night outages.
Step 4: Creating a Dashboard
Once data came alive, dashboards became addictive.
Our main dashboard included:
One chart that the business team loved showed API latency trends.
It revealed something unexpected:
The payment API slowed down only during weekends — which aligned exactly with peak traffic.
We would never have spotted this pattern with plain log files.
What We Learned the Hard Way
1. Bad mappings lead to painful debugging
Our first few days were spent wondering why aggregations didn’t work.
The root cause: OpenSearch treated numbers as strings.
2. Too many small indices slow down queries
We initially created a new index every hour.
It felt organized, but it wrecked performance.
Switching to one daily index fixed it.
3. Dashboards can become operational tools
Developers started watching the dashboard after every production deployment.
This helped catch issues faster than waiting for users to report them.
4. Snapshots are not optional
We once lost data due to a corrupted node and no snapshots.
After that, automated S3 snapshots became mandatory.
Where OpenSearch Fits Best in Daily Engineering Work
Through this practical use-case, a few patterns emerged:
It excels in log analytics
It is perfect for building search features quickly
It works well for real-time trend visualization
Alerts reduce manual monitoring drastically
But it’s not meant to replace transactional databases or heavy BI systems.
Its strength lies in fast search and analysis.
Final Thoughts
Our first OpenSearch setup was extremely simple, but it brought an immediate jump in visibility and debugging speed.
Over a few months, it went from a basic log viewer to a crucial monitoring system for all microservices.
For anyone starting out, I’d recommend:
OpenSearch grows with you, and the more data you feed it, the more valuable it becomes.