Pentaho Data Integration And Business Analytics

Pentaho simplifies the preparation and blending of data and offers a huge spectrum of self-service business analytics.

Pentaho is a company offering Pentaho Business Analytics, a suite of business intelligence products that are open source. The products provide OLAP services, data integration, dashboarding, reporting, data mining and ETL capabilities. The company was founded in the year 2004 and headquartered in Orlando, Florida. In 2015 it was acquired by Hitachi Data Systems.


Pentaho was born out of the need to achieve disruptive, positive change in the business analytics market, which is dominated by costly heavy-weight products that are built on outdated technology. Its modern subscription, flexible pricing model, and core-based model offer the most value by far for the cost with surveys that confirm up to 80 percent cost savings against traditional analytics vendors.

The platform simplifies preparation and blending data and provides a wide spectrum of self-service business analytics that includes dashboards, data visualization, reports, discovery and predictive analytics.


Pentaho data integration platform offers the extract, load functionality and transform necessary for integrating a wide range of data sources, which include enterprise apps, relational database, files and big data. The ETL architecture of the platform supports the making and maintenance of target databases like data marts, data warehouse, and data lakes. The product offers data integration portion of the Business Analytics platform that also includes preparation of data and governance capabilities. Data integration could be used alone or in conjunction with tools.

Version 5.0 gives unified, open platform for accessing, integrating and blending any data, in any scenario, across an entire analytics spectrum. A new, modern interface streamlines the analytics experience for all users that transform data to a competitive edge.


The Pentaho suite has two offerings, an enterprise edition, and a community edition. The enterprise edition has extra features that are not found in the community edition. The enterprise edition could be obtained via annual subscription, including extra support services. The core offering of Pentaho is often enahnce4d by add-on products, typically in the form of plug-ins, form the organization itself as well as a broader community of enthusiasts and users.


New Pentaho simplified the user interface. Users could easily browse files, build new content, mark favorites, access documents and more. It has a re-designed experience for administrators. Administrators become more efficient and faster to implement with a seamlessly integrated administrator perspective.


Small and medium businesses and huge enterprises as well use the product to provide a cohesive and comprehensive data integration and business analytics platform. Aside from direct sales, Pentaho embedded OEM network, allowing the vendors to extend products with analytics capabilities and data integration.
Aside from the commercial versions, Pentaho provides an open source version of the data integration product known as Kettle. A lot of organizations initially begin working with the Kettle open source tool for exploring integration capacities for integration workloads that are limited.


The latest Pentaho Data Integration 6.1 version offers the following,

  1. Allow connectivity to a huge array of big data stores, relational databases, files and enterprise apps as sources or targets in integration tasks.
  2. Offers graphical ETL designer, allows data integration teams for designing, testing and deploying integration process, notifications, workflows, and alerts.
  3. Provides an extensive library of prebuilt data integration transformations, which support complex process workflows.
  4. Let users visualize data on data preparation and publish metadata models to the analytics tools.
  5. Offers repository-based development tools which manage design, testing, creation, deployment, and operation of integration processes and support for metadata.


Deepest and broadest big data integration update for popular big data stores. New features for managing huge volumes of data has new capabilities such as rollback, job restart and load balancing.

Simplified integration and extensibility for embedding: New REST services for third-party application developers and an extensive library of samples to eliminate start-up time.


As the first major business intelligence vendor to introduce the big data capabilities in May of 2010, the platform led the charge in big data integration and analytics. The first-mover advantage allowed Pentaho for engaging with big data clients ahead of time and address the first emerging use cases while continuing to deliver tech innovations which keep users ahead of the big data curve.

The robust partner ecosystem of the company includes technology leaders like Cisco, Amazon Web services, HP Vertica, Cloudera, DataStax, Dell, EM Greenplum and much more.

It is no doubt that Pentaho leads the way in big data. The platform saves time and money and is a great advantage for business organizations big and small.