
|
What You Need to Know

|

|
The data integration tools market continues to gain momentum as organizations recognize the role of these technologies in support of high-profile initiatives such as master data management, business intelligence and delivery of service-oriented architectures. Vendor consolidation continues, as does convergence of single-purpose tools. While most vendors still approach this market with multiple products, metadata-driven architectures supporting a range of data delivery styles continue to emerge. Organizations seeking data integration tools must assess their current and future requirements and map them against product functionality, including support for a range of data integration patterns and latencies. In addition, buyers must recognize that as an evolving market, disruptions due to merger and acquisition activity are likely as smaller vendors with valuable technology continue to be subsumed into larger entities to form more complete data integration tools portfolios.

|
|


|
Magic Quadrant

|

|
Figure 1. Magic Quadrant for Data Integration Tools, 2007
Source: Gartner

The discipline of data integration comprises the practices, architectural techniques and tools for achieving the consistent access and delivery of data across the spectrum of data subject areas and data structure types in the enterprise, in order to meet the data consumption requirements of all applications and business processes. As such, data integration capabilities are at the heart of the information-centric infrastructure and will power the frictionless sharing of data across all organizational and system boundaries. Contemporary pressures are leading to an increase in investment in data integration in all vertical industries and geographies. Business drivers, such as the imperative for speed to market and agility to change business processes and models, are forcing organizations to manage their data assets differently. Simplification of processes and the IT infrastructure are necessary to achieve transparency, and transparency requires a consistent and complete view of the data, which represents the performance and operation of the business. Data integration is a critical component of an overall enterprise information management strategy that can address these data-oriented issues.
From a technology point of view, data integration tools were traditionally delivered via a set of related markets, with vendors in each market offering a specific style of data integration tool. Most of the activity has been within the extraction, transformation and loading (ETL) tools market, with a growth in the use of ETL tools for data warehouse implementations and, more recently, other types of data integration problems. Markets for replication tools, data federation and other submarkets each contained vendors offering tools optimized for a particular style of data integration. A variety of other related markets, such as data quality tools, adapters and data modeling tools, also overlap with the data integration tools space. The result of all this historical fragmentation in the markets is the equally fragmented and complex way in which data integration is accomplished in large enterprises different teams using different tools with little consistency, lots of overlap and redundancy, and no common management and leverage of metadata. Technology buyers were been forced to acquire a portfolio of tools from multiple vendors in order to amass the capabilities necessary to address the full range of their data integration requirements.
With the emergence of the data integration tools market, the separate and distinct submarkets continue to converge, both at a vendor and technology level. This is being driven by buyer demands (for example, organizations realizing they need to think about data integration holistically and have a common set of data integration capabilities they can use across the enterprise). It is also being driven by the actions of the vendors (for example, vendors in individual data integration submarkets organically expanding their capabilities into neighboring areas, and acquisition activity bringing vendors from multiple submarkets together). The result is a market for complete data integration tools that address a range of different data integration styles and are based on common design tooling, metadata and runtime architecture. This market has supplanted the former data integration tools submarkets, such as ETL, and becomes the competitive landscape in which Gartner evaluates vendors for placement within this Magic Quadrant. Gartner estimates that the current size of the market for data integration tools is approximately $1.2 billion as of the end of 2006, and that it is growing at a compound annual rate of more than 17% (see "Market Size and Forecast: Data Integration Tools, Worldwide, 2005-2010").

Market Definition/Description
The data integration tools market comprises vendors that offer software products to enable the construction and implementation of data access and delivery infrastructure for a variety of data integration scenarios, including:
- Data acquisition for business intelligence (BI) and data warehousing extracting data from operational systems, transforming and merging that data, and delivering it to integrated data structures for analytic purposes. BI and data warehousing remains a mainstay of the demand for data integration tools.
- Creation of integrated master data stores enabling the consolidation and rationalization of the data, representing critical business entities such as customers, products and employees. Master data management may or may not be subject-based, and data integration tools can be used to build the data consolidation and synchronization processes that are key to success.
- Data migrations/conversions traditionally addressed most often via the custom coding of conversion programs, data integration tools are increasingly addressing the data movement and transformation challenges inherent in the replacement of legacy applications and consolidation efforts during merger and acquisition activities.
- Synchronization of data between operational applications similar in concept to each of the prior scenarios, data integration tools provide the capability to ensure database-level consistency across applications, both on an internal and interenterprise basis, and in a bidirectional or unidirectional manner.
- Creation of federated views of data from multiple data stores data federation, often referred to as enterprise information integration (EII), is growing in popularity as an approach for providing real-time integrated views across multiple data stores without physical movement of data. Data integration tools are increasingly including this type of virtual federation capability.
- Delivery of data services in a service-oriented architecture (SOA) context an architectural technique, rather than a data integration usage itself, data services are the emerging trend for the role and implementation of data integration capabilities within SOAs. Data integration tools will increasingly enable the delivery of many types of data services.
- Unification of structured and unstructured data also not a specific use-case itself, and relevant to each of the above scenarios, there is an early but growing trend toward leveraging data integration tools for merging both structured and unstructured data sources, as organizations work on delivering a holistic information infrastructure that addresses all data types.
Gartner has defined several classes of functional capabilities that vendors of data integration tools must possess in order to deliver optimal value to organizations in support of a full range of data integration scenarios:

Connectivity/Adapter Capabilities (Data Source and Target Support)
The ability to interact with a range of different data structures types, including:
- Relational databases.
- Legacy and non-relational databases.
- Various file formats.
- XML.
- Packaged applications such as customer relationship management (CRM) and supply chain management (SCM).
- Industry-standard message formats such as electronic data interchange (EDI), Society for Worldwide Interbank Financial Telecommunication (SWIFT) and Health Level 7 (HL7).
- Message queues, including those provided by application integration middleware products (such as MQ) and standards-based products (such as Java Messaging Service [JMS]).
- Semi-structured data, such as e-mail, Web sites, office productivity tools and content repositories.
In addition, data integration tools must support different modes of interaction with this range of data structure types, including:
- Bulk acquisition and delivery.
- Granular trickle-feed acquisition and delivery.
- Changed-data capture (ability to identify and extract modified data).
- Event-based acquisition (time-based or data value-based).

Data-Delivery Capabilities
The ability to provide data to consuming applications, processes and databases in a variety of modes, including:
- Physical bulk data movement between data repositories.
- Federated views formulated in memory.
- Message-oriented movement via encapsulation.
- Replication of data between homogeneous or heterogeneous database management systems (DBMSs) and schemas.
In addition, support for delivery of data across the range of latency requirements is important:
- Scheduled batch delivery.
- Streaming/real-time delivery.
- Event-driven delivery.

Data Transformation Capabilities
Built-in capabilities for achieving data transformation operations of varying complexity, including:
- Basic transformations, such as data type conversions, string manipulations and simple calculations.
- Intermediate complexity transformations, such as lookup and replace operations, aggregations, summarizations, deterministic matching and management of slowly changing dimensions.
- Complex transformations, such as sophisticated parsing operations on free-form text and rich media.
In addition, the tools must provide facilities for development of custom transformations and extension of packaged transformations.

Metadata and Data-Modeling Capabilities
As the increasingly important heart of data integration capabilities, metadata management and data-modeling requirements include:
- Automated discovery and acquisition of metadata from data sources, applications and other tools.
- Data-model creation and maintenance.
- Physical to logical model mapping and rationalization.
- Defining model-to-model relationships via graphical attribute-level mapping.
- Lineage and impact analysis reporting, via graphical and tabular format.
- An open metadata repository, with the ability to share metadata bidirectionally with other tools.
- Automated synchronization of metadata across multiple instances of the tools.
- Ability to extend the metadata repository with customer-defined metadata attributes and relationships.
- Documentation of project/program delivery definitions and design principles in support of requirements definition activities.
- Business analyst/end-user interface to view and work with metadata.

Design and Development Environment Capabilities
Facilities for enabling the specification and construction of data integration processes, including:
- Graphical representation of repository objects, data models and data flows.
- Workflow management for the development process, addressing approvals, promotions and so on.
- Granular role-based and developer-based security.
- Team-based development capabilities, such as version control and collaboration.
- Functionality to support reuse across developers and projects, and facilitate identification of redundancies.
- Testing and debugging.

Data Governance Capabilities (Data Quality, Profiling and Mining)
Mechanisms for aiding the understanding and assurance of quality of data over time, including interoperability with:
- Data profiling tools.
- Data mining tools.
- Data quality tools.

Runtime Platform Capabilities
Breadth of support for hardware and operating systems on which data integration processes may be deployed, specifically:
- Mainframe environments such as IBM z/OS.
- Midrange environments such as IBM System i (AS/400) or HP Tandem.
- Unix-based environments.
- Wintel environments.
- Linux environments.

Operations and Administration Capabilities
Facilities for enabling adequate ongoing support, management, monitoring and control of data integration processes implemented via the tools, such as:
- Error-handling functionality, both pre-defined and customizable.
- Monitoring and control of runtime processes.
- Collection of runtime statistics to determine usage and efficiency, as well as an application-style interface for visualization and evaluation.
- Security controls, for both data "in flight" and administrator processes.
- Runtime architecture that ensures performance and scalability.

Architecture and Integration
The degree of commonality, consistency and interoperability between the various components of the data integration toolset, including:
- Minimal number of products (ideally one) supporting all data delivery modes.
- Common metadata (single repository) and/or the ability to share metadata across all components and data delivery modes.
- Common design environment for supporting all data delivery modes.
- Ability to switch seamlessly and transparently between delivery modes with minimal re-work.
- Interoperability with other integration tools and applications, via certified interfaces and robust application programming interfaces (APIs).
- Efficient support for all data delivery modes regardless of runtime architecture type (centralized server engine vs. distributed runtime).

Service-Enablement Capabilities
As acceptance of data services concepts continues to grow, data integration tools must exhibit service-oriented characteristics and provide support for SOA deployments, such as:
- Ability to deploy all aspects of runtime functionality as data services.
- Management of publication and testing of data services.
- Interaction with service repositories and registries.
- Service-enablement of the development and administration environments, such that external tools and applications can dynamically modify and control runtime behavior of the tools.

Inclusion and Exclusion Criteria
For vendors to be included in this Magic Quadrant they must meet the following requirements:
- Possess within their technology portfolio the subset of capabilities identified by Gartner as most critical from within the overall range of capabilities expected in data integration tools. Specifically, vendors must deliver the following functional requirements:
- Range of connectivity/adapter support (sources and targets): native access to relational DBMS products, plus access to non-relational legacy data structures, flat files and XML.
- Mode of connectivity/adapter support (against a range of sources and targets): bulk/batch, plus at least additional mode (real-time or trickle-feed, changed-data capture, or event capture).
- Data delivery modes support: bulk/batch (ETL-style) delivery, plus at least one additional mode (federated views, message-oriented delivery or data replication).
- Data transformation support: at a minimum, packaged capabilities for basic transformations (such as data type conversions, string manipulations and calculations).
- Metadata and data modeling support: automated metadata discovery, lineage and impact analysis reporting, and an open metadata repository including mechanisms for bidirectional sharing of metadata with other tools.
- Design and development support: graphical design/development environment and team development capabilities (such as version control and collaboration).
- Data governance support: ability to interoperate at a metadata level with data profiling and/or data quality tools.
- Runtime platform support: Windows, Unix or Linux operating systems.
- Generate at least $20 million of annual software revenue from data integration tools or maintain at least 300 production customers.
- Support data integration tools customers in at least two of the major geographic regions (North America, Latin America, Europe and Asia/Pacific).
- Have customer implementations that reflect the use of the tools at an enterprise (cross-departmental and multi-project) level.
Vendors focusing only on one specific data subject area (only customer data integration, for example), a single vertical industry, or just their own data models and architectures are excluded from this market.
Gartner notes that many other data integration tools vendors currently exist beyond those included in this Magic Quadrant. However, most do not meet the above criteria and therefore are not included in this analysis. Market trends over the past three years indicate that organizations want to utilize data integration tools that provide flexible data access, delivery and operational management capabilities within a single vendor solution. Excluded vendors frequently provide products to address one very specific style of data delivery (for example, only data federation) but cannot support other styles. Others provide a range of functionality, but operate only in a single region or support only narrow, departmental implementations. Some vendors meet all the functional, deployment and geographic requirements but are very early in their maturity and have limited revenue and few production customers. The following vendors are sometimes considered by Gartner clients alongside those appearing in the Magic Quadrant when deployment needs are aligned with their specific capabilities and/or are newer market entrants with relevant capabilities:
Ab Initio, Lexington, MS, U.S., www.abinitio.com Application development toolbox (Co>Operating System) and component library for metadata management and data integration.
Alebra Technologies, Minneapolis, MN, U.S., www.alebra.com Parallel Data Mover for cross-platform file and database copying and sharing.
Apatar, Chicopee, MA, U.S., www.apatar.com Open source data integration tools focused on ETL and data federation scenarios.
Attunity, Burlington, MA, U.S., www.attunity.com A range of data integration-oriented products, including adapters (Attunity Connect), changed data capture (Attunity Stream) and data federation (Attunity Federate) for various platforms and database/file types.
BEA Systems, San Jose, CA, U.S., www.bea.com AquaLogic Data Services Platform for data source abstraction, access and federation in support of data services delivery.
CA, Islandia, NY, U.S., www.ca.com Advantage Data Transformer provides ETL-oriented data integration. InfoRefiner provides replication and propagation capabilities for mainframe data repositories.
CDB Software, Houston, TX, U.S., www.cdbsoftware.com CDB/Delta provides changed-data capture and replication capabilities for DB2 on the z/OS platform.
CoSort (IRI), Melbourne, FL, U.S., www.cosort.com The Fast Extract and SortCL tools provide for rapid unloading and transformation of data in Oracle databases in support of ETL processes.
Composite Software, San Mateo, CA, U.S., www.compositesw.com Composite Information Server provides data federation/EII capabilities and supports delivery of data services.
Datawatch, Chelmsford, MA, U.S., www.datawatch.com The Monarch Data Pump product provides ETL functionality with a bias toward extracting data from report text, PDF files, spreadsheets and other less-structured data sources.
Denodo, Palo Alto, CA, U.S. and Madrid, Spain, www.denodo.com The Denodo Platform provides data federation and mashup capabilities for joining structured data sources with data from Web sites, documents and other less-structured repositories.
Embarcadero, San Francisco, CA, U.S., www.embarcadero.com The DT/Studio ETL tool provides support for a range of relational and other data sources, and integrates with the vendor's data modeling and database design tools.
ETL Solutions, Blaenau Ffestiniog, North Wales, U.K.,
www.etlsolutions.com Transformation Manager provides a metadata-driven toolset for the authoring, testing, debugging and deployment of various data integration requirements.
Exeros, Santa Clara, CA, U.S., www.exeros.com The DataMapper product automates the process of discerning the business rules that enable mapping and transformation of data between dissimilar data structures.
GoldenGate, San Francisco, CA, U.S., www.goldengate.com Real-time, heterogeneous data replication capabilities provided by the Transactional Data Management (TDM) Software Platform.
Jitterbit, Alameda, CA, U.S., www.jitterbit.com Feely downloadable software with a focus on both application integration (event- and message-based) and data integration.
IKAN Software, Mechelen, Belgium, www.etl4all.com Java-based ETL technology named ETL4ALL, supporting transformation servers on Windows, Linux, Unix and IBM iSeries.
Ipedo, Redwood City, CA, U.S., www.ipedo.com Ipedo XIP provides data federation/EII capabilities with an XML-oriented approach.
Kalido, Burlington, MA, U.S. and London, U.K., www.kalido.com The Kalido Active Information Management Software enables dynamic data modeling and change management for data warehouses and master data environments.
Lakeview Technology, Oakbrook Terrace, IL, U.S., www.lakeviewtech.com Real-time database replication functionality is provided in the MIMIX replicate1 product.
Metatomix, Dedham, MA, U.S., www.metatomix.com Semantics-based approach to creation of data services and federated views of data across multiple data sources.
Pentaho, Orlando, FL, U.S., www.pentaho.org a provider of open-source BI solutions, Pentaho has added data integration tools to its portfolio by leveraging the "Kettle" open-source project and providing services and support.
Progress Software, Bedford, MA, U.S., www.progress.com The DataXtend and DataDirect product lines provide tools for data access, replication and synchronization.
Quest Software, Aliso Viejo, CA, U.S., www.quest.com SharePlex provides real-time replication support for Oracle DBMS environments and is targeted primarily at high-availability applications.
Raining Data, Irvine, CA, U.S., www.rainingdata.com TigerLogic XDMS provides XML-based data federation and persistence, as well as delivery of data services.
Red Hat/Metamatrix, Raleigh, NC, U.S., www.redhat.com The MetaMatrix Server, Enterprise and Query products support creation of data models and model-driven federated views of data.
Relational Solutions, Westlake, OH, U.S., www.relationalsolutions.com The BlueSky Integration Studio provides ETL capabilities in a simplified, low-cost toolset that runs in the Windows environment.
SchemaLogic, Kirkland, WA, U.S., www.schemalogic.com Creation and maintenance of data models (Workshop), business models (SchemaServer), and the ability to propagate models and data across applications (Integration Service).
Software AG, Darmstadt, Germany, www.softwareag.com The Enterprise Information Integrator product provides data federation capabilities and is oriented toward SOA deployments. The vendor's recent acquisition of webMethods adds process-oriented integration capabilities.
Sypherlink, Dublin, OH, U.S., www.sypherlink.com Metadata discovery and mapping via Harvester, and access to data sources for creation of integrated views via Exploratory Warehouse.
Talend, Los Altos, CA, U.S. and Suresnes, France, www.talend.com Open Studio is an open-source tool that primarily supports ETL-oriented implementations and is provided for on-premises deployment as well as in a software-as-a-service (SaaS) delivery model.
Vamosa, Glasgow, U.K. and Cambridge, MA, U.S., www.vamosa.com Provides content integration and migration, aimed at synchronization and consolidation of document repositories, via its Content X-Change and Content Migrator products.
WhereScape, Atlanta, GA, U.S., www.wherescape.com WhereScape RED enables rapid creation and maintenance of data warehouses, including ETL functionality.
Xaware, Colorado Springs, CO, U.S., www.xaware.com Via the XA-Suite product, provides support for the access, integration and service-enablement of data sources.

Hummingbird Connectivity (a division of Open Text).

Ab Initio excluded due to lack of confirmation that the vendor's product capabilities and market presence meet all the inclusion criteria specified for this market analysis.
Embarcadero Technologies excluded because the vendor's product capabilities do not meet all the functional inclusion criteria specified for this market analysis.

In order to emphasize the need to address a range of data delivery styles, metadata management strength and other technical requirements of enterprisewide data integration activities, the "Ability to Execute" criteria in the data integration tools market includes a strong emphasis on product capabilities. The Product/Services evaluation criteria includes all the major categories of functionality described in the market definition above. In this iteration of the Magic Quadrant, in addition to the importance of the functional points that are part of the market inclusion criteria, we place an increased emphasis on metadata management capabilities and support for data governance via data quality capabilities. Also, as this is a dynamic and highly competitive market, Sales Execution/Pricing and Customer Experience (which includes the availability and quality of customer references, as well as overall customer satisfaction) are key, and are therefore also heavily weighted. In this iteration of the Magic Quadrant, we place an increased emphasis on Customer Experience, in particular the quality of customer references. While also important criteria, Overall Viability (which includes an assessment of financial strength and growth), Market Responsiveness and Track Record and Marketing Execution (which reflects the degree of vendor "mind share" in the market) are weighted lower than the other criteria in order to reflect that this is a market that is early in its maturity.
Table 1. Ability to Execute Evaluation Criteria
Product/Service |
high |
Overall Viability (Business Unit, Financial, Strategy, Organization) |
standard |
Sales Execution/Pricing |
high |
Market Responsiveness and Track Record |
standard |
Marketing Execution |
standard |
Customer Experience |
high |
Operations |
no rating |
Source: Gartner

The "Completeness of Vision" criteria most strongly emphasizes an overall market understanding. This criteria includes an assessment of the degree to which the vendor establishes market trends and direction, as well as the ability of the vendor to capitalize on market trends and survive disruptions. Both of these characteristics are crucial in the data integration tools market due to the volatility introduced by merger and acquisition activity, as well as the increasing impact on the market of the largest software vendors in the world. As such, in this iteration of the Magic Quadrant we place additional emphasis on these criteria. In addition, we place a high weighting on Innovation, in the form of the magnitude of R&D investment directed specifically toward data integration tools, and the amount of creativity the vendors exhibit when developing their product and go-to-market strategies. Specific to Offering (Product) Strategy, another highly-weighted criteria on which we place an increased emphasis in this iteration of the Magic Quadrant, a key consideration is the degree of openness of the vendors' offerings. For success in this market, vendors must deliver independence from their own proprietary data models and architectures, and be capable of easily interoperating with architectures and technologies of other vendors. The remaining criteria receive moderate weightings with a slight emphasis on Sales Strategy and Geographic Strategy given the rapidly expanding size of the market.
Table 2. Completeness of Vision Evaluation Criteria
Market Understanding |
high |
Marketing Strategy |
standard |
Sales Strategy |
standard |
Offering (Product) Strategy |
high |
Business Model |
standard |
Vertical/Industry Strategy |
no rating |
Innovation |
high |
Geographic Strategy |
standard |
Source: Gartner

Leaders in the data integration tools market will be front runners in the convergence of single-purpose tools into an offering that supports a range of data delivery styles. These vendors will be strong in the more traditional data integration patterns such as ETL, they will support newer patterns such as data federation, and will provide capabilities that enable data services in the context of SOA. Leaders have significant mind share in the market, and resources skilled with their tools are readily available. These vendors establish market trends, to a large degree, by providing new functional capabilities in their products, and by identifying new types of business problems where data integration tools can bring significant value. Examples of deployments that span multiple projects and types of use cases are commonplace within their customer base.

Challengers in the data integration tools market are well positioned in light of the key trends in the market, such as the need to support multiple styles of data delivery. However, they may not provide comprehensive breadth of functionality, or they may be limited to specific technical environments or application domains. In addition, their vision may be hampered by the lack of a coordinated strategy across the various products in their data integration tools portfolio. Because this is a market that is at a relatively early stage in its maturity, Challengers can vary significantly with regard to their financial strength and global presence. The customer base of Challengers is generally substantial in size, though implementations are often of a single project nature, or reflect multiple projects of a single type (for example, all ETL-oriented use cases).

Visionaries in the data integration tools market will have a solid understanding of the key market trends and a position that is well aligned with current demand, but they may lack the market awareness or credibility beyond their existing customer base or outside of a single application domain, or they may not provide a comprehensive set of product capabilities. Visionaries may be new market entrants lacking the installed base and global presence of larger vendors, though they could also be well-established, large players in related markets and have only recently placed an emphasis on data integration tools.

Niche Players in the data integration tools market have gaps in both vision and ability to execute, often lacking key aspects of product functionality and/or exhibiting a narrow focus within their own architecture and installed base. These vendors have little mind share in the market and are not recognized as proven providers of data integration tools for enterprise-class deployments. Many Niche Players have very strong offerings for a specific range of data integration problems (for example, a particular set of technical environments or application domains) and deliver substantial value for their customers in that segment.

Vendor Strengths and Cautions
- Business Objects supports a range of data integration capabilities via Data Integrator (ETL), Data Federator and Metadata Manager. As a prominent vendor in the market for BI platforms, Business Objects enjoys a large global customer base into which it is actively cross-selling its data integration tools. This strategy is gaining traction, as demonstrated by the vendor's rapidly growing revenue and customer base for Data Integrator.
- A centralized metadata management system comes from the rearchitected data integration offering. Impact analysis, documentation, job monitoring, extensible modeling and a development environment that also ties to the metadata repository create an environment in which developers and administrators can quickly determine what development is under way, what is reused and what will occur as a result of implementation or change. The future road map includes a single metadata repository at the center of the software architecture, which will be a significant improvement over the current unified view over many metadata solutions.
- Business Objects offers integrated or interoperable data profiling and data quality (via the Firstlogic acquisition in 1Q06), and data mining and text mining capability (via the InXight acquisition in 3Q07). The development environment supports the introduction of third-party tools or advanced data manipulation/management capabilities from within Business Objects' own toolset, and permits the development of reusable directory objects based on those tools.

- Customers provide mixed feedback about using Business Objects' data integration tools as enterprise solutions, with some references indicating they use Data Integrator for tactical scenarios as an exception to their standard data integration tools. The use of Data Integrator in applications unrelated to BI is increasing (for example, migrating data between applications as old applications are decommissioned).
- While Business Objects' customer base in this market is growing, the best practices knowledge base is still limited. Business Objects needs to cultivate a stable of system integrator (SI) partners and consultancies and build up the base of skills with data integration tools in the marketplace in general. Data Integrator has a good start in this direction, as it is beginning to be applied in the market to non-BI use cases.
- Business Objects does not have runtime capability on the mainframe, although it does have connectivity for accessing mainframe-based data. With many legacy systems and data warehouse sources still operating in large companies on mainframe systems, this is one of the challenges to Data Integrator becoming a legacy-capable, enterprise solution.

- Cognos' positioning in this market is as a specialist, providing data integration capabilities in support of its BI and performance management offerings. Its data integration tools portfolio includes Data Manager for ETL, Framework Manager for metadata modeling (for both data integration and BI), and federation capabilities delivered via an OEM of technology from Composite Software. In addition, Cognos provides a model-driven approach to data integration with its Adaptive Application Framework to automate data-mart generation for analytic applications.
- The vendor's strategy includes partnering with larger data integration tools vendors (such as Informatica and IBM) when its customers' needs for data integration tools extend beyond BI and performance management use cases or the capabilities of the Cognos tools. A recent example of this is Cognos' reseller partnership with Informatica for providing data profiling and data quality capabilities.
- References are focused on using Data Manager in their Cognos-based BI initiatives. They regularly cite ease of use, low cost and integration with other Cognos products as the main reasons for choosing it to address their ETL needs. Recent enhancements to the product remain focused in the same direction, and include the enhanced creation and management of complex hierarchy structures and tighter integration with the Cognos 8 portfolio. In addition, Cognos has service-enabled Data Manager so that ETL capabilities can be delivered as a service to the full range of Cognos solutions.

- Data Manager is optimized for the delivery of data to dimensionally oriented (star schema-based) structures, and is not commonly seen populating enterprise-scale data warehouses with more detailed and complex (normalized) schemas.
- Cognos provides data federation, caching and access to complex data sources (such as salesforce.com and XML) via the Composite Software OEM. However, few Cognos customers seem to be aware of this option or have implemented it.
- Organizations using the Cognos data integration tools are invariably users of Cognos BI and performance management solutions, and use of the tool outside the boundary of BI and performance management activities appears nonexistent.

- ETI has a long association with the discipline of data integration, having been one of the earliest vendors offering technology in the original ETL tools market. With its mature and proven code-generating architecture that supports a range of platforms, ETI has historically been focused on physical, bulk-data movement and delivery. The vendor is now leveraging these strengths and developing new differentiated offerings via alternative delivery and licensing models, including its built-to-order (BTO) and SaaS products, as well as continuing to enhance its support for service-oriented deployments.
- ETI continues to leverage its strength for supporting multiple hardware and operating system environments, including the mainframe and legacy data sources, and is often seen in sectors such as federal government, where these characteristics, along with large data volumes and a high degree of complexity, are common. This reflects the vendor's roots in the market and the majority of its customer base, but also serves as a differentiator from much of the competition.
- Recent positive developments for ETI include the delivery of integrated data profiling and cleansing capabilities (via OEM partnerships with vendors specializing in those topics), as well as support for multibyte data and Japanese language. The vendor continues to form new partnerships and expand existing ones, with a focus on both technology providers (such as Microsoft and Teradata) and system integrators (including CACI, Booz Allen, IBM and various international service providers).

- Among vendors in this market, ETI's modest size (in terms of revenue and employees) and limited global presence remains a challenge. However, over the last 12 months it has reversed the decline in its customer base and experienced growth to around 450 customers. ETI's brand recognition in the market remains minimal, something it will find challenging to address given its limited resources.
- While ETI has extended its business model to offer customers new ways to consume its technology (such as BTO), it still struggles to keep pace with the market from a product functionality point of view. Recent releases of the product demonstrate progress, including improvements in ease of use, but ETI continues to exhibit weaknesses in areas such as modeling and metadata management.
- In addition, the vendor must continue to innovate and deliver deeper support for delivery styles beyond ETL in order to increase the attractiveness of its technology to end users desiring traditional on-premises deployments. Currently, the biggest gap is in support of data federation, an area which ETI could address either via partnerships or through organic extension of its existing technology.

- Hummingbird Connectivity, a division of Open Text, focuses on integration of data and content via its Genio product. It can interact with most relational DBMS systems (via native SQL, ADO/OLEDB or ODBC), nonrelational legacy systems (such as CICS, Adabas and OS 390), enterprise resource planning (ERP) systems, message oriented middleware and structured files. Due to the capabilities of its parent company, Hummingbird Connectivity is strong in dealing with less-structured sources (content management systems, document repositories and so on), a topic that many of its competitors are only just now beginning to address.
- The company's pre-built functionality (for example, aggregation and statistical functionality, string functions, and numeric and date functions) is beginning to rival better-known market leaders, but relies on a script-builder interface for transformation functionality, including text mining.
- Metadata capabilities include dynamic impact analysis that provides feedback to developers relative to design changes vs. prior versions. The tool can also import/export metadata with a wide variety of design tools (via XMI standards).

- Genio provides native connectivity to SAP but relies on Web services for other ERP solutions. With no federation, replication or direct data quality support, the product becomes utilitarian for its current capabilities, and the gap between this tool and those of the market leaders will be difficult to close.
- Genio "Polling" and "Event Mechanism" support event capture via specified scheduled or data evaluation rules. However, other real-time support relies on leveraged message-oriented middleware (MOM) capabilities, database utilities or other third-party changed-data capture (CDC) solutions potentially weakening real-time and EAI support based on organizations' needs. Metadata is also a caution for Genio. It focuses primarily on the documentation role of metadata, with limited emphasis on system-level auditing, and using metadata in support of real-time messaging to end-user analytic tools.
- Hummingbird's mind share and visibility in the data integration tools market has been on the decline over the last few years. Gartner receives minimal inquiries about the vendor in this market, and Open Text retains its major focus on content management applications.

- A division of Information Builders, iWay Software creates and sells Information Builders' integration technologies, with the goal of building an integration software business independent of the BI capabilities for which Information Builders is well known. iWay offers capabilities for physical data movement and delivery (via its Data Migrator ETL tool), data federation (via the iWay Data Hub product) and real-time application integration (supported by the Service Manager product).
- The products are very well integrated and leverage common infrastructure, such as an extensive adapter suite, and can be deployed on a wide range of platforms, including the mainframe. During 2007, iWay added data quality capabilities via a partnership with Trillium Software, but has only basic data profiling capabilities within Data Migrator.
- Information Builders' size and global presence afford iWay a strong foundation from which to execute its growth strategy. Customer references cite a short learning curve and implementation, lower cost, and integration with the Information Builders BI products (specifically WebFOCUS) as main drivers for their selections of iWay data integration tools.

- Most customer references reveal many ETL-only implementations of a tactical nature, such as one-time data conversions, or supporting less-critical portions of broader data integration processes implemented with competitive vendors' technology. However, mission-critical data integration processes and deployments involving multiple products and scenarios (for example, Service Manager capturing events and populating an operational data store, with Data Manager loading a data warehouse for analysis via Information Builders' BI capabilities) are becoming more common.
- iWay's product capabilities are well aligned with the evolving needs of the data integration tools market, but one of the vendor's biggest challenges is gaining recognition outside the Information Builders customer base. iWay has maintained a low profile, selling into existing Information Builders accounts and entering OEM relationships with other vendors. The fact that some of the products, such as Data Migrator, are co-branded as Information Builders products will make it difficult for iWay to quickly and substantially improve this situation.

- IBM demonstrates the best vision in the market for extensive data integration capabilities, as it continues to progress toward bringing together all its data integration components atop common metadata, common design tooling, and a common look and feel. In addition to the new Information Server flagship product, IBM provides a wide variety of legacy data integration technology, including products such as Classic Federation Server, Data Event Publisher, and the recently acquired replication and changed-data capture technology from DataMirror.
- The IBM Information Server is a comprehensive data integration platform that includes solutions for ETL, data federation, data replication, data profiling and cleansing. With the Metadata Server at its core, Information Server enables various integration modules, such as DataStage, Federation Server, Information Analyzer and QualityStage components for data profiling and cleansing, and the Metadata Workbench. The Transformation Extender (formerly DataStage TX), used for heavy transformation requirements, was moved to the WebSphere business integration family.
- Although the Information Server release is still relatively new and few customers have yet upgraded from previous versions of the various components, initial feedback is positive regarding the scalability, performance and usability. In particular, the new user interface and the central metadata repository are cited as significant improvements.

- IBM is only slowly getting sales traction for the broad data integration platform. Most customers focus on selective products, most often DataStage only, due to relatively high price points and the project-specific nature of their work.
- IBM has acquired a large number of technologies in the recent past and must continue its product integration efforts. The vendor has a number of legacy data integration products; for example, the Classic Connector for z/OS and the Classic Federation Server, that need to be migrated to the Information Server architecture. In addition, the newly acquired products from DataMirror and IBM's own EAI products in the WebSphere Business Integration family have not been integrated, nor is there an official road map. IBM has missed the opportunity to articulate a strategy to further expand its data integration vision by including its customer data integration and master data management solutions.
- IBM has to be mindful of sustaining its marketing messages relative to the various data integration offerings. Because the information-on-demand message has become so broad and includes so many offerings, Information Server does not get the same attention as the offerings of some of its more focused competitors.

- As one of the most widely recognized providers in this market, Informatica continues to grow its presence, with an installed base of nearly 3,000 customers. Based on the PowerCenter platform, which has its roots in traditional ETL-style data delivery, the vendor's data integration capabilities continue to expand with both organic functional additions and via acquisition (such as the 4Q06 purchase of Itemfield's complex transformation development and implementation technology).
- Informatica is respected for its consistent track record of delivering solid technology, regular releases, and a positive service and support experience. In addition, the vendor continues to enjoy a strong "ecosystem" of technology and service partners, including the recent addition of a significant OEM agreement where SAP will offer PowerCenter and Metadata Manager to its customers.
- Informatica continues to further its positioning toward support for SaaS use cases with the initial introduction of Informatica On Demand, a hosted solution for synchronization of data with salesforce.com. Expanded functionality in the 8.5 release of the products such as masking of sensitive data and test data generation, real-time replication, and broader metadata discovery will help to further extend Informatica's footprint.

- Informatica still has a minority of customers using its technology for data integration patterns other than bulk, batch-oriented data delivery. However, a significant portion of the customer base is already using PowerCenter and related products in support of master data management initiatives, data migrations, synchronization of data between operational applications, and other use cases outside of the BI and data warehousing domain.
- While it supports data federation (as an option for PowerCenter), real-time changed-data capture and propagation via its PowerExchange products, and the ability to interact with message queues (such as IBM WebSphere MQ, JMS and others via the Real Time Option for PowerCenter), production implementations using these capabilities are not yet commonplace. As demand for these implementation styles continues to build, Informatica will need to further expand these capabilities and emphasize them in its marketing, as various competitors are more proven in these areas.
- Informatica has continued to post solid financial results, with more than two years of consecutive quarterly license revenue and earnings growth. However, with price points toward the high end of the range in this market, it will come under increasing pricing pressure as lower-cost and open-source offerings gain traction.

- Microsoft's main offering related to this market is SQL Server Integration Services (SSIS). With SSIS, Microsoft provides support for bulk data movement (via ETL) and the ability to interface easily with BizTalk Server, which provides real-time, message-based capabilities. Microsoft also offers data replication and synchronization capabilities, and limited data federation support (all rising components in the high-availability data warehouse), within the SQL Server DBMS. SSIS includes an extensible tool using scripts, SQL and any .NET language, and includes an instance manager in the SSIS controls. Microsoft's offering includes adapters for applications and across platforms with the shipped product (for example, PeopleSoft, SAP or mainframe and midrange DB2).
- SSIS includes standardized data integration best practices for data warehouses, including built-in features to manage dynamic and slowly changing dimensions. This supports data mart-style tables in relational DBMSs or within (SQL Server Analysis Services [SSAS]) Analysis Server-deployed data marts, making it easier to move atomic warehouse data to version-controlled data marts. Microsoft has made a significant effort to ensure the integration offering is independent of its own BI tool for general data integration use but, at the same time, it has built-in capability to directly support SSAS. In addition, Microsoft provides basic data quality functionality within SSIS that is equivalent to other vendors' basic offerings.
- Microsoft's size and global presence provide a huge customer base for best practices, excellent support and a distribution model that supports both direct and channel partner sales.

- "Free" or "included" does not equate to "no cost." Microsoft indicates that it is often desirable to deploy separate servers for SSIS, Analysis Services and Reporting Services as demand increases. Gartner customers report that this is a best practice usage pattern, as well. An all SQL Server-based deployment is recommended for licenses on each server (SSIS, SSAS and SSRS), and the hardware itself must be purchased. These costs remain significantly lower than those of many of its competitors.
- There is some lack of native connectivity for various data sources. The product includes support for Extended Binary-Coded Decimal Interchange Code (EBCDIC) text files, but sources of a legacy or non-relational variety must be addressed via adapters provided by Microsoft partners, or by using the separately licensed Host Integration Server (which provides connectivity for DB2/Universal Database [UDB] and Virtual Storage Access Method [VSAM]).
- Customer references show that the Microsoft data integration tools are most often used in Microsoft-centric environments (often seen in midsize businesses) and for more tactical ETL use cases in diverse environments. Microsoft must continue to increase its ability to deal with the heterogeneous platforms and data source types that are commonplace in large enterprises.

- Oracle DBMS licenses include Oracle Warehouse Builder (OWB), and the company also offers Oracle Data Integrator (ODI), Streams and limited federation capabilities (in its BI tools). Several advanced features of OWB (such as increased scalability, impact analysis and lineage reporting, and data profiling) are offered as options at an additional cost. Oracle also provides database replication capabilities and limited support for federated queries in the Oracle DBMS, thus providing additional data integration capability.
- Oracle offers good market acceptance in a large customer base, global presence and proven viability. Oracle has delivered solid, continual releases of OWB and, as a result, has seen significant uptake in Oracle-centric environments due to the architecture of the tool. With add-on gateway products, OWB can also target non-Oracle DBMSs. This is an extension of a philosophy that data resides in locations rather than a directional designation, such as sources or targets, and that a location could be any type of repository. ODI adds to the capability to better support heterogeneity.
- Better integration has been provided in the product set in 2007, with Oracle Database 11g, including the ability to leverage incremental updates to materialized views and cubes in Oracle data warehouses, as well as a database-integrated data-masking capability to support development environments involving sensitive data.

- Application of the Oracle tools remains limited to specific use cases. Customer interactions reflect the implementation of OWB primarily for ETL scenarios. The Sunopsis acquisition of 2006, which introduced a data integration services tool, saw that code base transferred to the Fusion Middleware solution set. Oracle has crafted clear messages about the use of its various tools and users are cautioned to avoid their "misapplication."
- Some former PeopleSoft and Siebel customers have competitive data integration tools in-house (such as Informatica PowerCenter or IBM DataStage) as part of the data warehousing offerings associated with these applications (for example, EPM, OneWorld and Siebel Analytics). While Oracle does not restrict these customers from continuing to leverage their investments in these incumbent tools, this presents another challenge for Oracle in trying to grow market awareness that it is a serious data integration tools player and has heterogeneous capabilities.
- Oracle follows a very traditional import/export metadata sharing approach, which includes manual metadata integration for a complete solution. Oracle has many data management products, including its data hub products for MDM and several data integration tools, each with its own metadata to contribute ultimately to the manual portion of the metadata management environment.

- Pervasive offers a solid, low-priced and moderately advanced data integration tool. The company is an unassuming but capable player in the data integration tools market. With many years of experience and more than 3,500 customers, Pervasive Data Integrator is frequently used for ETL-style data delivery, and also supports EAI and real-time messaging-style solutions. As a small company, customers expect more intimate support and service and customer references report that Pervasive support was "the most responsive company [sic] ever dealt with."
- Pervasive makes good use of its metadata capability for standards use and SOA. The design time interface produces an XML metadata repository and supports structured, semistructured and unstructured data extraction (including a document schema designer for pulling recurring, untagged data from documents). It supports X.12, Health Insurance Portability and Accountability Act (HIPAA) and HL7 delivered using services- or message-based data transfer for both intra- and interenterprise integration. From an SOA perspective, Pervasive supports a Web service "invoke" utility and the ability to expose any of its data integration capabilities as a service. Additionally, product road maps indicate an even more service-oriented development, deployment and services management future.
- With a wide channel partnering strategy, Pervasive is building up annuity revenue, even though it has a smaller average selling price. Pre-built packaged application adapters include SAP, Siebel and Saleforce/AppExchange (Pervasive was one of the first vendors to provide such an interface), among others. Attractive pricing relative to market leaders draws interest from end-user organizations and independent software vendor (ISV) partners.

- Customers report using the tools primarily in tactical situations, as opposed to enterprisewide. Recent references indicate a broader use of the solution throughout the enterprise and, frequently, embedded Pervasive solutions are transparent to end users.
- With no federation capability (although the "building blocks" of data access, joins and so on are present) and data cleansing provided via partners (the vendor does offer data profiling capabilities), Pervasive remains primarily a data mover and needs to progress in its data quality and various delivery modes before it can move forward.

Pitney Bowes Group 1 Software
- Pitney Bowes Group 1 Software, a division of mailstream hardware and services vendor Pitney Bowes, competes in the data integration tools market via its DataFlow offering. These products were originally developed by Sagent Technology, which was acquired by Group 1 in 2003, and became part of the Pitney Bowes software portfolio through its subsequent acquisition of Group 1 in 2004. DataFlow primarily supports ETL implementation patterns, although limited data federation scenarios can also be achieved.
- Group 1 supports a modest customer base of just over 400 direct DataFlow customers, although the total size of the DataFlow installed base is larger as a result of OEM agreements in which DataFlow is embedded in other vendors' products. Implementations generally reflect traditional ETL use cases in the BI domain, with a mix of departmental implementations and enterprisewide implementations in midsize businesses.
- Customer references consistently indicate strong ease of use, lower cost (relative to market leaders) and solid ETL functionality as the reasons for their selection or continuing use of DataFlow.

- Group 1 is not often seen actively competing against the market leaders for new data integration tools opportunities at the enterprise level. Its base of direct customers appears to be flat to slightly declining. However, Pitney Bowes is attempting to address this by realigning itself for a greater focus on its software offerings and the synergies between them.
- Building brand awareness and recognition remains a major challenge for Group 1, as "customer communications management" and other mailstream-oriented messages seem to take precedence within the vendor's strategy. However, the DataFlow technology is domain-agnostic, and Group 1 is attempting to actively sell it into scenarios that are not related to customer data management.
- With such a strong focus on ETL, the vendor will be increasingly challenged to expand its capabilities as market demand and more competitors turn their attention toward additional modes of data delivery, as well as to non-BI use cases. In addition, Group 1 will need to quickly execute on its product road map to fill in key gaps relative to market demand specifically, data profiling capabilities, additional connectivity (such as application adapters), and metadata-level integration with, and cross-product leverage of, the SOA capabilities of Group 1's proven customer data quality tools.

- SAP's huge installed base, brand recognition and global presence give the vendor many opportunities to push its data integration capabilities into the market. Heavily SAP-centric organizations will find most of the typically required core data integration functionality in the NetWeaver portfolio. The integration capabilities enable customers to conveniently pull data into NetWeaver BI, and provide EAI and process integration through NetWeaver PI.
- SAP provides data modeling and metadata management that can be used across multiple integration scenarios.
- To provide customers with expanded data integration capabilities, in particular access to many non-SAP data sources feeding SAP NetWeaver BI, SAP has struck an OEM agreement with Informatica. In this, SAP provides Informatica PowerCenter, PowerExchange and Metadata Manager as part of NetWeaver BI, Performance Management or Master Data Management.

- Because SAP's data integration strategy focuses only on SAP environments, the SAP data integration product set should not be considered as a general-purpose data integration platform. Moving SAP NetWeaver BI data to non-SAP targets requires purchasing an OpenHub license.
- Although SAP planned to enhance its data integration portfolio when it acquired the assets of San Francisco-based Callixa (a startup firm building EII software) back in September 2005, there is still no federation product available from the company. In the meantime, SAP still has a relationship with MetaMatrix (now owned by Red Hat), a specialist vendor of metadata management and data federation technology, whose technology SAP licensed and is in use by a subset of the SAP customer base.
- SAP has delivered basic data quality functionality for Netweaver MDM, but has not built general-purpose data quality capabilities for broader use throughout its data integration tools, and the new Informatica OEM agreement does not include Informatica's data quality components. Customers must work with NetWeaver-certified data quality solution providers, such as Trillium Software or Dun&Bradstreet, to provide extensive data cleansing or enrichment functionality for use outside Netweaver MDM.

- SAS Institute has been enabling companies to do data integration activities for decades. Most of this relates to BI and data warehousing, given the vendor's traditional focus on analytics. SAS's primary product in the data integration tools market is the Enterprise Data Integration Server, providing capabilities such as packaged transformations, metadata management, parallel processing, load-balancing and, through the vendor's DataFlux subsidiary, rich data quality functionality (profiling, cleansing, matching and en
| |