BusinessObjects Data Integrator Features

Deliver Trustworthy Information

Meet compliance requirements for accurate and auditable information. Meet compliance requirements for accurate and auditable information.

Data Quality
Advanced data profiling allows you to understand the content, structure, quality, and relationships between tables in your source data. The data validation feature helps you build a firewall between your source and target systems to filter out unwanted data based on your business rules. Audit your data throughout the extract, transform, and load (ETL) process. Market leading data quality capabilities are built into the same tool to parse, cleanse, match, and consolidate data such as product, customer, or service data.

Impact Analysis
Interchange metadata with the business intelligence (BI) layer. This metadata integration provides end-to-end impact analysis that allows you to see the effect of changes in the source systems to the BI reports. As a result, you can easily manage change in your BI environment.

Data Lineage
BI users can view the context of the data they are looking at in their BI reports. Users can see when it was updated, how it was computed, and where it came from—all the way back to the original transactional source. This visibility is critical to help users gain better trust in their information.

Maximize IT Productivity

Simplify and accelerate the delivery of your BI projects. Data Integrator provides an innovative environment for designing and managing ETL processes.

Single Graphical Designer
The graphical designer is the single interface for performing all tasks involved with building, testing, debugging, and managing both a data quality and ETL job. Through the interface, you can manage projects; profile data; create ETL jobs; cleanse, validate, and audit data; set parallel job execution; build workflows; and test, debug, and monitor your ETL jobs.

Web-Based Administrator
The web-based administrator allows you to start, stop, schedule, and monitor ETL jobs independent from the design environment.

Powerful Pre-Built Transforms
Perform a complete range of data transformations. Choose from a library of powerful, extensible, and reusable transforms for operations, such as hierarchy flattening for XML files, XML pipelining, pivot and reverse pivot of rows and columns, slowly changing dimensions, data cleansing and matching, change data capture, data validation, and more.

Developer Collaboration
For team-based development, users can securely check work in and out of a central metadata repository. Your users can share and version their work to accelerate development. And you can easily compare differences between objects.

Design Once, Deploy Many
Data Integrator jobs are portable to different database types, versions, and instances. So developers can design an ETL job once and port it to any database environment to accelerate deployment.

BI Metadata Integration
By deeply integrating the ETL process with the BI product used in your data warehousing project, both your IT and business users will gain measurable benefits such as easy metadata management, simplified and unified administration, improved lifecycle management, and reduced maintenance costs. Using Data Integrator, you can automatically create and update a Business Objects semantic layer (i.e., universe) to accelerate BI deployment and simplify change management.

Prebuilt Data Mart Solutions
BusinessObjects Rapid Marts provide prebuilt data marts sourcing from enterprise applications such as SAP, PeopleSoft, Oracle, and Siebel. Get domain knowledge and data integration best practices in pre-built data models, transformation logic, and data extraction.

Standardize on an Enterprise-Class Data Integration Platform

Data Integrator scales to enterprise-level data integration demands and supports parallel processing, distributed processing, real-time data movement, as well as broad source and target support. An open services-based architecture permits third-party integration using standards like CWM, XML, HTTP/HTTPS, JMS, SNMP, and web services.

Parallel Performance
Data Integrator offers comprehensive parallel performance within the same product. This allows you to easily set your Data Integrator job to run in parallel without having to redesign the job in a different UI or product. A new feature, distributed data flows, allows a single dataflow to be split into multiple sub-processes that can be run in parallel across multiple CPUs to enhance performance. So each memory intensive task can become its own process and have its own memory space. And because more memory can be applied to a single job, data integration processes will complete much more quickly.

Push-Down Processing
Data Integrator is the first market leading data integration product to offer both the extract, transform, and load (ETL) and extract, load, and transform (ELT) approaches to maximize performance. In the traditional ETL approach, the powerful Data Integrator engine processes the data. Where possible or by manual setting, Data Integrator will use the ELT approach which pushes the processing down to the source or target database to leverage the power of the database server and minimize network traffic.

Grid computing
Data Integrator offers extreme grid computing capabilities to distribute the processing across a network of servers. Data Integrator can distribute an entire job down or individual components of a job (workflows, data flows, and transformations) over the grid (server group). This allows organizations to deploy many smaller sized servers, all working together in one server group to execute the components of a single job in parallel, greatly improving performance. This highly scalable solution also allows you to simply add an extra machine to the server group to enhance performance, instead of upgrading or replacing existing machines.

Memory and Caching
Data Integrator can cache huge volumes of data regardless of available memory. In other words, cache is virtually unlimited. This feature is important for 32 bit O/S platforms where it is common to have memory constraints and as a result, your ETL job will continue to process even if the memory limit has been reached. With persistent caching, Data Integrator allows cache memory to be reused for a set period of time. For example, commonly referenced tables, such as a currency conversion table, can be used for multiple processes to calculate both price and discount. Caches can also be shared among different data flow processes, reducing the amount of caching needed, freeing up cache for other processes and improving performance.

Self Tuning

Data Integrator provides performance self tuning where Data Integrator learns by seeing the data volumes of previous runs and tunes memory, caching, and indexing appropriately for the data in the next run.

Other enterprise-class performance features include native bulk load for all major databases, change data capture, accelerated Web services processing, advanced aggregation, table partitioning, work load balancing, and 64-bit architecture support for HP-UX Itanium, Sun Solaris, and IBM AIX.

Broad Source and Target Support
Connect to your data where ever it resides—in a relational database, mainframe system, or packaged application. No other product provides deeper metadata-based connectivity to packaged applications from SAP, Siebel, PeopleSoft, JDE, and Oracle. Native connectivity is available for virtually all database types or through ODBC. You can access legacy mainframe applications using integrated IBM technology. Data Integrator also supports flat files, XML, and web services. If your application is proprietary, you can connect to it using a java software development kit (SDK).

Services-Based Architecture
Data Integrator offers comprehensive support for web services and allows any of your batch or real-time data integration jobs to be published as a web service and called from another application. Data Integrator can also call web service-enabled applications to easily access virtually any data.