The DIA servers are well-suited to host a number of industry data loading tools. These tools perform the tasks of Extract, Transform, and Load (ETL) of source data into Greenplum databases and data warehouses.
Informatica™ is a popular ETL tool. A large percentage of EMC’s customers are Informatica users. The DIA servers serves as the Integration service servers (RedHat Linux hosts.)
Installing Informatica on a DIA server
Before installing Informatica on a DIA server, install the Informatica Repository server, or take note of an existing one. Install Informatica on the DIA by logging into the DIA host as root or gpadmin. Inflate the zip file by using gunzip, and run the install.sh script. Follow the on-screen prompt and complete the installation process.
When asked if you want to create a new Domain, or join an existing Domain, join the existing domain created on the Repository server.
When the install script is done, go to the Repository server, and start the Administrator’s web console. You should be able to add the new Integration server you have just installed (on the DIA server) to the Domain.
Installing Informatica Integration Services for Windows
In the previous section, we took advantage of the DIA servers being RedHat Linux Servers, and installed the Linux version of Informatica Integration service. If a
customer would like to use the Windows version of Informatica Integration service, he or she can do so, but will first have to install Windows operating systems over the RedHat Linux operating system.
After the Windows operating systems is installed on the DIA server, the installing of the Informatica Integration carries on as usual, and should be added to the
Informatica domain as described above.
For Windows users, you will have to install the Greenplum loaders package to include the gpfdist utility, and also install the Python language. Python is an open
source software that can be downloaded freely from the Internet. Currently, version 2.6 and above should work well for GPDB 22.214.171.124 and above. For older
versions of the GPDB, you may wish to check the Greenplum Database Administrator’s Guide or try version 2.5.4.