Data Pump Quick Reference

posted Aug 5, 2011, 6:32 AM by Sachchida Ojha

Data Pump Components:

  • Direct Path API (DPAPI): Oracle Database 10g supports a direct path API interface that minimizes data conversion and parsing at both unload and load time.
  • External Table Services: Data Pump uses the new ORACLE_DATAPUMP access driver that provides external tables write and read access to files containing binary streams.
  • The DBMS_METADATA package is used by worker processes for all metadata unloading and loading. Database object definitions are stored using XML rather than SQL.
  • The DBMS_DATAPUMP package embodies the API for high-speed export and import utilities for bulk data and metadata movement.
  • The SQL*Loader client has been integrated with external tables, thereby providing automatic migration of loader control files to external table access parameters.
  • The expdp and impdp clients are thin layers that make calls to the DBMS_DATAPUMP package to initiate and monitor Data Pump operations.

Master Table

  • During the operation, a master table is maintained in the schema of the user who initiated the Data Pump export. The master table has the same name as the name of the Data Pump job. This table maintains one row per object with status information. In the event of a failure, Data Pump uses the information in this table to restart the job. The master table is the heart of every Data Pump operation; it maintains all the information about the job. Data Pump uses the master
    table to restart a failed or suspended job. The master table is dropped (by default) when the Data Pump job finishes successfully.
  • The master table is written to the dump file set as the last step of the export dump operation and is removed from the user’s schema. For the import dump operation, the master table is loaded from the dump file set to the user’s schema as the first step and is used to sequence the objects being imported.

Data Pump Processes

All Data Pump work is done though jobs. Data Pump jobs, unlike DBMS jobs, are merely server processes that process the data on behalf of the main process. The main process, known as a master control process, coordinates this effort via Advanced Queuing; it does so through a special table created at runtime known as a master table.

  • Client process: This process is initiated by the client utility: expdp,impdp, or other clients to make calls to the Data Pump API. Since the Data Pump is completely integrated to the database, once the Data Pump job is initiated, this process is not necessary for the progress of the job.
  • Shadow process: When a client logs into the Oracle database, a foreground process is created (a standard feature of Oracle). This shadow process services the client data dump API requests. This process creates the master table and creates Advanced Queuing (AQ) queues used for communication. Once the client process is ended, the shadow process also goes away.
  • Master control process (MCP): Master control process controls the execution of the Data Pump job; there is one MCP per job. MCP divides the Data Pump job into various metadata and data load or unload jobs and hands them over to the worker processes. The MCP has a process name of the format ORACLE_SID_DMnn_PROCESS_ID. It maintains the job state, job description, restart, and file information in the master table.
  • Worker process: The MCP creates the worker processes based on the value of the PARALLEL parameter. The worker process performs the tasks requested by MCP, mainly loading or unloading data and metadata. The worker processes have the format ORACLE_SID_DWnn_PROCESS_ID. The worker processes maintain the current status in the master table that can beused to restart a failed job.
  • Parallel Query (PQ) processes: The worker processes can initiate parallel query processes if external table is used as the data access method for loading or unloading. These are standardparallel query slaves of the parallel execution architecture.

 Data Pump Benefits

  • Data access methods:
    – Direct path
    – External tables
  • Detachment from and reattachment to long-running jobs
  • Restarting of Data Pump jobs
  • Fine-grained object and data selection
  • Explicit database version specification
  • Parallel execution
  • Estimation of export job space consumption
  • Network mode in a distributed environment: the data pump operations can be performed from one database to another without writing to a dump file, using the network method.
  • Remapping capabilities during import
  • Data sampling and metadata compression

 Create Data Pump Directory

  • Since the Data Pump is server based, directory objects must be created in the database where the
    Data Pump files will be stored.
  • The user executing Data Pump must have been granted permissions on the directory. READ permission is required to perform import, and WRITE permission is required to perform export and to create log files or SQL files.
  • The user (who owns the software installation and database files) must have READ and WRITE operating system privileges on the directory. BUT the database user does NOT need any operating system privileges on the directory for Data Pump to succeed.

SQL> SELECT * FROM dba_directories;

SQL> CREATE OR REPLACE DIRECTORY data_pump_dir AS 'c:\temp';
-- Set up default directory data_pump_dir (The name of the default directory must be DATA_PUMP_DIR)



Export Modes

1. Full

expdp \'/ as sysdba\'  DIRECTORY=dpdata DUMPFILE=exp%U.dmp FULL=y LOGFILE=exp.log JOB_NAME=expdp PARALLEL=30

2. User (Owner)   
expdp \'/ as sysdba\'  DIRECTORY=dpdata DUMPFILE=scott.dmp SCHEMAS=scott LOGFILE=exp.log JOB_NAME=expdp

3. Table   
expdp \'/ as sysdba\' DIRECTORY=dpdata DUMPFILE=scott.dmp TABLES=scott.emp,blake.dept LOGFILE=exp.log JOB_NAME=expdp

4. Tablespace
expdp \'/ as sysdba\' DIRECTORY=dpdata DUMPFILE=exp.dmp TABLESPACES=users, tools TRANSPORT_FULL_CHECK=y LOGFILE=exp.log JOB_NAME=expdp

5. Transportable Tablespace Export


expdp \'/ as sysdba\' DIRECTORY=dpdata DUMPFILE=exp_tbs.dmp


6. Export metadata

expdp \'/ as sysdba\' SCHEMAS=scott DIRECTORY=dpdata DUMPFILE=meta.dmp CONTENT=metadata_only


Import Modes

1. Full

impdp  \'/ as sysdba\' DIRECTORY=dpdata DUMPFILE=exp.dmp FULL=y LOGFILE=imp.log JOB_NAME=impdp
impdp \'/ as sysdba\' DIRECTORY=dpdata DUMPFILE=exp%U.dmp FULL=y LOGFILE=imp.log JOB_NAME=DMFV_impdp PARALLEL=30

2. User (Owner)   
impdp  \'/ as sysdba\'  DIRECTORY=dpdata DUMPFILE=scott.dmp SCHEMAS=scott LOGFILE=imp.log JOB_NAME=impdp

3. Table 

impdp \'/ as sysdba\' DIRECTORY=dpdata DUMPFILE=scott.dmp TABLES=scott.emp,blake.dept LOGFILE=imp.log JOB_NAME=impdp

4. Tablespace

5. Transportable Tablespace


Other Features






    Data Pump Monitoring

    1. Script:


  • dpstatus.sql

  • TTITLE 'Currently Active DataPump Operations'
    COL owner_name          FORMAT A06      HEADING 'Owner'
    COL job_name            FORMAT A20      HEADING 'JobName'
    COL operation           FORMAT A12      HEADING 'Operation'
    COL job_mode            FORMAT A12      HEADING 'JobMode'
    COL state               FORMAT A12      HEADING 'State'
    COL degree              FORMAT 9999     HEADING 'Degr'
    COL attached_sessions   FORMAT 9999     HEADING 'Sess'

      FROM dba_datapump_jobs

    TTITLE 'Currently Active DataPump Sessions'
    COL owner_name          FORMAT A06      HEADING 'Owner'
    COL job_name            FORMAT A20      HEADING 'Job'
    COL osuser              FORMAT A12      HEADING 'UserID'

         dba_datapump_sessions DPS
        ,v$session S
     WHERE S.saddr = DPS.saddr


    how many worker processes (column DEGREE) are working on the job.


    when joined with the previous view and V$SESSION gives the SID of the session of the main foreground process.

    SQL> select sid, serial# from v$session s, dba_datapump_sessions d where s.saddr = d.saddr;

    4. Alert log

    When the process starts up, the MCP and the worker processes are shown in the alert log as follows:
    kupprdp: master process DM00 started with pid=23, OS id=20530 to execute -   SYS.KUPM$MCP.MAIN('CASES_EXPORT', 'ANANDA');
    kupprdp: worker process DW01 started with worker id=1, pid=24, OS id=20532 to execute - SYS.KUPW$WORKER.MAIN('CASES_EXPORT', 'ANANDA');

    kupprdp: worker process DW03 started with worker id=2, pid=25, OS id=20534 to execute - SYS.KUPW$WORKER.MAIN('CASES_EXPORT', 'ANANDA');


    predict the time it will take to complete the job.

    select sid, serial#, sofar, totalwork from v$session_longops where sofar != totalwork;

    Controlling Data Pump Jobs

    1. Switching Between Logging and Interactive Client Mode:

    expdp system/oracle parfile=c:\rmancmd\longexport.dpectl

    1.1 switch this job to interactive client mode by typing CTRL-C
    1.2 switch from interactive client mode back to logging mode

    2.Viewing Job Status:

    From interactive mode

    Export> status

    3. Closing and Reattaching To A Job

    3.1 detach from a client session, but leave a running job still executing

    Export> exit_client

    3.2 reattach to a running job after closing the client connection, reopen a session by
    $ expdp system/oracle attach=longexport

    4. Halt the execution of the job:

    Export> stop_job

    restart the job:
    Export> START_JOB

    5. End the job

    Export> kill_job
  • Comments