CopySchema

Christophe Marcel
5y
16.6k
0
0

Article

Introduction

CopySchema is a utility for copying all the data from one Oracle schema to another.

Although similar to the exp/imp and bulk load utilities of Oracle, it has the advantage of offering a simple GUI interface and of handling changes to the schema. For example, if you are copying data from your version 1.x schema to the 1.x+1 schema (in which you have added a few columns or tables and such).

Perhaps of greater interest, I have diligently documented the C# and C++ source code. Most open-source projects I've examined have such poor documentation that it makes modifying the code very difficult, and in particular in the cases where you want to learn from the code.

The code may be interesting to those interested in

Multithreaded programming (C#)
ADO.net to access Oracle (C#)
OTL to access Oracle (C++)
A simple example of .net Interop (accessing COM components from C#, in this case)
A well documented and explained Object-Oriented design

Origin

The utility was originally written for 2ndSight Software while I worked there as a programmer. The company graciously gave me rights to this little program for me to sell as shareware. Ultimately, I prefer to simply offer it for free and open the source code.

Intended Use

This should be useful in a production environment where data is updated by batch and rolled out regularly. SQL scripts would be used to create the new (empty) production schema, after which CopySchema would copy the latest data to it.

One advantage with CopySchema is that it compares the two schemas and skips the tables and fields that do not match between the two schemas. The comparison is made only with respect to the names (not the types of the fields or their lengths), but in a hectic production environment, it helps to see what may have been missed in the rollout scripts, or what has changed since the last rollout.

Another possible use of CopySchema is to start the process of two schemas to see a list of the differences between the two. As long as the user clicks 'No' and doesn't continue after the schema definitions are intersected and the differences listed, nothing will have been modified in any schema.

Alternate Utilities / Methods

The export/import utilities that come with Oracle (exp.exe and imp.exe) can be used but they do not well support importing exported data into a schema name that is different from the original. There is a workaround, but it ends up putting all the data into the new schema owner's default tablespace.

The bulk loader is another option (probably the fastest), and although there must be scripts out there to automate such a task, it is unwieldy. I personally much prefer a GUI interface. I would be very curious to find out how this program compares to the bulk loader, which also probably uses the same OCI calls.

Since the CopySchema does not currently disable indexes (although it disables all constraints, triggers, and primary keys) CopySchema should be a little slower.

In both alternate cases, large intermediate files need to be created, and small differences between the source and target schemas could cause problems.

Program Details

Written in C#, CopySchema uses ADO to execute simple SQL statements (such as enable/disable commands, queries to get the schema definition). It uses 'CopyTable.exe', a COM server that was written in C++ and uses native OCI calls (OTL is used as a wrapper around OCI, actually), to do the bulk inserts.

Before any data is actually copied to the target schema, all foreign keys, primary keys, check constraints and triggers are disabled. This speeds up the copying, and in the case of the foreign key constraints, it permits the program to copy data in any order, without worrying about dependencies between the tables.

If you cancel the copy process while it is running, the 'Start' button will not be re-enabled. In order to re-start the process, you must exit and re-start the program.

The user sets the number of independent threads that will operate simultaneously. Each thread gets its own instance of a CopyTable.exe.

The program saves the settings to the registry when it starts to copy, so when you re-start it later, it will have the names of the source/target schemas, servers, number of threads, etc. recalled from the registry. (the passwords are not saved to the registry and must be re-entered)

Limitations

Only columns of type VARCHAR2, DATE, and NUMBER are supported. Any column data of some other type (ex. BLOB, VARRAYs, etc.) will not be copied.

Sequence Of Actions

Action	Notes
1. Get schema definitions	-
2. Intersect schemas	The user is offered a chance to quit the process, in which case absolutely no changes will have been made to any schema
3. Disable all Foreign / Check Constraints	-
4. Disable Triggers, Primary Keys, Unique Constraints	IOT table primary keys excluded
5. Copy the data for each table	Only columns of type VARCHAR2, DATE, and NUMBER are supported
6. Enable Triggers, Primary Keys, Unique Constraints	IOT table primary keys excluded
7. Rebuild Indexes	IOT table primary keys excluded
8. Synchronize sequences	-
9. Enable Foreign / Check Constraints	-

Installation

The program will only run on Windows and requires that the .Net Framework be installed.

Evidently, you must also have the Oracle Client installed on your machine, along with access to an Oracle instance. The software has been run successfully on Oracle8i and Oracle9i.

Finally, download and unzip the install program and run the setup.exe program.

Wish List

Disable indexes before copying data to the target schema (and re-create them once all the data has been copied)
Handle more column datatypes, esp. the various LOB types
Add a checkbox to make the rebuilding of indexes optional (Rebuilding large indexes takes a long time)
Log the output / allow saving the log of actions
Execute a dbms_utility.analyze_schema() and dbms_utility.compile_schema() at the end of the process (The calls fail when called from ADO, so I would like to try from OCI)
Add a text box where the user can specify what tables to exclude from the process (Ex. WHERE table_name != 'ref_codes' - this would be appended to the SQL that is executed to get the list of table names in the schemas)

ScreenShots

Startup window

Defines the source and destination schemas.

The first checkbox should normally be left checked. However, if you cancel a copy in progress, you risk leaving the target schema with all its constraints disabled. In such a case you could re-run CopySchema with this option unchecked so that all triggers/constraints are disabled and then later re-enabled.
When the 2nd checkbox is checked, the foreign key constraints will take longer to enable, although you will be sure that there are no inconsistencies in the schema. Normally your original data should not have become corrupted, so you can normally run without this option turned on.

Intersected Schemas

Read the list of differences between the two schemas and decide whether you wish to start the copy or not.

While Copying

Here you see a screenshot of what the program looks like while it is copying the data. The current row count for each thread is updated every 10,000 rows.

Continue...