DataDiff: The table-level configuration page

[Applies to: AdeptSQL Diff
version 1.90 and higher]
The first page of the DataDiff comparison setting dialog controls the
comparison as a whole, which includes row filtering, server-to-server
synchronization and buffer management options.
The row filter condition which you can
optionally specify in this dialog allows you to compare
only a subset of the records. DataDiff will use the filter in
the WHERE
clause of the two queries fetching data, so it must be a
valid SQL
expression returning True or False. You can tell the Diff
to use the same filter for both tables (which is the
default), or enter separate conditions for each table.
Separate filters may be useful, for example, in a situation
when one table contains only a subset of rows from the other one, and
you only need to compare those rows.
The program remembers the filters you last used for these
particular tables, as well as keeps the history of several
most recently entered filter expressions, so that you don't
have to re-type them every time.
The Server-side data synchronization option
tells the Diff that the preferrable way to synchronize data (especially
BLOB values) is to obtain new values directly from the "source"
database, as opposite to using the values loaded into DataDiff buffers.
With this option enabled,
server-side scripting may be turned on and off for each particular column.
When both database are on the same server, server-side
scripting is done simply by referring to the "source" table and its
columns using database name. For example, an update statement would
look
like this:
UPDATE dbo.Countries SET
Name=_src.Name
FROM dbo.Countries _tgt INNER
JOIN Bookstore.dbo.Countries _src
ON (_tgt.CountryID=N'wf') AND
(_src.CountryID=N'wf')
When the
databases are on separate servers, DataDiff uses the "linked
server" mechanism to synchronize the data.
You can specify the linked server aliases in the provided
text fields.
If the links between the two servers are not established yet,
the DataDiff can generate "exec sp_addlinkedserver..." for you.
For the login information it uses the same login and
password you've specified in the connection dialog.
If you need the servers to connect to each other under
different login, link the servers manually.
Disposing of
identical records. Usually the DataDiff keeps in
memory
all the rows loaded for comparison. Obviously, this means the maximum
amount of compared data is limited by available system resources
(memory). However, as the DataDiff reads from both tables
simultaneously and compares the rows on the fly, you can tell it to
immediately discard identical rows and reclaim the memory. In this mode
DataDiff can compare tables of any size, provided that the number of changed
rows is not
too big.
Another method of saving on the buffer space (and comparison
time) is to use 16-byte MD5 digests instead of lengthy BLOBs and
textual values.
This MD5 threshold can be configured separately for each column as
described on the next page.
Sorting data on
the server. The data comparison algorithm used
by DataDiff does not require the data to be coming from server ordered
in any particular way. However, with this option checked, the DataDiff
will include "ORDER BY list-of-key-columns" clause into the queries.
This will have the following advantages:
- Immediately after loading, the data will be displayed in
the correct order. This is convenient, but not very
important, because you can re-sort the viewed records by simply
clicking on any column headers.
- If you are comparing two extra-large tables with the "[x]
Don't keep identical records" option, sorting the incoming data on
server insures that matching records will be coming in more or less
simultaneously and therefore can be compared and disposed of
efficiently.
Please note
that sorting on the server does require time. If
there is no index on the
columns which you use as the key columns during the comparison, and the
tables are large, the sorting time can easily exceed the timeout
interval and the data scan would fail.
|