|
The first page of the
DataDiff comparison setting dialog controls the comparison as a
whole, which includes row filtering, server-to-server
synchronization and buffer management options.
The
row filter condition which you can
optionally specify in this dialog allows you to compare only a
subset of the records. DataDiff will use the filter in the WHERE
clause of the two queries fetching data, so it must be a valid SQL
expression returning True or False. You can let the Diff use the
same filter for both tables (which is the default), or enter
separate conditions for each table. Separate filters may be
useful, for example, in a situation when one table contains only a
subset of rows from the other one, and you only need to compare
those rows.
The program remembers the
filters you last used for these particular tables, as well as keeps
a history of several most recently entered filter expressions, so
that you don't have to re-type them every time.
The
Server-side data synchronization option tells the Diff
that the preferable way to synchronize data (especially BLOB
values) is to obtain new values directly from the "source"
database, as opposite to using the values loaded into DataDiff
buffers. With this option enabled, server-side
scripting may be turned on and off for
each particular
column.
When both database are on
the same server, server-side scripting is done simply by referring
to the "source" table and its columns using database name. For
example, an update statement would look like this:
UPDATE dbo.Countries SET
Name=_src.Name
FROM dbo.Countries _tgt INNER JOIN Bookstore.dbo.Countries _src
ON (_tgt.CountryID=N'wf') AND (_src.CountryID=N'wf')
Note
that AdeptSQL Diff 1.9X
doesn't fully support Unicode, so to avoid data loss,
any NCHAR/NVARCHAR/NTEXT columns actually containing multi-language
characters must be synchronized through server-to-server
scripting!
When the databases are on
separate servers, DataDiff uses the "linked server" mechanism to
synchronize the data. You can specify the linked server
aliases in the provided text fields.
If the links between the
two servers are not established yet, the DataDiff can generate
"exec sp_addlinkedserver..." for you. For the login
information it uses the same login and password you've specified in
the connection dialog. If you need the servers to connect to
each other under different login, link the servers manually.
Disposing
of identical records. Usually the
DataDiff keeps in memory all the rows loaded for comparison.
Obviously, this means the maximum amount of compared data is
limited by available system memory. However, the DataDiff
reads from both tables simultaneously and compares the rows on the
fly, so you can tell it to immediately discard identical rows and
reclaim the memory. In this mode DataDiff can compare tables of any
size, provided that the number of changed
rows is not
too big.
Another method of saving
on the buffer space (and comparison time) is to use 16-byte MD5
digests instead of lengthy BLOBs and textual values. This MD5
threshold can be configured separately for each column as
described
on the next
page.
Sorting
data on the server. The data comparison
algorithm used by DataDiff does not require the data to be coming
from server ordered in any particular way. However, with this
option checked, the DataDiff will include "ORDER BY
list-of-key-columns" clause into the queries. This will have the
following advantages:
|
·
|
Immediately after
loading, the data will be displayed in the correct order.
This is convenient, but not very important, because you can re-sort
the viewed records by simply clicking on any column headers.
|
|
·
|
If you
are comparing two extra-large tables with the "[x] Don't keep
identical records" option, sorting the incoming data on server
insures that matching records will be coming in more or less
simultaneously and therefore can be compared and disposed of
efficiently.
|
Please
note that sorting on the
server does require time. If
there is no index on the columns which you use as the key columns
during the comparison, and the tables are large, the sorting time
can easily exceed the timeout interval and the data scan would
fail.
|