|
Missing
columns. Normally you would
expect to compare data in tables whose structures are very similar
or identical. However it might happen that some of the columns
exist in one table but not in the other one. The column selection
dialog displays such columns as "ColumnName / [No column]". By
default such incomplete pairs are marked as "Ignored", however you
can include them into the comparison as well. In which case the
'Compare To' expression
described here
will be used
instead of the missing column. This may be convenient when a new
"flag" column has been added to the newer version of the table and
it contains NULLs for all records except for a few ones where this
flag has been specifically set. The data comparison will clearly
show these particular records.
Incompatible
column types. If a given column has
different data types in the two tables, DataDiff will make a
reasonable effort to compare the data correctly. All data is cached
as VARIANT types anyway, so any details specific to DBMS are lost
(e.g. both 'datetime' and 'smalldatetime' would become the same
type, all flavors of 'int' will become just integer, all
CHARs/VARCHARs/NVARCHARs become OLE strings, etc). Any columns
which are completely incompatible (e.g. BLOB against INT) will be
marked as such in the column selection dialog and you won't be able
to include them into comparison.
Duplicate
rows.
Two rows in the same table are considered duplicate if the
combinations of key column values in the rows are identical. Since
the only purpose of the key columns is to provide unique
identification for each row in the table, you should choose the key
so as to avoid the duplicates. It is recommended that you always
choose PK column(s) or UNIQUE column(s) as the key. If for some
reason you run data comparison with a non-unique key, the DataDiff
will still be able to handle them correctly.
BLOB
values. Although the total size
of the loaded table data is limited only by the maximum size of
system memory, DataDiff can not
keep in the
cache any single column value whose storage size exceeds 64Kb.
Furthermore, by default it only keeps values of up to 4Kb in size
(which is a user-configurable setting described
here). Whenever the Diff
encounters a BLOB value exceeding the 64Kb limit or (more likely)
the user-defined threshold, it processes the whole BLOB into a
16-byte MD5 digest code and keeps that instead. Such digested value
takes part in the comparison like any normal field, it is displayed
as "[BLOB Digest]" in the grid and it generates a warning message
when you try to produce SQL from rows containing digested fields.
The only way to reliably script long variable size columns is to
use
server-side scripting
for
them.
The
hidden keys
. There are two
situations when the DataDiff will load column(s) into the buffer
regardless of the "Ignore" status you've given them. The first such
case is when the ignored columns belong to the actual primary key,
whereas the "key" columns you've chosen are not known to contain
unique values. In this case the program would not display or
compare the PK columns, but will quietly keep them in the buffer
and use their values in the WHERE clause of UPDATE statement, to
make sure that any row can be correctly addressed and
synchronized.
Ignored
NOT NULL-able columns. The second such
situation is when values from the ignored column must be included
into the generated INSERTs anyway. If a column is NOT
NULLable and has no DEFAULT or IDENTITY attributes, it
must
be included
in any valid INSERT statement. If you specify such column as
"Ignored", the DataDiff will either load it into the buffer anyway,
or insert it directly from the server (which is probably
preferable).
Scripting
IDENTITY columns. Before scripting,
DataDiff checks if any of the modifications in the data involve
IDENTITY (auto-incremental) columns. If they do, the script is
modified as follows:
|
·
|
For the
records where the IDENTITY column is updated, DataDiff generated a
pair of DELETE/INSERT statements, deleting and reinserting the
record. Note that it should be a very rare situation, since the
IDENTITY is normally either the key field or the "hidden key" (see
above) which are never synchronized.
|
|
·
|
Whenever
the program needs to insert records with IDENTITY column (including
the DELETE/INSERT situation described above), it will put "SET
IDENTITY_INSERT <table_name> ON" before the INSERTs and the
corresponding "OFF" setting afterwards.
|
|
·
|
Deleting
of records is handled in the usual way.
|
Computed
columns. By default, any
computed columns are marked as "Ignored", don't take part in the
comparison and aren't included into the generated SQL. You may
choose to include them into comparison (so they will appear in the
data grid), but they won't be scripted anyway. If the same column
appears as computed in one table and as a regular column in the
other, it will be compared in the regular way and scripted, when
appropriate (from computed to non-computed, but not the other way
around).
|