DataDiff: The column configuration page
[Applies to: AdeptSQL Diff
version 1.90 and higher]
1. Configuring column roles
The second page of the DataDiff comparison setting dialog deals with
individual columns. In the comparison, each column can be
assigned one of the following three roles:
-
Key
columns. Before the tables can be compared, you must
choose one or more key columns. The key fields are important, as the
program associates a row in one table with its counterpart in the other
one by comparing their key column values (or combinations of values, if
there are several columns in the key).
If there is a primary key constraint in one of both
tables, its columns will be taken as key columns by default. However,
this is not necessarily what you need. Often the PK of a table is based
on some meaningless, internally generated numeric ID, which might have
totally different values in the two databases. In such situation you
might prefer to use some other unique column of the table (e.g.
person's name or email). It is strongly advised that you choose for the
key column(s) which don't contain duplicate data (although the Diff
does handle duplicate keys as well).
In the data comparison windows, the key columns are always
displayed as the left-most columns of the grid and are indicated by
gray
background.
Ignored
columns. These are the columns which
must not take part in this comparison. They are not loaded into the
buffer (although there are exceptions),
not displayed and not scripted. By default, computed columns or columns
present in only one table are marked
as ignored.
BLOB columns may be marked so too, unless you switch off the
appropriate configuration
option. But whichever columns the program initially marks as
ignored, you can always change that before starting the comparison.
Regular
(compared) columns. These are just any
other
columns left after you've chosen the key and the ignored ones. When
the program finds a pair of records with matching key column values, it
proceeds to compare each pair of "regular" columns to see whether the
row has changed or not.
Note that the column list supports multiselection: you can
choose as many items as you like
and then change their state with a single click. The program saves
the configuration of columns for each pair of recently compared tables
and automatically recalls the settings whenever you compare the same
tables again.
2. Configuring comparison and scripting options for specific
columns
Once you select a specific column, one of more of the
following comparison
options will become enabled for it:
MD5 Digest if longer than...
Here
you can specify the maximum column length, in bytes, beyond which the
column value will be digested into a 16-byte checksum before being
loaded into the buffer. This option is only enabled for
string and binary columns. Digested columns can be reliably compared,
although obviously you can't view them. Such columns can be
scripted (regardless of their length) only if you use server-side
scripting option for them.
Compare lone
column to... It is possible that one table has
some columns which don't have their counterparts in the other table.
Normally such columns are not compared at all (automatically marked as
"Ignored", see above), but you can force DataDiff to compare them
against a specific value. Here you can specify such value,
which must be a valid SQL expression type-compatible with the lone
column. DataDiff does not try to parse or verify the expression, but
simply passes it to the server as part of the query. Therefore it need
not be a constant, but may be an expression using column names and
functions. If you leave this input field empty, but use
comparison/scripting options involving the "Compare to...", then NULL
will be used.
Numeric
tolerance... is enabled only for numeric columns (even for
integer ones, although it mostly makes sense for
FLOAT/DOUBLE/DECIMAL/CURRENCY columns). If you specify a
non-zero value here, any difference in column values not exceeding the
tolerance will be interpreted as a "minor" one. Which means
the two column values will be displayed one below the other as normal
changed values, but they won't be color-marked and won't be scripted.
NULL = 0, '',
False... This option is only enabled for
NULLable columns and indicates that differences between NULL and
various "empty" values like 0, '', 0x will be considered minor.
Case
insensitive option
is only enabled for string columns
(CHAR/VARCHAR/NCHAR/NVARCHAR/TEXT/NTEXT). If checked, column
values differing only in character case will be considered "minor
changes".
Strip trailing spaces is only enabled for [N][VAR]CHAR columns. If ON, this option makes the DataDiff to strip off any trailing blanks before comparing column values. This is the
default option for fixed-length character columns, because the values
obtained from them are always space-padded. Otherwise you would
never be able, for example, to compare value 'TEST' in a CHAR(5) column
against 'TEST' in a CHAR(6) one. Trailing spaces in [N]VARCHAR
columns may be significant, so for such columns the option will be
initially OFF and it will be up to you to switch it on if needed.
The following options determine how the selected column(s)
will be scripted. As you can see on the screenshot above, these
scripting modes can be set separately for each of the two direction.
The Diff analyzes a number of factors (such as whether the column is
nullable,
has some default value, is IDENTITY, is computed, etc) to enable only
those script modes that would make sense (produce valid SQL) for this
particular column/direction. For example, if you compare a computed column against a regular one,
the scripting options will obviously be limited to "Exclude from
inserts" on the computed-column side, but can be set differently for
the other side.
Of the enabled options, the Diff chooses an "optimal"
scripting mode for each direction. That means you need not bother
about setting scripting modes for every column if you don't want to. If you do, you can choose between a subset of the following scripting modes:
Client-side - DataDiff will use
data loaded into the
comparison buffers, as literals, to script the column. This was the
only scripting mode available in Diff versions prior to 1.90 and it is
the default scripting mode now. Note that for a variable-size
column that can be MD5-digested (that is, TEXT or BLOB), this is not a
reliable option. Rows
where the column value is long enough to become digested will not be
scripted and DataDiff will report an error for each of such rows.
Server-side
- This is
often the best scripting mode for longish variable-size text and binary
columns, which can be successfully scripted even when only their MD5
signature is present in the buffer. However, server-side
scripting is only possible when the two databases are accessible to
each other (or at least the source database is accessible from the
destination server). For numeric and other short columns
server-side scripting is also possible, but usually doesn't make much
sense, because using loaded values (that is, Client-side
scripting) provides for more compact and efficient queries.
Exclude from INSERTS - this only applies to
columns which are NULLable, are IDENTITY columns or have a default
value. It means that when an INSERT is generated, no value for this
column will be included into the VALUES list, so it will be left NULL
or initialized from its DEFAULT or the next identity number.
Use 'Compare to' value - this only applies to "lone" columns which, obviously, can be
dealt with in INSERTs in only two ways: either not included at all (see
above) or included as a constant. To make things simpler, in the latter
case the DataDiff uses that same 'Compare To' expression described
above.
|