???
It helps to differentiate between configuration
data and transaction data, because they have quite different
requirements.
Examples for configuration data:
- Supported languages: English, German, …
- Accepted currencies: USD, SGD, CHF, …
- Countries: US, SG, CH
Configuration data changes seldom and has usually
only few table entries. A configuration data record has often only 3 fields
(Index, Short Form, Long Form):
|
CountryID
|
Country
|
CountryDesc
|
|
1
|
US
|
United States of
America
|
|
2
|
SG
|
Singapore
|
|
3
|
CH
|
Switzerland
|
If
the configuration data will not change between software updates, it’s better to
implement them as enumeration in the higher tiers and constraints in the data
base (instead using relation integrity checks).
Examples for transactional data:
- Shopping carts
- Orders
- Activity logs (WebPages requested by user, emails sent,
etc.)
Transactional data have usually many fields. Often a
transaction (order) has several states (ordered, paid, processed, invalid).
Once a transaction record is created, most fields shouldn’t change anymore
(except status). Some fields might be added once a status changes (time of
status change, by whom, etc.). Transactional data shouldn’t be deleted, but
marked as invalid.
Transactional data record should have at least a created
(date) and by (user) field. Another possibility is to copy each time a
transactional record is changed the original record to another table with a
trigger. This allows to prove who has made which change.
Every money amount needs usually 3 fields:
- Amount in actual currency
- Currency of money value
- Amount converted in accounting currency (book value)
|
Amount
|
Currency
|
Book
|
|
100
|
USD
|
100.000000
|
|
100
|
SGD
|
65.6789
|
|
100
|
CHF
|
71.2345
|
Calculation of book value:
Amount x exchange rate
of that day
Book values should be stored as data type Money.
Each
table should have a primary key and be linked to related tables.
Use
whenever possible an identity column as primary key, since it’s an integer and
managed by the database server. Integers are perfect for indexes. They can be
stored in few bytes and integer comparison is faster than string comparison
(important for finding a value).
If a string would be the natural candidate as
primary key, use none the less an identity column, but remember to put a unique
index on the string column.
The
handling of null is counterintuitive. The comparison null=null returns false ! But
also null<>null
returns false !!! Many developers “forget” to detect null and to write
additional code for it. Therefore, it’s best to avoid using null. This can
easily be done for strings, where an empty string most of the time has the same
meaning as null (not defined). With numbers it’s trickier, because sometimes 0
means something else than not defined. It that case, null cannot be avoided.
An Index improves searching in a table, but adds overhead
for index maintenance and uses more space. SQL Server creates indexes for
primary keys automatically. Other indexes have to be added manually.
|
Example
|
Meaning
|
|
PK_Xxx
|
Primary Key
|
|
FK_Xxx
|
Foreign Key
|
|
UK_Xxx
|
Unique Key
|
If
2 tables are logically related, create a link between them. This allows the
database to guarantee relational integrity. Normally, one of the linked fields
should be a primary key.
Using a meaningless number (created by server) for
linking tables allows for easy “renaming”: Any value can be renamed without
breaking relationships.
Remember to create an index for the foreign key, SQL
Server doesn’t do it automatically !
Stored procedures provide the .NET developer with a
database interface. They are responsible for the consistency of the database
and to report any problem back to .NET. Stored procedures are often called by
code from different developers. Therefore, special care has to be taken to
write robust code.
Stored procedures should be formatted according the
general formatting rules specified in 3.2 Coding Format and according to the rules from this chapter.
Indent stored procedure code like other programs, i.e. each
statement on its own line and indented by 4 spaces if within a language control
structure:
-- instead of
checking if session exists, just try to update it
UPDATE Session
SET AccessCount=AccessCount+1, LastAccess=getdate()
WHERE SessionID=@SessionID
if @@rowcount=0 begin
-- session
did not exist, create a new one
INSERT INTO
Session (SessionID, PCCookieID)
VALUES (@SessionID,
@PCCookieID)
end
Indent also code between begin transaction and commit.
Use lower case for all Transact SQL key words,
except query statements (SELECT, UPDATE, INSERT, DELETE). Use Pascal Casing for
names created by developer.
•
Use -- for one line comments
•
Use /* xxx xxxx */ for multi line comments
-- read return
parameters
SELECT @CookieUserName = CookieUserName
FROM PCCookie
WHERE PCCookieID=@PCCookieID
/* to increase
performance, this code could be included in the cookie handling section, but
it would bloat
the code and blur the program logic */
•
Write every SQL clause on a new line
•
Use abbreviations for every table name, one letter for each part
in table name
•
Use fully qualified table name (including dbo. prefix)
SELECT lp.ClientIP, lp.RequestURL, lp.PreviousURL,
lp.Browser, lp.Created,
pc.CookieUserName,
pc.PCCookieID, s.SessionID
FROM dbo.PCCookie pc
INNER JOIN dbo.Session s ON s.PCCookieID = pc.PCCookieID
INNER JOIN dbo.LogPage lp ON lp.SessionID = s.SessionID
WHERE (pc.PCCookieID = '{A6FCE542-28F6-436B-BB6A-069B294F336A}')
ORDER BY lp.Created
Special care has to be taken for writing JOIN
clauses. Start with the table which is included in the WHERE clause. Then add
table following the links from this table. If several tables are involved in
the WHERE clause, start with the table which returns the fewest records.
The following graphic shows how the tables from the queries
above are linked.

The Query asks for all LogPages sharing the same PCCookie
(user). The Query starts with PCCookie, then links to Session and finally
LogPage.
Consider the following Query returning all pages accessed
today. It could look nearly like the previous query:
SELECT lp.ClientIP, lp.RequestURL, lp.PreviousURL,
lp.Browser, lp.Created,
pc.CookieUserName,
pc.PCCookieID, s.SessionID
FROM dbo.PCCookie pc
INNER JOIN dbo.Session s ON s.PCCookieID = pc.PCCookieID
INNER JOIN dbo.LogPage lp ON lp.SessionID = s.SessionID
WHERE (lp.Created = getdate()
ORDER BY lp.Created
However, it’s better to write the query as:
SELECT lp.ClientIP, lp.RequestURL, lp.PreviousURL,
lp.Browser, lp.Created,
pc.CookieUserName,
pc.PCCookieID, s.SessionID
FROM dbo.LogPage lp
INNER JOIN dbo.Session s ON s.SessionID = lp.SessionID
INNER JOIN dbo.PCCookie pc ON pc.PCCookieID = s.PCCookieID
WHERE (lp.Created = getdate()
ORDER BY lp.Created
OPTION (FORCE
ORDER)
Especially, if many joins are involved, the database
usually doesn’t find the fastest way to perform the query. If the query is
ordered as recommended, the query runs often faster with the added option OPTION (FORCE ORDER). However, this
needs to be verified with the Query Analyzer and Execution Plan.
SQL
server stores all stored procedures per database. Since there might be many stored
procedures, it’s important to use a naming convention resulting in a meaningful
grouping when the names are sorted alphabetically. The structure is:
Abbreviation - Verb – Object - Index
Since there is no storage hierarchy for stored
procedures, stored procedures can be grouped together only by name. Procedures
accessing the same group of tables should start their name with the same
abbreviation:
Shopping
Cart:
SC_GetShoppingCart
SC_AddItem
SC_PlaceOrder
The verb states which kind of access is intended:
|
Verb
|
Description
|
|
Add
|
create new record (INSERT …)
|
|
Get
|
read database (SELECT …)
|
|
Update
|
change record (UPDATE …)
|
|
Del
|
delete (DELETE …)
|
Other verbs might be used as appropriate. Delete is most of
the time not needed, since data shouldn’t be deleted, only marked as invalid.
Data is usually only deleted after few years !
Table to be accessed, name for group of tables or
entity involved
If no Index is indicated in the procedure name, the
primary key is used.
Examples
SC_AddShoppingCart
SC_GetShoppingCart
SC_GetShoppingCartByUser
SC_GetItemsBySC
(Primary key for Shopping Cart Item is ShoppingCartItemID, not ShoppingCartID)
/******************************************************************************
Ses_WriteCookieInfo
===================
Update user
specific information for existing cookie
Error raised:
1) @PCCookieID not
found
2) Concurrency
problem, @TimeStamp didn't match the value from TimeStamp field
3) This error condition
should never occur. If it does happen, reason needs to be found.
Copyright (c)
Peter Huber, Singapore, 2003
*******************************************************************************/
For important parameters like keys, no
special code is needed to test for missing parameters or if they are null,
because a run time error will be raised by the database during database
modification. For pure data reading (SELECT), simply no results will be
returned (however, no runtime error will be generated).
Since it’s recommended to reduce Allow Nulls
in table columns as much as possible and using defaults instead like empty
string or 0, we need to assign the default value defined in the database to unimportant
parameters if they are missing or null. Otherwise a runtime error would be
raised, which we don’t want to happen for unimportant data.
Value ranges don’t need to be tested. They should be
defined as constraints and should raise a run time error.
The statements INSERT, UPDATE and DELETE are
supposed to change data in the database. The outcome of these statements must
be controlled immediately after every operation. One must be especially careful
with statements having WHERE clauses, because they might not find the record as
expected, but also not raise an error message.
If an error is detected, an appropriate error
message has to be raised (see example) and the return code of the stored
procedure should be >0.
•
The UPDATE needs only 3 lines of code, but error detection and error
reporting over 20 ! This means that if the update is successful, there is
hardly any performance penalty for the concurrency detection.
•
There might be several reasons why the UPDATE didn’t change any
record:
- Error
in UPDATE statement => @@error<>0
- No
record exists for PCCookie
- Record
exists, but concurrency problem (provided timestamp doesn’t match with
timestamp in database).
•
The error message created must report which case
above it was and provide enough data (keys, etc.) to allow further
investigation of the database.
•
Please note that @@rowcount and @@Error must be
immediately checked after UPDATE. For example, it’s not possible to use
2 set statements to assign each to a variable ! Use instead:
SELECT
@ErrorUpdate= @@error, @ErrorCount=@@rowcount
•
The error message created must report which case
above it was and provide enough data (keys, etc.) to allow further
investigation of the database.
•
Note the last raiserror, which reports even a case 4,
which theoretically shouldn’t happen.
-- update
PCCookie
UPDATE dbo.PCCookie
SET CookieUserName=@CookieUserName
WHERE PCCookieID=@PCCookieID and @TimeStamp=[TimeStamp]
-- check if no
record was updated
if (@@rowcount=0) and (@@error=0) begin
-- does
@CookieID exist ?
declare @CookieIDString varchar(50)
set @CookieIDString
= convert(varchar(50), @PCCookieID)
if not
exists (SELECT * FROM dbo.PCCookie WHERE PCCookieID=@PCCookieID)
begin
raiserror('@PCCookieID
not found: %s', 16, 1, @CookieIDString)
end else
begin
-- does
@CookieID exists, is @TimeStamp wrong ?
declare
@DBTimeStamp timestamp
SELECT @DBTimeStamp
= TimeStamp
FROM dbo.PCCookie
WHERE PCCookieID=@PCCookieID
--
prepare @TimeStamp and TimeStamp as strings for error messages
declare
@TimeStampBig bigint
set @TimeStampBig
= convert(bigint, @TimeStamp)
declare
@DBTimeStampBig bigint
set @DBTimeStampBig
= convert(bigint, @DBTimeStamp)
if (@DBTimeStamp<>@TimeStamp)
begin
raiserror('Concurrency
problem: PCCookieID %s was read with timestamp %u but has now timestamp %u',
16, 2, @CookieIDString, @TimeStampBig, @DBTimeStampBig)
end
else begin
--
This error condition should never occur, because the update
-- statement
didn't genereate an error (@@error=0) and there is a
--
cookie for @PCCookieID with a TimeStamp=@TimeStamp, but no record
--
was updated (@@rowcount=0).
raiserror('This
error condition should never occur: PCCookieID %s was read with timestamp %u
and still has timestamp %u, but did not work', 16, 3, @CookieIDString,
@TimeStampBig, @DBTimeStampBig)
end
end
end
Write
for each xyz stored procedure a xyzTest stored procedure. Reasons:
•
Testing of stored procedures without the involvement of .NET
developers
•
Helps to document the proper use of the stored procedure
•
Allows for easy verification of interface to layers above if
underlying database structure changes in the future.
•
It is easier to debug with query analyzer than within Visual
Studio
Delete old test data at the beginning of the test.
This allows inspection of test data after the test has been run.
Precede each test
with print statements:
1) Empty line
2) Explanation of test, starting with ‘---- ‘
3) More lines as needed for expected error messages,
etc., intended by 5 blanks
print ''
print '---- write null parameters'
print ' "@PCCookieID not found: (null)" error
expected'
•
call without any parameters
•
call for non existing keys
•
call with the minimum amount of parameters
•
calls with wrong parameter
•
test concurrency detection
/**********************************************
Ses_WriteCookieInfoTest
-----------------------
Test procedure
for WriteCookieInfoTest
Copyright Peter
Huber, Singapore, 2003
**********************************************/
alter procedure
Ses_WriteCookieInfoTest
as
set nocount on
declare
@PCCookieID
uniqueidentifier,
@CookieUserName
nVarChar(50),
@CookieLanguage
smallint
-- delete old
test data
execute Ses_DeleteTestData
print ''
print '---- write null parameters'
print ' "@PCCookieID not found: (null)" error
expected'
execute Ses_WriteCookieInfo @PCCookieID, @CookieUserName,
@CookieLanguage
print ''
print '---- update non existant PCCookieID'
set @PCCookieID='00000000-0000-0000-0000-000000000000'
print ' "@PCCookieID not found:
00000000-0000-0000-0000-000000000000" error expected'
execute Ses_WriteCookieInfo @PCCookieID
print ''
print '---- write only required parameter @PCCookieID'
INSERT INTO PCCookie
(PCCookieID)
VALUES(@PCCookieID)
execute Ses_WriteCookieInfo @PCCookieID
print ''
print '---- write all parameters'
set @CookieUserName = ''
set @CookieLanguage = 2
execute Ses_WriteCookieInfo @PCCookieID, @CookieUserName,
@CookieLanguage
print ''
print '---- write invalid language parameter'
print ' "UPDATE statement conflicted with COLUMN
CHECK constraint" error expected'
set @CookieLanguage = 999
execute Ses_WriteCookieInfo @PCCookieID, @CookieUserName,
@CookieLanguage
The peer review of the stored procedure is also a
good time to check together if the tables, indexes, constraints, etc. are setup
properly.
Stored Procedure:
•
Proper formatting and documentation ?
•
Is each local variable initialized before use ?
•
Do all stored procedure parameter definitions correspond with the
database table column definitions ?
•
Are varchars’ length compared to the column width definition in
the database ?
•
Are input parameter’s tested or default values assigned ?
•
Is the success of any data modification statement (INSERT, UPDATE,
DELETE) immediately tested (@@error, @@rowcount) ?
•
Is transaction control implemented if several data modification
statements are involved ?
•
Are cases handled properly where one of several data modification
statements fails ?
•
Are concurrency problems detected ?
•
Check Execution Plan
- Does
it contain any full table scans ? Can they be avoided ? Needs an index to be
added ?
- Is
the sequence of table accessed good (from tables with few resulting records to
tables with many records)
•
Has every stored procedure a test procedure ?
Database Setup:
•
Have all tables a primary key (if possible identity column) ?
•
Are unique index defined for unique fields ?
•
Are table properly linked ? Are indexes created for foreign keys
?
•
Are value ranges protected by constraints ?
•
Are the needed test procedures written and reviewed ?
Most data should be kept in a database, even if it
is no longer valid. This allows later investigation if something goes wrong.
Don’t delete configuration data as long transactional data reference them and delete
transactional data only after years according business requirements.
The Don’t Delete Data rule applies for the
users. However, for data maintenance and testing, delete procedures are needed.
Make sure that the content of every table can be deleted by at least one stored
procedure.
Deletion is tricky, because of referential
integrity. Usually, the deletion process is reversed compared to the insertion
process. Delete first the detail records and then delete the master record, if
it is not referenced by any other detail records.
???
We have to prevent 2 users from changing the same
data simultaneously. Because we don’t delete data and inserting is not a
problem, only updates need to be protected. This problem occurs very seldom,
but must be prevented (or at least detected). This can be done with little database
performance overhead, though the quite some code is required for error
detection, by using optimistic locking. A record is only written back, when the
timestamp is still the same as during reading. Otherwise a concurrency runtime
error is raised.
Example code and comments see at 10.3.5 Test Outcome of Data Modification Statements
SQL server protects every single database operation.
An operation is performed completely or not at all. If more than one database operation
is involved, they need to be protected by explicit transaction statements.
The preferred place for transaction control is a
stored procedure, unless several databases are involved. A stored procedure has
to guarantee that the database is in a consistent state when it returns.
??? Example
Run it nightly to test consistency
???
Options ???
The data on the production server are very valuable
and can be easily spoiled by running faulty queries. Therefore, the following
procedure has to be followed:

Chapters Overview