???
It helps to differentiate between configuration
data and transaction data, because they have quite different
requirements.
Examples for configuration data:
- Supported languages: English, German, ?/p>
- Accepted currencies: USD, SGD, CHF, ?/p>
- Countries: US, SG, CH
Configuration data changes seldom and has usually
only few table entries. A configuration data record has often only 3 fields
(Index, Short Form, Long Form):
|
CountryID
|
Country
|
CountryDesc
|
|
1
|
US
|
United States of
America
|
|
2
|
SG
|
Singapore
|
|
3
|
CH
|
Switzerland
|
If
the configuration data will not change between software updates, it抯 better to
implement them as enumeration in the higher tiers and constraints in the data
base (instead using relation integrity checks).
Examples for transactional data:
- Shopping carts
- Orders
- Activity logs (WebPages requested by user, emails sent,
etc.)
Transactional data have usually many fields. Often a
transaction (order) has several states (ordered, paid, processed, invalid).
Once a transaction record is created, most fields shouldn抰 change anymore
(except status). Some fields might be added once a status changes (time of
status change, by whom, etc.). Transactional data shouldn抰 be deleted, but
marked as invalid.
Transactional data record should have at least a created
(date) and by (user) field. Another possibility is to copy each time a
transactional record is changed the original record to another table with a
trigger. This allows to prove who has made which change.
Every money amount needs usually 3 fields:
- Amount in actual currency
- Currency of money value
- Amount converted in accounting currency (book value)
|
Amount
|
Currency
|
Book
|
|
100
|
USD
|
100.000000
|
|
100
|
SGD
|
65.6789
|
|
100
|
CHF
|
71.2345
|
Calculation of book value:
Amount x exchange rate
of that day
Book values should be stored as data type Money.
Each
table should have a primary key and be linked to related tables.
Use
whenever possible an identity column as primary key, since it抯 an integer and
managed by the database server. Integers are perfect for indexes. They can be
stored in few bytes and integer comparison is faster than string comparison
(important for finding a value).
If a string would be the natural candidate as
primary key, use none the less an identity column, but remember to put a unique
index on the string column.
The
handling of null is counterintuitive. The comparison null=null returns false ! But
also null<>null
returns false !!! Many developers 揻orget?to detect null and to write
additional code for it. Therefore, it抯 best to avoid using null. This can
easily be done for strings, where an empty string most of the time has the same
meaning as null (not defined). With numbers it抯 trickier, because sometimes 0
means something else than not defined. It that case, null cannot be avoided.
An Index improves searching in a table, but adds overhead
for index maintenance and uses more space. SQL Server creates indexes for
primary keys automatically. Other indexes have to be added manually.
|
Example
|
Meaning
|
|
PK_Xxx
|
Primary Key
|
|
FK_Xxx
|
Foreign Key
|
|
UK_Xxx
|
Unique Key
|
If
2 tables are logically related, create a link between them. This allows the
database to guarantee relational integrity. Normally, one of the linked fields
should be a primary key.
Using a meaningless number (created by server) for
linking tables allows for easy 搑enaming? Any value can be renamed without
breaking relationships.
Remember to create an index for the foreign key, SQL
Server doesn抰 do it automatically !
Stored procedures provide the .NET developer with a
database interface. They are responsible for the consistency of the database
and to report any problem back to .NET. Stored procedures are often called by
code from different developers. Therefore, special care has to be taken to
write robust code.
Stored procedures should be formatted according the
general formatting rules specified in 3.2 Coding Format and according to the rules from this chapter.
Indent stored procedure code like other programs, i.e. each
statement on its own line and indented by 4 spaces if within a language control
structure:
-- instead of
checking if session exists, just try to update it
UPDATE Session
SET AccessCount=AccessCount+1, LastAccess=getdate()
WHERE SessionID=@SessionID
if @@rowcount=0 begin
?牋-- session
did not exist, create a new one
?牋INSERT INTO
Session (SessionID, PCCookieID)
?牋VALUES (@SessionID,
@PCCookieID)
end
Indent also code between begin transaction and commit.
Use lower case for all Transact SQL key words,
except query statements (SELECT, UPDATE, INSERT, DELETE). Use Pascal Casing for
names created by developer.
•
Use -- for one line comments
•
Use /* xxx xxxx */ for multi line comments
-- read return
parameters
SELECT @CookieUserName = CookieUserName?
FROM PCCookie
WHERE PCCookieID=@PCCookieID
/* to increase
performance, this code could be included in the cookie handling section, but
爄t would bloat
the code and blur the program logic */
•
Write every SQL clause on a new line
•
Use abbreviations for every table name, one letter for each part
in table name
•
Use fully qualified table name (including dbo. prefix)
SELECT lp.ClientIP, lp.RequestURL, lp.PreviousURL,
lp.Browser, lp.Created,
牋?pc.CookieUserName,
pc.PCCookieID, s.SessionID
FROM dbo.PCCookie pc
INNER JOIN dbo.Session s ON s.PCCookieID = pc.PCCookieID
INNER JOIN dbo.LogPage lp ON lp.SessionID = s.SessionID
WHERE (pc.PCCookieID = '{A6FCE542-28F6-436B-BB6A-069B294F336A}')
ORDER BY lp.Created
Special care has to be taken for writing JOIN
clauses. Start with the table which is included in the WHERE clause. Then add
table following the links from this table. If several tables are involved in
the WHERE clause, start with the table which returns the fewest records.
The following graphic shows how the tables from the queries
above are linked.

The Query asks for all LogPages sharing the same PCCookie
(user). The Query starts with PCCookie, then links to Session and finally
LogPage.
Consider the following Query returning all pages accessed
today. It could look nearly like the previous query:
SELECT lp.ClientIP, lp.RequestURL, lp.PreviousURL,
lp.Browser, lp.Created,
牋?pc.CookieUserName,
pc.PCCookieID, s.SessionID
FROM dbo.PCCookie pc
INNER JOIN dbo.Session s ON s.PCCookieID = pc.PCCookieID
INNER JOIN dbo.LogPage lp ON lp.SessionID = s.SessionID
WHERE (lp.Created = getdate()
ORDER BY lp.Created
However, it抯 better to write the query as:
SELECT lp.ClientIP, lp.RequestURL, lp.PreviousURL,
lp.Browser, lp.Created,
牋?pc.CookieUserName,
pc.PCCookieID, s.SessionID
FROM dbo.LogPage lp
INNER JOIN dbo.Session s ON s.SessionID = lp.SessionID
INNER JOIN dbo.PCCookie pc ON pc.PCCookieID = s.PCCookieID
WHERE (lp.Created = getdate()
ORDER BY lp.Created
OPTION (FORCE
ORDER)
Especially, if many joins are involved, the database
usually doesn抰 find the fastest way to perform the query. If the query is
ordered as recommended, the query runs often faster with the added option OPTION (FORCE ORDER). However, this
needs to be verified with the Query Analyzer and Execution Plan.
SQL
server stores all stored procedures per database. Since there might be many stored
procedures, it抯 important to use a naming convention resulting in a meaningful
grouping when the names are sorted alphabetically. The structure is:
Abbreviation - Verb ?Object - Index
Since there is no storage hierarchy for stored
procedures, stored procedures can be grouped together only by name. Procedures
accessing the same group of tables should start their name with the same
abbreviation:
Shopping
Cart:
SC_GetShoppingCart
SC_AddItem
SC_PlaceOrder
The verb states which kind of access is intended:
|
Verb
|
Description
|
|
Add
|
create new record (INSERT ?
|
|
Get
|
read database (SELECT ?
|
|
Update
|
change record (UPDATE ?
|
|
Del
|
delete (DELETE ?
|
Other verbs might be used as appropriate. Delete is most of
the time not needed, since data shouldn抰 be deleted, only marked as invalid.
Data is usually only deleted after few years !
Table to be accessed, name for group of tables or
entity involved
If no Index is indicated in the procedure name, the
primary key is used.
Examples
SC_AddShoppingCart
SC_GetShoppingCart
SC_GetShoppingCartByUser
SC_GetItemsBySC
(Primary key for Shopping Cart Item is ShoppingCartItemID, not ShoppingCartID)
/******************************************************************************
Ses_WriteCookieInfo
===================
Update user
specific information for existing cookie
Error raised:
1) @PCCookieID not
found
2) Concurrency
problem, @TimeStamp didn't match the value from TimeStamp field
3) This error condition
should never occur. If it does happen, reason needs to be found.
Copyright (c)
Peter Huber, Singapore, 2003
*******************************************************************************/
For important parameters like keys, no
special code is needed to test for missing parameters or if they are null,
because a run time error will be raised by the database during database
modification. For pure data reading (SELECT), simply no results will be
returned (however, no runtime error will be generated).
Since it抯 recommended to reduce Allow Nulls
in table columns as much as possible and using defaults instead like empty
string or 0, we need to assign the default value defined in the database to unimportant
parameters if they are missing or null. Otherwise a runtime error would be
raised, which we don抰 want to happen for unimportant data.
Value ranges don抰 need to be tested. They should be
defined as constraints and should raise a run time error.
The statements INSERT, UPDATE and DELETE are
supposed to change data in the database. The outcome of these statements must
be controlled immediately after every operation. One must be especially careful
with statements having WHERE clauses, because they might not find the record as
expected, but also not raise an error message.
If an error is detected, an appropriate error
message has to be raised (see example) and the return code of the stored
procedure should be >0.
•
The UPDATE needs only 3 lines of code, but error detection and error
reporting over 20 ! This means that if the update is successful, there is
hardly any performance penalty for the concurrency detection.
•
There might be several reasons why the UPDATE didn抰 change any
record:
- Error
in UPDATE statement => @@error<>0
- No
record exists for PCCookie
- Record
exists, but concurrency problem (provided timestamp doesn抰 match with
timestamp in database).
•
The error message created must report which case
above it was and provide enough data (keys, etc.) to allow further
investigation of the database.
•
Please note that @@rowcount and @@Error must be
immediately checked after UPDATE. For example, it抯 not possible to use
2 set statements to assign each to a variable ! Use instead:
燬ELECT
@ErrorUpdate= @@error, @ErrorCount=@@rowcount
•
The error message created must report which case
above it was and provide enough data (keys, etc.) to allow further
investigation of the database.
•
Note the last raiserror, which reports even a case 4,
which theoretically shouldn抰 happen.
-- update
PCCookie
UPDATE dbo.PCCookie
SET CookieUserName=@CookieUserName
WHERE PCCookieID=@PCCookieID and @TimeStamp=[TimeStamp]
-- check if no
record was updated
if (@@rowcount=0) and (@@error=0) begin
牋?-- does
@CookieID exist ?
牋?declare @CookieIDString varchar(50)
牋?set @CookieIDString
= convert(varchar(50), @PCCookieID)
牋?if not
exists (SELECT * FROM dbo.PCCookie WHERE PCCookieID=@PCCookieID)
begin
牋牋牋?raiserror('@PCCookieID
not found: %s', 16, 1, @CookieIDString)
牋?end else
begin
牋牋牋?-- does
@CookieID exists, is @TimeStamp wrong ?
牋牋牋?declare
@DBTimeStamp timestamp
牋牋牋?
牋牋牋?SELECT @DBTimeStamp
= TimeStamp
牋牋牋?FROM dbo.PCCookie
牋牋牋?WHERE PCCookieID=@PCCookieID
牋牋牋?
牋牋牋?--
prepare @TimeStamp and TimeStamp as strings for error messages
牋牋牋?declare
@TimeStampBig bigint
牋牋牋?set @TimeStampBig?
= convert(bigint, @TimeStamp)
牋牋牋?declare
@DBTimeStampBig bigint
牋牋牋?set @DBTimeStampBig?
= convert(bigint, @DBTimeStamp)
牋牋牋?if (@DBTimeStamp<>@TimeStamp)
begin
牋牋牋牋牋?raiserror('Concurrency
problem: PCCookieID %s was read with timestamp %u but has now timestamp %u',
16, 2, @CookieIDString, @TimeStampBig, @DBTimeStampBig)
牋?牋牋end
else begin
牋牋牋牋牋?--
This error condition should never occur, because the update
牋牋牋牋牋?-- statement
didn't genereate an error (@@error=0) and there is a
牋牋牋牋牋?--
cookie for @PCCookieID with a TimeStamp=@TimeStamp, but no record
牋 牋牋牋牋?-
was updated (@@rowcount=0).?
牋牋牋牋牋?raiserror('This
error condition should never occur: PCCookieID %s was read with timestamp %u
and still has timestamp %u, but did not work', 16, 3, @CookieIDString,
@TimeStampBig, @DBTimeStampBig)
牋牋?牋end
牋?end
end
Write
for each xyz stored procedure a xyzTest stored procedure. Reasons:
•
Testing of stored procedures without the involvement of .NET
developers
•
Helps to document the proper use of the stored procedure
•
Allows for easy verification of interface to layers above if
underlying database structure changes in the future.
•
It is easier to debug with query analyzer than within Visual
Studio
Delete old test data at the beginning of the test.
This allows inspection of test data after the test has been run.
Precede each test
with print statements:
1) Empty line
2) Explanation of test, starting with ?code>---- ?/p>
3) More lines as needed for expected error messages,
etc., intended by 5 blanks
print ''
print '---- write null parameters'
print '牋牋 "@PCCookieID not found: (null)" error
expected'
•
call without any parameters
•
call for non existing keys
•
call with the minimum amount of parameters
•
calls with wrong parameter
•
test concurrency detection
/**********************************************
Ses_WriteCookieInfoTest
-----------------------
Test procedure
for WriteCookieInfoTest
Copyright Peter
Huber, Singapore, 2003
**********************************************/
alter procedure
Ses_WriteCookieInfoTest
as
set nocount on
declare
?@PCCookieID
uniqueidentifier,?
?@CookieUserName
nVarChar(50),
?@CookieLanguage
smallint
-- delete old
test data
execute Ses_DeleteTestData
print ''
print '---- write null parameters'
print '牋牋 "@PCCookieID not found: (null)" error
expected'
execute Ses_WriteCookieInfo @PCCookieID, @CookieUserName,
@CookieLanguage
print ''
print '---- update non existant PCCookieID'
set @PCCookieID='00000000-0000-0000-0000-000000000000'
print '牋牋 "@PCCookieID not found:
00000000-0000-0000-0000-000000000000" error expected'
execute Ses_WriteCookieInfo @PCCookieID
print ''
print '---- write only required parameter @PCCookieID'
INSERT INTO PCCookie
(PCCookieID)
VALUES(@PCCookieID)
execute Ses_WriteCookieInfo @PCCookieID
print ''
print '---- write all parameters'
set @CookieUserName = ''
set @CookieLanguage = 2
execute Ses_WriteCookieInfo @PCCookieID, @CookieUserName,
@CookieLanguage
print ''
print '---- write invalid language parameter'
print '牋牋 "UPDATE statement conflicted with COLUMN
CHECK constraint" error expected'
set @CookieLanguage = 999
execute Ses_WriteCookieInfo @PCCookieID, @CookieUserName,
@CookieLanguage
The peer review of the stored procedure is also a
good time to check together if the tables, indexes, constraints, etc. are setup
properly.
Stored Procedure:
•
Proper formatting and documentation ?
•
Is each local variable initialized before use ?
•
Do all stored procedure parameter definitions correspond with the
database table column definitions ?
•
Are varchars?length compared to the column width definition in
the database ?
•
Are input parameter抯 tested or default values assigned ?
•
Is the success of any data modification statement (INSERT, UPDATE,
DELETE) immediately tested (@@error, @@rowcount) ?
•
Is transaction control implemented if several data modification
statements are involved ?
•
Are cases handled properly where one of several data modification
statements fails ?
•
Are concurrency problems detected ?
•
Check Execution Plan
- Does
it contain any full table scans ? Can they be avoided ? Needs an index to be
added ?
- Is
the sequence of table accessed good (from tables with few resulting records to
tables with many records)
•
Has every stored procedure a test procedure ?
Database Setup:
•
Have all tables a primary key (if possible identity column) ?
•
Are unique index defined for unique fields ?
•
Are table properly linked ? Are indexes created for foreign keys
?
•
Are value ranges protected by constraints ?
•
Are the needed test procedures written and reviewed ?
Most data should be kept in a database, even if it
is no longer valid. This allows later investigation if something goes wrong.
Don抰 delete configuration data as long transactional data reference them and delete
transactional data only after years according business requirements.
The Don抰 Delete Data rule applies for the
users. However, for data maintenance and testing, delete procedures are needed.
Make sure that the content of every table can be deleted by at least one stored
procedure.
Deletion is tricky, because of referential
integrity. Usually, the deletion process is reversed compared to the insertion
process. Delete first the detail records and then delete the master record, if
it is not referenced by any other detail records.
???
We have to prevent 2 users from changing the same
data simultaneously. Because we don抰 delete data and inserting is not a
problem, only updates need to be protected. This problem occurs very seldom,
but must be prevented (or at least detected). This can be done with little database
performance overhead, though the quite some code is required for error
detection, by using optimistic locking. A record is only written back, when the
timestamp is still the same as during reading. Otherwise a concurrency runtime
error is raised.
Example code and comments see at 10.3.5 Test Outcome of Data Modification Statements
SQL server protects every single database operation.
An operation is performed completely or not at all. If more than one database operation
is involved, they need to be protected by explicit transaction statements.
The preferred place for transaction control is a
stored procedure, unless several databases are involved. A stored procedure has
to guarantee that the database is in a consistent state when it returns.
??? Example
Run it nightly to test consistency
???
Options ???
The data on the production server are very valuable
and can be easily spoiled by running faulty queries. Therefore, the following
procedure has to be followed:

Chapters Overview