
www.Usenet.com
| <-- __Chronological__ --> | <-- __Thread__ --> |
"David Portas" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > > living people than VendorID values. I don't think an identity VendorID > > value is any more artificial than any other unique way of identifying a > > vendor, so long as it is assigned the first time a vendor enters the > > system and is never changed. > > But the point is that a "natural" key is verifiable outside of the system. > > When I see a NG post from "skass[at]drew.edu" I don't care whether that's > based on your "real" name or even whether S.Kass is the same name as on your > passport or driver's licence. What's important to me is that it's determined > by a consistent method outside of the system which gives me some acceptable > degree of confidence that you're the same person who posted here as > "skass[at]drew.edu" yesterday. Of course that validity is destroyed if you > change your email address or if someone spoofs your address. But it's still > intrinsically better than an arbitrary ID allocated by the server. > I am a firm believer that natural keys should only be used to logically design/normalize the data. When it comes to the real reason for keys, data integrity, more often than not I have seen that natural keys are intrinsically not good physical primary keys. 1. Natural keys are, being natural and therefore user entered [i.e., provided to the database by external means], fungible. If a user enters data, they must also be able to modify it. If data can be modified, then its value as a systemic primary key is gone. Yes, you can cascade updates to these, but why do it when it can be avoided to start with. 2. Natural keys are typically a composite of atomic attributes. If using a composite, these must be propagated to referencing tables as foreign keys. Your normalization drops below par, by having these [potentially] massively duplicated columns. Attributes that are single, [supposedly] unique attributes (e.g., SSN), usually represent some official, governmentally recognized ID, and therefore have legal issues with being propagated throughout a system. Also, for amateurs and many professionals, natural keys are very often chosen incorrectly. e.g., I believe some combination of Name and other info has been used by my ISP as their primary key. My last name was entered into their system incorrectly, but they cannot fix it because their system will not allow it. Preposterous and poor design. Surrogate keys generated by using the identity property are ideal for data integrity, because 1. They are static values [i.e., once entered, it does not change] and the DBA has control over allowing values in identity columns to be modified. 2. They are singleton row ids. The fact that they are sequential is irrelevant. That is simply the most efficient means of generating new numeric values. Identity integers can be problematic in two-way replication, but proper management of key ranges can alleviate these issues. GUIDs are the MS recommended way to deal with distributed data and two-way replication, but not as easy to deal with in unreplicated databases.
| <-- __Chronological__ --> | <-- __Thread__ --> |