home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!cs.utexas.edu!natinst.com!hrd769.brooks.af.mil!hrd769.brooks.af.mil!not-for-mail
- From: burgess@hrd769.brooks.af.mil (Dave Burgess)
- Newsgroups: comp.databases
- Subject: Re: Normalizing 2NF -> 3NF
- Date: 23 Nov 1992 11:21:39 -0600
- Organization: Armstrong Lab MIS, Brooks AFB TX
- Lines: 128
- Distribution: world
- Message-ID: <1er3v3INNqfc@hrd769.brooks.af.mil>
- References: <69674@cup.portal.com> <1ejt9qINN340@hrd769.brooks.af.mil> <70064@cup.portal.com>
- NNTP-Posting-Host: hrd769.brooks.af.mil
-
- In article <70064@cup.portal.com> DVaz@cup.portal.com (David D Vazquez) writes:
- >Dave Burges writes:
- >>
- >> Where is the redundancy? In the foreign key element? That is where your
- >>cardinality will help determine the actual implementation, but we are talking
- >>Form Normalization, not implementation. In normalization, the concepts of
- >>query speed and disk space have no place. Save them for the physical
- >>implementation phase.
- >>
- >
- >I agree, but physically implemented tables generally tend to look alot like
- >their abstractly designed relatives. You can also save alot of time and
- >headaches if you implement your tables just like they've been designed. Just
- >think, you've spent 7 hours designing a database only to have your programmer/
- >implementor spend another 7 hours trying to figure out the better implementa-
- >tion (and for something silly like R1(A*,B)/R2(A*,C) vs. R1(A*,B)/R2(B*,C).
- >The database implementor shouldn't have to make these kinds of decisions!
-
- I agree 100%. That is why they should implement according to the design,
- not try to second guess what was intended. I have a 'REAL LIFE' example of this
- that I am not willing to share on the net, but E-Mail me and I will be glad to
- discuss it. It shows what happens when the implementation team codes up what
- they understood the situation to be, rather then implementing what the designer
- had documented. The nightmare continues.
-
- >Supertype/subtype and recursive relationship implementation should be the extent
- >of implementation issues.
- >
-
- That is what makes the Database methodology so inviting. It allows us the
- ability to document these types of relationships in (relatively) easy to do
- pieces that illuminate (versus obfuscate) the issue.
-
- Before I continue, I need to make sure everyone understands what I meant by
- 'Dorm'. In the military, every building at a training site has a 'name',
- whether it is a squadron name DORM or a historical figure HALL or a simple
- building number. In addition, each building on the 'campus' has a street
- address. If the term 'dorm' was taken as a room number, then we are debating
- different issues. For example, the student lives in 'Kilday Hall', but the
- street address is '123 34th Street West'. No one uses the address, except to
- get mail into the mail room. All other correspondence refers to the 'dorm
- name'.
- If the term 'dorm' were to be taken as a 'room number' then the
- relation R1(A*,B,C) could only be further normalized through the use of a
- compound key relationship where B and C are used in R2(C+B*) since this is the
- only way to develop a candidate key from the R2 relation. If this were the
- case and no other data about (dorm+room) were kept, we would still need to
- have it separated out to avoid an anomoly where the rooms disappear when no
- one lives in them unless we implement a kludge like 'VACANT' moves in whenever
- the rooms are emptied.
-
- >'Redundancy?', you ask. Given R(Student, Dorm, Dorm Address); if you implement
- >as R1(Student, Dorm) and R2(Student, Dorm Address), you are repeating values
- >of Dorm Address for each student as well as not enforcing Dorm -> Dorm Address
- >(or vice versa); you'd have to do some extra coding.. Also, you'd need two
- >copies of Student's names (one to find the dorm and one to find the dorm's
- >address); this is a waste.
-
- The dorm address is not dependent on the student; it is dependent on the
- dormitory. In fact, if the students all move out of the dormitory in your
- example, it ceases to exist. I don't think THAT was your intent (although
- it would make urban renewal a lot easier, and I have known students where, when
- they moved out, it was as much work as building a new dorm to fix it back
- up :-)) Therefore, the relationship in the example could never be
- R1(A*,B) R2(A*,C). The dorm address is transitively dependent on the
- student, but functionally dependent on the dormitory. Therefore, the only
- possible 3NF relations from the relationship are:
-
- R1(A*,B)/R2(B*,C)
- or
- R1(A*,C)/R2(C*,B)
-
- Since either the Dorm address or the dormitory name could be a candidate
- key to their relation, the selected foreign key into R2 is moot.
-
- Another way to look at it to further understanding might be to recognize the
- one-to-many relationships at work.
-
- Student (1) to (1) Dormitory
- Student (1) to (1) Dorm address
- Dormitory (1) to (1) Dorm Address
- Dormitory (1) to (m) Student
- Dorm Address (1) to (1) Dormitory
- Dorm Address (1) to (m) Student
-
- I don't want to draw pictures, but the 1-M relationship between (Dorm Name
- Student) implies that the relation should be split to AVOID redundant data
- being entered in the 2NF relation. In other words, the dorm information would
- be entered once (address/dorm name) and all of the students would reference
- that combination through a foreign key (address or name).
-
- >
- >The problem I see is that you never have one correct answer when it comes to
- >getting a database into 3NF. Should there be one, maybe?
- >
- > -Dave. "dvaz@cup.portal.com and dvazquez@sdiserv.cs.uno.edu"
-
- I have seen dozens of database designs for the sample problem in the class
- I teach. None were ever the same. On the other hand, there are certain
- features which every class has implemented the same way:
-
- - All the successful designs relied on the analysis of the anomolies that
- their design would and wouldn't address.
- - The successful ones were reality based, in that the relationships between
- the relations correlated very closely to the discovered real life situations.
- - The unsuccessful ones were over-simplified in order to make postpone
- decision making to the implementation phase.
- - The confusing ones may or may not have worked. They were unmappable
- to a physical implementation method, or relied on a single database product
- (e.g. Oracle) to limit their design. This made implementing them using a
- different database model (Flat File access model/Network Model/
- Hierarchical Model).
- - None have every successfully overdesigned. The ones where a lot of work
- was done on in the design stages were very easy to implement. They were not
- always optimally executable, but the were easy to implement. The optimization
- of the database is a separate issue, and relies on the access patterns, rather
- than any relationships of the data.
-
-
- I understand that I have painted many of these statements with a pretty wide
- brush, but please understand that I get a little religious about implementation
- being where the work is done. I have been bowled down too many times by
- poorly designed but very excellently implemented database systems.
-
- TSgt Dave Burgess
- NCOIC AL/MIS
- Brooks AFB, TX
-
-