Database+Design+Info

=Conceptual Database Design Components=

More information on data modeling []

Entities (Tables)
A person, place, thing or other concept for which data is collected. Example: users, posts, comments, game characters, game weapons, game skills, etc.

Attributes (Fields in Tables)
A thing or unit of information that describes an entity. Examples:


 * Users:** username, password, email
 * Posts:** author, date, content, comments
 * Comments:** author, data, content, post
 * Game Characters:** character name, age, character type, health, weapons, skills
 * Game Weapons:** weapon name, type, damage, weight
 * Game Skills:** skill name, pluses to character attributes, negatives to character attributes

Relationships

 * A user submits a post:** the user is the author for the post so there must be a way to keep track of the user that created each post
 * A user submits a comment for a post:** the user is the author for the comment that is linked to a specific post so there must be a way to keep track of the user that wrote the comment and to which post it belongs
 * A user creates a game character:** there must be a way to know which user created a character
 * A game character carries weapons:** there must be a way to see what weapons a character has
 * A game character has and/or may learn skills:** there must be a way to see what skills a character may call upon

**One-to-One**
In a one-to-one relationship, one entity can be associated (tied to) with at most one instance of the other entity, and vice versa. These types of relationships are rare among entities.

**One-to Many**
In a one-to-many relationship, any instance of the first entity may be associated (tied to) one or more instances of the second.

A user may submit one or more posts A user may submit one or more comments to a post A user may create one or more game characters A game character may carry one or more weapons A game character may know or learn one or more skills

**Many-to-Many**
In a many-to-many relationship, any instance of the first entity may be associated (tied to) zero, one, or more instances of the second and vice versa. This is fine in the conceptual design phase but it does cause problems during the creation of the actual database architecture. The relational model does not directly support many-to-many relationships so we need to make some adaptations in how we organize and keep track of entities and attributes.

Intersection tables are used to keep track of the many-to-many relationships. For example: A game character may carry zero, one, or more game weapons. A game weapon may be carried by zero, one, or more game characters.

Table Columns and Data Types
Data types are used to:


 * Restrict the data (in a column) to characters that make sense for the data type (numeric, a string of text, a date, a time, etc.)
 * Provide a set of behaviors useful to the database user (subtract a date from another date to get time elapsed)
 * Efficient storage of data

Primary Key
A column or set of columns that uniquely identifies each row

Referential Constraints
Use of a foreign key in one table to tie it to a row in another table. For a one-to-many relationship, the primary key from the parent table is used as a foreign key in the child table. If the parent is deleted, then the children should also be deleted (cascading delete). In the very rare event that the parent primary key is updated, the foreign key in the child table should also be updated (cascading update).

Integrity Constraints

 * NOT NULL** (required fields, including primary and foreign keys, may not be NULL or empty)
 * CHECK** (use of a logic statement (outcome is a logical true or false) to validate a column value - example: checking to make sure a birth date is not greater than the current date. If this is true than reject the data because it is an error. All valid birth dates must be less than the current date.)

More information on relational database model http://en.wikipedia.org/wiki/Relational_model

=Normalization=

Normalization helps to ensure that the structure of a relational database will keep its integrity over time as data is added and removed.


 * //Normalization is a technique for producing a set of relations that possess a certain set of properties. Dr. E. F. Codd, the father of the relational database, developed the process in 1972, using three normal forms. The name was a bit of a political gag at the time. President Nixon was "normalizing" relations with China, so Dr. Codd figured if you could normalize relations with a country, you should be able to "normalize" data relations as well. Additional normal forms were added later.//**

from databases DeMYSTiFieD: A Self-Teaching Guide (2004) by Andy Oppel (page 145)

Ways to help maintain the integrity of a relational database:
 * 1) Choose an appropriate primary key.
 * 2) Structure relations in such a way as to remove insert, delete and update anomalies.


 * //The mantra of the skilled database designer is, For each attribute, capture it once, store it once, and use that copy everywhere.//**

from databases DeMYSTiFieD: A Self-Teaching Guide (2004) by Andy Oppel (page 148)

Choosing a Primary Key

 * 1) If there is only one candidate, choose it.
 * 2) Choose the candidate least likely to have its value change.
 * 3) Choose the simplest candidate.
 * 4) Choose the shortest candidate.

Insert Anomaly
Insert anomaly refers to a relationship in which you cannot insert a new row into a table because of an artificial dependency on another relation (table).

Delete Anomaly
A delete anomaly is similar to an insert anomaly but in the opposite. Deletion of a row in one entity (table) unintentionally causes the loss of data in another entity (table).

Update Anomaly
An update anomaly refers to a relationship in which an update of a single value requires multiple rows of data to be updated.

First Normal Form - Eliminate Repeating Data
A relation that contains no multivalued attributes is in first normal form.

For a group of attributes that repeat:
 * 1) Create a new relation.
 * 2) Use the primary key from the original relation as a foreign key in the new one.
 * 3) Make a primary key for the new relation.

Second Normal Form - Eliminate Partial Dependencies
A relation is in second normal form when:
 * 1) The relation is in first normal form
 * 2) All non-key attributes are functionally dependent on the entire primary key.

Third Normal Form - Eliminate Transitive Dependencies
A relation is in third normal form when:
 * 1) The relation is in second normal form.
 * 2) There is no transitive dependence. (All the non-key attributes depend only on the primary key.)

Beyond Third Normal Form

 * Boyce-Codd Normal Form (stronger version of third normal form)
 * Fourth Normal Form (deals with an anomaly occurring when 2 or more multivalued attributes are included in the same relation)
 * Fifth Normal Form (deals with a join dependency that requires knowledge of relational calculus to understand)
 * Domain-Key Normal Form (DKNF) (theory published in a research paper in 1981 but not in widespread use)

Denormalization
Once a database has been normalized, it is analyzed again from a performance perspective. Normalization leads to more relations and thus more tables which may require more "joins" during processing. Adjustments may be made to optimize performance.

More information on database normalization []

.