SQL Interview Tips - 1 ~ Rangoli Kannan

What is data integrity?

Data integrity is an important feature in SQL Server. When used properly, it ensures that data is accurate, correct, and valid.
also acts as a trap for otherwise undetectable bugs within applications.

Explain constraints?
Candidate key :

A candidate key is a combination of attributes that can be uniquely used to identify a database record without any extraneous data.
Each table may have one or more candidate keys.
One of these candidate keys is selected as the table primary key.

Primary Key :

Foreign key :

A foreign key is a field in a relational table that matches the primary key column of another table.
The foreign key can be used to cross-reference tables.
The foreign key is used to prevent actions that would destroy link between tables.
The foreign key also prevents that invalid data is inserted into the foreign key column, because it has to be one of the values contained in the table it points to.

Unique Key :

Unique key constraint will provide you a constraint like the column values should retain uniqueness.
It will allow null value in the column.
It will create non-clustered index by default
Any number of unique constraints can be added to a table.

Surrogate Key / Artificial Key / Identity key:

The value is unique system-wide, hence never reused
The value is system generated
The value is not manipulable by the user or application
The value contains no semantic meaning
The value is not visible to the user or application
Surrogate Key is the solution for critical column problems. For example, the customur purchases different items in differnt locations,for this situation we have to maintain historical data.
By using surrogate key we can introduce the row in the datawarehouse to maintain historical data.
Surrogate is mainly used in slowely changing dimensions,it mantaining the uniqueness in the table.it is used to track the old value with the new one.
We can say "Surrogate key" is a User defined primary key

What's the difference between a primary key and a unique key?

Primary key wont allow nulls, unique key allow nulls.
Primary key constraints are more restrictive than Unique constraints
Unique constraints create unique non-clustered indexes by default;Pprimary key constraints create unique clustered indexes by default.
There can be only one clustered index on a table, so you can specify only one unique clustered or primary key clustered constraint.

What is the difference between a primary key and a surrogate key?

A primary key is a special constraint on a column or set of columns.
A primary key constraint ensures that the column(s) so designated have no NULL values, and that every value is unique.
Physically, a primary key is implemented by the database system using a unique index, and all the columns in the primary key must have been declared NOT NULL.
A table may have only one primary key, but it may be composite (consist of more than one column).
A surrogate key is any column or set of columns that can be declared as the primary key instead of a "real" or natural key.
Sometimes there can be several natural keys that could be declared as the primary key, and these are all called candidate keys. So a surrogate is a candidate key.
A table could actually have more than one surrogate key, although this would be unusual.
The most common type of surrogate key is an incrementing integer, such as an auto_increment column in MySQL, or a sequence in Oracle, or an identity column in SQL Server.

Pros and Cons of Surrogate Key :

Pros:

Business Logic is not in the keys.
Small 4-byte key (the surrogate key will most likely be an integer and SQL Server for example requires only 4 bytes to store it, if a bigint, then 8 bytes).
Joins are very fast.
No locking contentions because of unique constraint (this refers to the waits that get developed when two sessions are trying to insert the same unique business key) as the surrogates get generated by the DB and are cached - very scalable.

Cons :

An additional index is needed. In SQL Server, the PK constraint will always creates a unique index, in Oracle, if an index already exists, PK creation will use that index for uniqueness enforcement (not a con in Oracle).
Cannot be used as a search key.
If it is database controlled, for products that support multiple databases, different implementations are needed, example: identity in SS2k, before triggers and sequences in Oracle, identity/sequence in DB2 UDB.
Always requires a join when browsing the child table(s).

Pros and Cons of Natural Key :

Pros :

Cons:

If not chosen wisely (business meaning in the key(s)), then over a period of time additions may be required to the PK and modifications to the PK can occur.
If using strings, joins are a bit slower as compared to the int data-type joins, storage is more as well. Since storage is more, less data-values get stored per index page. Also, reading strings is a two step process in some RDBMS: one to get the actual length of the string and second to actually perform the read operation to get the value.
Locking contentions can arise if using application driven generation mechanism for the key.
Can’t enter a record until value is known since the value has some meaning.

Use the tips!!!

Rangoli Kannan - Technical Blog