Entity integrity

From ArticleWorld

Entity integrity is the most basic level of data integrity provided in the relational database model. Entity integrity is a property that ensures that no component of the primary key of any base relation can be null. That is, a single-attribute primary key cannot accept null values; neither can any of the attributes in a composite primary key. Entity integrity forms one of the two implicit integrity constraints (a constraint is implicit if the user need not explicitly define it) in the relational model, the other one being referential integrity.

Most relational database management systems (RDBMSs) support entity integrity by rejecting operations (INSERT or UPDATE) that produce an invalid primary key. However, no major RDBMS system forces that primary keys be specified for any relation (i.e. table). It is up to the user to specify a primary key for every new table that he/she creates.

It is incorrect to say that entity integrity states that primary key values should be unique. The reason behind the uniqueness of primary key values arises from the definition of the primary key concept itself (more precisely, from the uniqueness property), and not from entity integrity.


Enforcing entity integrity is important because primary keys serve as identifiers for individual tuples (rows) in a relational DBMS. This means that no primary key should be "missing". If a primary key happens to be null, it would be a contradiction in terms. This is because it would effectively state that there is an entity without a known identity. This is the reason for the use of the term 'entity integrity'.

Also, if nulls are allowed, this implies that some tuples cannot be uniquely identified. For example, if two or more tuples have null as their primary key, then we cannot distinguish between them.