Reaves.dev

v0.1.0

built using

Phoenix v1.7.12

Fundamentals Of Databases

Stephen M. Reaves

::

2023-05-15

Notes about Lesson 1 of CS-6400

Summary

Why use Models

Models can be useful when we want to examine or manage part of the real world.

The costs of using a model are often considerably lower than the cost of using or experimenting with the real world itself.

Models also use conventions to increase speed in making a point.

Message to Model Makers

Users of a model must have a certain amount of knowledge in common (conventions).

A model is:

To Use or Not To Use

Database Management Systems (DBMS) are good at:

Why not use a DBMS:

Data Modeling

The model represents a perception of structures of reality.

The data modeling process is to fix a perception of structures of reality and represent this perception.

In the data modeling process, we select aspects and we abstract.

Process Modeling

The use of the model reflects processes of reality.

Processes may be represented:

Data Models

Data models contain:

DB architecture and DBMS architecture are concerned with the layers of the system.

Example of Data Models

A data model is not the same as a model of data

A database is a model of the structures of reality.

A data model is the tool (or formalism) used to create the model.

A data model includes:

Relational Model Data Structures

Data is represented in tables. Tables consist of a name, and a matrix where columns a fields (consisting of name and type) and rows are entries.

Example:

Email (varchar(50))BirthDate (datetime)Hometown (varchar(50))Salary (integer)
[email protected]1985-11-07Atlanta10,000
[email protected]1969-11-28Austion10,000
[email protected]1967-11-03Portland12,000
[email protected]1988-11-15Atlanta13,000
[email protected]1973-03-12Portland14,000
[email protected]1988-11-09Atlanta15,000

Degree := Number of columns

Schema := The aspect of the table that is considered stable, or not expected to change over time

State := The aspect of the table that is expected to change over time

Relational Model Constraints

Constraints express rules that cannot be expressed by the data structures alone

Examples:

Data Model Operations

Operations support change and retrieval of data.

Keys and Identifiers

Keys are uniqueness constraints

Integrity and Consistency

Integrity answers the question Does the db reflect reality well?

Consistency answers the question Is the db without internal conflicts/contradictions?

Null Values

Email (varchar(50))BirthDate (datetime)Hometown (varchar(50))SSRegistSexSalary (integer)
[email protected]1985-11-07AtlantaYesM10,000
[email protected]1969-11-28AustionINAPPLICABLEF10,000
[email protected]1967-11-03PortlandINAPPLICABLEF12,000
[email protected]1988-11-15AtlantaINAPPLICABLEF13,000
[email protected]1973-03-12PortlandINAPPLICABLEF14,000
[email protected]1988-11-09AtlantaNoM15,000

Null values can be used for unknown, but not for Inapplicable

Surrogates - Things and Names

In name-based representation you are what is known about you, no more, no less.

Surrogate keys solve this problem by adding an artificial ID. This allows you to change everything else about a record and still determine if it is the same record.

ANSI/SPARC

We already separate data and schema, why not separate schema into conceptual and internal schemas?

Conceptual Schemas

Conceptual Schema describes all conceptually relevant, general, time-invariant, structural aspects of reality, and excludes aspects of data representation and physical organization, and access.

This is typically how we normally think of schemas.

External Schemas

Describes parts of the information in the conceptual schema in a form convenient to a particular user group’s view. It is derived from the conceptual schema.

This is typically how we think of views.

Internal Schemas

Describes how the information described in the conceptual schema is physically represented to provide the overall best performance.

This is typically how we think of indexes.

Physical Data Independence

Physical data independence is a measure of how much the internal schema can change without affecting the application programs.

This is similar to encapsulation in OOP.

Logical Data Independence

Logical data independence is a measure of how much the conceptual schema can change without affecting the application programs.

ANSI/SPARC DBMS Framework

Gcluster_queryQuerycluster_compilerSchemaTransformerTransformeruserusersiStorage InternalisInternal Schemasi->iscoeConceptual/Externalis->coecoe->userQueriesdatadatadata->siCompilerCompilermetadatametadatametadata->simetadata->ismetadata->coeespExternal Schema Processormetadata->espispInternal Schema Processorisp->metadatadbaDatabase Administrationdba->ispChange schemacspConceptual Schema Processordba->cspasaApplication System Administratorasa->espcsp->metadatacsp->asaeaEnterprise Administratorea->csp

Framework came out as early as 1975.

Metadata

System metadata are critical in a DBMS.

Business metadata are critical in a data warehouse.