Models, Modelling, and Classification of DBMS

## Models, Modelling, and Classification of DBMS
---
## Outline
* Models 
* Modelleing 
* Classification of DBMS
---
## Models
* define the logical structure of a database
* includes relationships and constraints
* determines how data can be stored and accessed
---
## Well-known Models
* Hierarchical
* Network
* Relational
* Object 
* Document 
---
## Hierarchical Model
* oldest formal database model
* data organized into a tree-like structure
* records are nodes in the tree
* every record has at most 1 parent 
* every record may have multiple children
---
## Hierarchical Model 
![Hierarchical Model](/img/Hierarchical_Model.svg)
---
## Hierarchichal Model - Advantages
* fast and efficient retrieval
* simple predictable structure
* efficient storage
* good performance
---
## Hierarchical Model - Disadvantages
* 
* 
* 
* 
---
## Network Model
* records are nodes in a graph
* records can have multiple parents
* supports more types of relationships than the hierarchical model
---
## Network Model
![Network Model](/img/Bachman_order_processing_model.png)

<a href="https://commons.wikimedia.org/wiki/File:Bachman_order_processing_model.tiff">Mhkay</a>, <a href="https://creativecommons.org/licenses/by-sa/4.0">CC BY-SA 4.0</a>, via Wikimedia Commons
---
## Relational Model
* uses first-order predicate logic
* data is represented in tuples
* tuples grouped into relations
* extremely popular model
---
## Relational Model 
* Relation (or Table, or File)
* Tuple (or Row, or Record)
* Attribute (or Column, or Field)
---
## Relational Model

![Relational Model](https://upload.wikimedia.org/wikipedia/commons/8/8d/Relational_model_concepts.png)

<a href="https://commons.wikimedia.org/wiki/File:Relational_model_concepts.png">User:AutumnSnow</a>, <a href="http://creativecommons.org/licenses/by-sa/3.0/">CC BY-SA 3.0</a>, via Wikimedia Commons 
---
## Object Model
* unit of information is the object
* mirrors the use of objects in object oriented programming
* tightly coupled with object oriented programming language
* persist data from a running object-oriented program
---
## Document Model
* model for semi-structured data
* documents consist of key/value pairs
* documents don't have to have all the same keys but may share some
* key for each document
* retrieval is by document key or individual keys within a document
---
## Data Modelling
* Data Abstratction
* Schemas
* Logical and Physical Data Independence
---
## Data Abstraction
* there are multiple levels of data abstraction
* view (external)
* logical (or conceptual)
* internal (or physical)
---
## Database Abstraction

![Database Abstraction](/img/database-abstraction.png)

<a href="https://afteracademy.com/blog/what-is-data-abstraction-in-dbms-and-what-are-its-three-levels/">Database Abstraction</a>
---
## View or External Level
* how the data looks to end users developers
* typically implemented as a database view
* does not necessarily correspond directly to logical design
* e.g. end user wants a view combining columns from multiple tables
* concern of 
---
## Logical or Conceptual Level
* how the data is designed to be stored
* optimized for minimizing redundancy and increasing efficiency
* not necessarily how the user wants it
* concern of DBMS administrator or designer
---
## Internal or Physical Level
* how data are actually stored in files in the file system
* low-level definition
* tuned for efficient access considering the constraints of the operating system
* concern of DBMS implentor/vendor
---
## Schemas
* formal description of database structure
* External: can be multiple depending needs of individual users
* Conceptual: only one created by database designer
* Physical: only one created by DBMS vendor
---
## Logical and Physical Data Independence
* Logical: End users and developers should be immune to a change in the conceptual schema 
* Physical: Conceptual schema should be immune to a change in the physical implementation of the DBMS
---
## Classification of DBMS
---
## Classification of DBMS
* data model
* user numbers
* distribution
* license 
* in-memory databases
---
## Data Model
* Hierarchical: IMS
* Network: IDS
* Relational: MariaDB, PostgreSQL, Oracle, Microsoft SQL Server, MySQL
* Object-oriented: ObjectStore
* Document: MongoDB
---
## User Numbers
* single-user: Microsoft Access
* multi-user: everything else
---
## Distribution
* When do we need more than one database server?
---
## Distribution
* replication
* sharding
---
## Distribution - Replication
* used for load balancing
* entire database is replicated on another computer
* one server is the "Master"; others are "Slaves" 
* works well when there are few writes and many reads
---
## Distribution - Sharding
* used when there is more data than will fit on a single computer
* there are no duplicates; entire database is spread accross several computers
* can also be used for load balancing
* works well for any access pattern (reads vs writes)
---
## License
* propietary
* open-source
* 
* 
---
## Proprietary
* software is protected through: copyright, patent, trade secrets
* can only run the amount you pay for
* e.g. Oracle, Microsoft SQL Server
---
## Open-source
* software is freely copyable
* can run as many copies as you like
* e.g. MariaDB, PostgreSQL, MongoDB
---
## Proprietary vs. Open-source
* discussion
---
## Location
* on-premises
* cloud
---
## Location - On-premises
* own hardware
* own data center
---
## Location - Cloud
* someone else's hardware
* someone else's data center
---
## On-premises vs. Cloud
* discussion
---
## In-memory databases
* entire database is in main memory
* not on secondary storage
* access is orders of magnitude faster
* e.g. Redis, memcached
---
## Computer memory hierarchy

[Computer Memory Hierarchy](https://commons.wikimedia.org/wiki/File:ComputerMemoryHierarchy.svg)
---
## DRAM vs. SDD vs. HDD
* discussion