Reaves.dev

v0.1.0

built using

Phoenix v1.7.20

Db Index

Stephen M. Reaves

::

2023-07-18

Notes about Lesson 11 of CS-6400

Summary

Computer Architecture

Main memory is fast, volatile, small, and (relatively) expensive

Disks are permanent, slow, big, and cheap

Why should you care

Time:

  • Main memory access time is 30ns (0.3×107s 0.3\times10^{-7}s )
  • Disk access time is 10ms (1×102s 1\times10^{-2}s )

That would be like reading a Phonebook:

  • Read a page in 1 minute
  • Open a page in 200 days

Cost cmputation:

  • Only I/O cost counts
  • CPU cost is ignored

Disk

Typical high-end pack capacity:

  • 4 platters
  • 8 heads
  • ~150K tracks/surface
  • ~1000KB/track
  • 1,2000GB
  • 512 bytes/sector
  • 4, 8, or 16KB/block
  • 600MB/s transfer rate
  • 10,000 RPMs
  • 3-4ms latency

Records

RegularUser(
  Email varchar(50),
  Sex char(1),
  Birthdate datetime,
  CurrentCity varchar(50),
  Hometown varchar(50),
);

Storage used for one record:

  • Record size: 159 bytes
  • block size: 4k
  • filled: 80%
  • records/block: ~20
  • num of records: 4 million
  • num of blocks: ~200,000
  • file size: ~800MB

Assumptions

Page fault:

  • Seek time 3-8ms
  • Rotational delay: 2-3ms
  • Transfer time: 0.5-1ms
  • Total: 5-12 ms

Bulk/Extent transfers (e.g. 250 blocks) save seek time and rotational delay, but require more buffer space.

  • Page faults of 10 ms each would cost about 0.260s
    • compared to 2.5s of individual transfers

buffer needs to be managed

  • LRU strategies
    • excellent for merge joins
    • kill nested loop joins

File Organization

Heap

Unsorted File

Lookup time:

  • N/2
    • where N = num of data blocks
    • 200,000/2 * 0.01s = 16.6 min

Block pointer: 4 bytes

num of records: 4 million

num of data blocks: ~200,000

file size: ~800MB

Sorting

Sorting helps

Allows us to use binary search

  • log2(N) log_2(N)
  • 18 * 0.01s = 0.18s

Primary Index

Skip list holding pointers to the beginning of blocks

This reduces the size of data we search in.

GindexABCDdata_aAaAbAcAdindex:A->data_a:ndata_bBaBbBcBdindex:B->data_b:ndata_cCaCbCcCdindex:C->data_c:ndata_dDaDbDcDdindex:D->data_d:ndata_a:D->data_b:Adata_b:D->data_c:Adata_c:D->data_d:A

a dense key means all the key values in the data are in the index.

a sparse index means only a subset are.

Lookup time:

  • log2(n)+1 log_2(n) +1
    • where n = num of index blocks
    • +1 is to move from index to data block

Secondary Index

Same thing as a primary index, but values in the data are not necessarily sorted

  • This means all secondary indexes are dense indexes
GindexABCDdata_dDaDbDcDdindex:D->data_d:ndata_aAaAbAcAdindex:A->data_a:ndata_cCaCbCcCdindex:C->data_c:ndata_bBaBbBcBdindex:B->data_b:ndata_d:e->data_a:Adata_a:e->data_c:Adata_c:e->data_b:A

This is tricky for non-unique fields

Multi-leveled Index

GindexABCDindex_aABindex:A->index_a:Aindex:B->index_a:Bindex_cCDindex:C->index_c:Cindex:D->index_c:Dindex_a:B->index_c:Cdata_aAaAbAcAdindex_a:A->data_a:ndata_bBaBbBcBdindex_a:B->data_b:ndata_cCaCbCcCdindex_c:C->data_c:ndata_dDaDbDcDdindex_c:D->data_d:ndata_a:D->data_b:Adata_b:D->data_c:Adata_c:D->data_d:A

Lookup Time:

  • logk(n)+1 log_k(n)+1
    • where n is num of index values
    • where k is num of layers (3 in this case)

Multi-level Index B-Tree

Multi-level indexes can be arranged in a B-Tree.

Insertion, deletion, and udpate operations are implemented to keep tree balanced.

Only rarely does an overflow at a lower level of the tree propagate more than 1-2 levels up.

Static Hashing

Instead of indexing, we can have a hash table

Lookup Time:

  • Constant if bucket directory is in main memory
    • 1-2 * 0.01s
    • Dynamic hashing expands bucket directory when needed