Appendix

ACID

Atomicity, Consistency, Isolation and Durability. Relational DBs

Atomicity

A transaction is said to be atomic if it is indivisible, or made up of a series of operations such that either they all take place or none do.

Availabilty

The availability of the system is essentially the uptime of the system, i.e. when it is running and operational.

BASE

Basically Available, Soft state eventual consistency. NoSQL DBs

Bloomfilter

Space efficient probablistic data stores Checks if the element is a member of a set. Can have false positive but never false negatives.

Caching

Cache storage to speend up access 1) Every node does its own caching 2) Distributed cache, shared between different nodes

  • Cache is not the source of trueth
  • Cache data has to be small -> stored in memory
  • Eviction policies example LRU

CAP Theorem

Consistency | Availabilty | Partition Tolerance Can’t have all three. Partition Tolerance always happens, so essentially choosing between consistency & availabilty. Traditional DBs rely on ACID, so Consistency wins over Availabilty. NoSql DBs can choose Availabilty over Consistency.

Concurrent Transactions

When transacations can take place as if in parallel, but in reality might take place out of order or in partial order, without affecting the final outcome.

Consistency

Many times, data will be replicated across nodes. The consistency of a system is a way of stating how quickly and accurately the data is updated across the nodes such that it is consistent across all the nodes.See also strong consistency and eventual consistency. See here for a detailed explanation.

Content Delivery Network

Geographically distributed servers that store content. Allows client to contact the server closest to it.

Count Min Sketch

Space efficient probablistic data stores Counts frequency of events, but with som error rate.

Cross origin resource sharing

Allows restricted resource on a web page to be requested from another domain outside the domain from which th efirst resource was served. Browser makes an HTTP OPTIONS (GET PUT POST) call for a url, and the server retursn a response saying “These other domains are approved to GET this url”

Durability

The durability of a system guarantees that data that is committed persists, or is saved permanently.

Edge Caching

Edge caching refers to the use of caching servers to store content closer to end users. For instance, if you visit a popular Web site and download some static content that gets cached, each subsequent user will get served that content directly from the caching server until it expires

Eventual Consistency

A system display eventual consistency when nodes evetually update each other with changes, before which it is possible that clients may see inconsistencies in the data. For example, client C may update system A by changing the value of x from 6 to 7. Since A doesn’t immediately update the other nodes about this change, if C asks for the value of x again and the request is routed to system B, it will respond with the old value of x ie 6. If C repeats this request after some time during which the update takes place, it will get the new value of x.

Indexing

Organizing data for fast retrieval

Lazy space allocation

Rather than allocating space for the file content as soon as it is created, the data is written onto a buffer first. This improves the chance that the data is written in a contiguous group of blocks, reducing fragmentation problems and increasing performance.

Locking

Optimistic vs Pessimistic Locking. Optimistic Locking - getting a lock only at time of transaction Pessimistic Locking - getting all the locks beforehand and then commiting

NoSql

Scales better and has higher availabilty but no ACID properties. Types: 1) Key Value 2) Wide Column - 1 row can have many different types of Databases 3) Document based databases 4) graph based

Partition Tolerance

When there’s a partition between two hosts. Always happens in real life.

Same origin policy

A web browser permits scripts contained in a first web page to access data in a second web page only if bothe web pages have the same origin.to prevent Cross-Site Scripting (XSS) attacks.

Scaling

Vertical Scaling - Add more memory/CPU power. Expensive and limited Horizontal Scaling - Add more hosts. But have to worry about distribution concepts.

Sharding

Distribution of data over hosts, for example usign consistent hashing

Serializability

A transaction schedule is serializable if there exists a schedule where the transactions are executed in some sequence with the same outcome.

Service Level Agreement (SLA)

An agreement between the the service provider and customers, which documents the terms of the service, and the standards to which the customer can expect to hold the service.

Split Brain Situation

During a partition, one section may elect another master even though the old master is actually still alive in the other partition. To prevent this,the number of masters must never be less than or equal to half the total number of nodes, or the number of votes need for electing a master must be at least one more than half .

Strong Consistency

A system displays strong consistency when it behaves as if it is running on one node, i.e. when all the node immediately update each other for any change such that the client never sees any inconsistencies in the data. So any read after a write will always see the result of that write.

Written on October 7, 2018