nosql - How does Cassandra scale horizontally ? -
i've watched video on cassandra database, turns effective , explains lot cassandra. i've ready article , books cassandra thing not understand how cassandra scale horizontally. horizontally scale mean add more nodes gain more space. understand each node has identical data i.e if 1 node has 1tb of data , replicated other nodes means n nodes each contain 1tb of data. missing here ?
yes, missing something. data may not need duplicated n times, n number of nodes. typically configure replication factor (rf) lower number of nodes (n).
for example, rf = 3, n = 5. meaning each row duplicated 3 times across randomly chosen 3 nodes out of 5 nodes (plus pristine copy). if 1 node goes down, have 3 copies elsewhere on other nodes.
this works better in larger clusters, e.g. rf = 5, n = 100.
higher rf improves data redundancy , read speed, decreases write speed. there balance, if rf high, rf = n, you'd have high data redundancy, high resilience node failures, , high read throughput. on other side write throughput limited, data needs replicated nodes. if 1 node goes down in scenario write may fail (depending on client config) desired replication factor cannot achieved.
Comments
Post a Comment