Cassandra Multi-AZ Data Replication

Posted on August 19, 2015 by CloudThat | Comments(7)

Apache Cassandra is an open source non-relational/NOSQL database. It is massively scalable and is designed to handle large amounts of data across multiple servers (Here, we shall use Amazon EC2 instances), providing high availability. In this blog, we shall replicate data across nodes running in multiple Availability Zones (AZs) to ensure reliability and fault tolerance. We will also learn how to ensure that the data remains intact even when an entire AZ goes down. The initial setup consists of a Cassandra cluster with 6 nodes with 2 nodes (EC2s) spread across AZ-1a , 2 in AZ-1b and 2 in AZ-1c. Initial Setup: Cassandra Cluster with six nodes. AZ-1a: us-east-1a: Node 1, Node 2 AZ-1b: us-east-1b: Node 3, Node 4 AZ-1c: us-east-1c: Node 5, Node 6 Next, we have to make changes in the Cassandra configuration file. cassandra.yaml file is the main configuration file for Cassandra. We can control how nodes are configured within a cluster, including inter-node communication, data partitioning and replica placement etc., in this config file. The key value which we need to define in the config file in this context is called Snitch. Basically, a snitch indicates as to which Region and Availability zones does each node in the

Continue reading…

Migration from relational database to NoSQL database

Posted on August 19, 2015 by Harshal Bulsara | Comments(0)

Do we really need NoSQL databases? Relational database model was proposed in 1970, since then we are using RDBMS for most of the applications. But this model is having a hard time keeping pace with the volume, velocity, and variety of data. To keep pace with growing data storage needs, NoSQL databases were introduced in which the focus has shifted from relationships in data, to have a scalable solution to store large volumes of data. Relational databases focus on ACID (Atomicity, Consistency, Isolation, and Durability) property whereas NoSQL focus on CAPs (Consistency, Availability, and Partition tolerance) theorem. Market share of Relational and NoSQL database The chart shows the Relational databases were used 51% whereas NoSQL databases were used 49% during the period of March 2013 – 2014 During period of March 2014 – 2015 the use of NoSQL databases increased to 59% whereas the use of Relational database reduced to 41% Advantages of NoSQL Database Fast Each table in NoSQL is independent of the other. NoSQL provides us the ability to scale the tables horizontally, so we can store frequently required information in one table. All the table joins need to be handled at application level. Thus, data retrieval is

Continue reading…