Cassandra howto
Cassandra is a “big data” type database that scales really well across multiple servers but is a bit confusing if you come from regular databases like mysql.
Cassandra has a bunch of servers that each run Cassandra with a duplicate copy of your database, so if one gets eaten you don’t have big problems because the others pick up the slack and continue to serve your data.
It’s really fast, but unless you have lots of data you probably don’t care because mysql is pretty fast for normal things. If you need Cassandra or other noSQL (stands for Not Only SQL, not just no SQL) things, you already know it.
Installing Cassandra 3.5.x on Debian Jessie
Right now you have to install it on Debian from Apache foundation sources and also install Java (though you could also use openjdk if you want), so add the repositories like:
vi /etc/apt/sources.list deb http://www.apache.org/dist/cassandra/debian 35x main deb-src http://www.apache.org/dist/cassandra/debian 35x main deb http://ppa.launchpad.net/webupd8team/java/ubuntu xenial main deb-src http://ppa.launchpad.net/webupd8team/java/ubuntu xenial main |
Then update apt-get and import the keys and install cassandra:
gpg --keyserver pgp.mit.edu --recv-keys 749D6EEC0353B12C gpg --export --armor 749D6EEC0353B12C | sudo apt-key add - apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys EEA14886 apt-get update apt-get install oracle-java8-installer ntp git apt-cache madison cassandra cassandra | 3.5 | http://www.apache.org/dist/cassandra/debian/ 35x/main amd64 Packages cassandra | 3.5 | http://www.apache.org/dist/cassandra/debian/ 35x/main Sources apt-cache search cassandra ycassa-doc - Documentation for the Pycassa library python-pycassa - Client library for Apache Cassandra cassandra - distributed storage system for structured data cassandra-tools - distributed storage system for structured data apt-get install cassandra cassandra-tools |
Okay, now we login to the console and create a keyspace, which is a container that holds your data like:
>: cqlsh Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.1.1 | CQL spec 3.3.1 | Native protocol v4] Use HELP for help. cqlsh> DESC KEYSPACES; system_traces system_schema system_auth system system_distributed cqlsh> CREATE KEYSPACE somedb WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 1}; cqlsh> use somedb; cqlsh:somedb> |
This means you have one node (replication factor 1), meaning the database/keyspace is only on one server, the one you’re working on. The class means how complicated the replication scheme is, this important if you have lots of nodes and lots of data.
Now let’s create a table in that keyspace. You don’t just need a database, you need a keyspace that could eventually fill up lots of servers, though you’re not doing that now, but you could as you scale.
Now that you have a keyspace, you have to make a table that holds your data. First we’ll use the DESC command to see that you don’t have any:
cqlsh:somedb> desc tables; <empty> cqlsh:somedb> CREATE TABLE people ( ... name text, ... age text, ... sex text, ... PRIMARY KEY (name)); cqlsh:somedb> |
This means “name” is what you reference everything else to. To see what you’ve created do:
cqlsh:somedb> DESC SCHEMA ; CREATE KEYSPACE somedb WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true; CREATE TABLE somedb.people ( name text PRIMARY KEY, age int, sex text ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; |
Now insert something in your database like:
INSERT INTO people (name, age, sex) VALUES ( 'doug', 46, 'male'); |
If that worked you shouldn’t get any errors, and you can see it’s there by doing:
cqlsh:somedb> SELECT * FROM people ; name | age | sex -------+-----+-------- wendy | 16 | female doug | 46 | male (2 rows) |
Installing a Cassandra management GUI
Cassandra-Cluster-Admin here, but it’s a little rougher, and you have to get it from the Cassandra like:mkdir /usr/src/cluster-admin cd /usr/src/cluster-admin git clone https://github.com/sebgiroux/Cassandra-Cluster-Admin ln -s /usr/src/cluster-admin/Cassandra-Cluster-Admin/ /var/www/html/cluster-admin vi /usr/src/cluster-admin/Cassandra-Cluster-Admin/includes/conf.inc.php change port to 9042 from 9160 |
Now visit http://localhost/cluster-admin