Overview

This is the main documentation for the gora-cassandra module which enables Apache Cassandra backend support for Gora.

gora.properties

Property Key Property Value Required Description
gora.datastore.default= org.apache.gora.cassandra.store.CassandraStore Yes Implementation of the persistent Java storage class
gora.cassandra.mapping.file= /path/to/gora-cassandra-mapping.xml No The XML mapping file to be used. If no value is used this defaults to gora-cassandra-mapping.xml
gora.cassandra.servers= localhost:9160 Yes This value should specify the host:port for a running Cassandra server or node. In this case the server happens to be running on localhost at port 9160 which is the default Cassandra server configuration. It is important that the host matches that specified in gora-cassandra-mapping.xml
gora.cassandrastore.username= ${username} No The authentication details for passing a username to the CassandraHostConfigurator. This will be required if security is required for Cassandra reads and writes.
gora.cassandrastore.password= ${password} No The authentication details for passing a password to the CassandraHostConfigurator. This will be required if security is required for Cassandra reads and writes.

Gora Cassandra mappings

Say we wished to map some Employee data and store it into the CassandraStore.

<gora-otd>
  <keyspace name="Employee" host="localhost" placement_strategy="org.apache.cassandra.locator.SimpleStrategy"
  replication_factor="1" cluster="Gora Cassandra Test Cluster">
    <family name="p" gc_grace_seconds="5"/>
    <family name="f" gc_grace_seconds="5"/>
    <family name="sc" type="super" />
  </keyspace>

  <class name="org.apache.gora.examples.generated.Employee" keyClass="java.lang.String" keyspace="Employee">
    <field name="name"  family="p" qualifier="info:nm" ttl="10"/>
    <field name="dateOfBirth"  family="p" qualifier="info:db" ttl="10"/>
    <field name="ssn"  family="p" qualifier="info:sn" ttl="10"/>
    <field name="salary"  family="p" qualifier="info:sl" ttl="10"/>
  </class>
</gora-otd>

Here you can see that we require the definition of two child elements within the gora-otd mapping configuration, namely;

The keyspace element; where we specify:

  1. a parameter containing the Cassandra keyspace schema name e.g. Employee,

  2. a parameter containing the host e.g. localhost. The value of the host attribute of keyspace tag should match exactly what is in gora.properties file. Essentially this means that if you are using port number, you should use it everywhere regardless of whether it is the default port or not. At runtime Gora will otherwise try to connect to localhost. For more information please see here

  3. a parameter containing the Cassandra cluster name e.g. Gora Cassandra Test Cluster,

  4. a parameter containing a placement_strategy: The value of 'placement_strategy' should be a fully qualifed class name that is known to the cassansra cluster, not the application or Gora. As of this writing, the classes that ship with cassandra are: org.apache.cassandra.locator.SimpleStrategy and org.apache.cassandra.locator.NetworkTopologyStrategy. gora-cassandra will use SimpleStrategy by default if no value for this attribute is specified. Finally it should be noted that the placement_strategy attribute of the keyspace tag will only apply if Gora creates the Cassandra Keyspace. More about placement strategies can be found here.

  5. a parameter containing a replication_factor attribute with value integer. Again the replacation_factor value associated with the Keyspace tag will only apply if Gora creates the Keyspace and will have no effect if this is being used against an existing keyspace. the default value for 'replication_factor' is '1'. N.B.In Cassandra this property is required if the placement_strategy class is SimpleStrategy; otherwise, not used. This value essentially relates to the number of replicas of data you want to reside on multiple nodes.

  6. A child element family containing the name, type and gc_grace_seconds parameters for column families we wish to create within Cassandra. In this case we create three columns; p, f and sc the last of which contains an optional type attribute which further defines this as a super column. Additonally, column families p and f assign a value of 5 to gc_grace_seconds. In Gora we define the default value of 'gc_grace_seconds' as '0' which is ONLY VIABLE FOR A SINGLE NODE CLUSTER. You should update this value according to your cluster configuration. Columns marked with a gc_grace_seconds exist for a configured time period. More information can be found here

The class element specifying persistent fields which values should map to. This element contains;

  1. a parameter containing the Persistent class name e.g. org.apache.gora.examples.generated.Employee,

  2. a parameter containing the keyClass e.g. java.lang.String which specifies the keys which map to the field values,

  3. a parameter containing the keyspace e.g. Employee which matches to the above keyspace definition,

  4. finally a child element(s) field which represent fields which are to be persisted into Cassandra. These need to be configured such that they receive the following;

    a parameter name e.g. (name, dateOfBirth, ssn and salary respectively),

    a parameter containing the column family to which the field belongs e.g. (all p in this case),

    an optional parameter qualifier, which enables more granular control over the data to be persisted into Cassandra.

    an optional patameter ttl (time to live): the value of the 'ttl' attribute should most likely always be zero unless you want Cassandra to create Tombstones and delete portions of your data once this period expires. Any positive value is read and bound to the number of seconds after which the value for that field will disappear. The default value of ttl is '0'.