CassandraStore (Apache Gora 0.4 API)

java.lang.Object
- org.apache.gora.store.impl.DataStoreBase<K,T>
- - org.apache.gora.cassandra.store.CassandraStore<K,T>

All Implemented Interfaces:

Closeable, AutoCloseable, DataStore<K,T>, org.apache.hadoop.conf.Configurable, org.apache.hadoop.io.Writable
```
public class CassandraStore<K,T extends PersistentBase>
extends DataStoreBase<K,T>
```
CassandraStore is the primary class responsible for directing Gora CRUD operations into Cassandra. We (delegate) rely heavily on CassandraClient for many operations such as initialization, creating and deleting schemas (Cassandra Keyspaces), etc.

Field Summary

Fields
Modifier and Type	Field and Description
`static int`	`DEFAULT_UNION_SCHEMA` Default schema index with value "0" used when AVRO Union data types are stored
`static ThreadLocal<org.apache.avro.io.BinaryEncoder>`	`encoders`
`static org.slf4j.Logger`	`LOG` Logging implementation
`static String`	`UNION_COL_SUFIX` Fixed string with value "UnionIndex" used to generate an extra column based on the original field's name
`static ConcurrentHashMap<String,org.apache.avro.specific.SpecificDatumWriter<?>>`	`writerMap` Create a `ConcurrentHashMap` for the datum readers and writers.

Fields inherited from class org.apache.gora.store.impl.DataStoreBase
autoCreateSchema, beanFactory, conf, datumReader, datumWriter, fieldMap, keyClass, persistentClass, properties, schema

Constructor Summary

Constructors
Constructor and Description

CassandraStore()
The default constructor for CassandraStore

Constructors
Constructor and Description
`CassandraStore()` The default constructor for CassandraStore

Method Summary

Methods
Modifier and Type	Method and Description
`void`	`close()` Close the DataStore.
`void`	`createSchema()` Creates the optional schema or table (or similar) in the datastore to hold the objects.
`boolean`	`delete(K key)` Deletes the object with the given key
`long`	`deleteByQuery(Query<K,T> query)` Deletes all the objects matching the query.
`void`	`deleteSchema()` Deletes the underlying schema or table (or similar) in the datastore that holds the objects.
`Result<K,T>`	`execute(Query<K,T> query)` When executing Gora Queries in Cassandra we query the Cassandra keyspace by families.
`void`	`flush()` Flush the buffer which is a synchronized `LinkedHashMap` storing fields pending to be stored by `put(Object, PersistentBase)` operations.
`T`	`get(K key, String[] fields)` Returns the object corresponding to the given key.
`List<PartitionQuery<K,T>>`	`getPartitions(Query<K,T> query)` Partitions the given query and returns a list of `PartitionQuery`s, which will execute on local data.
`String`	`getSchemaName()` In Cassandra Schemas are referred to as Keyspaces
`void`	`initialize(Class<K> keyClass, Class<T> persistent, Properties properties)` Initialize is called when then the call to `org.apache.gora.store.DataStoreFactory#createDataStore(Class dataStoreClass, Class keyClass, Class persistent, org.apache.hadoop.conf.Configuration conf)` is made.
`Query<K,T>`	`newQuery()` Constructs and returns a new Query.
`void`	`put(K key, T value)` When doing the `put(Object, PersistentBase)` operation, the logic is as follows: Obtain the Avro `Schema` for the object. Create a new duplicate instance of the object (explained in more detail below) **. Obtain a `List` of the `Schema` `Schema.Field`'s. Iterate through the field `List`.
`boolean`	`schemaExists()` Simple method to check if a Cassandra Keyspace exists.

Methods inherited from class org.apache.gora.store.impl.DataStoreBase
equals, get, getBeanFactory, getConf, getFields, getFieldsToQuery, getKeyClass, getOrCreateConf, getPersistentClass, getSchemaName, newKey, newPersistent, readFields, setBeanFactory, setConf, setKeyClass, setPersistentClass, truncateSchema, write

Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - LOG
```
public static final org.slf4j.Logger LOG
```
    Logging implementation
  - UNION_COL_SUFIX
```
public static String UNION_COL_SUFIX
```
    Fixed string with value "UnionIndex" used to generate an extra column based on the original field's name
  - DEFAULT_UNION_SCHEMA
```
public static int DEFAULT_UNION_SCHEMA
```
    Default schema index with value "0" used when AVRO Union data types are stored
  - encoders
```
public static final ThreadLocal<org.apache.avro.io.BinaryEncoder> encoders
```
  - writerMap
```
public static final ConcurrentHashMap<String,org.apache.avro.specific.SpecificDatumWriter<?>> writerMap
```
    Create a ConcurrentHashMap for the datum readers and writers. This is necessary because they are not thread safe, at least not before Avro 1.4.0 (See AVRO-650). When they are thread safe, it is possible to maintain a single reader and writer pair for every schema, instead of one for every thread.
    
    See Also:
    AVRO-650
- Constructor Detail
  - CassandraStore
```
public CassandraStore()
               throws Exception
```
    The default constructor for CassandraStore
    
    Throws:
    
    Exception
- Method Detail
  - initialize
```
public void initialize(Class<K> keyClass,
              Class<T> persistent,
              Properties properties)
```
    Initialize is called when then the call to org.apache.gora.store.DataStoreFactory#createDataStore(Class dataStoreClass, Class keyClass, Class persistent, org.apache.hadoop.conf.Configuration conf) is made. In this case, we merely delegate the store initialization to the org.apache.gora.cassandra.store.CassandraClient#initialize(Class keyClass, Class persistentClass).
    
    Specified by:
    
    initialize in interface DataStore<K,T extends PersistentBase>
    
    Overrides:
    
    initialize in class DataStoreBase<K,T extends PersistentBase>
    
    Parameters:
    keyClass - the class of the keys
    persistent - the class of the persistent objects
    properties - extra metadata
  - close
```
public void close()
```
    Description copied from interface: DataStore
    
    Close the DataStore. This should release any resources held by the implementation, so that the instance is ready for GC. All other DataStore methods cannot be used after this method was called. Subsequent calls of this method are ignored.
  - createSchema
```
public void createSchema()
```
    Description copied from interface: DataStore
    
    Creates the optional schema or table (or similar) in the datastore to hold the objects. If the schema is already created previously, or the underlying data model does not support or need this operation, the operation is ignored.
  - delete
```
public boolean delete(K key)
```
    Description copied from interface: DataStore
    
    Deletes the object with the given key
    
    Parameters:
    key - the key of the object
    
    Returns:
    whether the object was successfully deleted
  - deleteByQuery
```
public long deleteByQuery(Query<K,T> query)
```
    Description copied from interface: DataStore
    
    Deletes all the objects matching the query. See also the note on visibility.
    
    Parameters:
    query - matching records to this query will be deleted
    
    Returns:
    number of deleted records
  - deleteSchema
```
public void deleteSchema()
```
    Description copied from interface: DataStore
    
    Deletes the underlying schema or table (or similar) in the datastore that holds the objects. This also deletes all the data associated with the schema.
  - execute
```
public Result<K,T> execute(Query<K,T> query)
```
    When executing Gora Queries in Cassandra we query the Cassandra keyspace by families. When we add sub/supercolumns, Gora keys are mapped to Cassandra partition keys only. This is because we follow the Cassandra logic where column family data is partitioned across nodes based on row Key.
    
    Parameters:
    query - the query to execute.
    
    Returns:
    the results as a Result object.
  - flush
```
public void flush()
```
    Flush the buffer which is a synchronized LinkedHashMap storing fields pending to be stored by put(Object, PersistentBase) operations. Invoking this method therefore writes the buffered rows into Cassandra.
    
    See Also:
    DataStore.flush()
  - get
```
public T get(K key,
    String[] fields)
```
    Description copied from interface: DataStore
    
    Returns the object corresponding to the given key.
    
    Parameters:
    key - the key of the object
    fields - the fields required in the object. Pass null, to retrieve all fields
    
    Returns:
    the Object corresponding to the key or null if it cannot be found
  - getPartitions
```
public List<PartitionQuery<K,T>> getPartitions(Query<K,T> query)
                                                               throws IOException
```
    Description copied from interface: DataStore
    
    Partitions the given query and returns a list of PartitionQuerys, which will execute on local data.
    
    Parameters:
    query - the base query to create the partitions for. If the query is null, then the data store returns the partitions for the default query (returning every object)
    
    Returns:
    a List of PartitionQuery's
    
    Throws:
    
    IOException
  - getSchemaName
```
public String getSchemaName()
```
    In Cassandra Schemas are referred to as Keyspaces
    
    Returns:
    Keyspace
  - newQuery
```
public Query<K,T> newQuery()
```
    Description copied from interface: DataStore
    
    Constructs and returns a new Query.
    
    Returns:
    a new Query.
  - put
```
public void put(K key,
       T value)
```
    When doing the put(Object, PersistentBase) operation, the logic is as follows:
    1. Obtain the Avro Schema for the object.
    2. Create a new duplicate instance of the object (explained in more detail below) **.
    3. Obtain a List of the Schema Schema.Field's.
    4. Iterate through the field List. This allows us to consequently process each item.
    5. Check to see if the Schema.Field is NOT dirty. If this condition is true then we DO NOT process this field.
    6. Obtain the element at the specified position in this list so we can directly operate on it.
    7. Obtain the Schema.Type of the element obtained above and process it accordingly. N.B. For nested type ARRAY, MAP RECORD or UNION, we shadow the checks in bullet point 5 above to infer that the Schema.Field is either at position 0 OR it is NOT dirty. If one of these conditions is true then we DO NOT process this field. This is carried out in org.apache.gora.cassandra.store.CassandraStore#getFieldValue(Schema, Type, Object)
    8. We then insert the Key and Object into the LinkedHashMap buffer before being flushed. This performs a structural modification of the map.
    ** We create a duplicate instance of the object to be persisted and insert processed objects into a synchronized LinkedHashMap. This allows us to keep all the objects in memory till flushing.
    Parameters:
    key - for the Avro Record (object).
    value - Record object to be persisted in Cassandra
    See Also:
    DataStore.put(java.lang.Object, org.apache.gora.persistency.Persistent).
  - schemaExists
```
public boolean schemaExists()
```
    Simple method to check if a Cassandra Keyspace exists.
    
    Returns:
    true if a Keyspace exists.

Class CassandraStore<K,T extends PersistentBase>

Field Summary

Fields inherited from class org.apache.gora.store.impl.DataStoreBase

Constructor Summary

Method Summary

Methods inherited from class org.apache.gora.store.impl.DataStoreBase

Methods inherited from class java.lang.Object

Field Detail

LOG

UNION_COL_SUFIX

DEFAULT_UNION_SCHEMA

encoders

writerMap

Constructor Detail

CassandraStore

Method Detail

initialize

close

createSchema

delete

deleteByQuery

deleteSchema

execute

flush

get

getPartitions

getSchemaName

newQuery

put

schemaExists