public class CassandraStore<K,T extends PersistentBase> extends DataStoreBase<K,T>
CassandraStore is the primary class
responsible for directing Gora CRUD operations into Cassandra. We (delegate) rely
heavily on CassandraClient for many operations
such as initialization, creating and deleting schemas (Cassandra Keyspaces), etc.| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_UNION_SCHEMA
Default schema index with value "0" used when AVRO Union data types are stored
|
static ThreadLocal<org.apache.avro.io.BinaryEncoder> |
encoders |
static org.slf4j.Logger |
LOG
Logging implementation
|
static String |
UNION_COL_SUFIX
Fixed string with value "UnionIndex" used to generate an extra column based on
the original field's name
|
static ConcurrentHashMap<String,org.apache.avro.specific.SpecificDatumWriter<?>> |
writerMap
Create a
ConcurrentHashMap for the
datum readers and writers. |
autoCreateSchema, beanFactory, conf, datumReader, datumWriter, fieldMap, keyClass, persistentClass, properties, schema| Constructor and Description |
|---|
CassandraStore()
The default constructor for CassandraStore
|
| Modifier and Type | Method and Description |
|---|---|
void |
close()
Close the DataStore.
|
void |
createSchema()
Creates the optional schema or table (or similar) in the datastore
to hold the objects.
|
boolean |
delete(K key)
Deletes the object with the given key
|
long |
deleteByQuery(Query<K,T> query)
Deletes all the objects matching the query.
|
void |
deleteSchema()
Deletes the underlying schema or table (or similar) in the datastore
that holds the objects.
|
Result<K,T> |
execute(Query<K,T> query)
When executing Gora Queries in Cassandra we query the Cassandra keyspace by families.
|
void |
flush()
Flush the buffer which is a synchronized
LinkedHashMap
storing fields pending to be stored by
put(Object, PersistentBase)
operations. |
T |
get(K key,
String[] fields)
Returns the object corresponding to the given key.
|
List<PartitionQuery<K,T>> |
getPartitions(Query<K,T> query)
Partitions the given query and returns a list of
PartitionQuerys,
which will execute on local data. |
String |
getSchemaName()
In Cassandra Schemas are referred to as Keyspaces
|
void |
initialize(Class<K> keyClass,
Class<T> persistent,
Properties properties)
Initialize is called when then the call to
org.apache.gora.store.DataStoreFactory#createDataStore(Class
is made. |
Query<K,T> |
newQuery()
Constructs and returns a new Query.
|
void |
put(K key,
T value)
When doing the
put(Object, PersistentBase)
operation, the logic is as follows:
Obtain the Avro Schema for the object.
Create a new duplicate instance of the object (explained in more detail below) **.
Obtain a List of the Schema
Schema.Field's.
Iterate through the field List. |
boolean |
schemaExists()
Simple method to check if a Cassandra Keyspace exists.
|
equals, get, getBeanFactory, getConf, getFields, getFieldsToQuery, getKeyClass, getOrCreateConf, getPersistentClass, getSchemaName, newKey, newPersistent, readFields, setBeanFactory, setConf, setKeyClass, setPersistentClass, truncateSchema, writepublic static final org.slf4j.Logger LOG
public static String UNION_COL_SUFIX
public static int DEFAULT_UNION_SCHEMA
public static final ThreadLocal<org.apache.avro.io.BinaryEncoder> encoders
public static final ConcurrentHashMap<String,org.apache.avro.specific.SpecificDatumWriter<?>> writerMap
ConcurrentHashMap for the
datum readers and writers.
This is necessary because they are not thread safe, at least not before
Avro 1.4.0 (See AVRO-650).
When they are thread safe, it is possible to maintain a single reader and
writer pair for every schema, instead of one for every thread.public void initialize(Class<K> keyClass, Class<T> persistent, Properties properties)
org.apache.gora.store.DataStoreFactory#createDataStore(Class dataStoreClass, Class keyClass, Class persistent, org.apache.hadoop.conf.Configuration conf)
is made. In this case, we merely delegate the store initialization to the
org.apache.gora.cassandra.store.CassandraClient#initialize(Class keyClass, Class persistentClass) .initialize in interface DataStore<K,T extends PersistentBase>initialize in class DataStoreBase<K,T extends PersistentBase>keyClass - the class of the keyspersistent - the class of the persistent objectsproperties - extra metadatapublic void close()
DataStorepublic void createSchema()
DataStorepublic boolean delete(K key)
DataStorekey - the key of the objectpublic long deleteByQuery(Query<K,T> query)
DataStorequery - matching records to this query will be deletedpublic void deleteSchema()
DataStorepublic Result<K,T> execute(Query<K,T> query)
query - the query to execute.Result object.public void flush()
LinkedHashMap
storing fields pending to be stored by
put(Object, PersistentBase)
operations. Invoking this method therefore writes the buffered rows
into Cassandra.DataStore.flush()public T get(K key, String[] fields)
DataStorekey - the key of the objectfields - the fields required in the object. Pass null, to retrieve all fieldspublic List<PartitionQuery<K,T>> getPartitions(Query<K,T> query) throws IOException
DataStorePartitionQuerys,
which will execute on local data.query - the base query to create the partitions for. If the query
is null, then the data store returns the partitions for the default query
(returning every object)IOExceptionpublic String getSchemaName()
public Query<K,T> newQuery()
DataStorepublic void put(K key, T value)
put(Object, PersistentBase)
operation, the logic is as follows:
Schema for the object.List of the Schema
Schema.Field's.List. This allows us to
consequently process each item.Schema.Field is NOT dirty.
If this condition is true then we DO NOT process this field.Schema.Type of the element obtained
above and process it accordingly. N.B. For nested type ARRAY, MAP
RECORD or UNION, we shadow the checks in bullet point 5 above to infer that the
Schema.Field is either at
position 0 OR it is NOT dirty. If one of these conditions is true then we DO NOT
process this field. This is carried out in
org.apache.gora.cassandra.store.CassandraStore#getFieldValue(Schema, Type, Object)LinkedHashMap buffer
before being flushed. This performs a structural modification of the map.LinkedHashMap. This allows
us to keep all the objects in memory till flushing.key - for the Avro Record (object).value - Record object to be persisted in CassandraDataStore.put(java.lang.Object,
org.apache.gora.persistency.Persistent).public boolean schemaExists()
Copyright © 2010-2014 The Apache Software Foundation. All Rights Reserved.