public class CassandraStore<K,T extends PersistentBase> extends DataStoreBase<K,T>
CassandraStore
is the primary class
responsible for directing Gora CRUD operations into Cassandra. We (delegate) rely
heavily on CassandraClient
for many operations
such as initialization, creating and deleting schemas (Cassandra Keyspaces), etc.Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_UNION_SCHEMA
Default schema index with value "0" used when AVRO Union data types are stored
|
static ThreadLocal<org.apache.avro.io.BinaryEncoder> |
encoders |
static org.slf4j.Logger |
LOG
Logging implementation
|
static String |
UNION_COL_SUFIX
Fixed string with value "UnionIndex" used to generate an extra column based on
the original field's name
|
static ConcurrentHashMap<String,org.apache.avro.specific.SpecificDatumWriter<?>> |
writerMap
Create a
ConcurrentHashMap for the
datum readers and writers. |
autoCreateSchema, beanFactory, conf, datumReader, datumWriter, fieldMap, keyClass, persistentClass, properties, schema
Constructor and Description |
---|
CassandraStore()
The default constructor for CassandraStore
|
Modifier and Type | Method and Description |
---|---|
void |
close()
Close the DataStore.
|
void |
createSchema()
Creates the optional schema or table (or similar) in the datastore
to hold the objects.
|
boolean |
delete(K key)
Deletes the object with the given key
|
long |
deleteByQuery(Query<K,T> query)
Deletes all the objects matching the query.
|
void |
deleteSchema()
Deletes the underlying schema or table (or similar) in the datastore
that holds the objects.
|
Result<K,T> |
execute(Query<K,T> query)
When executing Gora Queries in Cassandra we query the Cassandra keyspace by families.
|
void |
flush()
Flush the buffer which is a synchronized
LinkedHashMap
storing fields pending to be stored by
put(Object, PersistentBase)
operations. |
T |
get(K key,
String[] fields)
Returns the object corresponding to the given key.
|
List<PartitionQuery<K,T>> |
getPartitions(Query<K,T> query)
Partitions the given query and returns a list of
PartitionQuery s,
which will execute on local data. |
String |
getSchemaName()
In Cassandra Schemas are referred to as Keyspaces
|
void |
initialize(Class<K> keyClass,
Class<T> persistent,
Properties properties)
Initialize is called when then the call to
org.apache.gora.store.DataStoreFactory#createDataStore(Class
is made. |
Query<K,T> |
newQuery()
Constructs and returns a new Query.
|
void |
put(K key,
T value)
When doing the
put(Object, PersistentBase)
operation, the logic is as follows:
Obtain the Avro Schema for the object.
Create a new duplicate instance of the object (explained in more detail below) **.
Obtain a List of the Schema
Schema.Field 's.
Iterate through the field List . |
boolean |
schemaExists()
Simple method to check if a Cassandra Keyspace exists.
|
equals, get, getBeanFactory, getConf, getFields, getFieldsToQuery, getKeyClass, getOrCreateConf, getPersistentClass, getSchemaName, newKey, newPersistent, readFields, setBeanFactory, setConf, setKeyClass, setPersistentClass, truncateSchema, write
public static final org.slf4j.Logger LOG
public static String UNION_COL_SUFIX
public static int DEFAULT_UNION_SCHEMA
public static final ThreadLocal<org.apache.avro.io.BinaryEncoder> encoders
public static final ConcurrentHashMap<String,org.apache.avro.specific.SpecificDatumWriter<?>> writerMap
ConcurrentHashMap
for the
datum readers and writers.
This is necessary because they are not thread safe, at least not before
Avro 1.4.0 (See AVRO-650).
When they are thread safe, it is possible to maintain a single reader and
writer pair for every schema, instead of one for every thread.public void initialize(Class<K> keyClass, Class<T> persistent, Properties properties)
org.apache.gora.store.DataStoreFactory#createDataStore(Class dataStoreClass, Class keyClass, Class persistent, org.apache.hadoop.conf.Configuration conf)
is made. In this case, we merely delegate the store initialization to the
org.apache.gora.cassandra.store.CassandraClient#initialize(Class keyClass, Class persistentClass)
.initialize
in interface DataStore<K,T extends PersistentBase>
initialize
in class DataStoreBase<K,T extends PersistentBase>
keyClass
- the class of the keyspersistent
- the class of the persistent objectsproperties
- extra metadatapublic void close()
DataStore
public void createSchema()
DataStore
public boolean delete(K key)
DataStore
key
- the key of the objectpublic long deleteByQuery(Query<K,T> query)
DataStore
query
- matching records to this query will be deletedpublic void deleteSchema()
DataStore
public Result<K,T> execute(Query<K,T> query)
query
- the query to execute.Result
object.public void flush()
LinkedHashMap
storing fields pending to be stored by
put(Object, PersistentBase)
operations. Invoking this method therefore writes the buffered rows
into Cassandra.DataStore.flush()
public T get(K key, String[] fields)
DataStore
key
- the key of the objectfields
- the fields required in the object. Pass null, to retrieve all fieldspublic List<PartitionQuery<K,T>> getPartitions(Query<K,T> query) throws IOException
DataStore
PartitionQuery
s,
which will execute on local data.query
- the base query to create the partitions for. If the query
is null, then the data store returns the partitions for the default query
(returning every object)IOException
public String getSchemaName()
public Query<K,T> newQuery()
DataStore
public void put(K key, T value)
put(Object, PersistentBase)
operation, the logic is as follows:
Schema
for the object.List
of the Schema
Schema.Field
's.List
. This allows us to
consequently process each item.Schema.Field
is NOT dirty.
If this condition is true then we DO NOT process this field.Schema.Type
of the element obtained
above and process it accordingly. N.B. For nested type ARRAY, MAP
RECORD or UNION, we shadow the checks in bullet point 5 above to infer that the
Schema.Field
is either at
position 0 OR it is NOT dirty. If one of these conditions is true then we DO NOT
process this field. This is carried out in
org.apache.gora.cassandra.store.CassandraStore#getFieldValue(Schema, Type, Object)
LinkedHashMap
buffer
before being flushed. This performs a structural modification of the map.LinkedHashMap
. This allows
us to keep all the objects in memory till flushing.key
- for the Avro Record (object).value
- Record object to be persisted in CassandraDataStore.put(java.lang.Object,
org.apache.gora.persistency.Persistent).
public boolean schemaExists()
Copyright © 2010-2014 The Apache Software Foundation. All Rights Reserved.