Riak Client Usage Introduction ------ This document assumes that you have already started your Riak cluster. For instructions on that prerequisite, refer to riak/doc/basic-setup.txt. Overview --- To talk to riak, all you need is an Erlang node with the Riak ebin directories in its code path. Once this shell is up, use riak:client_connect/1 to get connected. The client returned from client_connect is defined by the riak_client module, and supports the simple functions get, put, delete, and others. Starting Your Client Erlang Node --- Riak client nodes must use "long names" and have the Riak ebin directories in their code path. The easiest way to start a node of this nature is: $ export ERL_LIBS=$PATH_TO_RIAK/apps $ erl -name myclient@127.0.0.1 -setcookie cookie Note: If you are using a precompiled version of Riak your ERL_LIBS would be /usr/lib/riak/lib You'll know you've done this correctly if you can execute the following commands and get a path to a beam file, instead of the atom 'non_existing': (myclient@127.0.0.1)1> code:which(riak). "../riak_kv/ebin/riak.beam" Connecting --- Once you have your node running, pass your Riak server nodename to riak:client_connect/1 to connect and get a client. This can be as simple as: 3> {ok, Client} = riak:client_connect('riak@127.0.0.1'). {ok,{riak_client,'riak@127.0.0.1', <<1,112,224,226>>}} Storing New Data --- Each bit of data in Riak is stored in a "bucket" at a "key" that is unique to that bucket. The bucket is intended as an organizational aid, for example to help segregate data by type, but Riak doesn't care what values it stores, so choose whatever scheme suits you. Buckets and keys must both be binaries. Before storing your data, you must wrap it in a riak_object: 4> Object = riak_object:new(<<"groceries">>, <<"mine">>, ["eggs", "bacon"]). {r_object,<<"groceries">>,<<"mine">>, [{r_content,{dict,0,16,16,8,80,48, {[],[],[],[],[],[],[],[],[],[],[],[],[],[],...}, {{[],[],[],[],[],[],[],[],[],[],[],[],...}}}, ["eggs","bacon"]}], [], {dict,0,16,16,8,80,48, {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],...}, {{[],[],[],[],[],[],[],[],[],[],[],[],[],...}}}, undefined} Then, using the client you opened earlier, store the object: 5> Client:put(Object, 1). ok If the return value of the last command was anything but the atom 'ok', then the store failed. The return value may give you a clue as to why the store failed, but check the Troubleshooting section below if not. The object is now stored in Riak, but you may be wondering about the additional parameter to Client:put. There are five different 'put' functions: put/1, put/2, put/3, put/4 and put/5. The lower-arity functions pass defaults for the parameters they leave out of the higher-arity functions. The available parameters, in order are: Object: the riak_object to store W: the minimum number of nodes that must respond with success for the write to be considered successful DW: the minimum number of nodes that must respond with success *after durably storing* the object for the write to be considered successful Timeout: the number of milliseconds to wait for W and DW responses before exiting with a timeout Options: List of options that control certain behavior. returnbody: Return the stored object. The default timeout is currently 60 seconds, and put/2 passes its W value as the DW value. So, the example above asks the client to store Object, waiting for 1 successful durable write response, waiting a maximum of 60 seconds for success. See riak/doc/architecture.txt for more information about W and DW values. Fetching Data --- At some point you'll want that data back. Using the same bucket and key you used before: 6> {ok, O} = Client:get(<<"groceries">>, <<"mine">>, 1). {ok,{r_object,<<"groceries">>,<<"mine">>, [{r_content,{dict,2,16,16,8,80,48, {[],[],[],[],[],[],[],[],[],[],[],[],...}, {{[],[],[],[],[],[], [["X-Riak-Last-Modified",87|...]], [],[],[],...}}}, ["eggs","bacon"]}], [{"20090722142711-myclient@127.0.0.1-riak@127.0.0.1-916345", {1,63415492187}}], {dict,0,16,16,8,80,48, {[],[],[],[],[],[],[],[],[],[],[],[],[],...}, {{[],[],[],[],[],[],[],[],[],[],[],...}}}, undefined}} 7> riak_object:get_value(O). ["eggs","bacon"] Like 'put', there are multiple 'get' functions: get/2, get/3, and get/4. Their parameters are: Bucket: the bucket in which the object is stored Key: the key under which the object is stored R: the minimum number of nodes that must respond with success for the read to be considered successful Timeout: the number of milliseconds to wait for R responses before exiting with a timeout So, the example 'get' above requested the "mine" object in the "groceries" bucket, demanding at least one successful response in 60 seconds. Modifying Data --- Say you had the "grocery list" from the examples above, reminding you to get ["eggs","bacon"], and you want to add "milk" to it. The easiest way is: 8> {ok, Oa} = Client:get(<<"groceries">>, <<"mine">>, 1). ... 9> Ob = riak_object:update_value(Oa, ["milk"|riak_object:get_value(Oa)]). ... 10> Client:put(Ob). ok That is, fetch the object from Riak, modify its value with riak_object:update_value/2, then store the modified object back in Riak. You can get your updated object to convince yourself that your list is updated: 11> {ok, Oc} = Client:get(<<"groceries">>, <<"mine">>, 1). ... 12> riak_object:get_value(Oc). ["milk","eggs","bacon"]. Siblings --- By default, Riak does not expose siblings to clients. It is, however, important to be aware of their presence. Riak is able to provide high availability, in part, due to its ability to accept put requests, even when it can't tell if that put would overwrite data. Through the use of vector clocks Riak is able to track and resolve conflicting versions of an object. For more information about how Riak uses vclocks, see riak/doc/architecture.txt. Siblings occur when Riak accepts conflicting values for an object that cannot be resolved with vector clocks. Let's continue the example from above but this time we will tell Riak to return siblings to the client. Siblings can be allowed per bucket by setting 'allow_mult' to 'true': 13> Client:set_bucket(<<"groceries">>, [{allow_mult, true}]). ok To create a sibling, we'll fire up a new client and define our grocery list with different items: 14> {ok, Client2} = riak:client_connect('riak@127.0.0.1'). ... 15> Client2:put(riak_object:new(<<"groceries">>, <<"mine">>, ["bread","cheese"]), 1). ... 16> {ok, O2} = Client2:get(<<"groceries">>, <<"mine">>, 1). ... 17> riak_object:get_value(O2). ** exception error: no match of right hand side value [{{dict,2,16,16,8,80,48, {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}, {{[],[],[],[],[],[],[],[],[],[], [[<<"X-Riak-VTag">>,49,54,106,80|...]], [],[], [[<<"X-Ri"...>>|{...}]], [],[]}}}, ["bread","cheese"]}, {{dict,2,16,16,8,80,48, {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}, {{[],[],[],[],[],[],[],[],[],[], [[<<"X-Riak-VTag">>,50,97,53|...]], [],[], [[<<...>>|...]], [],[]}}}, ["milk","eggs","bacon"]}] in function riak_object:get_value/1 Now that Riak is returning siblings it seems our get_value/1 function is broken. So, what happened? The function get_value/1 does not know how to handle siblings; to see both sets of conflicting data, use riak_object:get_values/1: 18> riak_object:get_values(O2). [["bread","cheese"],["milk","eggs","bacon"]] It is up to the client to "merge" the siblings in whatever way suits your application. In this case, we really did just want bread and cheese in the list, so: 19> O3 = riak_object:update_value(O2, ["bread","cheese"]). ... 20> Client2:put(O3, 1). ok Now, we can fetch the grocery list and check its value: 21> {ok, O4} = Client2:get(<<"groceries">>, <<"mine">>, 1). ... 22> riak_object:get_value(O4). ["bread","cheese"] To return to the default behavior we can set 'allow_mult' back to false: 23> Client2:set_bucket(<<"groceries">>, [{allow_mult, false}]). Listing Keys --- Most uses of key-value stores are structured in such a way that requests know which keys they want in a bucket. Sometimes, though, it's necessary to find out what keys are available (when debugging, for example). For that, there is list_keys: 1> Client:list_keys(<<"groceries">>). {ok, ["mine"]}. Note that keylist updates are asynchronous to the object storage primitives, and may not be updated immediately after a put or delete. This function is primarily intended as a debugging aid. Deleting Data --- Throwing away data is quick and simple: just use the delete function in the riak_client module: 1> Client:delete(<<"groceries">>, <<"mine">>, 1). ok As with get, delete has arity-2, arity-3 and arity-4 functions, with parameters: Bucket: the bucket the object is in Key: the key to delete RW: the number of nodes to wait for responses from Timeout: the number of milliseconds to wait for responses So, the command demonstrated above tries to delete the object with the key "mine" in the bucket "groceries", waiting up to 60 seconds for at least one node to respond. Issuing a delete for an object that does not exist returns an error tuple. For example, calling the same delete as above a second time: 2> Client:delete(<<"groceries">>, <<"mine">>, 1). {error,notfound} Bucket Properties --- As seen in the examples above, simply storing a key/value in a bucket causes the bucket to come into existence with some default parameters. To view the settings for a bucket, use the get_bucket function in the riak_client module: 1> Client:get_bucket(<<"groceries">>). [{name,<<"groceries">>}, {n_val,3}, {allow_mult,false}, {last_write_wins,false}, {precommit,[]}, {postcommit,[]}, {chash_keyfun,{riak_core_util,chash_std_keyfun}}, {linkfun,{modfun,riak_kv_wm_link_walker,mapreduce_linkfun}}, {old_vclock,86400}, {young_vclock,20}, {big_vclock,50}, {small_vclock,10}, {r,quorum}, {w,quorum}, {dw,quorum}, {rw,quorum}] If the default parameters do not suit your application, you can alter them on a per-bucket basis with set_bucket. You should do this before storing any data in the bucket, but many of the settings will "just work" if they are modified after the bucket contains data. An interesting bucket setting to take note of is 'n_val'. n_val tells Riak how many copies of an object to make. More properly, for a bucket with an n_val of N, Riak will store each object in the bucket in N different vnodes (see riak/doc/architecture.txt for a description of the term "vnode"). Most of the time, the default n_val of 3 works perfectly. Three provides some durability and availability over 1 or 2. If you are attempting to enhance either durability or speed, however, increasing n_val may make sense *if you also modify the R, W, and DW* values in your get and put calls. Changing n_val on its own will have little effect without also modifying the put- and get-time parameters. Troubleshooting --- {nodedown, ...} - If all of your riak_client calls are exiting with exceptions that describe themselves as, roughly: ** exception exit: {{nodedown,'riak@127.0.0.1'}, {gen_server,call, [{riak_api,'riak@127.0.0.1'}, {get,groceries,"mine",1,15000}, 15000]}} in function gen_server:call/3 The node that your client was connected to is down. Try connecting a new client. riak:client_connect/1 returns {error,timeout} - The Riak node you are connecting to is down. {error,notfound} - The bucket/key combination you requested was not found.