Sophie: riak-1.3.1-2.fc18 i686

riak-1.3.1-2.fc18.i686.rpm

Riak Setup Instructions
------

This document explains how to set up a Riak cluster.  It assumes that
you have already downloaded and successfully built Riak.  For help with
those steps, please refer to riak/README.


Overview
---

Riak has many knobs to tweak, affecting everything from distribution
to disk storage.  This document will attempt to give a description of
the common configuration parameters, then describe two typical setups,
one small, one large.


Configuration
---

Configurations are stored in the simple text files vm.args and
app.config.  Initial versions of these files are stored in the
rel/files/ subdirectory of the riak source tree.  When a release
is generated, these files are copied to rel/riak/etc/.

vm.args
---

The vm.args configuration file sets the parameters passed on the
command line to the Erlang VM.  Lines starting with a '#' are
comments, and are ignored.  The other lines are concatenated and
passed on the command line verbatim.

Two important parameters to configure here are "-name", the name to
give the Erlang node running Riak, and "-setcookie", the cookie that
all Riak nodes need to share in order to communicate.

app.config
---

The app.config configuration file is formatted as an Erlang VM config
file.  The syntax is simply:

[
 {AppName, [
            {Option1, Value1},
            {Option2, Value2},
            ...
           ]},
 ...
].

Normally, this will look something like:

[
 {riak_kv, [
         {storage_backend, riak_kv_dets_backend},
         {riak_kv_dets_backend_root, "data/dets"}
        ]},
 {sasl, [
         {sasl_error_logger, {file, "log/sasl-error.log"}}
        ]}
].

This would set the 'storage_backend' and 'riak_kv_dets_backend_root'
options for the 'riak_kv' application, and the 'sasl_error_logger' option
for the 'sasl' application.

The following parameters can be used in app.config to configure Riak
behavior.  Some of the terminology used below is better explained in
riak/doc/architecture.txt.

cluster_name: string
  The name of the cluster.  Can be anything.  Used mainly in saving
  ring configuration.  All nodes should have the same cluster name.

gossip_interval: integer
  The period, in milliseconds, at which ring state gossiping will
  happen.  A good default is 60000 (sixty seconds).  Best not to
  change it unless you know what you're doing.

ring_creation_size: integer
  The number of partitions to divide the keyspace into.  This can be
  any number, but you probably don't want to go lower than 16, and
  production deployments will probably want something like 1024 or
  greater.  This is a very difficult parameter to change after your
  ring has been created, so choose a number that allows for growth, if
  you expect to add nodes to this cluster in the future.

ring_state_dir: string
  Directory in which the ring state should be stored.  Ring state is
  stored to allow an entire cluster to be restarted.

storage_backend: atom

  Name of the module that implements the storage for a vnode.  The
  backends that ship with Riak are riak_kv_bitcask_backend, 
  riak_kv_cache_backend, riak_dets_backend, riak_ets_backend, 
  riak_fs_backend, and riak_kv_gb_trees_backend. Some backends have 
  their own set of configuration parameters.

  riak_kv_bitcask_backend:
    Stores data in the Bitcask key/value store.

  riak_kv_cache_backend:
    Behaves as an LRU-with-timed-expiry cache.
  
  riak_kv_dets_backend:
    A backend that uses DETS to store its data.

    riak_kv_dets_backend_root: string
      The directory under which this backend will store its files.
  
  riak_kv_ets_backend:
    A backend that uses ETS to store its data.
  
  riak_kv_fs_backend:
    A backend that uses the filesystem directly to store data.  Data
    are stored in Erlang binary format in files in a directory
    structure on disk.

    riak_fs_backend_root: string
      The directory under which this backend will store its files.

  riak_kv_gb_trees_backend:
     A backend that uses Erlang gb_trees to store data.


Single-node Configuration
---

If you're running a single Riak node, you likely don't need to change
any configuration at all.  After compiling and generating the release
("./rebar compile generate"), simply start Riak from the rel/
directory.  (Details about the "riak" control script can be found in
the README.)


Large (Production) Configuration
---

If you're running any sort of cluster that could be labeled
"production", "deployment", "scalable", "enterprise", or any other
word implying that the cluster will be running interminably with
on-going maintenance, then you will want to change configurations a
bit.  Some recommended changes:

* Uncomment the "-heart" line in vm.args.  This will cause the "heart"
  utility to monitor the Riak node, and restart it if it stops.

* Change the name of the Riak node in vm.args from riak@127.0.0.1 to
  riak@VISIBLE.HOSTNAME.  This will allow Riak nodes on separate machines
  to communicate.

* Change 'web_ip' in the 'riak_core' section of app.config if you'll be 
  accessing that interface from a non-host-local client.

* Consider adding a 'ring_creation_size' entry to app.config, and
  setting it to a number higher than the default of 64.  More
  partitions will allow you to add more Riak nodes later, if you need
  to.

To get the cluster up and running, first start Riak on each node with
the usual "riak start" command.  Next, tell each node to join the
cluster with the riak-admin script:

box2$ bin/riak-admin join riak@box1.example.com
Sent join request to riak@box1.example.com

To check that all nodes have joined the cluster, attach a console to
any node, and request the ring from the ring manager, then check that
all nodes are represented:

$ bin/riak attach
Attaching to /tmp/erlang.pipe.1 (^D to exit)
(riak@box1.example.com)1> {ok, R} = riak_core_ring_manager:get_my_ring().
{ok,{chstate,'riak@box1.example.com',
...snip...
(riak@box1.example.com)2> riak_core_ring:all_members(R).
['riak@box1.example.com','riak@box2.example.com']


Your cluster should now be ready to accept requests.  See
riak/doc/basic-client.txt for simple instructions on connecting,
storing and fetching data.

Starting more nodes in production is just as easy:

1. Install Riak on another host, modifying hostnames in configuration
   files, if necessary.
2. Start the node with "riak start"
3. Add the node to the cluster with
   "riak-admin join ExistingClusterNode"


Developer Configuration
---

If you're hacking on Riak, and you need to run multiple nodes on a
single physical machine, use the "devrel" make command:

$ make devrel
mkdir -p dev
(cd rel && ../rebar generate target_dir=../dev/dev1 overlay_vars=vars/dev1_vars.config)
==> rel (generate)
mkdir -p dev
(cd rel && ../rebar generate target_dir=../dev/dev2 overlay_vars=vars/dev2_vars.config)
==> rel (generate)
mkdir -p dev
(cd rel && ../rebar generate target_dir=../dev/dev3 overlay_vars=vars/dev3_vars.config)
==> rel (generate)

This make target creates a release, and then modifies configuration
files such that each Riak node uses a different Erlang node name
(dev1-3), web port (8091-3), data directory (dev/dev1-3/data/), etc.

Start each developer node just as you would a regular node, but use
the 'riak' script in dev/devN/bin/.