Sophie: riak-1.3.1-2.fc18 i686

riak-1.3.1-2.fc18.i686.rpm

* Riak 1.1.2 Release Notes
** Features and Improvements for Riak

This is mainly a stabalization release, adding a few stability improvements
for heavily loaded networks and improving support for running mixed clusters
while migrating to the 1.1.x series.

The frequency that Riak checks whether primary partitions need to be transferred
and also a few other administrative tasks has been lowered from every 30s to
every 10s by default.  It is also configurable on an application environment
variable in app.config.

#+BEGIN_SRC erlangB
  {riak_core, [{vnode_management_timer, #Millisecs}, ...]}
#+END_SRC

** Known Issues

*** [[https://github.com/basho/riak/issues/130][https://github.com/basho/riak/issues/130]]

The /usr/sbin/riak script expects /var/run/riak to exist.  On Fedora and possibly
other Linux distributions /var/run is built on tmpfs, so the directory created
at package install time goes away on reboot and the /usr/sbin/riak script does not
have permissions to recreate it.

As a work around, add this to /etc/rc.local
#+BEGIN_SRC sh
mkdir /var/run/riak && chown -R riak /var/run/riak
#+END_SRC


** Bugs Fixed

-[[https://github.com/basho/erlang_js/issues/18][erlang_js inlines do not work on XCode 4.3]]
-[[https://github.com/basho/riak_sysmon/issues/5][Work-around for net_kernel:get_status() race condition]]
-[[https://github.com/basho/riak_core/issues/152][Handoff reported as started even if rejected for exceeding handoff concurrency]]
-[[https://github.com/basho/riak_core/issues/153][Intermittent hang with handoff sender]]
-[[https://github.com/basho/riak_core/issues/154][handoff of fallback data can be delayed or never performed]]
-[[https://github.com/basho/riak_core/issues/155][Chicken & egg problem with vnode proxy process start & registration?]]
-[[https://github.com/basho/riak_kv/issues/300][Put FSM crashes if forwarded node goes down]]

* Riak 1.1.1 Release Notes
** Major Features and Improvements for Riak
*** MapReduce Changes
This release includes fixes for systems with heavy MapReduce load.

Pipe maintains a limit on the number of active worker processes in the system.
Workers are created on demand when the pipe executes - they are not reserved
at pipe creation time.

Prior to 1.1.1 not all fittings (e.g. get object from riak, map value, reduce)
were checking to make sure their output was being delivered to the next stage
of the pipeline.  This has been changed so that the pipe will error if
processing errors or system limits are hit.

As a result of the change if the pipe is overloaded clients will see errors of this form:
={error,<<"Error sending inputs: {error,worker_limit_reached}">>}=

Clients can either decide to retry (preferably with a backoff/retries limit
mechanism) or if the hardware has sufficient capacity the number of workers
per vnode can be increased by setting worker_limit in =etc/app.config=.

={riak_pipe, [{worker_limit, #Workers}]}=

Different numbers of workers are used based on the type of MapReduce query so
tuning will be very workload dependent.

If you are using Javascript MapReduce you will also need to increase the number of
Javascript virtual machines in the =riak_kv= section of =app.config=.

#+BEGIN_SRC erlangB
            {map_js_vm_count,  #MapVMs},
            {reduce_js_vm_count, #ReduceVMs}
#+END_SRC

Again, tuning will be workload dependent.  If sufficient resources are available
setting #MapVMs to the maximum number of JS MapReduce jobs being run in parallel
will give best results and #ReduceVMs set to number of jobs / number of nodes plus
a margin for overload.

** Bugs Fixed
-[[https://github.com/basho/riak_core/issues/144][Map-reduce jobs fail during rolling upgrade to 1.1]]
-[[https://github.com/basho/riak_kv/issues/287][Race Condition causes Javascript VMs to never be returned to VM Pool]]
-[[https://issues.basho.com/show_bug.cgi?id=1258][Avoid {case_clause, ok} errors when vnode backend fails to start]]
-[[https://github.com/basho/bitcask/issues/39][Bitcask - Fix incorrect NIF error tuples]]

* Riak 1.1.0 Release Notes
** Major Features and Improvements for Riak
*** Riak Control
-[[http://basho.com/blog/technical/2012/01/30/Riak-in-Production-at-Posterous-Riak-Control-Preview/][Riak Control preview demo]]
-[[https://github.com/basho/riak_control][Riak Control repository with documentation on how to get started]]
*** Riaknostic
-[[http://basho.com/blog/technical/2011/12/15/announcing-riaknostic/][Blog post announcing riaknostic]]
-[[http://riaknostic.basho.com/][Riaknostic Homepage]]
*** Bitcask Improvements
- CRC checks have been added to hintfiles for bitcask
*** LevelDB Improvements
- Snappy (compression algorithm from Google) has been enabled
- The compaction algorithm has been tuned to reduce delays due to compaction
*** Other backend changes
- Multi-backend now supports 2i properly
*** Lager Improvements
- Tracing support (see the [[https://github.com/basho/lager/blob/master/README.org][README]])
- Term printing is ~4x faster and much more correct (compared to io:format)
- Bitstring printing support was added
*** MapReduce Improvements
- The MapReduce interface now supports requests with empty queries. This allows the 2i, list-keys, and search inputs to return matching keys to clients without needing to include a reduce_identity query phase.
- MapReduce error messages have been improved.  Most error cases should now return helpful information all the way to the client, while also producing less spam in Riak's logs.

*** Riak Core Improvements
There has been numerous changes to =riak_core= which address issues
with cluster scalability, and enable Riak to better handle large
clusters and large rings. Specifically, the warnings about potential
gossip overload and large ring sizes from the Riak 1.0 release notes
are no longer valid.  However, there are a few user visible
configuration changes impacted by these improvements.

First, Riak has a new routing layer that determines how requests are
sent to vnodes for servicing. In order to support mixed-version clusters
during a rolling upgrade, Riak 1.1 starts in a legacy routing mode that
is slower than the traditional behavior in prior releases. After upgrading
or deploying a cluster that is composed solely of Riak 1.1 nodes, the legacy
mode can be disabled by adding the following to the riak_core section of
app.config:

={legacy_vnode_routing, false}=

Second, the =handoff_concurrency= setting that can be enabled in the riak_core
section of app.config has been changed to limit both incoming and outgoing
handoff from a node. Prior to 1.1, this setting only limited the amount of
outgoing handoff traffic from a node, and was unable to prevent a node from
becoming overloaded with too much inbound traffic.

Finally, the =gossip_interval= setting that can be enabled in riak_core section
of app.config has been removed, and replaced with a new =gossip_limit= setting.
The =gossip_limit= setting takes the form ={allowed_gossips, interval_in_ms}=.
For example, the default setting is ={45, 10000}= which limits gossip to 45
gossip messages every 10 seconds.

*** New Ownership Claim Algorithm 
The new ring ownership claim algorithm introduced as an optional
setting in the 1.0 release has been set as the default for 1.1.

The new claim algorithm significantly reduces the amount of ownerhip
shuffling for clusters with more than N+2 nodes in them.  Changes to
the cluster membership code in 1.0 made the original claim algorithm
fall back to just reordering nodes in sequence in more cases than it
used to, causing massive handoff activity on larger clusters.  Mixed
clusters including pre-1.0 nodes will still use the original algorithm
until all nodes are upgraded.

NOTE: The semantics of the new algorithm are different than the old
algorithm when the number of nodes in the cluster is less than the
=target_n_val= setting. When the number of nodes in your cluster is
less than =target_n_val=, Riak will not guarantee that each of your
replicas are on distinct physical nodes. By default, Riak is setup
with a default replication value N=3 and a =target_n_val= of 4. As
such, you must have at least a 4-node cluster in order to have your
replicas placed on different physical nodes. If you want to deploy a
3-replica, 3-node cluster, you can override the =target_n_val= setting
in the riak_core section of app.config:

={target_n_val, 3}=

In practice, you should set =target_n_val= to match the largest replication
value you plan to use across your entire cluster.

*** Riak KV Improvements
**** Listkeys Backpressure

Backpressure has been added to listkeys to prevent the node listing keys from being
overwhelemed.  The change has required a protocol change so that the key lister
can limit the rate it receives data.

In mixed clusters where some of the nodes are < 1.1 please set listkeys_backpressure
false in the riak_kv section of app.config until all nodes are upgraded.

={listkeys_backpressure, false}=

Once all nodes are upgraded, set listkeys_backpressue to true in the riak_kv section of app.config

={listkeys_backpressure, true}=

Running nodes can be upgraded without restarting by running this snippet from
the riak console

=application:set_env(riak_kv, listkeys_backpressure, true).=

Fresh 1.1.0 and above installs default to using listkeys backpressure - adjust app.config if
different behavior is desired.

**** Don't drop post-commit errors on floor

In previous releases there is no easy way to determine if a
post-commit hook is failing.  In this release two counters have been
added to =riak-admin status= that will indicate pre/post-commit hook
failures.  They are =precommit_fail= and =postcommit_fail=.  By
default the errors themselves are not logged.  The thought is that a
bad hook could cause unnecessary IO overload.

If the error needs to be discovered then Lager, Riak's logging system,
will allow you to dynamically change the logging level to debug on the
function executing the hook.  To do that you need to =riak attach= on
one of the nodes and run the following.

={ok, Trace} = lager:trace_file("<path>/failing-postcommits", [{module, riak_kv_put_fsm}, {function, decode_postcommit}], debug).=

This will output all post-commit errors to
=<path>/failing-postcommits=.  When you've got enough samples you can
stop the trace like so.

=lager:stop_trace(Trace).=

** Other Additions
*** Default =small_vclock= to be equal to =big_vclock=

If you are using bidirectional cluster replication and you have
overridden the defaults for either of these then you should consider
setting both to the same value.

The default value of =small_vclock= has been changed to be equal to
=big_vclock= in order to delay or even prevent unnecessary sibling
creation in a Riak deployment with bidirectional cluster replication.
When you replicate a pruned vector clock the other cluster will think
it isn't a descendent, even though it is, and create a sibling.  By
raising =small_vclock= to match =big_vclock= you reduce the frequency
of pruning and thus siblings.  Combined with vnode vclocks, sibling
creation, for this particular reason, may be entirely avoided since
the number of entries will almost always stay below the threshold in a
well behaved cluster (i.e. one not under constant node membership
change or network partitions).

** Known Issues
-Luwak has been deprecated in the 1.1 release
-[[https://issues.basho.com/show_bug.cgi?id=1160][bz1160 - Bitcask fails to merge on corrupt file]]

*** Innostore on 32-bit systems

Depending on the partition count and the default stack size of the
operating system, using Innostore as the backend on 32-bit systems can
cause problems. A symptom of this problem would be a message similar
to this on the console or in the Riak error log:

#+BEGIN_SRC
10:22:38.719 [error] Failed to start riak_kv_innostore_backend for index 1415829711164312202009819681693899175291684651008. Reason: {eagain,[{erlang,open_port,[{spawn,innostore_drv},[binary]]},{innostore,connect,0},{riak_kv_innostore_backend,start,2},{riak_kv_vnode,init,1},{riak_core_vnode,init,1},{gen_fsm,init_it,6},{proc_lib,init_p_do_apply,3}]}
10:22:38.871 [notice] "backend module failed to start."
#+END_SRC

The workaround for this problem is to reduce the stack size for the
user Riak is run as (=riak= for most distributions). For example
on Linux, the current stack size can be viewed using =ulimit -s= and
can be altered by adding entries to the =/etc/security/limits.conf=
file such as these:

#+BEGIN_SRC
riak             soft    stack           1024
riak             hard    stack           1024
#+END_SRC

*** Ownership Handoff Stall

It has been observed that 1.1.0 clusters can end up in a state with ownership
handoff failing to complete. This state should not happen under normal
circumstances, but has been observed in cases where Riak nodes crashed due
to unrelated issues (eg. running out of disk space/memory) during cluster
membership changes. The behavior can be identified by looking at the output
of =riak-admin ring_status= and looking for transfers that are waiting on an
empty set. As an example:

#+BEGIN_SRC
============================== Ownership Handoff ==============================
Owner:      riak@host1
Next Owner: riak@host2

Index: 123456789123456789123456789123456789123456789123
  Waiting on: []
  Complete:   [riak_kv_vnode,riak_pipe_vnode]
#+END_SRC

To fix this issue, copy/paste the following code sequence into an Erlang
shell for each =Owner= node in =riak-admin ring_status= for which
this case has been identified. The Erlang shell can be accessed with =riak attach=

#+BEGIN_SRC
fun() ->
  Node = node(),
  {_Claimant, _RingReady, _Down, _MarkedDown, Changes} =
      riak_core_status:ring_status(),
  Stuck =
      [{Idx, Complete} || {{Owner, NextOwner}, Transfers} <- Changes,
                          {Idx, Waiting, Complete, Status} <- Transfers,
                          Owner =:= Node,
                          Waiting =:= [],
                          Status =:= awaiting],

  RT = fun(Ring, _) ->
           NewRing = lists:foldl(fun({Idx, Mods}, Ring1) ->
                         lists:foldl(fun(Mod, Ring2) ->
                             riak_core_ring:handoff_complete(Ring2, Idx, Mod)
                         end, Ring1, Mods)
                     end, Ring, Stuck),
           {new_ring, NewRing}
       end,

  case Stuck of
      [] ->
          ok;
      _ ->
          riak_core_ring_manager:ring_trans(RT, []),
          ok
  end
end().
#+END_SRC

** Bugs Fixed
-[[https://issues.basho.com/show_bug.cgi?id=775][bz775 - Start-up script does not recreate /var/run/riak]]
-[[https://issues.basho.com/show_bug.cgi?id=1283][bz1283 - erlang_js uses non-thread-safe driver function]]
-[[https://issues.basho.com/show_bug.cgi?id=1333][bz1333 - Bitcask attempts to open backup/other files]]
-[[http://basho.com/blog/technical/2012/01/27/Quick-Checking-Poolboy-for-Fun-and-Profit/][Poolboy - Lots of potential bugs fixed, see detailed post by Andrew Thompson]]
*** Lager Specific Bugs Fixed
- #26 - don't make a crash log called 'undefined'
- #28 - R13A support
- #29 - Don't unnecessarily quote atoms
- #31 - Better crash reports for proc_lib processes
- #33 - Don't assume supervisor children are named with atoms
- #35 - Support printing bitstrings (binaries with trailing bits)
- #37 - Don't generate dynamic atoms