<!--$Id: get_bulk.so,v 10.7 2004/09/03 19:47:57 mjc Exp $--> <!--Copyright (c) 1997,2007 Oracle. All rights reserved.--> <!--See the file LICENSE for redistribution information.--> <html> <head> <title>Berkeley DB Reference Guide: Retrieving records in bulk</title> <meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> <meta name="keywords" content="embedded,database,programmatic,toolkit,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,Java,C,C++"> </head> <body bgcolor=white> <a name="2"><!--meow--></a> <table width="100%"><tr valign=top> <td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> <td align=right><a href="../am_misc/align.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../am_misc/partial.html"><img src="../../images/next.gif" alt="Next"></a> </td></tr></table> <p> <h3 align=center>Retrieving records in bulk</h3> <p>When retrieving large numbers of records from the database, the number of method calls can often dominate performance. Berkeley DB offers bulk get interfaces which can significantly increase performance for some applications. To retrieve records in bulk, an application buffer must be specified to the <a href="../../api_c/db_get.html">DB->get</a> or <a href="../../api_c/dbc_get.html">DBcursor->get</a> methods. This is done in the C API by setting the <b>data</b> and <b>ulen</b> fields of the <b>data</b> <a href="../../api_c/dbt_class.html">DBT</a> to reference an application buffer, and the <b>flags</b> field of that structure to <a href="../../api_c/dbt_class.html#DB_DBT_USERMEM">DB_DBT_USERMEM</a>. In the Berkeley DB C++ and Java APIs, the actions are similar, although there are API-specific methods to set the <a href="../../api_c/dbt_class.html">DBT</a> values. Then, the <a href="../../api_c/dbc_get.html#DB_MULTIPLE">DB_MULTIPLE</a> or <a href="../../api_c/dbc_get.html#DB_MULTIPLE_KEY">DB_MULTIPLE_KEY</a> flags are specified to the <a href="../../api_c/db_get.html">DB->get</a> or <a href="../../api_c/dbc_get.html">DBcursor->get</a> methods, which cause multiple records to be returned in the specified buffer.</p> <p>The difference between <a href="../../api_c/dbc_get.html#DB_MULTIPLE">DB_MULTIPLE</a> and <a href="../../api_c/dbc_get.html#DB_MULTIPLE_KEY">DB_MULTIPLE_KEY</a> is as follows: <a href="../../api_c/dbc_get.html#DB_MULTIPLE">DB_MULTIPLE</a> returns multiple data items for a single key. For example, the <a href="../../api_c/dbc_get.html#DB_MULTIPLE">DB_MULTIPLE</a> flag would be used to retrieve all of the duplicate data items for a single key in a single call. The <a href="../../api_c/dbc_get.html#DB_MULTIPLE_KEY">DB_MULTIPLE_KEY</a> flag is used to retrieve multiple key/data pairs, where each returned key may or may not have duplicate data items.</p> <p>Once the <a href="../../api_c/db_get.html">DB->get</a> or <a href="../../api_c/dbc_get.html">DBcursor->get</a> method has returned, the application will walk through the buffer handling the returned records. This is implemented for the C and C++ APIs using four macros: <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_INIT">DB_MULTIPLE_INIT</a>, <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_NEXT">DB_MULTIPLE_NEXT</a>, <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_KEY_NEXT">DB_MULTIPLE_KEY_NEXT</a>, and <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_RECNO_NEXT">DB_MULTIPLE_RECNO_NEXT</a>. For the Java API, this is implemented as three iterator classes: <a href="../../java/com/sleepycat/db/MultipleDataEntry.html">MultipleDataEntry</a>, <a href="../../java/com/sleepycat/db/MultipleKeyDataEntry.html">MultipleKeyDataEntry</a>, and <a href="../../java/com/sleepycat/db/MultipleRecnoDataEntry.html">MultipleRecnoDataEntry</a>.</p> <p>The <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_INIT">DB_MULTIPLE_INIT</a> macro is always called first. It initializes a local application variable and the <b>data</b> <a href="../../api_c/dbt_class.html">DBT</a> for stepping through the set of returned records. Then, the application calls one of the remaining three macros: <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_NEXT">DB_MULTIPLE_NEXT</a>, <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_KEY_NEXT">DB_MULTIPLE_KEY_NEXT</a>, and <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_RECNO_NEXT">DB_MULTIPLE_RECNO_NEXT</a>.</p> <p>If the <a href="../../api_c/dbc_get.html#DB_MULTIPLE">DB_MULTIPLE</a> flag was specified to the <a href="../../api_c/db_get.html">DB->get</a> or <a href="../../api_c/dbc_get.html">DBcursor->get</a> method, the application will always call the <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_NEXT">DB_MULTIPLE_NEXT</a> macro. If the <a href="../../api_c/dbc_get.html#DB_MULTIPLE_KEY">DB_MULTIPLE_KEY</a> flag was specified to the <a href="../../api_c/db_get.html">DB->get</a> or <a href="../../api_c/dbc_get.html">DBcursor->get</a> method, and, the underlying database is a Btree or Hash database, the application will always call the <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_KEY_NEXT">DB_MULTIPLE_KEY_NEXT</a> macro. If the <a href="../../api_c/dbc_get.html#DB_MULTIPLE_KEY">DB_MULTIPLE_KEY</a> flag was specified to the <a href="../../api_c/db_get.html">DB->get</a> or <a href="../../api_c/dbc_get.html">DBcursor->get</a> method, and, the underlying database is a Queue or Recno database, the application will always call the <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_RECNO_NEXT">DB_MULTIPLE_RECNO_NEXT</a> macro. The <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_NEXT">DB_MULTIPLE_NEXT</a>, <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_KEY_NEXT">DB_MULTIPLE_KEY_NEXT</a>, and <a href="../../api_c/dbt_bulk.html#DB_MULTIPLE_RECNO_NEXT">DB_MULTIPLE_RECNO_NEXT</a> macros are called repeatedly, until the end of the returned records is reached. The end of the returned records is detected by the application's local pointer variable being set to NULL.</p> <p>The following is an example of a routine that displays the contents of a Btree database using the bulk return interfaces.</p> <blockquote><pre>int rec_display(dbp) DB *dbp; { DBC *dbcp; DBT key, data; size_t retklen, retdlen; char *retkey, *retdata; int ret, t_ret; void *p; <p> memset(&key, 0, sizeof(key)); memset(&data, 0, sizeof(data)); <p> /* Review the database in 5MB chunks. */ #define BUFFER_LENGTH (5 * 1024 * 1024) if ((data.data = malloc(BUFFER_LENGTH)) == NULL) return (errno); data.ulen = BUFFER_LENGTH; data.flags = DB_DBT_USERMEM; <p> /* Acquire a cursor for the database. */ if ((ret = dbp->cursor(dbp, NULL, &dbcp, 0)) != 0) { dbp->err(dbp, ret, "DB->cursor"); free(data.data); return (ret); } <p> for (;;) { /* * Acquire the next set of key/data pairs. This code does * not handle single key/data pairs that won't fit in a * BUFFER_LENGTH size buffer, instead returning DB_BUFFER_SMALL * to our caller. */ if ((ret = dbcp->c_get(dbcp, &key, &data, DB_MULTIPLE_KEY | DB_NEXT)) != 0) { if (ret != DB_NOTFOUND) dbp->err(dbp, ret, "DBcursor->c_get"); break; } <p> for (DB_MULTIPLE_INIT(p, &data);;) { DB_MULTIPLE_KEY_NEXT(p, &data, retkey, retklen, retdata, retdlen); if (p == NULL) break; printf("key: %.*s, data: %.*s\n", (int)retklen, retkey, (int)retdlen, retdata); } } <p> if ((t_ret = dbcp->c_close(dbcp)) != 0) { dbp->err(dbp, ret, "DBcursor->close"); if (ret == 0) ret = t_ret; } <p> free(data.data); <p> return (ret); }</pre></blockquote> <table width="100%"><tr><td><br></td><td align=right><a href="../am_misc/align.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../am_misc/partial.html"><img src="../../images/next.gif" alt="Next"></a> </td></tr></table> <p><font size=1>Copyright (c) 1996,2007 Oracle. All rights reserved.</font> </body> </html>