.. -*- rst -*- .. highlightlang:: none åãæ¤ç´¢ãã¼ã¯ã¼ããªã®ã«å ¨ææ¤ç´¢çµæãç°ãªã ============================================ åãæ¤ç´¢ãã¼ã¯ã¼ãã§ãä¸ç·ã«æå®ããã¯ã¨ãªã«ãã£ã¦ã¯å ¨ææ¤ç´¢ã®çµæãç°ãªããã¨ãããã¾ããããã§ã¯ããã®åå ã¨å¯¾çæ¹æ³ã説æãã¾ãã ä¾ -- ã¾ããå®éã«æ¤ç´¢çµæãç°ãªãä¾ã説æãã¾ãã DDLã¯ä»¥ä¸ã®éãã§ããBlogsãã¼ãã«ã®bodyã«ã©ã ãTokenMecabãã¼ã¯ãã¤ã¶ã¼ã使ã£ã¦ãã¼ã¯ãã¤ãºãã¦ããã¤ã³ããã¯ã¹ãä½æãã¦ãã¾ãã:: table_create Blogs TABLE_NO_KEY column_create Blogs body COLUMN_SCALAR ShortText column_create Blogs updated_at COLUMN_SCALAR Time table_create Terms TABLE_PAT_KEY|KEY_NORMALIZE ShortText --default_tokenizer TokenMecab column_create Terms blog_body COLUMN_INDEX|WITH_POSITION Blogs body ãã¹ãç¨ã®ãã¼ã¿ã¯1件ã ãæå ¥ãã¾ãã:: load --table Blogs [ ["body", "updated_at"], ["æ±äº¬é½æ°ã«æ·±å»ãªãã¡ã¼ã¸ãä¸ãã¾ããã", "2010/9/21 10:18:34"], ] ã¾ããå ¨ææ¤ç´¢ã®ã¿ã§æ¤ç´¢ãã¾ãããã®å ´åããããã¾ãã:: > select Blogs --filter 'body @ "æ±äº¬é½"' [[0,4102.268052438,0.000743783],[[[1],[["_id","UInt32"],["updated_at","Time"],["body","ShortText"]],[1,1285031914.0,"æ±äº¬é½æ°ã«æ·±å»ãªãã¡ã¼ã¸ãä¸ãã¾ããã"]]]] ç¶ãã¦ãç¯å²æå®ã¨å ¨ææ¤ç´¢ãçµã¿åããã¦æ¤ç´¢ãã¾ãï¼1285858800ã¯2010/10/1 0:0:0ã®ç§è¡¨è¨ï¼ããã®å ´åãããããã¾ãã:: > select Blogs --filter 'body @ "æ±äº¬é½" && updated_at < 1285858800' [[0,4387.524084839,0.001525487],[[[1],[["_id","UInt32"],["updated_at","Time"],["body","ShortText"]],[1,1285031914.0,"æ±äº¬é½æ°ã«æ·±å»ãªãã¡ã¼ã¸ãä¸ãã¾ããã"]]]] æå¾ã«ãç¯å²æå®ã¨å ¨ææ¤ç´¢ã®é çªãå ¥ãæ¿ãã¦æ¤ç´¢ãã¾ããåã ã®æ¡ä»¶ã¯åãã§ããããã®å ´åã¯ããããã¾ããã:: > select Blogs --filter 'updated_at < 1285858800 && body @ "æ±äº¬é½"' [[0,4400.292570838,0.000647716],[[[0],[["_id","UInt32"],["updated_at","Time"],["body","ShortText"]]]]] ã©ããã¦ãã®ãããªæåã«ãªããã説æãã¾ãã åå ---- ãã®ãããªæåã«ãªãã®ã¯å ¨ææ¤ç´¢æã«è¤æ°ã®æ¤ç´¢ã®æåã使ãåãã¦ããããã§ããããã§ã¯ç°¡åã«èª¬æããã®ã§ã詳細㯠:doc:`/spec/search` ãåç §ãã¦ãã ããã æ¤ç´¢ã®æåã«ã¯ä»¥ä¸ã®3種é¡ãããã¾ãã 1. å®å ¨ä¸è´æ¤ç´¢ 2. éããã¡æ¸ãæ¤ç´¢ 3. é¨åä¸è´æ¤ç´¢ groongaã¯åºæ¬çã«å®å ¨ä¸è´æ¤ç´¢ã®ã¿ãè¡ãã¾ããä¸è¨ã®ä¾ã§ã¯ãæ±äº¬é½æ°ã«æ·±å»ãªãã¡ã¼ã¸ãä¸ãã¾ããããããæ±äº¬é½ãã¨ããã¯ã¨ãªã§æ¤ç´¢ãã¦ãã¾ãããTokenMecabãã¼ã¯ãã¤ã¶ã¼ã使ã£ã¦ããå ´åã¯ãã®ã¯ã¨ãªã¯ããããã¾ããã æ¤ç´¢å¯¾è±¡ã®ãæ±äº¬é½æ°ã«æ·±å»ãªãã¡ã¼ã¸ãä¸ãã¾ãããã㯠æ±äº¬ / é½æ° / ã« / æ·±å» / 㪠/ ãã¡ã¼ã¸ / ã / ä¸ã / ã¾ã / ã / ã ã¨ãã¼ã¯ãã¤ãºããã¾ãããã¯ã¨ãªã®ãæ±äº¬é½ã㯠æ±äº¬ / é½ ã¨ãã¼ã¯ãã¤ãºããããããå®å ¨ä¸è´ãã¾ããã groongaã¯å®å ¨ä¸è´æ¤ç´¢ããçµæã®ããã件æ°ãæå®ã®é¾å¤ãè¶ ããªãå ´åã«éããéããã¡æ¸ãæ¤ç´¢ãè¡ããããã§ãããã件æ°ãé¾å¤ãè¶ ããªãå ´åã¯é¨åä¸è´æ¤ç´¢ãè¡ãã¾ãï¼é¾å¤ã¯1ãããã©ã«ãå¤ã¨ãªã£ã¦ãã¾ãï¼ããã®ã±ã¼ã¹ã®ãã¼ã¿ã¯é¨åä¸è´æ¤ç´¢ã§ã¯ãããããã®ã§ããæ±äº¬é½ãã¯ã¨ãªã®ã¿ãæå®ããã¨ããããã¾ãã ãããã以ä¸ã®ããã«å ¨ææ¤ç´¢åã«ãã§ã«é¾å¤ãè¶ãã¦ããå ´åï¼ãupdated_at < 1285858800ãã§1件ãããããé¾å¤ãè¶ããï¼ã¯ããã¨ãå®å ¨ä¸è´æ¤ç´¢ã§1件ããããããªãå ´åã§ãé¨åä¸è´æ¤ç´¢ãªã©ãè¡ãã¾ããã:: select Blogs --filter 'updated_at < 1285858800 && body @ "æ±äº¬é½"' ãã®ãããæ¡ä»¶ã®é åºãå¤ããã¨æ¤ç´¢çµæãå¤ããã¨ããç¶æ³ãçºçãã¾ãã以ä¸ã§ããã®æ å ±ãåé¿ããæ¹æ³ã2種é¡ç´¹ä»ãã¾ããããããããã¬ã¼ããªãã¨ãªãæ¡ä»¶ãããã®ã§æ¡ç¨ãããã©ãããååæ¤è¨ãã¦ãã ããã 対çæ¹æ³1: ãã¼ã¯ãã¤ã¶ã¼ãå¤æ´ãã ----------------------------------- TokenMecabãã¼ã¯ãã¤ã¶ã¼ã¯äºåã«æºåããè¾æ¸ãç¨ãã¦ãã¼ã¯ãã¤ãºãããããåç¾çãããé©åçãéè¦ãããã¼ã¯ãã¤ã¶ã¼ã¨è¨ãã¾ããä¸æ¹ãTokenBigramãªã©ãN-gramç³»ã®ãã¼ã¯ãã¤ã¶ã¼ã¯é©åçãéè¦ãããã¼ã¯ãã¤ã¶ã¼ã¨è¨ãã¾ããä¾ãã°ãTokenMecabã®å ´åãæ±äº¬é½ãã§ã京é½ãã«å®å ¨ä¸è´ãããã¨ã¯ããã¾ããããTokenBigramã§ã¯å®å ¨ä¸è´ãã¾ããä¸æ¹ãTokenMecabã§ã¯ãæ±äº¬é½æ°ãã«å®å ¨ä¸è´ãã¾ããããTokenBigramã§ã¯å®å ¨ä¸è´ãã¾ãã ãã®ããã«N-gramç³»ã®ãã¼ã¯ãã¤ã¶ã¼ãæå®ãããã¨ã«ããåç¾çãããããã¨ãã§ãã¾ãããé©åçãä¸ããæ¤ç´¢ãã¤ãºãå«ã¾ããå¯è½æ§ãé«ããªãã¾ãããã®åº¦åãã調æ´ããããã«ã¯ :doc:`/reference/commands/select` ã®match_columnsã§ä½¿ç¨ããç´¢å¼æ¯ã«éã¿ä»ããæå®ãã¾ãã ããã§ããåè¿°ã®ä¾ã使ã£ã¦å ·ä½ä¾ã示ãã¾ããã¾ããTokenBigramãç¨ããç´¢å¼ã追å ãã¾ãã:: table_create Bigram TABLE_PAT_KEY|KEY_NORMALIZE ShortText --default_tokenizer TokenBigram column_create Bigram blog_body COLUMN_INDEX|WITH_POSITION Blogs body ãã®ç¶æ ã§ã以åã¯ãããããªãã£ãã¬ã³ã¼ãããããããããã«ãªãã¾ãã:: > select Blogs --filter 'updated_at < 1285858800 && body @ "æ±äº¬é½"' [[0,7163.448064902,0.000418127],[[[1],[["_id","UInt32"],["updated_at","Time"],["body","ShortText"]],[1,1285031914.0,"æ±äº¬é½æ°ã«æ·±å»ãªãã¡ã¼ã¸ãä¸ãã¾ããã"]]]] ããããN-gramç³»ã®ãã¼ã¯ãã¤ã¶ã¼ã®æ¹ãTokenMecabãã¼ã¯ãã¤ã¶ã¼ãããèªã®ãããæ°ãå¤ããããN-gramç³»ã®ãããã¹ã³ã¢ã®æ¹ãéãæ±ããã¦ãã¾ãã¾ããN-gramç³»ã®ãã¼ã¯ãã¤ã¶ã¼ã®æ¹ãTokenMecabãã¼ã¯ãã¤ã¶ã¼ãããé©åçã®ä½ãå ´åãå¤ãã®ã§ããã®ã¾ã¾ã§ã¯æ¤ç´¢ãã¤ãºãä¸ä½ã«è¡¨ç¤ºãããå¯è½æ§ãé«ããªãã¾ãã ããã§ãTokenMecabãã¼ã¯ãã¤ã¶ã¼ã使ã£ã¦ä½ã£ãç´¢å¼ã®æ¹ãTokenBigramãã¼ã¯ãã¤ã¶ã¼ã使ã£ã¦ä½ã£ãç´¢å¼ãããéè¦ããããã«éã¿ä»ããæå®ãã¾ããããã¯ãmatch_columnsãªãã·ã§ã³ã§æå®ã§ãã¾ãã:: > select Blogs --match_columns 'Terms.blog_body * 10 || Bigram.blog_body * 3' --query 'æ±äº¬é½' --output_columns '_score, body' [[0,8167.364602632,0.000647003],[[[1],[["_score","Int32"],["body","ShortText"]],[13,"æ±äº¬é½æ°ã«æ·±å»ãªãã¡ã¼ã¸ãä¸ãã¾ããã"]]]] ãã®å ´åã¯ã¹ã³ã¢ã11ã«ãªã£ã¦ãã¾ããå 訳ã¯ãTerms.blog_bodyç´¢å¼ï¼TokenMecabãã¼ã¯ãã¤ã¶ã¼ã使ç¨ï¼ã§ãããããã®ã§10ãBigram.blog_bodyç´¢å¼ï¼TokenBigramãã¼ã¯ãã¤ã¶ã¼ã使ç¨ï¼ã§ãããããã®ã§3ãããããåè¨ãã¦13ã«ãªã£ã¦ãã¾ãããã®ããã«TokenMecabãã¼ã¯ãã¤ã¶ã¼ã®éã¿ãé«ããããã¨ã«ãããæ¤ç´¢ãã¤ãºãä¸ä½ã«ãããã¨ãæãã¤ã¤åç¾çãä¸ãããã¨ãã§ãã¾ãã ãã®ä¾ã¯æ¥æ¬èªã ã£ãã®ã§TokenBigramãã¼ã¯ãã¤ã¶ã¼ã§ããã£ãã®ã§ãããã¢ã«ãã¡ãããã®å ´åã¯TokenBigramSplitSymbolAlphaãã¼ã¯ãã¤ã¶ã¼ãªã©ãå©ç¨ããå¿ è¦ãããã¾ããä¾ãã°ãã楽ããbilliardãã¯TokenBigramãã¼ã¯ãã¤ã¶ã¼ã§ã¯ 楽ã / ãã / billiard ã¨ãªãããbillãã§ã¯å®å ¨ä¸è´ãã¾ãããä¸æ¹ãTokenBigramSplitSymbolAlphaãã¼ã¯ãã¤ã¶ã¼ã使ã㨠楽ã / ãã / ãb / bi / il / ll / li / ia / ar / rd / d ã¨ãªãããbillãã§ãå®å ¨ä¸è´ãã¾ãã TokenBigramSplitSymbolAlphaãã¼ã¯ãã¤ã¶ã¼ã使ãå ´åãéã¿ä»ããèæ ®ããå¿ è¦ããããã¨ã¯ãããããã¾ããã å©ç¨ã§ãããã¤ã°ã©ã ç³»ã®ãã¼ã¯ãã¤ã¶ã¼ã®ä¸è¦§ã¯ä»¥ä¸ã®éãã§ãã * TokenBigram: ãã¤ã°ã©ã ã§ãã¼ã¯ãã¤ãºãããé£ç¶ããè¨å·ã»ã¢ã«ãã¡ãããã»æ°åã¯ä¸èªã¨ãã¦æ±ãã * TokenBigramSplitSymbol: è¨å·ããã¤ã°ã©ã ã§ãã¼ã¯ãã¤ãºãããé£ç¶ããã¢ã«ãã¡ãããã»æ°åã¯ä¸èªã¨ãã¦æ±ãã * TokenBigramSplitSymbolAlpha: è¨å·ã¨ã¢ã«ãã¡ãããããã¤ã°ã©ã ã§ãã¼ã¯ãã¤ãºãããé£ç¶ããæ°åã¯ä¸èªã¨ãã¦æ±ãã * TokenBigramSplitSymbolAlphaDigit: è¨å·ã»ã¢ã«ãã¡ãããã»æ°åããã¤ã°ã©ã ã§ãã¼ã¯ãã¤ãºããã * TokenBigramIgnoreBlank: ãã¤ã°ã©ã ã§ãã¼ã¯ãã¤ãºãããé£ç¶ããè¨å·ã»ã¢ã«ãã¡ãããã»æ°åã¯ä¸èªã¨ãã¦æ±ãã空ç½ã¯ç¡è¦ããã * TokenBigramIgnoreBlankSplitSymbol: è¨å·ããã¤ã°ã©ã ã§ãã¼ã¯ãã¤ãºãããé£ç¶ããã¢ã«ãã¡ãããã»æ°åã¯ä¸èªã¨ãã¦æ±ãã空ç½ã¯ç¡è¦ããã * TokenBigramIgnoreBlankSplitSymbolAlpha: è¨å·ã¨ã¢ã«ãã¡ãããããã¤ã°ã©ã ã§ãã¼ã¯ãã¤ãºãããé£ç¶ããæ°åã¯ä¸èªã¨ãã¦æ±ãã空ç½ã¯ç¡è¦ããã * TokenBigramIgnoreBlankSplitSymbolAlphaDigit: è¨å·ã»ã¢ã«ãã¡ãããã»æ°åããã¤ã°ã©ã ã§ãã¼ã¯ãã¤ãºããã空ç½ã¯ç¡è¦ããã 対çæ¹æ³2: é¾å¤ãããã ----------------------- éããã¡æ¸ãæ¤ç´¢ã»é¨åä¸è´æ¤ç´¢ãå©ç¨ãããã©ããã®é¾å¤ã¯--with-match-escalation-threshold configureãªãã·ã§ã³ã§å¤æ´ãããã¨ãã§ãã¾ãã以ä¸ã®ããã«æå®ããã¨ã100件以ä¸ã®ãããæ°ã§ããã°ããã¨ãå®å ¨ä¸è´æ¤ç´¢ã§ããããã¦ããéããã¡æ¸ãæ¤ç´¢ã»é¨åä¸è´æ¤ç´¢ãè¡ãã¾ãã:: % ./configure --with-match-escalation-threashold=100 ãã®å ´åã対çæ¹æ³1åæ§ãæ¤ç´¢ãã¤ãºãä¸ä½ã«ç¾ããå¯è½æ§ãé«ããªããã¨ã«æ³¨æãã¦ãã ãããæ¤ç´¢ãã¤ãºãå¤ããªã£ãå ´åã¯æå®ããå¤ãä½ãããå¿ è¦ãããã¾ãã