Codeigniter 集成sphinx搜索 这里采用的是coreseek中文搜索引擎,具体安装请参考官方网站
先上效果图 加入sphinx类库(/application/libraries/sphinx_client.php) 0001 <?php
0002
0003 //
0004 // $Id: sphinxapi.php 2055 2009-11-06 23:09:58Z shodan $
0005 //
0006
0007 //
0008 // Copyright (c) 2001-2008, Andrew Aksyonoff. All rights reserved.
0009 //
0010 // This program is free software; you can redistribute it and/or modify
0011 // it under the terms of the GNU General Public License. You should have
0012 // received a copy of the GPL license along with this program; if you
0013 // did not, you can find it at http://www.gnu.org/
0014 //
0015
0016 /////////////////////////////////////////////////////////////////////////////
0017 // PHP version of Sphinx searchd client (PHP API)
0018 /////////////////////////////////////////////////////////////////////////////
0019
0020 /// known searchd commands
0021 define ( "SEARCHD_COMMAND_SEARCH", 0 );
0022 define ( "SEARCHD_COMMAND_EXCERPT", 1 );
0023 define ( "SEARCHD_COMMAND_UPDATE", 2 );
0024 define ( "SEARCHD_COMMAND_KEYWORDS",3 );
0025 define ( "SEARCHD_COMMAND_PERSIST", 4 );
0026 define ( "SEARCHD_COMMAND_STATUS", 5 );
0027 define ( "SEARCHD_COMMAND_QUERY", 6 );
0028
0029 /// current client-side command implementation versions
0030 define ( "VER_COMMAND_SEARCH", 0x116 );
0031 define ( "VER_COMMAND_EXCERPT", 0x100 );
0032 define ( "VER_COMMAND_UPDATE", 0x102 );
0033 define ( "VER_COMMAND_KEYWORDS", 0x100 );
0034 define ( "VER_COMMAND_STATUS", 0x100 );
0035 define ( "VER_COMMAND_QUERY", 0x100 );
0036
0037 /// known searchd status codes
0038 define ( "SEARCHD_OK", 0 );
0039 define ( "SEARCHD_ERROR", 1 );
0040 define ( "SEARCHD_RETRY", 2 );
0041 define ( "SEARCHD_WARNING", 3 );
0042
0043 /// known match modes
0044 define ( "SPH_MATCH_ALL", 0 );
0045 define ( "SPH_MATCH_ANY", 1 );
0046 define ( "SPH_MATCH_PHRASE", 2 );
0047 define ( "SPH_MATCH_BOOLEAN", 3 );
0048 define ( "SPH_MATCH_EXTENDED", 4 );
0049 define ( "SPH_MATCH_FULLSCAN", 5 );
0050 define ( "SPH_MATCH_EXTENDED2", 6 ); // extended engine V2 (TEMPORARY, WILL BE REMOVED)
0051
0052 /// known ranking modes (ext2 only)
0053 define ( "SPH_RANK_PROXIMITY_BM25", 0 ); ///< default mode, phrase proximity major factor and BM25 minor one
0054 define ( "SPH_RANK_BM25", 1 ); ///< statistical mode, BM25 ranking only (faster but worse quality)
0055 define ( "SPH_RANK_NONE", 2 ); ///< no ranking, all matches get a weight of 1
0056 define ( "SPH_RANK_WORDCOUNT", 3 ); ///< simple word-count weighting, rank is a weighted sum of per-field keyword occurence counts
0057 define ( "SPH_RANK_PROXIMITY", 4 );
0058 define ( "SPH_RANK_MATCHANY", 5 );
0059 define ( "SPH_RANK_FIELDMASK", 6 );
0060
0061 /// known sort modes
0062 define ( "SPH_SORT_RELEVANCE", 0 );
0063 define ( "SPH_SORT_ATTR_DESC", 1 );
0064 define ( "SPH_SORT_ATTR_ASC", 2 );
0065 define ( "SPH_SORT_TIME_SEGMENTS", 3 );
0066 define ( "SPH_SORT_EXTENDED", 4 );
0067 define ( "SPH_SORT_EXPR", 5 );
0068
0069 /// known filter types
0070 define ( "SPH_FILTER_VALUES", 0 );
0071 define ( "SPH_FILTER_RANGE", 1 );
0072 define ( "SPH_FILTER_FLOATRANGE", 2 );
0073
0074 /// known attribute types
0075 define ( "SPH_ATTR_INTEGER", 1 );
0076 define ( "SPH_ATTR_TIMESTAMP", 2 );
0077 define ( "SPH_ATTR_ORDINAL", 3 );
0078 define ( "SPH_ATTR_BOOL", 4 );
0079 define ( "SPH_ATTR_FLOAT", 5 );
0080 define ( "SPH_ATTR_BIGINT", 6 );
0081 define ( "SPH_ATTR_MULTI", 0x40000000 );
0082
0083 /// known grouping functions
0084 define ( "SPH_GROUPBY_DAY", 0 );
0085 define ( "SPH_GROUPBY_WEEK", 1 );
0086 define ( "SPH_GROUPBY_MONTH", 2 );
0087 define ( "SPH_GROUPBY_YEAR", 3 );
0088 define ( "SPH_GROUPBY_ATTR", 4 );
0089 define ( "SPH_GROUPBY_ATTRPAIR", 5 );
0090
0091 // important properties of PHP's integers:
0092 // - always signed (one bit short of PHP_INT_SIZE)
0093 // - conversion from string to int is saturated
0094 // - float is double
0095 // - div converts arguments to floats
0096 // - mod converts arguments to ints
0097
0098 // the packing code below works as follows:
0099 // - when we got an int, just pack it
0100 // if performance is a problem, this is the branch users should aim for
0101 //
0102 // - otherwise, we got a number in string form
0103 // this might be due to different reasons, but we assume that this is
0104 // because it didn't fit into PHP int
0105 //
0106 // - factor the string into high and low ints for packing
0107 // - if we have bcmath, then it is used
0108 // - if we don't, we have to do it manually (this is the fun part)
0109 //
0110 // - x64 branch does factoring using ints
0111 // - x32 (ab)uses floats, since we can't fit unsigned 32-bit number into an int
0112 //
0113 // unpacking routines are pretty much the same.
0114 // - return ints if we can
0115 // - otherwise format number into a string
0116
0117 /// pack 64-bit signed
0118 function sphPackI64 ( $v )
0119 {
0120 assert ( is_numeric($v) );
0121
0122 // x64
0123 if ( PHP_INT_SIZE>=8 )
0124 {
0125 $v = (int)$v;
0126 return pack ( "NN", $v>>32, $v&0xFFFFFFFF );
0127 }
0128
0129 // x32, int
0130 if ( is_int($v) )
0131 return pack ( "NN", $v < 0 ? -1 : 0, $v );
0132
0133 // x32, bcmath
0134 if ( function_exists("bcmul") )
0135 {
0136 if ( bccomp ( $v, 0 ) == -1 )
0137 $v = bcadd ( "18446744073709551616", $v );
0138 $h = bcdiv ( $v, "4294967296", 0 );
0139 $l = bcmod ( $v, "4294967296" );
0140 return pack ( "NN", (float)$h, (float)$l ); // conversion to float is intentional; int would lose 31st bit
0141 }
0142
0143 // x32, no-bcmath
0144 $p = max(0, strlen($v) - 13);
0145 $lo = abs((float)substr($v, $p));
0146 $hi = abs((float)substr($v, 0, $p));
0147
0148 $m = $lo + $hi*1316134912.0; // (10 ^ 13) % (1 << 32) = 1316134912
0149 $q = floor($m/4294967296.0);
0150 $l = $m - ($q*4294967296.0);
0151 $h = $hi*2328.0 + $q; // (10 ^ 13) / (1 << 32) = 2328
0152
0153 if ( $v<0 )
0154 {
0155 if ( $l==0 )
0156 $h = 4294967296.0 - $h;
0157 else
0158 {
0159 $h = 4294967295.0 - $h;
0160 $l = 4294967296.0 - $l;
0161 }
0162 }
0163 return pack ( "NN", $h, $l );
0164 }
0165
0166 /// pack 64-bit unsigned
0167 function sphPackU64 ( $v )
0168 {
0169 assert ( is_numeric($v) );
0170
0171 // x64
0172 if ( PHP_INT_SIZE>=8 )
0173 {
0174 assert ( $v>=0 );
0175
0176 // x64, int
0177 if ( is_int($v) )
0178 return pack ( "NN", $v>>32, $v&0xFFFFFFFF );
0179
0180 // x64, bcmath
0181 if ( function_exists("bcmul") )
0182 {
0183 $h = bcdiv ( $v, 4294967296, 0 );
0184 $l = bcmod ( $v, 4294967296 );
0185 return pack ( "NN", $h, $l );
0186 }
0187
0188 // x64, no-bcmath
0189 $p = max ( 0, strlen($v) - 13 );
0190 $lo = (int)substr ( $v, $p );
0191 $hi = (int)substr ( $v, 0, $p );
0192
0193 $m = $lo + $hi*1316134912;
0194 $l = $m % 4294967296;
0195 $h = $hi*2328 + (int)($m/4294967296);
0196
0197 return pack ( "NN", $h, $l );
0198 }
0199
0200 // x32, int
0201 if ( is_int($v) )
0202 return pack ( "NN", 0, $v );
0203
0204 // x32, bcmath
0205 if ( function_exists("bcmul") )
0206 {
0207 $h = bcdiv ( $v, "4294967296", 0 );
0208 $l = bcmod ( $v, "4294967296" );
0209 return pack ( "NN", (float)$h, (float)$l ); // conversion to float is intentional; int would lose 31st bit
0210 }
0211
0212 // x32, no-bcmath
0213 $p = max(0, strlen($v) - 13);
0214 $lo = (float)substr($v, $p);
0215 $hi = (float)substr($v, 0, $p);
0216
0217 $m = $lo + $hi*1316134912.0;
0218 $q = floor($m / 4294967296.0);
0219 $l = $m - ($q * 4294967296.0);
0220 $h = $hi*2328.0 + $q;
0221
0222 return pack ( "NN", $h, $l );
0223 }
0224
0225 // unpack 64-bit unsigned
0226 function sphUnpackU64 ( $v )
0227 {
0228 list ( $hi, $lo ) = array_values ( unpack ( "N*N*", $v ) );
0229
0230 if ( PHP_INT_SIZE>=8 )
0231 {
0232 if ( $hi<0 ) $hi += (1<<32); // because php 5.2.2 to 5.2.5 is totally fucked up again
0233 if ( $lo<0 ) $lo += (1<<32);
0234
0235 // x64, int
0236 if ( $hi<=2147483647 )
0237 return ($hi<<32) + $lo;
0238
0239 // x64, bcmath
0240 if ( function_exists("bcmul") )
0241 return bcadd ( $lo, bcmul ( $hi, "4294967296" ) );
0242
0243 // x64, no-bcmath
0244 $C = 100000;
0245 $h = ((int)($hi / $C) << 32) + (int)($lo / $C);
0246 $l = (($hi % $C) << 32) + ($lo % $C);
0247 if ( $l>$C )
0248 {
0249 $h += (int)($l / $C);
0250 $l = $l % $C;
0251 }
0252
0253 if ( $h==0 )
0254 return $l;
0255 return sprintf ( "%d%05d", $h, $l );
0256 }
0257
0258 // x32, int
0259 if ( $hi==0 )
0260 {
0261 if ( $lo>0 )
0262 return $lo;
0263 return sprintf ( "%u", $lo );
0264 }
0265
0266 $hi = sprintf ( "%u", $hi );
0267 $lo = sprintf ( "%u", $lo );
0268
0269 // x32, bcmath
0270 if ( function_exists("bcmul") )
0271 return bcadd ( $lo, bcmul ( $hi, "4294967296" ) );
0272
0273 // x32, no-bcmath
0274 $hi = (float)$hi;
0275 $lo = (float)$lo;
0276
0277 $q = floor($hi/10000000.0);
0278 $r = $hi - $q*10000000.0;
0279 $m = $lo + $r*4967296.0;
0280 $mq = floor($m/10000000.0);
0281 $l = $m - $mq*10000000.0;
0282 $h = $q*4294967296.0 + $r*429.0 + $mq;
0283
0284 $h = sprintf ( "%.0f", $h );
0285 $l = sprintf ( "%07.0f", $l );
0286 if ( $h=="0" )
0287 return sprintf( "%.0f", (float)$l );
0288 return $h . $l;
0289 }
0290
0291 // unpack 64-bit signed
0292 function sphUnpackI64 ( $v )
0293 {
0294 list ( $hi, $lo ) = array_values ( unpack ( "N*N*", $v ) );
0295
0296 // x64
0297 if ( PHP_INT_SIZE>=8 )
0298 {
0299 if ( $hi<0 ) $hi += (1<<32); // because php 5.2.2 to 5.2.5 is totally fucked up again
0300 if ( $lo<0 ) $lo += (1<<32);
0301
0302 return ($hi<<32) + $lo;
0303 }
0304
0305 // x32, int
0306 if ( $hi==0 )
0307 {
0308 if ( $lo>0 )
0309 return $lo;
0310 return sprintf ( "%u", $lo );
0311 }
0312 // x32, int
0313 elseif ( $hi==-1 )
0314 {
0315 if ( $lo<0 )
0316 return $lo;
0317 return sprintf ( "%.0f", $lo - 4294967296.0 );
0318 }
0319
0320 $neg = "";
0321 $c = 0;
0322 if ( $hi<0 )
0323 {
0324 $hi = ~$hi;
0325 $lo = ~$lo;
0326 $c = 1;
0327 $neg = "-";
0328 }
0329
0330 $hi = sprintf ( "%u", $hi );
0331 $lo = sprintf ( "%u", $lo );
0332
0333 // x32, bcmath
0334 if ( function_exists("bcmul") )
0335 return $neg . bcadd ( bcadd ( $lo, bcmul ( $hi, "4294967296" ) ), $c );
0336
0337 // x32, no-bcmath
0338 $hi = (float)$hi;
0339 $lo = (float)$lo;
0340
0341 $q = floor($hi/10000000.0);
0342 $r = $hi - $q*10000000.0;
0343 $m = $lo + $r*4967296.0;
0344 $mq = floor($m/10000000.0);
0345 $l = $m - $mq*10000000.0 + $c;
0346 $h = $q*4294967296.0 + $r*429.0 + $mq;
0347 if ( $l==10000000 )
0348 {
0349 $l = 0;
0350 $h += 1;
0351 }
0352
0353 $h = sprintf ( "%.0f", $h );
0354 $l = sprintf ( "%07.0f", $l );
0355 if ( $h=="0" )
0356 return $neg . sprintf( "%.0f", (float)$l );
0357 return $neg . $h . $l;
0358 }
0359
0360
0361 function sphFixUint ( $value )
0362 {
0363 if ( PHP_INT_SIZE>=8 )
0364 {
0365 // x64 route, workaround broken unpack() in 5.2.2+
0366 if ( $value<0 ) $value += (1<<32);
0367 return $value;
0368 }
0369 else
0370 {
0371 // x32 route, workaround php signed/unsigned braindamage
0372 return sprintf ( "%u", $value );
0373 }
0374 }
0375
0376
0377 /// sphinx searchd client class
0378 class Sphinx_client
0379 {
0380 var $_host; ///< searchd host (default is "localhost")
0381 var $_port; ///< searchd port (default is 9312)
0382 var $_offset; ///< how many records to seek from result-set start (default is 0)
0383 var $_limit; ///< how many records to return from result-set starting at offset (default is 20)
0384 var $_mode; ///< query matching mode (default is SPH_MATCH_ALL)
0385 var $_weights; ///< per-field weights (default is 1 for all fields)
0386 var $_sort; ///< match sorting mode (default is SPH_SORT_RELEVANCE)
0387 var $_sortby; ///< attribute to sort by (defualt is "")
0388 var $_min_id; ///< min ID to match (default is 0, which means no limit)
0389 var $_max_id; ///< max ID to match (default is 0, which means no limit)
0390 var $_filters; ///< search filters
0391 var $_groupby; ///< group-by attribute name
0392 var $_groupfunc; ///< group-by function (to pre-process group-by attribute value with)
0393 var $_groupsort; ///< group-by sorting clause (to sort groups in result set with)
0394 var $_groupdistinct;///< group-by count-distinct attribute
0395 var $_maxmatches; ///< max matches to retrieve
0396 var $_cutoff; ///< cutoff to stop searching at (default is 0)
0397 var $_retrycount; ///< distributed retries count
0398 var $_retrydelay; ///< distributed retries delay
0399 var $_anchor; ///< geographical anchor point
0400 var $_indexweights; ///< per-index weights
0401 var $_ranker; ///< ranking mode (default is SPH_RANK_PROXIMITY_BM25)
0402 var $_maxquerytime; ///< max query time, milliseconds (default is 0, do not limit)
0403 var $_fieldweights; ///< per-field-name weights
0404 var $_overrides; ///< per-query attribute values overrides
0405 var $_select; ///< select-list (attributes or expressions, with optional aliases)
0406
0407 var $_error; ///< last error message
0408 var $_warning; ///< last warning message
0409 var $_connerror; ///< connection error vs remote error flag
0410
0411 var $_reqs; ///< requests array for multi-query
0412 var $_mbenc; ///< stored mbstring encoding
0413 var $_arrayresult; ///< whether $result["matches"] should be a hash or an array
0414 var $_timeout; ///< connect timeout
0415
0416 /////////////////////////////////////////////////////////////////////////////
0417 // common stuff
0418 /////////////////////////////////////////////////////////////////////////////
0419
0420 /// create a new client object and fill defaults
0421 function __construct ()
0422 {
0423 // per-client-object settings
0424 $this->_host = "localhost";
0425 $this->_port = 9312;
0426 $this->_path = false;
0427 $this->_socket = false;
0428
0429 // per-query settings
0430 $this->_offset = 0;
0431 $this->_limit = 20;
0432 $this->_mode = SPH_MATCH_ALL;
0433 $this->_weights = array ();
0434 $this->_sort = SPH_SORT_RELEVANCE;
0435 $this->_sortby = "";
0436 $this->_min_id = 0;
0437 $this->_max_id = 0;
0438 $this->_filters = array ();
0439 $this->_groupby = "";
0440 $this->_groupfunc = SPH_GROUPBY_DAY;
0441 $this->_groupsort = "@group desc";
0442 $this->_groupdistinct= "";
0443 $this->_maxmatches = 1000;
0444 $this->_cutoff = 0;
0445 $this->_retrycount = 0;
0446 $this->_retrydelay = 0;
0447 $this->_anchor = array ();
0448 $this->_indexweights= array ();
0449 $this->_ranker = SPH_RANK_PROXIMITY_BM25;
0450 $this->_maxquerytime= 0;
0451 $this->_fieldweights= array();
0452 $this->_overrides = array();
0453 $this->_select = "*";
0454
0455 $this->_error = ""; // per-reply fields (for single-query case)
0456 $this->_warning = "";
0457 $this->_connerror = false;
0458
0459 $this->_reqs = array (); // requests storage (for multi-query case)
0460 $this->_mbenc = "";
0461 $this->_arrayresult = false;
0462 $this->_timeout = 0;
0463 }
0464
0465 function __destruct()
0466 {
0467 if ( $this->_socket !== false )
0468 fclose ( $this->_socket );
0469 }
0470
0471 /// get last error message (string)
0472 function GetLastError ()
0473 {
0474 return $this->_error;
0475 }
0476
0477 /// get last warning message (string)
0478 function GetLastWarning ()
0479 {
0480 return $this->_warning;
0481 }
0482
0483 /// get last error flag (to tell network connection errors from searchd errors or broken responses)
0484 function IsConnectError()
0485 {
0486 return $this->_connerror;
0487 }
0488
0489 /// set searchd host name (string) and port (integer)
0490 function SetServer ( $host, $port = 0 )
0491 {
0492 assert ( is_string($host) );
0493 if ( $host[0] == '/')
0494 {
0495 $this->_path = 'unix://' . $host;
0496 return;
0497 }
0498 if ( substr ( $host, 0, 7 )=="unix://" )
0499 {
0500 $this->_path = $host;
0501 return;
0502 }
0503
0504 assert ( is_int($port) );
0505 $this->_host = $host;
0506 $this->_port = $port;
0507 $this->_path = '';
0508
0509 }
0510
0511 /// set server connection timeout (0 to remove)
0512 function SetConnectTimeout ( $timeout )
0513 {
0514 assert ( is_numeric($timeout) );
0515 $this->_timeout = $timeout;
0516 }
0517
0518
0519 function _Send ( $handle, $data, $length )
0520 {
0521 if ( feof($handle) || fwrite ( $handle, $data, $length ) !== $length )
0522 {
0523 $this->_error = 'connection unexpectedly closed (timed out?)';
0524 $this->_connerror = true;
0525 return false;
0526 }
0527 return true;
0528 }
0529
0530 /////////////////////////////////////////////////////////////////////////////
0531
0532 /// enter mbstring workaround mode
0533 function _MBPush ()
0534 {
0535 $this->_mbenc = "";
0536 if ( ini_get ( "mbstring.func_overload" ) & 2 )
0537 {
0538 $this->_mbenc = mb_internal_encoding();
0539 mb_internal_encoding ( "latin1" );
0540 }
0541 }
0542
0543 /// leave mbstring workaround mode
0544 function _MBPop ()
0545 {
0546 if ( $this->_mbenc )
0547 mb_internal_encoding ( $this->_mbenc );
0548 }
0549
0550 /// connect to searchd server
0551 function _Connect ()
0552 {
0553 if ( $this->_socket!==false )
0554 {
0555 // we are in persistent connection mode, so we have a socket
0556 // however, need to check whether it's still alive
0557 if ( !@feof ( $this->_socket ) )
0558 return $this->_socket;
0559
0560 // force reopen
0561 $this->_socket = false;
0562 }
0563
0564 $errno = 0;
0565 $errstr = "";
0566 $this->_connerror = false;
0567
0568 if ( $this->_path )
0569 {
0570 $host = $this->_path;
0571 $port = 0;
0572 }
0573 else
0574 {
0575 $host = $this->_host;
0576 $port = $this->_port;
0577 }
0578
0579 if ( $this->_timeout<=0 )
0580 $fp = @fsockopen ( $host, $port, $errno, $errstr );
0581 else
0582 $fp = @fsockopen ( $host, $port, $errno, $errstr, $this->_timeout );
0583
0584 if ( !$fp )
0585 {
0586 if ( $this->_path )
0587 $location = $this->_path;
0588 else
0589 $location = "{$this->_host}:{$this->_port}";
0590
0591 $errstr = trim ( $errstr );
0592 $this->_error = "connection to $location failed (errno=$errno, msg=$errstr)";
0593 $this->_connerror = true;
0594 return false;
0595 }
0596
0597 // send my version
0598 // this is a subtle part. we must do it before (!) reading back from searchd.
0599 // because otherwise under some conditions (reported on FreeBSD for instance)
0600 // TCP stack could throttle write-write-read pattern because of Nagle.
0601 if ( !$this->_Send ( $fp, pack ( "N", 1 ), 4 ) )
0602 {
0603 fclose ( $fp );
0604 $this->_error = "failed to send client protocol version";
0605 return false;
0606 }
0607
0608 // check version
0609 list(,$v) = unpack ( "N*", fread ( $fp, 4 ) );
0610 $v = (int)$v;
0611 if ( $v<1 )
0612 {
0613 fclose ( $fp );
0614 $this->_error = "expected searchd protocol version 1+, got version '$v'";
0615 return false;
0616 }
0617
0618 return $fp;
0619 }
0620
0621 /// get and check response packet from searchd server
0622 function _GetResponse ( $fp, $client_ver )
0623 {
0624 $response = "";
0625 $len = 0;
0626
0627 $header = fread ( $fp, 8 );
0628 if ( strlen($header)==8 )
0629 {
0630 list ( $status, $ver, $len ) = array_values ( unpack ( "n2a/Nb", $header ) );
0631 $left = $len;
0632 while ( $left>0 && !feof($fp) )
0633 {
0634 $chunk = fread ( $fp, $left );
0635 if ( $chunk )
0636 {
0637 $response .= $chunk;
0638 $left -= strlen($chunk);
0639 }
0640 }
0641 }
0642 if ( $this->_socket === false )
0643 fclose ( $fp );
0644
0645 // check response
0646 $read = strlen ( $response );
0647 if ( !$response || $read!=$len )
0648 {
0649 $this->_error = $len
0650 ? "failed to read searchd response (status=$status, ver=$ver, len=$len, read=$read)"
0651 : "received zero-sized searchd response";
0652 return false;
0653 }
0654
0655 // check status
0656 if ( $status==SEARCHD_WARNING )
0657 {
0658 list(,$wlen) = unpack ( "N*", substr ( $response, 0, 4 ) );
0659 $this->_warning = substr ( $response, 4, $wlen );
0660 return substr ( $response, 4+$wlen );
0661 }
0662 if ( $status==SEARCHD_ERROR )
0663 {
0664 $this->_error = "searchd error: " . substr ( $response, 4 );
0665 return false;
0666 }
0667 if ( $status==SEARCHD_RETRY )
0668 {
0669 $this->_error = "temporary searchd error: " . substr ( $response, 4 );
0670 return false;
0671 }
0672 if ( $status!=SEARCHD_OK )
0673 {
0674 $this->_error = "unknown status code '$status'";
0675 return false;
0676 }
0677
0678 // check version
0679 if ( $ver<$client_ver )
0680 {
0681 $this->_warning = sprintf ( "searchd command v.%d.%d older than client's v.%d.%d, some options might not work",
0682 $ver>>8, $ver&0xff, $client_ver>>8, $client_ver&0xff );
0683 }
0684
0685 return $response;
0686 }
0687
0688 /////////////////////////////////////////////////////////////////////////////
0689 // searching
0690 /////////////////////////////////////////////////////////////////////////////
0691
0692 /// set offset and count into result set,
0693 /// and optionally set max-matches and cutoff limits
0694 function SetLimits ( $offset, $limit, $max=0, $cutoff=0 )
0695 {
0696 assert ( is_int($offset) );
0697 assert ( is_int($limit) );
0698 assert ( $offset>=0 );
0699 assert ( $limit>0 );
0700 assert ( $max>=0 );
0701 $this->_offset = $offset;
0702 $this->_limit = $limit;
0703 if ( $max>0 )
0704 $this->_maxmatches = $max;
0705 if ( $cutoff>0 )
0706 $this->_cutoff = $cutoff;
0707 }
0708
0709 /// set maximum query time, in milliseconds, per-index
0710 /// integer, 0 means "do not limit"
0711 function SetMaxQueryTime ( $max )
0712 {
0713 assert ( is_int($max) );
0714 assert ( $max>=0 );
0715 $this->_maxquerytime = $max;
0716 }
0717
0718 /// set matching mode
0719 function SetMatchMode ( $mode )
0720 {
0721 assert ( $mode==SPH_MATCH_ALL
0722 || $mode==SPH_MATCH_ANY
0723 || $mode==SPH_MATCH_PHRASE
0724 || $mode==SPH_MATCH_BOOLEAN
0725 || $mode==SPH_MATCH_EXTENDED
0726 || $mode==SPH_MATCH_FULLSCAN
0727 || $mode==SPH_MATCH_EXTENDED2 );
0728 $this->_mode = $mode;
0729 }
0730
0731 /// set ranking mode
0732 function SetRankingMode ( $ranker )
0733 {
0734 assert ( $ranker==SPH_RANK_PROXIMITY_BM25
0735 || $ranker==SPH_RANK_BM25
0736 || $ranker==SPH_RANK_NONE
0737 || $ranker==SPH_RANK_WORDCOUNT
0738 || $ranker==SPH_RANK_PROXIMITY );
0739 $this->_ranker = $ranker;
0740 }
0741
0742 /// set matches sorting mode
0743 function SetSortMode ( $mode, $sortby="" )
0744 {
0745 assert (
0746 $mode==SPH_SORT_RELEVANCE ||
0747 $mode==SPH_SORT_ATTR_DESC ||
0748 $mode==SPH_SORT_ATTR_ASC ||
0749 $mode==SPH_SORT_TIME_SEGMENTS ||
0750 $mode==SPH_SORT_EXTENDED ||
0751 $mode==SPH_SORT_EXPR );
0752 assert ( is_string($sortby) );
0753 assert ( $mode==SPH_SORT_RELEVANCE || strlen($sortby)>0 );
0754
0755 $this->_sort = $mode;
0756 $this->_sortby = $sortby;
0757 }
0758
0759 /// bind per-field weights by order
0760 /// DEPRECATED; use SetFieldWeights() instead
0761 function SetWeights ( $weights )
0762 {
0763 assert ( is_array($weights) );
0764 foreach ( $weights as $weight )
0765 assert ( is_int($weight) );
0766
0767 $this->_weights = $weights;
0768 }
0769
0770 /// bind per-field weights by name
0771 function SetFieldWeights ( $weights )
0772 {
0773 assert ( is_array($weights) );
0774 foreach ( $weights as $name=>$weight )
0775 {
0776 assert ( is_string($name) );
0777 assert ( is_int($weight) );
0778 }
0779 $this->_fieldweights = $weights;
0780 }
0781
0782 /// bind per-index weights by name
0783 function SetIndexWeights ( $weights )
0784 {
0785 assert ( is_array($weights) );
0786 foreach ( $weights as $index=>$weight )
0787 {
0788 assert ( is_string($index) );
0789 assert ( is_int($weight) );
0790 }
0791 $this->_indexweights = $weights;
0792 }
0793
0794 /// set IDs range to match
0795 /// only match records if document ID is beetwen $min and $max (inclusive)
0796 function SetIDRange ( $min, $max )
0797 {
0798 assert ( is_numeric($min) );
0799 assert ( is_numeric($max) );
0800 assert ( $min<=$max );
0801 $this->_min_id = $min;
0802 $this->_max_id = $max;
0803 }
0804
0805 /// set values set filter
0806 /// only match records where $attribute value is in given set
0807 function SetFilter ( $attribute, $values, $exclude=false )
0808 {
0809 assert ( is_string($attribute) );
0810 assert ( is_array($values) );
0811 assert ( count($values) );
0812
0813 if ( is_array($values) && count($values) )
0814 {
0815 foreach ( $values as $value )
0816 assert ( is_numeric($value) );
0817
0818 $this->_filters[] = array ( "type"=>SPH_FILTER_VALUES, "attr"=>$attribute, "exclude"=>$exclude, "values"=>$values );
0819 }
0820 }
0821
0822 /// set range filter
0823 /// only match records if $attribute value is beetwen $min and $max (inclusive)
0824 function SetFilterRange ( $attribute, $min, $max, $exclude=false )
0825 {
0826 assert ( is_string($attribute) );
0827 assert ( is_numeric($min) );
0828 assert ( is_numeric($max) );
0829 assert ( $min<=$max );
0830
0831 $this->_filters[] = array ( "type"=>SPH_FILTER_RANGE, "attr"=>$attribute, "exclude"=>$exclude, "min"=>$min, "max"=>$max );
0832 }
0833
0834 /// set float range filter
0835 /// only match records if $attribute value is beetwen $min and $max (inclusive)
0836 function SetFilterFloatRange ( $attribute, $min, $max, $exclude=false )
0837 {
0838 assert ( is_string($attribute) );
0839 assert ( is_float($min) );
0840 assert ( is_float($max) );
0841 assert ( $min<=$max );
0842
0843 $this->_filters[] = array ( "type"=>SPH_FILTER_FLOATRANGE, "attr"=>$attribute, "exclude"=>$exclude, "min"=>$min, "max"=>$max );
0844 }
0845
0846 /// setup anchor point for geosphere distance calculations
0847 /// required to use @geodist in filters and sorting
0848 /// latitude and longitude must be in radians
0849 function SetGeoAnchor ( $attrlat, $attrlong, $lat, $long )
0850 {
0851 assert ( is_string($attrlat) );
0852 assert ( is_string($attrlong) );
0853 assert ( is_float($lat) );
0854 assert ( is_float($long) );
0855
0856 $this->_anchor = array ( "attrlat"=>$attrlat, "attrlong"=>$attrlong, "lat"=>$lat, "long"=>$long );
0857 }
0858
0859 /// set grouping attribute and function
0860 function SetGroupBy ( $attribute, $func, $groupsort="@group desc" )
0861 {
0862 assert ( is_string($attribute) );
0863 assert ( is_string($groupsort) );
0864 assert ( $func==SPH_GROUPBY_DAY
0865 || $func==SPH_GROUPBY_WEEK
0866 || $func==SPH_GROUPBY_MONTH
0867 || $func==SPH_GROUPBY_YEAR
0868 || $func==SPH_GROUPBY_ATTR
0869 || $func==SPH_GROUPBY_ATTRPAIR );
0870
0871 $this->_groupby = $attribute;
0872 $this->_groupfunc = $func;
0873 $this->_groupsort = $groupsort;
0874 }
0875
0876 /// set count-distinct attribute for group-by queries
0877 function SetGroupDistinct ( $attribute )
0878 {
0879 assert ( is_string($attribute) );
0880 $this->_groupdistinct = $attribute;
0881 }
0882
0883 /// set distributed retries count and delay
0884 function SetRetries ( $count, $delay=0 )
0885 {
0886 assert ( is_int($count) && $count>=0 );
0887 assert ( is_int($delay) && $delay>=0 );
0888 $this->_retrycount = $count;
0889 $this->_retrydelay = $delay;
0890 }
0891
0892 /// set result set format (hash or array; hash by default)
0893 /// PHP specific; needed for group-by-MVA result sets that may contain duplicate IDs
0894 function SetArrayResult ( $arrayresult )
0895 {
0896 assert ( is_bool($arrayresult) );
0897 $this->_arrayresult = $arrayresult;
0898 }
0899
0900 /// set attribute values override
0901 /// there can be only one override per attribute
0902 /// $values must be a hash that maps document IDs to attribute values
0903 function SetOverride ( $attrname, $attrtype, $values )
0904 {
0905 assert ( is_string ( $attrname ) );
0906 assert ( in_array ( $attrtype, array ( SPH_ATTR_INTEGER, SPH_ATTR_TIMESTAMP, SPH_ATTR_BOOL, SPH_ATTR_FLOAT, SPH_ATTR_BIGINT ) ) );
0907 assert ( is_array ( $values ) );
0908
0909 $this->_overrides[$attrname] = array ( "attr"=>$attrname, "type"=>$attrtype, "values"=>$values );
0910 }
0911
0912 /// set select-list (attributes or expressions), SQL-like syntax
0913 function SetSelect ( $select )
0914 {
0915 assert ( is_string ( $select ) );
0916 $this->_select = $select;
0917 }
0918
0919 //////////////////////////////////////////////////////////////////////////////
0920
0921 /// clear all filters (for multi-queries)
0922 function ResetFilters ()
0923 {
0924 $this->_filters = array();
0925 $this->_anchor = array();
0926 }
0927
0928 /// clear groupby settings (for multi-queries)
0929 function ResetGroupBy ()
0930 {
0931 $this->_groupby = "";
0932 $this->_groupfunc = SPH_GROUPBY_DAY;
0933 $this->_groupsort = "@group desc";
0934 $this->_groupdistinct= "";
0935 }
0936
0937 /// clear all attribute value overrides (for multi-queries)
0938 function ResetOverrides ()
0939 {
0940 $this->_overrides = array ();
0941 }
0942
0943 //////////////////////////////////////////////////////////////////////////////
0944
0945 /// connect to searchd server, run given search query through given indexes,
0946 /// and return the search results
0947 function Query ( $query, $index="*", $comment="" )
0948 {
0949 assert ( empty($this->_reqs) );
0950
0951 $this->AddQuery ( $query, $index, $comment );
0952 $results = $this->RunQueries ();
0953 $this->_reqs = array (); // just in case it failed too early
0954
0955 if ( !is_array($results) )
0956 return false; // probably network error; error message should be already filled
0957
0958 $this->_error = $results[0]["error"];
0959 $this->_warning = $results[0]["warning"];
0960 if ( $results[0]["status"]==SEARCHD_ERROR )
0961 return false;
0962 else
0963 return $results[0];
0964 }
0965
0966 /// helper to pack floats in network byte order
0967 function _PackFloat ( $f )
0968 {
0969 $t1 = pack ( "f", $f ); // machine order
0970 list(,$t2) = unpack ( "L*", $t1 ); // int in machine order
0971 return pack ( "N", $t2 );
0972 }
0973
0974 /// add query to multi-query batch
0975 /// returns index into results array from RunQueries() call
0976 function AddQuery ( $query, $index="*", $comment="" )
0977 {
0978 // mbstring workaround
0979 $this->_MBPush ();
0980
0981 // build request
0982 $req = pack ( "NNNNN", $this->_offset, $this->_limit, $this->_mode, $this->_ranker, $this->_sort ); // mode and limits
0983 $req .= pack ( "N", strlen($this->_sortby) ) . $this->_sortby;
0984 $req .= pack ( "N", strlen($query) ) . $query; // query itself
0985 $req .= pack ( "N", count($this->_weights) ); // weights
0986 foreach ( $this->_weights as $weight )
0987 $req .= pack ( "N", (int)$weight );
0988 $req .= pack ( "N", strlen($index) ) . $index; // indexes
0989 $req .= pack ( "N", 1 ); // id64 range marker
0990 $req .= sphPackU64 ( $this->_min_id ) . sphPackU64 ( $this->_max_id ); // id64 range
0991
0992 // filters
0993 $req .= pack ( "N", count($this->_filters) );
0994 foreach ( $this->_filters as $filter )
0995 {
0996 $req .= pack ( "N", strlen($filter["attr"]) ) . $filter["attr"];
0997 $req .= pack ( "N", $filter["type"] );
0998 switch ( $filter["type"] )
0999 {
1000 case SPH_FILTER_VALUES:
1001 $req .= pack ( "N", count($filter["values"]) );
1002 foreach ( $filter["values"] as $value )
1003 $req .= sphPackI64 ( $value );
1004 break;
1005
1006 case SPH_FILTER_RANGE:
1007 $req .= sphPackI64 ( $filter["min"] ) . sphPackI64 ( $filter["max"] );
1008 break;
1009
1010 case SPH_FILTER_FLOATRANGE:
1011 $req .= $this->_PackFloat ( $filter["min"] ) . $this->_PackFloat ( $filter["max"] );
1012 break;
1013
1014 default:
1015 assert ( 0 && "internal error: unhandled filter type" );
1016 }
1017 $req .= pack ( "N", $filter["exclude"] );
1018 }
1019
1020 // group-by clause, max-matches count, group-sort clause, cutoff count
1021 $req .= pack ( "NN", $this->_groupfunc, strlen($this->_groupby) ) . $this->_groupby;
1022 $req .= pack ( "N", $this->_maxmatches );
1023 $req .= pack ( "N", strlen($this->_groupsort) ) . $this->_groupsort;
1024 $req .= pack ( "NNN", $this->_cutoff, $this->_retrycount, $this->_retrydelay );
1025 $req .= pack ( "N", strlen($this->_groupdistinct) ) . $this->_groupdistinct;
1026
1027 // anchor point
1028 if ( empty($this->_anchor) )
1029 {
1030 $req .= pack ( "N", 0 );
1031 } else
1032 {
1033 $a =& $this->_anchor;
1034 $req .= pack ( "N", 1 );
1035 $req .= pack ( "N", strlen($a["attrlat"]) ) . $a["attrlat"];
1036 $req .= pack ( "N", strlen($a["attrlong"]) ) . $a["attrlong"];
1037 $req .= $this->_PackFloat ( $a["lat"] ) . $this->_PackFloat ( $a["long"] );
1038 }
1039
1040 // per-index weights
1041 $req .= pack ( "N", count($this->_indexweights) );
1042 foreach ( $this->_indexweights as $idx=>$weight )
1043 $req .= pack ( "N", strlen($idx) ) . $idx . pack ( "N", $weight );
1044
1045 // max query time
1046 $req .= pack ( "N", $this->_maxquerytime );
1047
1048 // per-field weights
1049 $req .= pack ( "N", count($this->_fieldweights) );
1050 foreach ( $this->_fieldweights as $field=>$weight )
1051 $req .= pack ( "N", strlen($field) ) . $field . pack ( "N", $weight );
1052
1053 // comment
1054 $req .= pack ( "N", strlen($comment) ) . $comment;
1055
1056 // attribute overrides
1057 $req .= pack ( "N", count($this->_overrides) );
1058 foreach ( $this->_overrides as $key => $entry )
1059 {
1060 $req .= pack ( "N", strlen($entry["attr"]) ) . $entry["attr"];
1061 $req .= pack ( "NN", $entry["type"], count($entry["values"]) );
1062 foreach ( $entry["values"] as $id=>$val )
1063 {
1064 assert ( is_numeric($id) );
1065 assert ( is_numeric($val) );
1066
1067 $req .= sphPackU64 ( $id );
1068 switch ( $entry["type"] )
1069 {
1070 case SPH_ATTR_FLOAT: $req .= $this->_PackFloat ( $val ); break;
1071 case SPH_ATTR_BIGINT: $req .= sphPackI64 ( $val ); break;
1072 default: $req .= pack ( "N", $val ); break;
1073 }
1074 }
1075 }
1076
1077 // select-list
1078 $req .= pack ( "N", strlen($this->_select) ) . $this->_select;
1079
1080 // mbstring workaround
1081 $this->_MBPop ();
1082
1083 // store request to requests array
1084 $this->_reqs[] = $req;
1085 return count($this->_reqs)-1;
1086 }
1087
1088 /// connect to searchd, run queries batch, and return an array of result sets
1089 function RunQueries ()
1090 {
1091 if ( empty($this->_reqs) )
1092 {
1093 $this->_error = "no queries defined, issue AddQuery() first";
1094 return false;
1095 }
1096
1097 // mbstring workaround
1098 $this->_MBPush ();
1099
1100 if (!( $fp = $this->_Connect() ))
1101 {
1102 $this->_MBPop ();
1103 return false;
1104 }
1105
1106 // send query, get response
1107 $nreqs = count($this->_reqs);
1108 $req = join ( "", $this->_reqs );
1109 $len = 4+strlen($req);
1110 $req = pack ( "nnNN", SEARCHD_COMMAND_SEARCH, VER_COMMAND_SEARCH, $len, $nreqs ) . $req; // add header
1111
1112 if ( !( $this->_Send ( $fp, $req, $len+8 ) ) ||
1113 !( $response = $this->_GetResponse ( $fp, VER_COMMAND_SEARCH ) ) )
1114 {
1115 $this->_MBPop ();
1116 return false;
1117 }
1118
1119 // query sent ok; we can reset reqs now
1120 $this->_reqs = array ();
1121
1122 // parse and return response
1123 return $this->_ParseSearchResponse ( $response, $nreqs );
1124 }
1125
1126 /// parse and return search query (or queries) response
1127 function _ParseSearchResponse ( $response, $nreqs )
1128 {
1129 $p = 0; // current position
1130 $max = strlen($response); // max position for checks, to protect against broken responses
1131
1132 $results = array ();
1133 for ( $ires=0; $ires<$nreqs && $p<$max; $ires++ )
1134 {
1135 $results[] = array();
1136 $result =& $results[$ires];
1137
1138 $result["error"] = "";
1139 $result["warning"] = "";
1140
1141 // extract status
1142 list(,$status) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1143 $result["status"] = $status;
1144 if ( $status!=SEARCHD_OK )
1145 {
1146 list(,$len) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1147 $message = substr ( $response, $p, $len ); $p += $len;
1148
1149 if ( $status==SEARCHD_WARNING )
1150 {
1151 $result["warning"] = $message;
1152 } else
1153 {
1154 $result["error"] = $message;
1155 continue;
1156 }
1157 }
1158
1159 // read schema
1160 $fields = array ();
1161 $attrs = array ();
1162
1163 list(,$nfields) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1164 while ( $nfields-->0 && $p<$max )
1165 {
1166 list(,$len) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1167 $fields[] = substr ( $response, $p, $len ); $p += $len;
1168 }
1169 $result["fields"] = $fields;
1170
1171 list(,$nattrs) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1172 while ( $nattrs-->0 && $p<$max )
1173 {
1174 list(,$len) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1175 $attr = substr ( $response, $p, $len ); $p += $len;
1176 list(,$type) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1177 $attrs[$attr] = $type;
1178 }
1179 $result["attrs"] = $attrs;
1180
1181 // read match count
1182 list(,$count) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1183 list(,$id64) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1184
1185 // read matches
1186 $idx = -1;
1187 while ( $count-->0 && $p<$max )
1188 {
1189 // index into result array
1190 $idx++;
1191
1192 // parse document id and weight
1193 if ( $id64 )
1194 {
1195 $doc = sphUnpackU64 ( substr ( $response, $p, 8 ) ); $p += 8;
1196 list(,$weight) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1197 }
1198 else
1199 {
1200 list ( $doc, $weight ) = array_values ( unpack ( "N*N*",
1201 substr ( $response, $p, 8 ) ) );
1202 $p += 8;
1203 $doc = sphFixUint($doc);
1204 }
1205 $weight = sprintf ( "%u", $weight );
1206
1207 // create match entry
1208 if ( $this->_arrayresult )
1209 $result["matches"][$idx] = array ( "id"=>$doc, "weight"=>$weight );
1210 else
1211 $result["matches"][$doc]["weight"] = $weight;
1212
1213 // parse and create attributes
1214 $attrvals = array ();
1215 foreach ( $attrs as $attr=>$type )
1216 {
1217 // handle 64bit ints
1218 if ( $type==SPH_ATTR_BIGINT )
1219 {
1220 $attrvals[$attr] = sphUnpackI64 ( substr ( $response, $p, 8 ) ); $p += 8;
1221 continue;
1222 }
1223
1224 // handle floats
1225 if ( $type==SPH_ATTR_FLOAT )
1226 {
1227 list(,$uval) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1228 list(,$fval) = unpack ( "f*", pack ( "L", $uval ) );
1229 $attrvals[$attr] = $fval;
1230 continue;
1231 }
1232
1233 // handle everything else as unsigned ints
1234 list(,$val) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1235 if ( $type & SPH_ATTR_MULTI )
1236 {
1237 $attrvals[$attr] = array ();
1238 $nvalues = $val;
1239 while ( $nvalues-->0 && $p<$max )
1240 {
1241 list(,$val) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1242 $attrvals[$attr][] = sphFixUint($val);
1243 }
1244 } else
1245 {
1246 $attrvals[$attr] = sphFixUint($val);
1247 }
1248 }
1249
1250 if ( $this->_arrayresult )
1251 $result["matches"][$idx]["attrs"] = $attrvals;
1252 else
1253 $result["matches"][$doc]["attrs"] = $attrvals;
1254 }
1255
1256 list ( $total, $total_found, $msecs, $words ) =
1257 array_values ( unpack ( "N*N*N*N*", substr ( $response, $p, 16 ) ) );
1258 $result["total"] = sprintf ( "%u", $total );
1259 $result["total_found"] = sprintf ( "%u", $total_found );
1260 $result["time"] = sprintf ( "%.3f", $msecs/1000 );
1261 $p += 16;
1262
1263 while ( $words-->0 && $p<$max )
1264 {
1265 list(,$len) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1266 $word = substr ( $response, $p, $len ); $p += $len;
1267 list ( $docs, $hits ) = array_values ( unpack ( "N*N*", substr ( $response, $p, 8 ) ) ); $p += 8;
1268 $result["words"][$word] = array (
1269 "docs"=>sprintf ( "%u", $docs ),
1270 "hits"=>sprintf ( "%u", $hits ) );
1271 }
1272 }
1273
1274 $this->_MBPop ();
1275 return $results;
1276 }
1277
1278 /////////////////////////////////////////////////////////////////////////////
1279 // excerpts generation
1280 /////////////////////////////////////////////////////////////////////////////
1281
1282 /// connect to searchd server, and generate exceprts (snippets)
1283 /// of given documents for given query. returns false on failure,
1284 /// an array of snippets on success
1285 function BuildExcerpts ( $docs, $index, $words, $opts=array() )
1286 {
1287 assert ( is_array($docs) );
1288 assert ( is_string($index) );
1289 assert ( is_string($words) );
1290 assert ( is_array($opts) );
1291
1292 $this->_MBPush ();
1293
1294 if (!( $fp = $this->_Connect() ))
1295 {
1296 $this->_MBPop();
1297 return false;
1298 }
1299
1300 /////////////////
1301 // fixup options
1302 /////////////////
1303
1304 if ( !isset($opts["before_match"]) ) $opts["before_match"] = "<b>";
1305 if ( !isset($opts["after_match"]) ) $opts["after_match"] = "</b>";
1306 if ( !isset($opts["chunk_separator"]) ) $opts["chunk_separator"] = " ... ";
1307 if ( !isset($opts["limit"]) ) $opts["limit"] = 256;
1308 if ( !isset($opts["around"]) ) $opts["around"] = 5;
1309 if ( !isset($opts["exact_phrase"]) ) $opts["exact_phrase"] = false;
1310 if ( !isset($opts["single_passage"]) ) $opts["single_passage"] = false;
1311 if ( !isset($opts["use_boundaries"]) ) $opts["use_boundaries"] = false;
1312 if ( !isset($opts["weight_order"]) ) $opts["weight_order"] = false;
1313
1314 /////////////////
1315 // build request
1316 /////////////////
1317
1318 // v.1.0 req
1319 $flags = 1; // remove spaces
1320 if ( $opts["exact_phrase"] ) $flags |= 2;
1321 if ( $opts["single_passage"] ) $flags |= 4;
1322 if ( $opts["use_boundaries"] ) $flags |= 8;
1323 if ( $opts["weight_order"] ) $flags |= 16;
1324 $req = pack ( "NN", 0, $flags ); // mode=0, flags=$flags
1325 $req .= pack ( "N", strlen($index) ) . $index; // req index
1326 $req .= pack ( "N", strlen($words) ) . $words; // req words
1327
1328 // options
1329 $req .= pack ( "N", strlen($opts["before_match"]) ) . $opts["before_match"];
1330 $req .= pack ( "N", strlen($opts["after_match"]) ) . $opts["after_match"];
1331 $req .= pack ( "N", strlen($opts["chunk_separator"]) ) . $opts["chunk_separator"];
1332 $req .= pack ( "N", (int)$opts["limit"] );
1333 $req .= pack ( "N", (int)$opts["around"] );
1334
1335 // documents
1336 $req .= pack ( "N", count($docs) );
1337 foreach ( $docs as $doc )
1338 {
1339 assert ( is_string($doc) );
1340 $req .= pack ( "N", strlen($doc) ) . $doc;
1341 }
1342
1343 ////////////////////////////
1344 // send query, get response
1345 ////////////////////////////
1346
1347 $len = strlen($req);
1348 $req = pack ( "nnN", SEARCHD_COMMAND_EXCERPT, VER_COMMAND_EXCERPT, $len ) . $req; // add header
1349 if ( !( $this->_Send ( $fp, $req, $len+8 ) ) ||
1350 !( $response = $this->_GetResponse ( $fp, VER_COMMAND_EXCERPT ) ) )
1351 {
1352 $this->_MBPop ();
1353 return false;
1354 }
1355
1356 //////////////////
1357 // parse response
1358 //////////////////
1359
1360 $pos = 0;
1361 $res = array ();
1362 $rlen = strlen($response);
1363 for ( $i=0; $i<count($docs); $i++ )
1364 {
1365 list(,$len) = unpack ( "N*", substr ( $response, $pos, 4 ) );
1366 $pos += 4;
1367
1368 if ( $pos+$len > $rlen )
1369 {
1370 $this->_error = "incomplete reply";
1371 $this->_MBPop ();
1372 return false;
1373 }
1374 $res[] = $len ? substr ( $response, $pos, $len ) : "";
1375 $pos += $len;
1376 }
1377
1378 $this->_MBPop ();
1379 return $res;
1380 }
1381
1382
1383 /////////////////////////////////////////////////////////////////////////////
1384 // keyword generation
1385 /////////////////////////////////////////////////////////////////////////////
1386
1387 /// connect to searchd server, and generate keyword list for a given query
1388 /// returns false on failure,
1389 /// an array of words on success
1390 function BuildKeywords ( $query, $index, $hits )
1391 {
1392 assert ( is_string($query) );
1393 assert ( is_string($index) );
1394 assert ( is_bool($hits) );
1395
1396 $this->_MBPush ();
1397
1398 if (!( $fp = $this->_Connect() ))
1399 {
1400 $this->_MBPop();
1401 return false;
1402 }
1403
1404 /////////////////
1405 // build request
1406 /////////////////
1407
1408 // v.1.0 req
1409 $req = pack ( "N", strlen($query) ) . $query; // req query
1410 $req .= pack ( "N", strlen($index) ) . $index; // req index
1411 $req .= pack ( "N", (int)$hits );
1412
1413 ////////////////////////////
1414 // send query, get response
1415 ////////////////////////////
1416
1417 $len = strlen($req);
1418 $req = pack ( "nnN", SEARCHD_COMMAND_KEYWORDS, VER_COMMAND_KEYWORDS, $len ) . $req; // add header
1419 if ( !( $this->_Send ( $fp, $req, $len+8 ) ) ||
1420 !( $response = $this->_GetResponse ( $fp, VER_COMMAND_KEYWORDS ) ) )
1421 {
1422 $this->_MBPop ();
1423 return false;
1424 }
1425
1426 //////////////////
1427 // parse response
1428 //////////////////
1429
1430 $pos = 0;
1431 $res = array ();
1432 $rlen = strlen($response);
1433 list(,$nwords) = unpack ( "N*", substr ( $response, $pos, 4 ) );
1434 $pos += 4;
1435 for ( $i=0; $i<$nwords; $i++ )
1436 {
1437 list(,$len) = unpack ( "N*", substr ( $response, $pos, 4 ) ); $pos += 4;
1438 $tokenized = $len ? substr ( $response, $pos, $len ) : "";
1439 $pos += $len;
1440
1441 list(,$len) = unpack ( "N*", substr ( $response, $pos, 4 ) ); $pos += 4;
1442 $normalized = $len ? substr ( $response, $pos, $len ) : "";
1443 $pos += $len;
1444
1445 $res[] = array ( "tokenized"=>$tokenized, "normalized"=>$normalized );
1446
1447 if ( $hits )
1448 {
1449 list($ndocs,$nhits) = array_values ( unpack ( "N*N*", substr ( $response, $pos, 8 ) ) );
1450 $pos += 8;
1451 $res [$i]["docs"] = $ndocs;
1452 $res [$i]["hits"] = $nhits;
1453 }
1454
1455 if ( $pos > $rlen )
1456 {
1457 $this->_error = "incomplete reply";
1458 $this->_MBPop ();
1459 return false;
1460 }
1461 }
1462
1463 $this->_MBPop ();
1464 return $res;
1465 }
1466
1467 function EscapeString ( $string )
1468 {
1469 $from = array ( '\\', '(',')','|','-','!','@','~','"','&', '/', '^', '$', '=' );
1470 $to = array ( '\\\\', '\(','\)','\|','\-','\!','\@','\~','\"', '\&', '\/', '\^', '\$', '\=' );
1471
1472 return str_replace ( $from, $to, $string );
1473 }
1474
1475 /////////////////////////////////////////////////////////////////////////////
1476 // attribute updates
1477 /////////////////////////////////////////////////////////////////////////////
1478
1479 /// batch update given attributes in given rows in given indexes
1480 /// returns amount of updated documents (0 or more) on success, or -1 on failure
1481 function UpdateAttributes ( $index, $attrs, $values, $mva=false )
1482 {
1483 // verify everything
1484 assert ( is_string($index) );
1485 assert ( is_bool($mva) );
1486
1487 assert ( is_array($attrs) );
1488 foreach ( $attrs as $attr )
1489 assert ( is_string($attr) );
1490
1491 assert ( is_array($values) );
1492 foreach ( $values as $id=>$entry )
1493 {
1494 assert ( is_numeric($id) );
1495 assert ( is_array($entry) );
1496 assert ( count($entry)==count($attrs) );
1497 foreach ( $entry as $v )
1498 {
1499 if ( $mva )
1500 {
1501 assert ( is_array($v) );
1502 foreach ( $v as $vv )
1503 assert ( is_int($vv) );
1504 } else
1505 assert ( is_int($v) );
1506 }
1507 }
1508
1509 // build request
1510 $req = pack ( "N", strlen($index) ) . $index;
1511
1512 $req .= pack ( "N", count($attrs) );
1513 foreach ( $attrs as $attr )
1514 {
1515 $req .= pack ( "N", strlen($attr) ) . $attr;
1516 $req .= pack ( "N", $mva ? 1 : 0 );
1517 }
1518
1519 $req .= pack ( "N", count($values) );
1520 foreach ( $values as $id=>$entry )
1521 {
1522 $req .= sphPackU64 ( $id );
1523 foreach ( $entry as $v )
1524 {
1525 $req .= pack ( "N", $mva ? count($v) : $v );
1526 if ( $mva )
1527 foreach ( $v as $vv )
1528 $req .= pack ( "N", $vv );
1529 }
1530 }
1531
1532 // connect, send query, get response
1533 if (!( $fp = $this->_Connect() ))
1534 return -1;
1535
1536 $len = strlen($req);
1537 $req = pack ( "nnN", SEARCHD_COMMAND_UPDATE, VER_COMMAND_UPDATE, $len ) . $req; // add header
1538 if ( !$this->_Send ( $fp, $req, $len+8 ) )
1539 return -1;
1540
1541 if (!( $response = $this->_GetResponse ( $fp, VER_COMMAND_UPDATE ) ))
1542 return -1;
1543
1544 // parse response
1545 list(,$updated) = unpack ( "N*", substr ( $response, 0, 4 ) );
1546 return $updated;
1547 }
1548
1549 /////////////////////////////////////////////////////////////////////////////
1550 // persistent connections
1551 /////////////////////////////////////////////////////////////////////////////
1552
1553 function Open()
1554 {
1555 if ( $this->_socket !== false )
1556 {
1557 $this->_error = 'already connected';
1558 return false;
1559 }
1560 if ( !$fp = $this->_Connect() )
1561 return false;
1562
1563 // command, command version = 0, body length = 4, body = 1
1564 $req = pack ( "nnNN", SEARCHD_COMMAND_PERSIST, 0, 4, 1 );
1565 if ( !$this->_Send ( $fp, $req, 12 ) )
1566 return false;
1567
1568 $this->_socket = $fp;
1569 return true;
1570 }
1571
1572 function Close()
1573 {
1574 if ( $this->_socket === false )
1575 {
1576 $this->_error = 'not connected';
1577 return false;
1578 }
1579
1580 fclose ( $this->_socket );
1581 $this->_socket = false;
1582
1583 return true;
1584 }
1585
1586 //////////////////////////////////////////////////////////////////////////
1587 // status
1588 //////////////////////////////////////////////////////////////////////////
1589
1590 function Status ()
1591 {
1592 $this->_MBPush ();
1593 if (!( $fp = $this->_Connect() ))
1594 {
1595 $this->_MBPop();
1596 return false;
1597 }
1598
1599 $req = pack ( "nnNN", SEARCHD_COMMAND_STATUS, VER_COMMAND_STATUS, 4, 1 ); // len=4, body=1
1600 if ( !( $this->_Send ( $fp, $req, 12 ) ) ||
1601 !( $response = $this->_GetResponse ( $fp, VER_COMMAND_STATUS ) ) )
1602 {
1603 $this->_MBPop ();
1604 return false;
1605 }
1606
1607 $res = substr ( $response, 4 ); // just ignore length, error handling, etc
1608 $p = 0;
1609 list ( $rows, $cols ) = array_values ( unpack ( "N*N*", substr ( $response, $p, 8 ) ) ); $p += 8;
1610
1611 $res = array();
1612 for ( $i=0; $i<$rows; $i++ )
1613 for ( $j=0; $j<$cols; $j++ )
1614 {
1615 list(,$len) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
1616 $res[$i][] = substr ( $response, $p, $len ); $p += $len;
1617 }
1618
1619 $this->_MBPop ();
1620 return $res;
1621 }
1622 }
1623
1624 //
1625 // $Id: sphinxapi.php 2055 2009-11-06 23:09:58Z shodan $
1626 // 测试控制器(/application/controllers/search_page.php) 01 <?php if( ! defined('BASEPATH')) die('No Access');
02
03 class Search_page extends CI_Controller{
04
05 public function __construct(){
06 parent::__construct();
07 }
08
09 public function search(){
10 $this->load->helper('url');
11 $this->load->view('search');
12 }
13
14 public function result(){
15 header('content-type: text/html;charset=utf-8');
16 $words = $this->input->get('words');
17 if($words===NULL) $words = '';
18 $this->load->library('sphinx_client', NULL, 'sphinx');
19 $index = "test1";
20 $opts = array
21 (
22 "before_match" => '<span style="color:red;">',
23 "after_match" => "</span>",
24 "chunk_separator" => " ... ",
25 "limit" => 60,
26 "around" => 3,
27 );
28 $this->sphinx->SetServer('192.168.23.128',9312);
29 $this->sphinx->SetConnectTimeout(3);
30 $this->sphinx->SetArrayResult(TRUE);
31 $this->sphinx->SetMatchMode(SPH_MATCH_ANY);
32 $this->sphinx->SetLimits(0,20);
33
34 $res = $this->sphinx->Query($words, 'test1');
35 if($res===FALSE){
36 var_dump($this->sphinx->GetLastError());
37 exit;
38 }
39
40 echo "关键词 <b>{$words}</b> ,找到约 <b>{$res['total_found']}</b> 结果,用时 <b>{$res['time']}</b>s";
41 echo '<br/><hr/><br/>';
42 if(array_key_exists('words', $res) && is_array($res['words'])){
43 foreach($res['words'] as $k => $v){
44 echo $k . ' : ' . $v['docs'] . ' - ' . $v['hits'] . '<br/>';
45 }
46 }
47 echo '<br/><hr/><br/>';
48 $this->load->database();
49 $idarr = array();
50 if(array_key_exists('matches', $res) && is_array($res['matches'])){
51 foreach($res['matches'] as $v){
52 $idarr[] = $v['id'];
53 }
54 }
55 if(count($idarr)>0){
56 $this->db->from('shop_goods_info');
57 $this->db->select('pname,cretime');
58 $this->db->where_in('id', $idarr);
59 $result = $this->db->get()->result_array();
60 echo '<ul>';
61 $name_arr = array();
62 foreach($result as $k=>$v){
63 $name_arr[$k] = $v['pname'];
64 }
65 $name_arr = $this->sphinx->BuildExcerpts($name_arr, $index, $words, $opts);
66 foreach($result as $k=>$v){
67 echo '<li>' . $name_arr[$k] . '(' . date('Y-m-d H:i:s', $v['cretime']) . ')</li>';
68 }
69 echo '</ul>';
70 }
71 $this->sphinx->Close();
72 }
73
74 }
75 ?> 搜索表单(/application/views/search.php) 01 <!DOCTYPE html>
02 <html>
03 <head>
04 <meta http-equiv="content-type" content="text/html;charset=utf-8" />
05 <title>搜索</title>
06 <meta name="keywords" content="keywords" />
07 <meta name="description" content="description" />
08 <style type="text/css">
09 #panel {
10 margin:20px;
11 }
12 </style>
13 </head>
14 <body>
15 <div id="panel">
16 <form name="form" method="get" action="<?php echo site_url(array('search_page','result')); ?>">
17 <label for="words">关键词:</label>
18 <input type="text" id="words" name="words" value="" size="60" />
19 <input type="submit" name="submit" value="搜索" />
20 </form>
21 </div>
22 </body>
23 </html>
Codeigniter 集成sphinx搜索 这里采用的是coreseek中文搜索引擎,具体安装请参考官方网站的更多相关文章
- sphinx中文版Coreseek中文检索引擎安装和使用方法(Linux)
sphinx中文版Coreseek中文检索引擎安装和使用方法(Linux) 众所周知,在MYSQL数据库中,如果你在百万级别数据库中使用 like 的话那你一定在那骂娘,coreseek是一个 ...
- 【PHP高效搜索专题(1)】sphinx&Coreseek的介绍与安装
我们已经知道mysql中带有"%keyword%"条件的sql是不走索引的,而不走索引的sql在大数据量+大并发量的时候,不仅效率极慢还很有可能让数据库崩溃.那我们如何通过某些关键 ...
- ***CodeIgniter集成微信支付(转)
微信支付Native扫码支付模式二之CodeIgniter集成篇 http://www.cnblogs.com/24la/p/wxpay-native-qrcode-codeigniter.html ...
- 微信支付JSAPI模式及退款CodeIgniter集成篇
微信支付接口文档:https://pay.weixin.qq.com/wiki/doc/api/jsapi.php?chapter=7_1 首先你得知道这个jsapi是不能离开微信进行调用支付的,明白 ...
- PHP读取sphinx 搜索返回结果完整实战实例
PHP读取sphinx 搜索返回结果完整实战实例 网上搜索N久都没有一个正在读取返回sphinx结果的实例,都是到了matches那里就直接var_dump或者print_r了,没有读取到字段的例子, ...
- Sphinx中文分词详细安装配置及API调用实战
这几天项目中需要重新做一个关于商品的全文搜索功能,于是想到了用Sphinx,因为需要中文分词,所以选择了Sphinx for chinese,当然你也可以选择coreseek,建议这两个中选择一个,暂 ...
- coreseek (sphinx)+ Mysql + Thinkphp搭建中文搜索引擎详解
一, 前言 1,研究coreseek的动机 我有一个自己的笔记博客,经常在上面做一些技术文章分析.在查询一些文章的时候,以前只能将要查询的内容去mysql中用like模糊匹配.在文章多了的情 ...
- Centos下Sphinx中文分词编译安装测试---CoreSeek
要支持中文分词,还需要下载Coreseek,可以去官方搜索下载,这里我用的4.1 百度云下载地址: https://pan.baidu.com/s/1slNIyHf tar -zxvf co ...
- coreseek中文搜索
coreseek的安装和使用 准备软件包 coreseek-3.2.14.tar.gz 其他汁源 coreseek中文索引-示例文件.zip sphinx配置文件详解.txt 1.安装组件 yum - ...
随机推荐
- Java I/O 文件加锁,压缩
文件加锁: 文件加锁机制允许我们同步访问某个作为共享资源的文件. public class Test { public static void main(String[] args) throws I ...
- mysql 中文字段排序( 按拼音首字母排序) 的查询语句
在处理使用Mysql时,数据表采用utf8字符集,使用中发现中文不能直接按照拼音排序 如果数据表tbl的某字段name的字符编码是latin1_swedish_ci select * from `tb ...
- css + html 小知识总结
Html+CSS基础之Html 注:本文摘自慕课网http://www.imooc.com HTML+CSS基础课程: 1. HTML是网页内容的载体.内容就是网页制作者放在页面上想要让用户浏览的 ...
- JavaWeb 5 Tomcat
5 Tomcat 1 Web开发入门 1.1 引入 之前的程序: java桌面程序,控制台控制,socket gui界面.javase规范 ...
- C++——并发编程
一.高级接口:async()和Future 1.1 async()和Future的第一个用例 假设需要计算两个操作数的总和,而这两个操作数是两个函数的返回值.寻常加法如下: func1() + fun ...
- mysql 前缀索引
计算适合设置索引的长度,直到去重以后在一个固定值. 根据去重以后适合的长度设置索引. 计划查询
- C++时间函数模板
//测时间 class Timer { private: clock_t _start; clock_t _end; public: Timer() { start(); } void start() ...
- Multipath多路径冗余全解
一.什么是multipath 普通的电脑主机都是一个硬盘挂接到一个总线上,这里是一对一的关系.而到了有光纤组成的SAN环境,由于主机和存储通过了光纤交换机连接,这样的话,就构成了多对多的关系.也就是说 ...
- Animation用法
测试代码及说明: <!DOCTYPE html> <html lang="en-US"> <head> <meta charset=&qu ...
- jquery获得option的值和对option进行操作 作者: 字体:[增加 减小] 类型:转载 时间:2013-12-13 我要评论
jquery获得option的值和对option进行操作 作者: 字体:[增加 减小] 类型:转载 时间:2013-12-13我要评论 本文为大家介绍下jquery获得option的值和对option ...