William Jiang

JavaScript,PHP,Node,Perl,LAMP Web Developer – http://williamjxj.com; https://github.com/williamjxj?tab=repositories

Setup coreseek_4.1-sphinx_2.0.1

The steps to setup coreseek_4.1-sphinx_2.0.1

Coreseek is Chinese-version Sphinx, which can search Chinese word – 中文全文搜索软件,是Sphinx的中文改进。
Here I list the core-steps of setup coreseek_4.1-sphinx_2.0.1, which has successfully work on CentOS 6.2 Server:
1. Dump MySQL table from from production env to developing env for testing purpose:

$ mysqldump  --databases production --tables contents | mysql -D development

2. Edit $HOME/etc/coreseek_sphinx.conf to setup coreseek index.
Notice: sph_counter table is optional. If use, it must be created at first, with 2 simple int columns of counter_id and max_id.
This table is used for the purpose of processing the existing data; for the still oncoming data, the setting here is ignored.


source contents {
	type	= mysql
	...	
	sql_query_pre		= SET NAMES 'utf8'
	sql_query_pre		= SET SESSION query_cache_type=OFF
    sql_query_pre		= REPLACE INTO sph_counter SELECT 1, MAX(cid) FROM contents
	sql_query_range		= SELECT MIN(cid), MAX(cid) FROM contents
	sql_query = SELECT * FROM contents WHERE cid >= $start AND cid <= $end
	sql_attr_uint 		= cate_id
	sql_attr_uint	 	= iid
	sql_attr_str2ordinal	= language
	sql_attr_str2ordinal	= createdby
	sql_attr_timestamp	= pubdate
}
...
index contents {
	source			= contents
	path			= /var/data/demo/contents
	min_word_len		= 3
	charset_type		= zh_cn.utf-8
	charset_dictpath	= /usr/local/mmseg/etc/
	stopwords		= /usr/local/mmseg/etc/stopwords.txt
}
searchd {
	port			= 9312
	log			= /var/log/demo/searchd.log
	query_log		= /var/log/demo/query.log
	pid_file		= /var/log/demo/searchd.pid
}

3. Create 2 dirs which is set in above coreseek_sphinx.conf file.

$ sudo mkdir /var/data/demo /var/log/demo

4. call coreseek ‘indexer’ to generate index for the MySQL table ‘contents’.
Then, start daemon ‘searchd’ for PHP-ext script which call SphinxAPI: sphinxapi.php.

$ sudo /usr/local/coreseek/bin/indexer -c $HOME/etc/coreseek_sphinx.conf contents
  
// first, check port 9312 is available?
$ netstat -ant | grep 9312

// if the port is available, not occupied, then:
$ sudo /usr/local/coreseek/bin/searchd -c $HOME/etc/coreseek_sphinx.conf 

// double check to make sure the daemon is running underneath:
$ ps -ef | grep searched | grep -v grep

5. Appendix

Here are very good examples:

url: http://www.shroomery.org/forums/search.php
source: http://www.shroomery.org/forums/dosearch.php.txt

The Sphinx PHP API is available at:
Sphinx for PHP: http://php.net/manual/en/sphinx.examples.php
Which Coreseek inherits from.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: