Check out our 2024 Retrospective for a look back at events that shaped the wiki during 2024.
User:Daveh/Lucene Search
The UESPWiki – Your source for The Elder Scrolls since 1995
This article details the experience of installing and configuring Lucene search for the wiki.
Dev Wiki Installation[edit]
- Lucene Installation
-
- Change all instances of uesp_net_wiki5 to uesp_net_wikidev in LocalSettings.php on dev.uesp.net (prevents issues later on).
- Download Lucene Search 2.1.3 and uncompress.
- Run: ./configure /home/uespdev/www/w
- Edit lsearch-global.conf to look like the following:
[Database] uesp_net_wikidev : (single) (spell,4,2) (prefix) (language,en)
-
- Update the [Namespace-Prefix] section in lsearch-global.conf to include:
[100] : 100 [101] : 101 [102] : 102 [103] : 103 [104] : 104 [105] : 105 [106] : 106 [107] : 107 [108] : 108 [109] : 109 [110] : 110 [111] : 111 [112] : 112 [113] : 113 [114] : 114 [115] : 115 [116] : 116 [117] : 117 [118] : 118 [119] : 119 [120] : 120 [121] : 121 [122] : 122 [123] : 123 [124] : 124 [125] : 125 [126] : 126 [127] : 127 [128] : 128 [129] : 129 [130] : 130 [131] : 131 [132] : 132 [133] : 133 [134] : 134 [135] : 135 [136] : 136 [137] : 137 [138] : 138 [139] : 139 [140] : 140 [141] : 141 [142] : 142 [143] : 143 [144] : 144 [145] : 145 [146] : 146 [147] : 147 all_talk : 1,3,5,7,9,11,13,15,103,105,107,109,111,113,115,117,119,121,123,125,127,129,131,133,135,137,139,141,143,145 all_content : 0,102,104,106,108,110,112,114,116,118,120,122,124,126,128,130,132,134,136,138,140,142,144,146
-
- Run ./build to build the index (run time = 25 min).
- Start the Lucene daemon and note any error messages (./lsearchd &).
- Check that Lucene is listening on port 8123 via netstat.
- Run a local test search via: wget http://content3.uesp.net:8123/search/uesp_net_wikidev/test.
- MWSearch Installation
-
- Download and unzip the MWSearch extension for MediaWiki 1.19.
- Install to /home/uespdev/www/w/extensions/MWSearch
- Edit LocalSettings.php to add the following lines:
$wgSearchType = 'LuceneSearch'; $wgLuceneHost = '10.2.212.14'; $wgLucenePort = 8123; require_once( "$IP/extensions/MWSearch/MWSearch.php" ); //The following must be after the extension is included $wgEnableLucenePrefixSearch = true; $wgLucenePrefixHost = '10.2.212.14'; $wgLuceneSearchVersion = 2.1;
-
- Ensure the following parameters are set in LocalSettings.php:
$wgUseAjax = true; $wgAjaxSearch = true; $wgEnableMWSuggest = true;
-
- Check the Special:Version page to confirm installation.
- Edit extensions/UespCustomCode/ and ensure the following lines are commented/uncommented as shown:
# $wgAutoloadClasses['SiteSearchMySQL'] = $dir . 'SiteSearchMySQL.php'; # $wgSearchType = 'SiteSearchMySQL'; # $aSpecialPages['Search'] = array( 'SpecialPage', 'Search', , true, 'efSiteSpecialSearch', $dir . 'SiteSpecialSearch.php');
-
- Test the search.
- OAIRespository Installation
-
- Download the OAI extension.
- Install into the MediaWiki extension folder: /home/uespdev/www/w/extensions/OAI
- Add a line to LocalSettings.php
require_once( "$IP/extensions/OAI/OAIRepo.php" );
-
- Run the maintenance/update.php script to create the OAI tables.
- Ensure that lsearch-global.conf has the following lines:
[OAI] <default> : http://dev.uesp.net/w/index.php
-
- Run the Lucene update script (run time = 1 min).
Live Installation[edit]
-
- For now content2 will be the indexer and search host.
- Added entries to MediaWiki:Common.css.
- Install OAI and MWSearch extensions.
- Ran maintenance/update.php on content3 (thus updating the master db1).
- Create the sub-domain search.uesp.net which points to the appropriate server. Note that the indexer appears to need to be run on a content server.
- Copy /lsearch folder from dev.
- Clear the dump and indexes folders.
- Change hostname in lsearch.initd to search.uesp.net.
- Setup start script: cp lsearch.initd /etc/init.d/lsearch.
- Edit lsearch.conf and update the localization URL:
Localization.url = file:///home/uesp/www/w/languages/messages
-
- Edit lsearch-global.conf:
[Database] uesp_net_wiki5 : (single,true,20,100) (spell,10,3) (prefix) (language,en) [Search-Group] content2.uesp.net : * [Index] content2.uesp.net : * [Index-Path] <default> : /search [OAI] <default> : http://content2.uesp.net/w/index.php # Namespaces as before
-
- Edit config.inc with the correct paths and hostnames (change hostname as appropriate):
dbname=uesp_net_wiki5 wgScriptPath=/w hostname=content2.uesp.net indexes=/lsearch/indexes mediawiki=/home/uesp/www/w base=/lsearch wgServer=http://www.uesp.net
-
- To build the index run ./build which should use the db2 slave database. Monitor the load on content1/db2 and if the build process stalls at any time break the process. It appears to sometimes lock-up on the database side. Approximate build run is 5-6 minutes.
- Start lsearchd on content2 and note any errors. Test to ensure it is working correctly.
- Run update on content2 and ensure it works correctly.
- Setup the init.d run script. Kill the current instance of Lucene manually and start it via the script. Ensure it has started and works correctly.
- Setup the update script to run hourly in cron.hourly.
- Edit LocalSettings.php to add the following lines:
$wgUseAjax = true; $wgAjaxSearch = true; $wgEnableMWSuggest = true; require_once( "$IP/extensions/OAI/OAIRepo.php" ); $wgSearchType = 'LuceneSearch'; $wgLuceneHost = '10.2.212.12'; $wgLucenePort = 8123; require_once( "$IP/extensions/MWSearch/MWSearch.php" ); $wgEnableLucenePrefixSearch = true; $wgLucenePrefixHost = '10.2.212.12'; $wgLuceneSearchVersion = 2.1;
-
- Test search in the live site. Ensure that the suggest/prefix search works in addition to the normal search and the search options.
Benchmarking[edit]
-
- Tests were done using ApacheBench in the format: ab -kc 10 -t 30 http://...
- Index changes in lsearch-global.conf seemed to have only a very minor effect on performance.
- Index build times are typically 5-6 minutes on live servers using db2 (main database slave).
- Index update times are typically ~30 seconds.
- Simple prefix suggestions via the opensearch API typically take 70-120 ms. Remote or local requests didn't change the time required very much (remote requests were slightly slower at 100-150 ms).
- Simple searches directly to Lucene took 20-90 ms from the local host.
- Search throughput maxes out around 350 req/s. Prefix rate is around 100 req/s.
- Average search time according to the search log is 50 ms.
- Load on the search server (content2)
-
- Average load = 0.03.
- Rare spikes from 0.5 - 1.0.
- Hourly spike up to 0.5 for 1 minute from the index update.
- Search request rate averages 0.67/sec.
- Prefix request rate averages 2.1/sec (estimate from Squid logs).
- Based on benchmarking current capacity of search host is 2-3%.
Custom Search Changes[edit]
A few skin changes were made at the same time as the Lucene search installation:
-
- skins/MonoBook.php
- skins/monobook/main.css
- extensions/UespCustomCode/files/search-icon.png
- extensions/UespCustomCode/SiteSpecialSearch.php
- extensions/UespCustomCode/SiteCustomCode.php
- MediaWiki:Common.css