14
strfilter
strfilter
SYNOPSIS
strfilter($s, $lang)
DESCRIPTION
strfilter
returns a character string with the words in the character string $s
in lowercase, without accents, without duplications and without the words which are not significant in the language $lang
separated by one space.
The insignificant words are defined in the file includes/stopwords.inc:
- includes
- stopwords.inc
- global $stopwords;
- $stopwords = array(
- 'en' => array(
- 'a',
- 'about',
- 'above',
- 'fr' => array(
- 'a',
- 'au',
- 'aussi',
stopwords.inc defines the global variable $stopwords
.
$stopwords
holds a table which associates for each language managed by the program a list of the words which are not significant in an index.
CODE
- global $stopwords;
- $stopwords = array();
- @include 'stopwords.inc';
Loads the global variable $stopwords
from the file stopwords.inc.
- require_once 'strflat.php';
- function strfilter($s, $lang) {
- global $stopwords;
- if ($s) {
- $wlist=array_map('strtolower', array_map('strflat', array_unique(preg_split('/\s+/', $s, -1, PREG_SPLIT_NO_EMPTY))));
- if ($lang && array_key_exists($lang, $stopwords)) {
- $wlist=array_diff($wlist, $stopwords[$lang]);
- }
- return implode(' ', $wlist);
- }
- return false;
- }
Comments