includes/search.php
author Dan
Tue, 18 Dec 2007 23:44:55 -0500
changeset 322 5f1cd51bf1be
parent 320 112debff64bd
child 334 c72b545f1304
child 339 5d62ef764b0d
permissions -rw-r--r--
Many changes. Installer with PostgreSQL is broken badly and will be for some time.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
     1
<?php
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
     2
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
     3
/*
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
     4
 * Enano - an open-source CMS capable of wiki functions, Drupal-like sidebar blocks, and everything in between
322
5f1cd51bf1be Many changes. Installer with PostgreSQL is broken badly and will be for some time.
Dan
parents: 320
diff changeset
     5
 * Version 1.0.3 (Dyrad)
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
     6
 * Copyright (C) 2006-2007 Dan Fuhry
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
     7
 * search.php - algorithm used to search pages
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
     8
 *
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
     9
 * This program is Free Software; you can redistribute and/or modify it under the terms of the GNU General Public License
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    10
 * as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    11
 *
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    12
 * This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    13
 * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for details.
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    14
 */
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    15
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    16
/**
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    17
 * Implementation of array_merge() that preserves key names. $arr2 takes precedence over $arr1.
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    18
 * @param array $arr1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    19
 * @param array $arr2
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    20
 * @return array
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    21
 */
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
    22
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    23
function enano_safe_array_merge($arr1, $arr2)
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    24
{
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    25
  $arr3 = $arr1;
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    26
  foreach($arr2 as $k => $v)
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    27
  {
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    28
    $arr3[$k] = $v;
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    29
  }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    30
  return $arr3;
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    31
}
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    32
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    33
/**
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
    34
 * In Enano versions prior to 1.0.2, this class provided a search function that was keyword-based and allowed boolean searches. It was
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
    35
 * cut from Coblynau and replaced with perform_search(), later in this file, because of speed issues. Now mostly deprecated. The only
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
    36
 * thing remaining is the buildIndex function, which is still used by the path manager and the new search framework.
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
    37
 *
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    38
 * @package Enano
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    39
 * @subpackage Page management frontend
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
    40
 * @license GNU General Public License <http://enanocms.org/Special:GNU_General_Public_License>
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    41
 */
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    42
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    43
class Searcher
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    44
{
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
    45
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    46
  var $results;
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    47
  var $index;
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    48
  var $warnings;
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    49
  var $match_case = false;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
    50
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    51
  function buildIndex($texts)
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    52
  {
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    53
    $this->index = Array();
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
    54
    $stopwords = get_stopwords();
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
    55
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    56
    foreach($texts as $i => $l)
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    57
    {
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    58
      $seed = md5(microtime(true) . mt_rand());
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    59
      $texts[$i] = str_replace("'", 'xxxApoS'.$seed.'xxx', $texts[$i]);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    60
      $texts[$i] = preg_replace('#([\W_]+)#i', ' ', $texts[$i]);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    61
      $texts[$i] = preg_replace('#([ ]+?)#', ' ', $texts[$i]);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    62
      $texts[$i] = preg_replace('#([\']*){2,}#s', '', $texts[$i]);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    63
      $texts[$i] = str_replace('xxxApoS'.$seed.'xxx', "'", $texts[$i]);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    64
      $l = $texts[$i];
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    65
      $words = Array();
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    66
      $good_chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789\' ';
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    67
      $good_chars = enano_str_split($good_chars, 1);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    68
      $letters = enano_str_split($l, 1);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    69
      foreach($letters as $x => $t)
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    70
      {
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    71
        if(!in_array($t, $good_chars))
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    72
          unset($letters[$x]);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    73
      }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    74
      $letters = implode('', $letters);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    75
      $words = explode(' ', $letters);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    76
      foreach($words as $c => $w)
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    77
      {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
    78
        if(strlen($w) < 2 || in_array($w, $stopwords))
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    79
          unset($words[$c]);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    80
        else
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    81
          $words[$c] = $w;
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    82
      }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    83
      $words = array_values($words);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    84
      foreach($words as $c => $w)
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    85
      {
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    86
        if(isset($this->index[$w]))
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    87
        {
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    88
          if(!in_array($i, $this->index[$w]))
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    89
            $this->index[$w][] = $i;
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    90
        }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    91
        else
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    92
        {
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    93
          $this->index[$w] = Array();
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    94
          $this->index[$w][] = $i;
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    95
        }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    96
      }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    97
    }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    98
    foreach($this->index as $k => $v)
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
    99
    {
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   100
      $this->index[$k] = implode(',', $this->index[$k]);
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   101
    }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   102
  }
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   103
}
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   104
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   105
/**
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   106
 * Searches the site for the specified string and returns an array with each value being an array filled with the following:
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   107
 *   page_id: string, self-explanatory
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   108
 *   namespace: string, self-explanatory
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   109
 *   page_length: integer, the length of the full page in bytes
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   110
 *   page_text: string, the contents of the page (trimmed to ~150 bytes if necessary)
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   111
 *   score: numerical relevance score, 1-100, rounded to 2 digits and calculated based on which terms were present and which were not
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   112
 * @param string Search query
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   113
 * @param string Will be filled with any warnings encountered whilst parsing the query
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   114
 * @param bool Case sensitivity - defaults to false
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   115
 * @return array
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   116
 */
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   117
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   118
function perform_search($query, &$warnings, $case_sensitive = false)
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   119
{
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   120
  global $db, $session, $paths, $template, $plugins; // Common objects
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   121
  $warnings = array();
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   122
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   123
  $query = parse_search_query($query, $warnings);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   124
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   125
  // Segregate search terms containing spaces
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   126
  $query_phrase = array(
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   127
    'any' => array(),
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   128
    'req' => array()
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   129
    );
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   130
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   131
  foreach ( $query['any'] as $i => $_ )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   132
  {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   133
    $term =& $query['any'][$i];
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   134
    $term = trim($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   135
    // the indexer only indexes words a-z with apostrophes
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   136
    if ( preg_match('/[^A-Za-z\']/', $term) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   137
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   138
      $query_phrase['any'][] = $term;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   139
      unset($term, $query['any'][$i]);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   140
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   141
  }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   142
  unset($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   143
  $query['any'] = array_values($query['any']);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   144
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   145
  foreach ( $query['req'] as $i => $_ )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   146
  {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   147
    $term =& $query['req'][$i];
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   148
    $term = trim($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   149
    if ( preg_match('/[^A-Za-z\']/', $term) )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   150
    {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   151
      $query_phrase['req'][] = $term;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   152
      unset($term, $query['req'][$i]);
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   153
    }
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   154
  }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   155
  unset($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   156
  $query['req'] = array_values($query['req']);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   157
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   158
  $results = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   159
  $scores = array();
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   160
  $ns_list = '(' . implode('|', array_keys($paths->nslist)) . ')';
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   161
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   162
  // FIXME: Update to use FULLTEXT algo when available.
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   163
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   164
  // Build an SQL query to load from the index table
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   165
  if ( count($query['any']) < 1 && count($query['req']) < 1 && count($query_phrase['any']) < 1 && count($query_phrase['req']) < 1 )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   166
  {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   167
    // This is both because of technical restrictions and devastation that would occur on shared servers/large sites.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   168
    $warnings[] = 'You need to have at least one keyword in your search query. Searching only for pages not containing a term is not allowed.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   169
    return array();
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   170
  }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   171
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   172
  //
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   173
  // STAGE 1
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   174
  // Get all possible result pages from the search index. Tally which pages have the most words, and later sort them by boolean relevance
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   175
  //
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   176
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   177
  // Skip this if no indexable words are included
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   178
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   179
  if ( count($query['any']) > 0 || count($query['req']) > 0 )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   180
  {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   181
    $where_any = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   182
    foreach ( $query['any'] as $term )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   183
    {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   184
      $term = escape_string_like($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   185
      if ( !$case_sensitive )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   186
        $term = strtolower($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   187
      $where_any[] = $term;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   188
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   189
    foreach ( $query['req'] as $term )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   190
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   191
      $term = escape_string_like($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   192
      if ( !$case_sensitive )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   193
        $term = strtolower($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   194
      $where_any[] = $term;
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   195
    }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   196
320
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   197
    $col_word = ( $case_sensitive ) ? 'word' : ENANO_SQLFUNC_LOWERCASE . '(word)';
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   198
    $where_any = ( count($where_any) > 0 ) ? '( ' . $col_word . ' = \'' . implode('\' OR ' . $col_word . ' = \'', $where_any) . '\' )' : '';
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   199
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   200
    // generate query
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   201
    // using a GROUP BY here ensures that the same word with a different case isn't counted as 2 words - it's all melted back
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   202
    // into one later in the processing stages
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   203
    // $group_by = ( $case_sensitive ) ? '' : ' GROUP BY lcase(word);';
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   204
    $sql = "SELECT word, page_names FROM " . table_prefix . "search_index WHERE {$where_any}";
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   205
    if ( !($q = $db->sql_unbuffered_query($sql)) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   206
      $db->_die('Error is in perform_search(), includes/search.php, query 1');
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   207
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   208
    $word_tracking = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   209
    if ( $row = $db->fetchrow() )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   210
    {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   211
      do
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   212
      {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   213
        // get page list
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   214
        $pages =& $row['page_names'];
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   215
        if ( strpos($pages, ',') )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   216
        {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   217
          // the term occurs in more than one page
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   218
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   219
          // Find page IDs that contain commas
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   220
          // This should never happen because commas are escaped by sanitize_page_id(). Nevertheless for compatibility with older
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   221
          // databases, and to alleviate the concerns of hackers, we'll accommodate for page IDs with commas here by checking for
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   222
          // IDs that don't match the pattern for stringified page ID + namespace. If it doesn't match, that means it's a continuation
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   223
          // of the previous ID and should be concatenated to the previous entry.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   224
          $matches = explode(',', $pages);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   225
          $prev = false;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   226
          foreach ( $matches as $i => $_ )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   227
          {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   228
            $match =& $matches[$i];
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   229
            if ( !preg_match("/^ns=$ns_list;pid=(.+)$/", $match) && $prev )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   230
            {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   231
              $matches[$prev] .= ',' . $match;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   232
              unset($match, $matches[$i]);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   233
              continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   234
            }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   235
            $prev = $i;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   236
          }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   237
          unset($match);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   238
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   239
          // Iterate through each of the results, assigning scores based on how many times the page has shown up.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   240
          // This works because this phase of the search is strongly word-based not page-based. If a page shows up
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   241
          // multiple times while fetching the result rows from the search_index table, it simply means that page
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   242
          // contains more than one of the terms the user searched for.
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   243
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   244
          foreach ( $matches as $match )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   245
          {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   246
            $word_cs = (( $case_sensitive ) ? $row['word'] : strtolower($row['word']));
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   247
            if ( isset($word_tracking[$match]) && in_array($word_cs, $word_tracking[$match]) )
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   248
            {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   249
              continue;
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   250
            }
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   251
            if ( isset($word_tracking[$match]) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   252
            {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   253
              if ( isset($word_tracking[$match]) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   254
              {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   255
                $word_tracking[$match][] = ($word_cs);
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   256
              }
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   257
            }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   258
            else
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   259
            {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   260
              $word_tracking[$match] = array($word_cs);
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   261
            }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   262
            $inc = 1;
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   263
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   264
            // Is this search term present in the page's title? If so, give extra points
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   265
            preg_match("/^ns=$ns_list;pid=(.+)$/", $match, $piecesparts);
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   266
            $pathskey = $paths->nslist[ $piecesparts[1] ] . sanitize_page_id($piecesparts[2]);
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   267
            if ( isset($paths->pages[$pathskey]) )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   268
            {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   269
              $test_func = ( $case_sensitive ) ? 'strstr' : 'stristr';
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   270
              if ( $test_func($paths->pages[$pathskey]['name'], $row['word']) || $test_func($paths->pages[$pathskey]['urlname_nons'], $row['word']) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   271
              {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   272
                $inc = 1.5;
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   273
              }
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   274
            }
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   275
            if ( isset($scores[$match]) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   276
            {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   277
              $scores[$match] = $scores[$match] + $inc;
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   278
            }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   279
            else
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   280
            {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   281
              $scores[$match] = $inc;
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   282
            }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   283
          }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   284
        }
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   285
        else
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   286
        {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   287
          // the term only occurs in one page
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   288
          $word_cs = (( $case_sensitive ) ? $row['word'] : strtolower($row['word']));
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   289
          if ( isset($word_tracking[$pages]) && in_array($word_cs, $word_tracking[$pages]) )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   290
          {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   291
            continue;
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   292
          }
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   293
          if ( isset($word_tracking[$pages]) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   294
          {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   295
            if ( isset($word_tracking[$pages]) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   296
            {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   297
              $word_tracking[$pages][] = ($word_cs);
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   298
            }
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   299
          }
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   300
          else
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   301
          {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   302
            $word_tracking[$pages] = array($word_cs);
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   303
          }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   304
          $inc = 1;
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   305
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   306
          // Is this search term present in the page's title? If so, give extra points
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   307
          preg_match("/^ns=$ns_list;pid=(.+)$/", $pages, $piecesparts);
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   308
          $pathskey = $paths->nslist[ $piecesparts[1] ] . sanitize_page_id($piecesparts[2]);
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   309
          if ( isset($paths->pages[$pathskey]) )
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   310
          {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   311
            $test_func = ( $case_sensitive ) ? 'strstr' : 'stristr';
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   312
            if ( $test_func($paths->pages[$pathskey]['name'], $row['word']) || $test_func($paths->pages[$pathskey]['urlname_nons'], $row['word']) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   313
            {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   314
              $inc = 1.5;
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   315
            }
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   316
          }
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   317
          if ( isset($scores[$pages]) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   318
          {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   319
            $scores[$pages] = $scores[$pages] + $inc;
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   320
          }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   321
          else
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   322
          {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   323
            $scores[$pages] = $inc;
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   324
          }
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   325
        }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   326
      }
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   327
      while ( $row = $db->fetchrow() );
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   328
    }
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   329
    $db->free_result();
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   330
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   331
    //
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   332
    // STAGE 2: FIRST ELIMINATION ROUND
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   333
    // Iterate through the list of required terms. If a given page is not found to have the required term, eliminate it
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   334
    //
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   335
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   336
    foreach ( $query['req'] as $term )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   337
    {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   338
      foreach ( $word_tracking as $i => $page )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   339
      {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   340
        if ( !in_array($term, $page) )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   341
        {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   342
          unset($word_tracking[$i], $scores[$i]);
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   343
        }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   344
      }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   345
    }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   346
  }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   347
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   348
  //
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   349
  // STAGE 3: PHRASE SEARCHING
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   350
  // Use LIKE to find pages with specified phrases. We can do a super-picky single query without another elimination round because
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   351
  // at this stage we can search the full page_text column instead of relying on a word list.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   352
  //
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   353
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   354
  // We can skip this stage if none of these special terms apply
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   355
320
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   356
  $text_col = ( $case_sensitive ) ? 'page_text' : ENANO_SQLFUNC_LOWERCASE . '(page_text)';
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   357
  $name_col = ( $case_sensitive ) ? 'name' : ENANO_SQLFUNC_LOWERCASE . '(name)';
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   358
  $text_col_join = ( $case_sensitive ) ? 't.page_text' : ENANO_SQLFUNC_LOWERCASE . '(t.page_text)';
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   359
  $name_col_join = ( $case_sensitive ) ? 'p.name' : ENANO_SQLFUNC_LOWERCASE . '(p.name)';
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   360
    
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   361
  $concat_column = ( ENANO_DBLAYER == 'MYSQL' ) ?
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   362
    'CONCAT(\'ns=\',t.namespace,\';pid=\',t.page_id)' :
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   363
    "'ns=' || t.namespace || ';pid=' || t.page_id";
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   364
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   365
  if ( count($query_phrase['any']) > 0 || count($query_phrase['req']) > 0 )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   366
  {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   367
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   368
    $where_any = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   369
    foreach ( $query_phrase['any'] as $term )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   370
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   371
      $term = escape_string_like($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   372
      if ( !$case_sensitive )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   373
        $term = strtolower($term);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   374
      $where_any[] = "( $text_col LIKE '%$term%' OR $name_col LIKE '%$term%' )";
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   375
    }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   376
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   377
    $where_any = ( count($where_any) > 0 ) ? implode(" OR\n  ", $where_any) : '';
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   378
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   379
    // Also do required terms, but use AND to ensure that all required terms are included
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   380
    $where_req = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   381
    foreach ( $query_phrase['req'] as $term )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   382
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   383
      $term = escape_string_like($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   384
      if ( !$case_sensitive )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   385
        $term = strtolower($term);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   386
      $where_req[] = "( $text_col LIKE '%$term%' OR $name_col LIKE '%$term%' )";
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   387
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   388
    $and_clause = ( $where_any != '' ) ? 'AND ' : '';
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   389
    $where_req = ( count($where_req) > 0 ) ? "{$and_clause}" . implode(" AND\n  ", $where_req) : '';
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   390
320
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   391
    $sql = 'SELECT ' . $concat_column . ' AS id, p.name FROM ' . table_prefix . "page_text AS t\n"
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   392
            . "  LEFT JOIN " . table_prefix . "pages AS p\n"
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   393
            . "    ON ( p.urlname = t.page_id AND p.namespace = t.namespace )\n"
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   394
            . "  WHERE\n  $where_any\n  $where_req;";
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   395
    if ( !($q = $db->sql_unbuffered_query($sql)) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   396
      $db->_die('Error is in perform_search(), includes/search.php, query 2. Parsed query dump follows:<pre>(indexable) ' . htmlspecialchars(print_r($query, true)) . '(non-indexable) ' . htmlspecialchars(print_r($query_phrase, true)) . '</pre>');
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   397
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   398
    if ( $row = $db->fetchrow() )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   399
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   400
      do
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   401
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   402
        $id =& $row['id'];
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   403
        $inc = 1;
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   404
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   405
        // Is this search term present in the page's title? If so, give extra points
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   406
        preg_match("/^ns=$ns_list;pid=(.+)$/", $id, $piecesparts);
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   407
        $pathskey = $paths->nslist[ $piecesparts[1] ] . sanitize_page_id($piecesparts[2]);
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   408
        if ( isset($paths->pages[$pathskey]) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   409
        {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   410
          $test_func = ( $case_sensitive ) ? 'strstr' : 'stristr';
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   411
          foreach ( array_merge($query_phrase['any'], $query_phrase['req']) as $term )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   412
          {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   413
            if ( $test_func($paths->pages[$pathskey]['name'], $term) || $test_func($paths->pages[$pathskey]['urlname_nons'], $term) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   414
            {
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   415
              $inc = 1.5;
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   416
              break;
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   417
            }
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   418
          }
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   419
        }
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   420
        if ( isset($scores[$id]) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   421
        {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   422
          $scores[$id] = $scores[$id] + $inc;
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   423
        }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   424
        else
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   425
        {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   426
          $scores[$id] = $inc;
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   427
        }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   428
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   429
      while ( $row = $db->fetchrow() );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   430
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   431
    $db->free_result();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   432
  }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   433
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   434
  //
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   435
  // STAGE 4 - SELECT PAGE TEXT AND ELIMINATE NOTS
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   436
  // At this point, we have a complete list of all the possible pages. Now we want to obtain the page text, and within the same query
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   437
  // eliminate any terms that shouldn't be in there.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   438
  //
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   439
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   440
  // Generate master word list for the highlighter
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   441
  $word_list = array_values(array_merge($query['any'], $query['req'], $query_phrase['any'], $query_phrase['req']));
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   442
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   443
  $text_where = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   444
  foreach ( $scores as $page_id => $_ )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   445
  {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   446
    $text_where[] = $db->escape($page_id);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   447
  }
320
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   448
  $text_where = '( ' . $concat_column . ' = \'' . implode('\' OR ' . $concat_column . ' = \'', $text_where) . '\' )';
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   449
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   450
  if ( count($query['not']) > 0 )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   451
    $text_where .= ' AND';
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   452
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   453
  $where_not = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   454
  foreach ( $query['not'] as $term )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   455
  {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   456
    $term = escape_string_like($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   457
    if ( !$case_sensitive )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   458
      $term = strtolower($term);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   459
    $where_not[] = $term;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   460
  }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   461
  $where_not = ( count($where_not) > 0 ) ? "$text_col NOT LIKE '%" . implode("%' AND $text_col NOT LIKE '%", $where_not) . "%'" : '';
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   462
320
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   463
  $sql = 'SELECT ' . $concat_column . ' AS id, t.page_id, t.namespace, CHAR_LENGTH(t.page_text) AS page_length, t.page_text, p.name AS page_name FROM ' . table_prefix . "page_text AS t
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   464
            LEFT JOIN " . table_prefix . "pages AS p
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   465
              ON ( p.urlname = t.page_id AND p.namespace = t.namespace )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   466
            WHERE $text_where $where_not;";
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   467
  if ( !($q = $db->sql_unbuffered_query($sql)) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   468
    $db->_die('Error is in perform_search(), includes/search.php, query 3');
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   469
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   470
  $page_data = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   471
  if ( $row = $db->fetchrow() )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   472
  {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   473
    do
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   474
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   475
      $row['page_text'] = htmlspecialchars($row['page_text']);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   476
      $row['page_name'] = htmlspecialchars($row['page_name']);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   477
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   478
      // Highlight results (this is wonderfully automated)
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   479
      $row['page_text'] = highlight_and_clip_search_result($row['page_text'], $word_list, $case_sensitive);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   480
      if ( strlen($row['page_text']) > 250 && !preg_match('/^\.\.\.(.+)\.\.\.$/', $row['page_text']) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   481
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   482
        $row['page_text'] = substr($row['page_text'], 0, 150) . '...';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   483
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   484
      $row['page_name'] = highlight_search_result($row['page_name'], $word_list, $case_sensitive);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   485
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   486
      $page_data[$row['id']] = $row;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   487
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   488
    while ( $row = $db->fetchrow() );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   489
  }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   490
  $db->free_result();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   491
  
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   492
  //
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   493
  // STAGE 5 - SPECIAL PAGE TITLE SEARCH
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   494
  // Iterate through $paths->pages and check the titles for search terms. Score accordingly.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   495
  //
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   496
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   497
  foreach ( $paths->pages as $id => $page )
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   498
  {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   499
    if ( $page['namespace'] != 'Special' )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   500
      continue;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   501
    if ( !is_int($id) )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   502
      continue;
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   503
    $idstring = 'ns=' . $page['namespace'] . ';pid=' . $page['urlname_nons'];
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   504
    $any = array_values(array_unique(array_merge($query['any'], $query_phrase['any'])));
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   505
    foreach ( $any as $term )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   506
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   507
      if ( $case_sensitive )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   508
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   509
        if ( strstr($page['name'], $term) || strstr($page['urlname_nons'], $term) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   510
        {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   511
          ( isset($scores[$idstring]) ) ? $scores[$idstring] = $scores[$idstring] + 1.5 : $scores[$idstring] = 1.5;
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   512
        }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   513
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   514
      else
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   515
      {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   516
        if ( stristr($page['name'], $term) || stristr($page['urlname_nons'], $term) )
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   517
        {
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   518
          ( isset($scores[$idstring]) ) ? $scores[$idstring] = $scores[$idstring] + 1.5 : $scores[$idstring] = 1.5;
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   519
        }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   520
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   521
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   522
    if ( isset($scores[$idstring]) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   523
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   524
      $page_data[$idstring] = array(
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   525
          'page_name' => $page['name'],
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   526
          'page_text' => '',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   527
          'page_id' => $page['urlname_nons'],
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   528
          'namespace' => $page['namespace'],
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   529
          'score' => $scores[$idstring],
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   530
          'page_length' => 1,
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   531
          'page_note' => '[Special page]'
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   532
        );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   533
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   534
  }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   535
  
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   536
  //
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   537
  // STAGE 6 - SECOND ELIMINATION ROUND
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   538
  // Iterate through the list of required terms. If a given page is not found to have the required term, eliminate it
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   539
  //
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   540
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   541
  $required = array_merge($query['req'], $query_phrase['req']);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   542
  foreach ( $required as $term )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   543
  {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   544
    foreach ( $page_data as $id => $page )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   545
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   546
      if ( ( $page['namespace'] == 'Special' || ( $page['namespace'] != 'Special' && !strstr($page['page_text'], $term) ) ) && !strstr($page['page_id'], $term) && !strstr($page['page_name'], $term) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   547
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   548
        unset($page_data[$id]);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   549
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   550
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   551
  }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   552
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   553
  // At this point, all of our normal results are in. However, we can also allow plugins to hook into the system and score their own
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   554
  // pages and add text, etc. as necessary.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   555
  // Plugins are COMPLETELY responsible for using the search terms and handling Boolean logic properly
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   556
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   557
  $code = $plugins->setHook('search_global_inner');
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   558
  foreach ( $code as $cmd )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   559
  {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   560
    eval($cmd);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   561
  }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   562
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   563
  // a marvelous debugging aid :-)
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   564
  // die('<pre>' . htmlspecialchars(print_r($page_data, true)) . '</pre>');
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   565
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   566
  //
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   567
  // STAGE 7 - HIGHLIGHT, TRIM, AND SCORE RESULTS
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   568
  // We now have the complete results of the search. We need to trim text down to show only portions of the page containing search
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   569
  // terms, highlight any search terms within the page, and sort the final results array in descending order of score.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   570
  //
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   571
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   572
  // Sort scores array
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   573
  arsort($scores);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   574
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   575
  // Divisor for calculating relevance scores
320
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   576
  $divisor = ( count($query['any']) + count($query_phrase['any']) + count($query['req']) + count($query['not']) ) * 1.5;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   577
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   578
  foreach ( $scores as $page_id => $score )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   579
  {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   580
    if ( !isset($page_data[$page_id]) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   581
      // It's possible that $scores contains a score for a page that was later eliminated because it contained a disallowed term
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   582
      continue;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   583
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   584
    // Make a copy of the datum, then delete the original (it frees up a LOT of RAM)
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   585
    $datum = $page_data[$page_id];
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   586
    unset($page_data[$page_id]);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   587
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   588
    // This is an internal value used for sorting - it's no longer needed.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   589
    unset($datum['id']);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   590
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   591
    // Calculate score
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   592
    // if ( $score > $divisor )
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   593
    //   $score = $divisor;
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   594
    $datum['score'] = round($score / $divisor, 2) * 100;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   595
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   596
    // Store it in our until-now-unused results array
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   597
    $results[] = $datum;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   598
  }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   599
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   600
  // Our work here is done. :-D
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   601
  return $results;
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   602
}
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   603
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   604
/**
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   605
 * Parses a search query into an associative array. The resultant array will be filled with the following values, each an array:
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   606
 *   any: Search terms that can optionally be present
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   607
 *   req: Search terms that must be present
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   608
 *   not: Search terms that should not be present
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   609
 * @param string Search query
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   610
 * @param array Will be filled with parser warnings, such as query too short, words too short, etc.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   611
 * @return array
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   612
 */
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   613
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   614
function parse_search_query($query, &$warnings)
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   615
{
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   616
  $stopwords = get_stopwords();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   617
  $ret = array(
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   618
    'any' => array(),
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   619
    'req' => array(),
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   620
    'not' => array()
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   621
    );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   622
  $warnings = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   623
  $terms = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   624
  $in_quote = false;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   625
  $start_term = 0;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   626
  $just_finished = false;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   627
  for ( $i = 0; $i < strlen($query); $i++ )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   628
  {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   629
    $chr = $query{$i};
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   630
    $prev = ( $i > 0 ) ? $query{ $i - 1 } : '';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   631
    $next = ( ( $i + 1 ) < strlen($query) ) ? $query{ $i + 1 } : '';
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   632
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   633
    if ( ( $chr == ' ' && !$in_quote ) || ( $i + 1 == strlen ( $query ) ) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   634
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   635
      $len = ( $next == '' ) ? $i + 1 : $i - $start_term;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   636
      $word = substr ( $query, $start_term, $len );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   637
      $terms[] = $word;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   638
      $start_term = $i + 1;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   639
    }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   640
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   641
    elseif ( $chr == '"' && $in_quote && $prev != '\\' )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   642
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   643
      $word = substr ( $query, $start_term, $i - $start_term + 1 );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   644
      $start_pos = ( $next == ' ' ) ? $i + 2 : $i + 1;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   645
      $in_quote = false;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   646
    }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   647
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   648
    elseif ( $chr == '"' && !$in_quote )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   649
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   650
      $in_quote = true;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   651
      $start_pos = $i;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   652
    }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   653
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   654
  }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   655
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   656
  $ticker = 0;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   657
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   658
  foreach ( $terms as $element => $__unused )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   659
  {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   660
    $atom =& $terms[$element];
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   661
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   662
    $ticker++;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   663
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   664
    if ( $ticker == 20 )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   665
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   666
      $warnings[] = 'Some of your search terms were excluded because searches are limited to 20 terms to prevent excessive server load.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   667
      break;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   668
    }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   669
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   670
    if ( substr ( $atom, 0, 2 ) == '+"' && substr ( $atom, ( strlen ( $atom ) - 1 ), 1 ) == '"' )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   671
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   672
      $word = substr ( $atom, 2, ( strlen( $atom ) - 3 ) );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   673
      if ( strlen ( $word ) < 2 || in_array($word, $stopwords) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   674
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   675
        $warnings[] = 'One or more of your search terms was excluded because either it was less than 2 characters in length or is a common word (a stopword) that is typically found on a large number of pages. Examples of stopwords include "the", "this", "which", "with", etc.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   676
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   677
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   678
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   679
      if(in_array($word, $ret['req']))
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   680
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   681
        $warnings[] = 'One or more of your search terms was excluded because duplicate terms were encountered.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   682
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   683
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   684
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   685
      $ret['req'][] = $word;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   686
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   687
    elseif ( substr ( $atom, 0, 2 ) == '-"' && substr ( $atom, ( strlen ( $atom ) - 1 ), 1 ) == '"' )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   688
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   689
      $word = substr ( $atom, 2, ( strlen( $atom ) - 3 ) );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   690
      if ( strlen ( $word ) < 4 )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   691
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   692
        $warnings[] = 'One or more of your search terms was excluded because terms must be at least 4 characters in length.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   693
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   694
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   695
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   696
      if(in_array($word, $ret['not']))
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   697
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   698
        $warnings[] = 'One or more of your search terms was excluded because duplicate terms were encountered.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   699
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   700
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   701
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   702
      $ret['not'][] = $word;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   703
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   704
    elseif ( substr ( $atom, 0, 1 ) == '+' )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   705
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   706
      $word = substr ( $atom, 1 );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   707
      if ( strlen ( $word ) < 2 || in_array($word, $stopwords) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   708
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   709
        $warnings[] = 'One or more of your search terms was excluded because either it was less than 2 characters in length or is a common word (a stopword) that is typically found on a large number of pages. Examples of stopwords include "the", "this", "which", "with", etc.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   710
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   711
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   712
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   713
      if(in_array($word, $ret['req']))
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   714
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   715
        $warnings[] = 'One or more of your search terms was excluded because duplicate terms were encountered.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   716
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   717
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   718
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   719
      $ret['req'][] = $word;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   720
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   721
    elseif ( substr ( $atom, 0, 1 ) == '-' )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   722
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   723
      $word = substr ( $atom, 1 );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   724
      if ( strlen ( $word ) < 2 || in_array($word, $stopwords) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   725
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   726
        $warnings[] = 'One or more of your search terms was excluded because either it was less than 2 characters in length or is a common word (a stopword) that is typically found on a large number of pages. Examples of stopwords include "the", "this", "which", "with", etc.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   727
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   728
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   729
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   730
      if(in_array($word, $ret['not']))
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   731
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   732
        $warnings[] = 'One or more of your search terms was excluded because duplicate terms were encountered.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   733
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   734
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   735
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   736
      $ret['not'][] = $word;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   737
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   738
    elseif ( substr ( $atom, 0, 1 ) == '"' && substr ( $atom, ( strlen($atom) - 1 ), 1 ) == '"' )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   739
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   740
      $word = substr ( $atom, 1, ( strlen ( $atom ) - 2 ) );
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   741
      if ( strlen ( $word ) < 2 || in_array($word, $stopwords) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   742
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   743
        $warnings[] = 'One or more of your search terms was excluded because either it was less than 2 characters in length or is a common word (a stopword) that is typically found on a large number of pages. Examples of stopwords include "the", "this", "which", "with", etc.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   744
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   745
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   746
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   747
      if(in_array($word, $ret['any']))
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   748
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   749
        $warnings[] = 'One or more of your search terms was excluded because duplicate terms were encountered.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   750
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   751
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   752
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   753
      $ret['any'][] = $word;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   754
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   755
    else
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   756
    {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   757
      $word = $atom;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   758
      if ( strlen ( $word ) < 2 || in_array($word, $stopwords) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   759
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   760
        $warnings[] = 'One or more of your search terms was excluded because either it was less than 2 characters in length or is a common word (a stopword) that is typically found on a large number of pages. Examples of stopwords include "the", "this", "which", "with", etc.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   761
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   762
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   763
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   764
      if(in_array($word, $ret['any']))
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   765
      {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   766
        $warnings[] = 'One or more of your search terms was excluded because duplicate terms were encountered.';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   767
        $ticker--;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   768
        continue;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   769
      }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   770
      $ret['any'][] = $word;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   771
    }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   772
  }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   773
  return $ret;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   774
}
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   775
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   776
/**
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   777
 * Escapes a string for use in a LIKE clause.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   778
 * @param string
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   779
 * @return string
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   780
 */
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   781
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   782
function escape_string_like($string)
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   783
{
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   784
  global $db, $session, $paths, $template, $plugins; // Common objects
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   785
  $string = $db->escape($string);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   786
  $string = str_replace(array('%', '_'), array('\%', '\_'), $string);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   787
  return $string;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   788
}
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   789
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   790
/**
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   791
 * Wraps <highlight></highlight> tags around all words in both the specified array. Does not perform any clipping.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   792
 * @param string Text to process
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   793
 * @param array Word list
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   794
 * @param bool If true, searches case-sensitively when highlighting words
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   795
 * @return string
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   796
 */
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   797
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   798
function highlight_search_result($pt, $words, $case_sensitive = false)
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   799
{
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   800
  $words2 = array();
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   801
  for ( $i = 0; $i < sizeof($words); $i++)
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   802
  {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   803
    if(!empty($words[$i]))
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   804
      $words2[] = preg_quote($words[$i]);
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   805
  }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   806
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   807
  $flag = ( $case_sensitive ) ? '' : 'i';
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   808
  $regex = '/(' . implode('|', $words2) . ')/' . $flag;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   809
  $pt = preg_replace($regex, '<highlight>\\1</highlight>', $pt);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   810
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   811
  return $pt;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   812
}
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   813
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   814
/**
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   815
 * Wraps <highlight></highlight> tags around all words in both the specified array and the specified text and clips the text to
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   816
 * an appropriate length.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   817
 * @param string Text to process
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   818
 * @param array Word list
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   819
 * @param bool If true, searches case-sensitively when highlighting words
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   820
 * @return string
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   821
 */
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   822
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   823
function highlight_and_clip_search_result($pt, $words, $case_sensitive = false)
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   824
{
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   825
  $cut_off = false;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   826
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   827
  $space_chars = Array("\t", "\n", "\r", " ");
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   828
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   829
  $pt = highlight_search_result($pt, $words, $case_sensitive);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   830
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   831
  foreach ( $words as $word )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   832
  {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   833
    // Boldface searched words
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   834
    $ptlen = strlen($pt);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   835
    for ( $i = 0; $i < $ptlen; $i++ )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   836
    {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   837
      $len = strlen($word);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   838
      if ( strtolower(substr($pt, $i, $len)) == strtolower($word) )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   839
      {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   840
        $chunk1 = substr($pt, 0, $i);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   841
        $chunk2 = substr($pt, $i, $len);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   842
        $chunk3 = substr($pt, ( $i + $len ));
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   843
        $pt = $chunk1 . $chunk2 . $chunk3;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   844
        $ptlen = strlen($pt);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   845
        // Cut off text to 150 chars or so
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   846
        if ( !$cut_off )
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   847
        {
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   848
          $cut_off = true;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   849
          if ( $i - 75 > 0 )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   850
          {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   851
            // Navigate backwards until a space character is found
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   852
            $chunk = substr($pt, 0, ( $i - 75 ));
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   853
            $final_chunk = $chunk;
320
112debff64bd SURPRISE! Preliminary PostgreSQL support added. The required schema file is not present in this commit and will be included at a later date. No installer support is implemented. Also in this commit: several fixes including <!-- SYSMSG ... --> was broken in template compiler; set fixed width on included images to prevent the thumbnail box from getting huge; added a much more friendly interface to AJAX responses that are invalid JSON
Dan
parents: 292
diff changeset
   854
            for ( $j = strlen($chunk) - 1; $j > 0; $j = $j - 1 )
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   855
            {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   856
              if ( in_array($chunk{$j}, $space_chars) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   857
              {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   858
                $final_chunk = substr($chunk, $j + 1);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   859
                break;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   860
              }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   861
            }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   862
            $mid_chunk = substr($pt, ( $i - 75 ), 75);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   863
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   864
            $clipped = '...' . $final_chunk . $mid_chunk . $chunk2;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   865
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   866
            $chunk = substr($pt, ( $i + strlen($chunk2) + 75 ));
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   867
            $final_chunk = $chunk;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   868
            for ( $j = 0; $j < strlen($chunk); $j++ )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   869
            {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   870
              if ( in_array($chunk{$j}, $space_chars) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   871
              {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   872
                $final_chunk = substr($chunk, 0, $j);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   873
                break;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   874
              }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   875
            }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   876
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   877
            $end_chunk = substr($pt, ( $i + strlen($chunk2) ), 75 );
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   878
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   879
            $clipped .= $end_chunk . $final_chunk . '...';
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   880
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   881
            $pt = $clipped;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   882
          }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   883
          else if ( strlen($pt) > 200 )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   884
          {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   885
            $mid_chunk = substr($pt, ( $i - 75 ), 75);
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   886
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   887
            $clipped = $chunk1 . $chunk2;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   888
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   889
            $chunk = substr($pt, ( $i + strlen($chunk2) + 75 ));
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   890
            $final_chunk = $chunk;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   891
            for ( $j = 0; $j < strlen($chunk); $j++ )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   892
            {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   893
              if ( in_array($chunk{$j}, $space_chars) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   894
              {
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   895
                $final_chunk = substr($chunk, 0, $j);
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   896
                break;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   897
              }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   898
            }
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   899
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   900
            $end_chunk = substr($pt, ( $i + strlen($chunk2) ), 75 );
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   901
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   902
            $clipped .= $end_chunk . $final_chunk . '...';
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   903
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   904
            $pt = $clipped;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   905
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   906
          }
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   907
          break 2;
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   908
        }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   909
      }
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   910
    }
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   911
    $cut_off = false;
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   912
  }
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   913
  return $pt;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   914
}
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   915
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   916
/**
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   917
 * Returns a list of words that shouldn't under most circumstances be indexed for searching. Kudos to MySQL.
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   918
 * @return array
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   919
 * @see http://dev.mysql.com/doc/refman/5.0/en/fulltext-stopwords.html
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   920
 */
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   921
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   922
function get_stopwords()
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   923
{
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   924
  static $stopwords;
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   925
  if ( is_array($stopwords) )
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   926
    return $stopwords;
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   927
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   928
  $stopwords = array('a\'s', 'able', 'after', 'afterwards', 'again',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   929
                     'against', 'ain\'t', 'all', 'almost', 'alone', 'along', 'already', 'also', 'although', 'always',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   930
                     'am', 'among', 'amongst', 'an', 'and', 'another', 'any', 'anybody', 'anyhow', 'anyone', 'anything', 'anyway',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   931
                     'anyways', 'anywhere', 'apart', 'appear', 'appreciate', 'appropriate', 'are', 'aren\'t', 'around', 'as', 'aside',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   932
                     'ask', 'asking', 'associated', 'at', 'available', 'away', 'awfully', 'be', 'became', 'because', 'become', 'becomes',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   933
                     'becoming', 'been', 'before', 'beforehand', 'behind', 'being', 'believe', 'below', 'beside', 'besides', 'best',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   934
                     'better', 'between', 'beyond', 'both', 'brief', 'but', 'by', 'c\'mon', 'c\'s', 'came', 'can', 'can\'t', 'cannot',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   935
                     'cant', 'cause', 'causes', 'certain', 'certainly', 'changes', 'clearly', 'co', 'com', 'come', 'comes', 'concerning',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   936
                     'consequently', 'consider', 'considering', 'contain', 'containing', 'contains', 'corresponding', 'could',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   937
                     'couldn\'t', 'course', 'despite', 'did', 'didn\'t', 'different', 'do',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   938
                     'does', 'doesn\'t', 'doing', 'don\'t', 'done', 'down', 'downwards', 'during', 'each', 'edu', 'eg', 'eight',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   939
                     'either', 'else', 'elsewhere', 'enough', 'entirely', 'especially', 'et', 'etc', 'even', 'ever', 'every',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   940
                     'everybody', 'everyone', 'everything', 'everywhere', 'ex', 'exactly', 'example', 'except', 'far', 'few', 'fifth',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   941
                     'first', 'five', 'followed', 'following', 'follows', 'for', 'former', 'formerly', 'forth', 'four', 'from',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   942
                     'further', 'get', 'gets', 'getting', 'given', 'gives', 'go', 'goes', 'going', 'gone', 'got',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   943
                     'gotten', 'had', 'hadn\'t', 'happens', 'hardly', 'has', 'hasn\'t', 'have', 'haven\'t', 'having',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   944
                     'he', 'he\'s', 'hello', 'help', 'hence', 'her', 'here', 'here\'s', 'hereafter', 'hereby', 'herein', 'hereupon',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   945
                     'hers', 'herself', 'hi', 'him', 'himself', 'his', 'hither', 'hopefully', 'how', 'howbeit', 'however', 'i\'d',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   946
                     'i\'ll', 'i\'m', 'i\'ve', 'ie', 'if', 'ignored', 'immediate', 'in', 'inasmuch', 'inc', 'indeed', 'indicate',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   947
                     'indicated', 'indicates', 'inner', 'insofar', 'instead', 'into', 'inward', 'is', 'isn\'t', 'it', 'it\'d', 'it\'ll',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   948
                     'it\'s', 'its', 'itself', 'just', 'keep', 'keeps', 'kept', 'know', 'knows', 'known', 'last', 'lately', 'later',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   949
                     'latter', 'latterly', 'least', 'less', 'lest', 'let', 'let\'s', 'like', 'liked', 'likely', 'little', 'look',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   950
                     'looking', 'looks', 'ltd', 'mainly', 'many', 'may', 'maybe', 'me', 'mean', 'meanwhile', 'merely', 'might', 'more',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   951
                     'moreover', 'most', 'mostly', 'much', 'must', 'my', 'myself', 'name', 'namely', 'nd', 'near', 'nearly', 'necessary',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   952
                     'need', 'needs', 'neither', 'never', 'nevertheless', 'new', 'next', 'nine', 'no', 'nobody', 'non', 'none', 'noone',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   953
                     'nor', 'normally', 'not', 'nothing', 'novel', 'now', 'nowhere', 'obviously', 'of', 'off', 'often', 'oh', 'ok',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   954
                     'okay', 'old', 'on', 'once', 'one', 'ones', 'only', 'onto', 'or', 'other', 'others', 'otherwise', 'ought', 'our',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   955
                     'ours', 'ourselves', 'out', 'outside', 'over', 'overall', 'own', 'particular', 'particularly', 'per', 'perhaps',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   956
                     'placed', 'please', 'plus', 'possible', 'presumably', 'probably', 'provides', 'que', 'quite', 'qv', 'rather', 'rd',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   957
                     're', 'really', 'reasonably', 'regarding', 'regardless', 'regards', 'relatively', 'respectively', 'right', 'said',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   958
                     'same', 'saw', 'say', 'saying', 'says', 'second', 'secondly', 'see', 'seeing', 'seem', 'seemed', 'seeming', 'seems',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   959
                     'seen', 'self', 'selves', 'sensible', 'sent', 'serious', 'seriously', 'seven', 'several', 'shall', 'she', 'should',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   960
                     'shouldn\'t', 'since', 'six', 'so', 'some', 'somebody', 'somehow', 'someone', 'something', 'sometime', 'sometimes',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   961
                     'somewhat', 'somewhere', 'soon', 'sorry', 'specified', 'specify', 'specifying', 'still', 'sub', 'such', 'sup',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   962
                     'sure', 't\'s', 'take', 'taken', 'tell', 'tends', 'th', 'than', 'thank', 'thanks', 'thanx', 'that', 'that\'s',
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   963
                     'thats', 'the', 'their', 'theirs', 'them', 'then', 'thence', 'there', 'there\'s', 'thereafter',
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   964
                     'thereby', 'therefore', 'therein', 'theres', 'thereupon', 'these', 'they', 'they\'d', 'they\'ll', 'they\'re',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   965
                     'they\'ve', 'think', 'third', 'this', 'thorough', 'thoroughly', 'those', 'though', 'three', 'through', 'throughout',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   966
                     'thru', 'thus', 'to', 'together', 'too', 'took', 'toward', 'towards', 'tried', 'tries', 'truly', 'try', 'trying',
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   967
                     'twice', 'two', 'un', 'under', 'unfortunately', 'unless', 'unlikely', 'until', 'unto', 'upon', 'use',
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   968
                     'used', 'useful', 'uses', 'using', 'usually', 'value', 'various', 'very',
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   969
                     'was', 'wasn\'t', 'way', 'we', 'we\'d', 'we\'ll', 'we\'re', 'we\'ve', 'welcome', 'well', 'went', 'were', 'weren\'t',
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   970
                     'what', 'what\'s', 'whatever', 'when', 'whence', 'whenever', 'where', 'where\'s', 'whereafter', 'whereas',
292
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   971
                     'which', 'while', 'who', 'who\'s', 'whole', 'whom', 'whose', 'why', 'will', 'willing', 'wish', 'with', 'within',
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   972
                     'without', 'won\'t', 'wonder', 'would', 'would', 'wouldn\'t', 'yes', 'yet', 'you', 'you\'d', 'you\'ll', 'you\'re',
b3cfaf0a505c Fixed highlighting in search results; changed search algorithm to give more score for terms found in page title; hopefully (hackishly) fixed login_key_cache getting too long
Dan
parents: 272
diff changeset
   973
                     'you\'ve', 'your', 'yours', 'zero');
272
e0ec986c0af3 Searching sucks, and Enano's search algorithm was complete bullcrap. So I rewrote it. No, it does not use Google search technology. Like they have a patent for using the Arial font on search result pages anyway.
Dan
parents: 166
diff changeset
   974
  return $stopwords;
1
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   975
}
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   976
fe660c52c48f Adding /includes
dan@scribus.fuhry.local.fuhry.local
parents:
diff changeset
   977
?>