Zend Search Lucene : Allowed Memory Exhausted

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Zend Search Lucene : Allowed Memory Exhausted

lekshmi
Hi,

    I am getting allowed memory exhausted error while using find() for reading indexed records. Following is the code which i'm using.

Zend_Search_Lucene_Analysis_Analyzer::setDefault( new StandardAnalyzer_Analyzer_Standard_English() );
$index = Zend_Search_Lucene::open(INDEX_PATH);
$userQuery = Zend_Search_Lucene_Search_QueryParser::parse($query);
$userQuery = $index->find($userQuery, 'biz_type', SORT_NUMERIC, SORT_ASC);

If i give $index->find() statement alone, there is no memory exhausted error. I have to get all 'id's and 'biz_type' from index sorted by biz_type. Don't know where i'm going wrong. Can someone please help me to solve this problem. This is very urgent

Thanks,
Lekshmi.
Reply | Threaded
Open this post in threaded view
|

RE: Zend Search Lucene : Allowed Memory Exhausted

Alexander Veremyev
Hi,

You probably have too large result set...

If non-default sort order is used, then Zend_Search_Lucene has to
retrieve all matched documents from the index (only set of document IDs
and scores are collected in other case).

It's dramatically increases search time and memory usage.

First way to solve this problem is to use result set limitation
functionality:
-------------------------------
<?php
...
Zend_Search_Lucene::setResultSetLimit($N);
$hits = index->find($userQuery, 'biz_type', SORT_NUMERIC, SORT_ASC);
...
---------------
But you should remember, it's "first N results", but not "best N" in
scoring or ordering field point of view.

Second way is to retrieve not more than N results with best scores
(without additional sorting options), retrieve field values and sort
result manually.


Third way (may be combined with second) is to retrieve complete result
set one by one, store only necessary fields in some arrays and destroy
already processed $hit objects:
-------------------------------
<?php
...
$hits = index->find($userQuery);

$docIDs    = array();
$docScores = array();
foreach ($hits as $hitId => $hit) {
    $docIDs[]  = $hit->id;
    $docScores = $hit->score;
}
unset($hits);

foreach ($docIDs as $id) {
    $doc = $index->getDocument($id);

    $biz_types[]      = $doc->biz_type;
    $someOtherField[] = $doc->someOtherField;
}
array_multisort($biz_types, SORT_NUMERIC, SORT_ASC,
                $docScores, SORT_NUMERIC, SORT_DESC,
                $docIDs,    SORT_NUMERIC, SORT_ASC,
                $someOtherField);
---------------
That removes memory usage overhead of storing completely retrieved
result set wrapped into hit objects (retrieving any stored field invokes
full document loading).


With best regards,
   Alexander Veremyev.

> -----Original Message-----
> From: lekshmi [mailto:[hidden email]]
> Sent: Thursday, July 24, 2008 9:20 AM
> To: [hidden email]
> Subject: [fw-formats] Zend Search Lucene : Allowed Memory Exhausted
>
>
> Hi,
>
>     I am getting allowed memory exhausted error while using find() for
> reading indexed records. Following is the code which i'm using.
>
> Zend_Search_Lucene_Analysis_Analyzer::setDefault( new
> StandardAnalyzer_Analyzer_Standard_English() );
> $index = Zend_Search_Lucene::open(INDEX_PATH);
> $userQuery = Zend_Search_Lucene_Search_QueryParser::parse($query);
> $userQuery = $index->find($userQuery, 'biz_type', SORT_NUMERIC,
SORT_ASC);
>
> If i give $index->find() statement alone, there is no memory exhausted
> error. I have to get all 'id's and 'biz_type' from index sorted by
> biz_type.
> Don't know where i'm going wrong. Can someone please help me to solve
this
> problem. This is very urgent :-(
>
> Thanks,
> Lekshmi.
>
> --
> View this message in context:
http://www.nabble.com/Zend-Search-Lucene-
> %3A-Allowed-Memory-Exhausted-tp18625572p18625572.html
> Sent from the Zend MFS mailing list archive at Nabble.com.