Quantcast

Question about SQL (not ZF related, sorry)

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Question about SQL (not ZF related, sorry)

Aurimas Likas
I want to ask Im sure a really simple question (at laest for most of you). Here is the situation:

I have 3 DB tables: products, tags and products_tags. products_tags has 2 columns product_id and tag_id (for many-to-many relationship). So the question is: how can I select products which have tag1 AND tag2 and tag3 and .... tagN? I need a best way in terms of performance. The only way I can think of now is to create a query with self joins. The query joins number would equal tags number - 1. Is there a better way? Maybe I miss here something fundamental (think). My project products need a way to be filtered by various tags...

Thanks in advance.

--
Aurimas L.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about SQL (not ZF related, sorry)

Daniel Latter-3
If this is your only use case you could just have one product table
with an extra "tags" column, here you could list the tags space
separated then just do a full text search on that column, this would
only take one query.

Thanks

On Wednesday, June 30, 2010, Aurimas Likas <[hidden email]> wrote:
> I want to ask Im sure a really simple question (at laest for most of you). Here is the situation:
>
> I have 3 DB tables: products, tags and products_tags. products_tags has 2 columns product_id and tag_id (for many-to-many relationship). So the question is: how can I select products which have tag1 AND tag2 and tag3 and .... tagN? I need a best way in terms of performance. The only way I can think of now is to create a query with self joins. The query joins number would equal tags number - 1. Is there a better way? Maybe I miss here something fundamental (think). My project products need a way to be filtered by various tags...
>
> Thanks in advance.
> --
> Aurimas L.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about SQL (not ZF related, sorry)

Henry Umansky-2
In reply to this post by Aurimas Likas
On Jun 30, 2010, at 3:28 PM, Aurimas Likas wrote:

> I want to ask Im sure a really simple question (at laest for most of you). Here is the situation:
>
> I have 3 DB tables: products, tags and products_tags. products_tags has 2 columns product_id and tag_id (for many-to-many relationship). So the question is: how can I select products which have tag1 AND tag2 and tag3 and .... tagN? I need a best way in terms of performance. The only way I can think of now is to create a query with self joins. The query joins number would equal tags number - 1. Is there a better way? Maybe I miss here something fundamental (think). My project products need a way to be filtered by various tags...
>
> Thanks in advance.
>
> --
> Aurimas L.

I had to do something similar a few years ago. Here is my code, except articles = products, and keywords = tags:

$pKeywords = array("any", "keywords", "array"); //keyword array sent into function

$kwSelect = $articles->select ();
$kwSelect->setIntegrityCheck ( false );
$kwSelect->from ( array ('ak' => 'articles_keywords' ), array ('ak.article_id' ) );
$kwSelect->joinInner ( array ('k' => 'keywords' ), 'k.id = ak.keyword_id', array () );

foreach ( $pKeywords as $kw ) {
        $kwSelect->orWhere ( 'k.slug = ?', $kw );
}

$kwSelect->group ( array ('ak.article_id' ) );

if($this->_kwIntersection == true)
        $kwSelect->having ( 'COUNT(ak.article_id)=' . count ( $pKeywords ) );

$select->where ( 'a.id IN (?)', new Zend_Db_Expr ( $kwSelect ) );

Not sure if this is the best way to handle your scenario, but it seemed to work for me. In this example, if _kwIntersection is true, then it's an OR operation, if it's false, then it's an AND operation. Also, there is another field in the database called "slug". The "slug" is the URL-safe version of the keyword, so for example, "Mechanical & Aerospace Engineering" slug would be "mechanical-aerospace-engineering", but your script might not need a slug. Hope this helps.

-Henry

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about SQL (not ZF related, sorry)

Andrew Ballard
In reply to this post by Aurimas Likas
On Wed, Jun 30, 2010 at 3:28 PM, Aurimas Likas <[hidden email]> wrote:

> I want to ask Im sure a really simple question (at laest for most of you).
> Here is the situation:
>
> I have 3 DB tables: products, tags and products_tags. products_tags has 2
> columns product_id and tag_id (for many-to-many relationship). So the
> question is: how can I select products which have tag1 AND tag2 and tag3 and
> .... tagN? I need a best way in terms of performance. The only way I can
> think of now is to create a query with self joins. The query joins number
> would equal tags number - 1. Is there a better way? Maybe I miss here
> something fundamental (think). My project products need a way to be filtered
> by various tags...
>
> Thanks in advance.
>
> --
> Aurimas L.
>

Does this work?

<?php

$tags = array('foo', 'bar', 'widget');

$inner = $db->select()
            ->from('product_tags', 'product_id')
            ->join('tags', '`product_tags`.`tag_id` = `tags`.`tag_id`')
            ->where('`tags`.`tag` IN (?)', $tags)
            ->where('`products`.`product_id` = `product_tags`.`product_id`')
            ->group('`products`.`product_id`')
            ->having('COUNT(*) = ?', count($tags));


$select = $db->select()
             ->from('products',
                    array('product_id', 'product_name', 'price'))
             ->where('EXISTS (?)', $inner);

?>

It should generate a statement that looks something like this:

SELECT      `products`.`product_id`, `products`.`product_name`,
`products`.`price`
FROM        `products`
WHERE       EXISTS (
                SELECT      `product_id`
                FROM        `product_tags`
                INNER JOIN  `tags`
                                ON  `tags`.`tag_id` = `product_tags`.`tag_id`
                WHERE       `tags`.`tag` IN ( 'foo', 'bar', 'widget' )
                  AND       `product_tags`.`product_id` =
`products`.`product_id`
                GROUP BY    `products`.`product_id`
                HAVING      COUNT(*) = 3
            )


It uses an aggregate sub-query to find rows for product IDs that match
all the tags, and then applies the result set to the products table.

Andrew
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about SQL (not ZF related, sorry)

BillKarwin
In reply to this post by Aurimas Likas

On Jun 30, 2010, at 12:28 PM, Aurimas Likas wrote:

> I have 3 DB tables: products, tags and products_tags. products_tags  
> has 2 columns product_id and tag_id (for many-to-many relationship).  
> So the question is: how can I select products which have tag1 AND  
> tag2 and tag3 and .... tagN? I need a best way in terms of  
> performance. The only way I can think of now is to create a query  
> with self joins. The query joins number would equal tags number - 1.

Using N-1 self-joins is the best solution for performance, especially  
if you use MySQL.  It  appears to require a lot of joins, and common  
wisdom is that you should avoid joins.

SELECT p.* FROM products p
JOIN products_tags pt1 ON p.product_id = pt1.product_id
JOIN tags t1 ON pt1.tag_id = t1.tag_id AND t1.tag = 'medical'
JOIN products_tags pt2 ON p.product_id = pt2.product_id
JOIN tags t2 ON pt1.tag_id = t2.tag_id AND t2.tag = 'performance'
JOIN products_tags pt3 ON p.product_id = pt3.product_id
JOIN tags t3 ON pt3.tag_id = t3.tag_id AND t3.tag = 'reporting';

If you use natural keys for the tags table instead of a pseudokey,  
then your products_tags table contains the value you need, and you an  
eliminate half of those joins.  For example:

CREATE TABLE tags (
   tag VARCHAR(20) PRIMARY KEY   -- natural key, not pseudokey
);

CREATE TABLE products_tags (
   product_id INT NOT NULL,
   tag VARCHAR(20) NOT NULL,  -- natural key, not pseudokey
   PRIMARY KEY (product_id, tag),
   FOREIGN KEY (product_id) REFERENCES products(product_id),
   FOREIGN KEY (tag) REFERENCES tags(tag)
);

SELECT p.* FROM products p
JOIN products_tags pt1 ON p.product_id = pt1.product_id AND pt1.tag =  
'medical'
JOIN products_tags pt2 ON p.product_id = pt2.product_id AND pt2.tag =  
'performance'
JOIN products_tags pt3 ON p.product_id = pt3.product_id AND pt3.tag =  
'reporting';

The primary key index in products_tags gives this good performance.  
Your EXPLAIN report should say "Using index" which means it can  
satisfy the query simply by reading the index, without reading the  
data for the products_tags table at all.  If your index is cached in  
database memory, this runs very fast.

A different solution that some people prefer is to use GROUP BY and  
include groups that have as many rows as the number of tags:

SELECT p.*
FROM products p
JOIN products_tags pt ON p.product_id = pt.product_id
WHERE pt.tag IN ('medical', 'performance', 'reporting')
GROUP BY p.product_id    -- EXPLAIN reports "Using temporary; Using  
filesort"
HAVING COUNT(*) = 3;

Common wisdom says that one join is always better than three or six,  
but this actually performs much *worse* (if you use MySQL), because  
virtually any query using GROUP BY uses a temporary table, which kills  
performance because disk I/O is expensive, and temporary tables  
usually write to disk.

This might not be such a bad solution if you use some other brand of  
database that optimizes GROUP BY better than MySQL.  So you should  
analyze queries with EXPLAIN (or equivalent) and profile carefully  
with sample data of realistic size.

Regards,
Bill Karwin
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about SQL (not ZF related, sorry)

BillKarwin

On Jun 30, 2010, at 2:12 PM, Aurimas Likas wrote:

> So, Bill, if my only choice is MySQL and if I dont need to join tags  
> table itself (I provide tag IDs to the function selecting products  
> and I dont need any tags data (their IDs are used only for  
> filtering)), I think I can leave pseudo keys (they should be faster  
> than natural ones) and get pretty good performance?? Tags number  
> should be relatively small (up to 10, usually up to 3-5).

Yes, if you know the integer tag_id values, you can continue to use  
pseudokeys and still eliminate the extra three joins.

There's a performance advantage to using integers instead of strings,  
because integer comparisons are a bit faster and also the total space  
required for your index is smaller, so it takes less memory to keep it  
in cache.

But these advantages are actually pretty small compared to the  
advantage of avoiding the cost of using a temporary table.  So you  
have already achieved a big win by using joins instead of group by.  
Any extra benefit from using integer pseudokeys is icing on the  
cake.  :-)

Regards,
Bill Karwin

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about SQL (not ZF related, sorry)

Andrew Ballard
In reply to this post by BillKarwin
On Wed, Jun 30, 2010 at 4:46 PM, Bill Karwin <[hidden email]> wrote:

> Common wisdom says that one join is always better than three or six, but
> this actually performs much *worse* (if you use MySQL), because virtually
> any query using GROUP BY uses a temporary table, which kills performance
> because disk I/O is expensive, and temporary tables usually write to disk.
>
> This might not be such a bad solution if you use some other brand of
> database that optimizes GROUP BY better than MySQL.  So you should analyze
> queries with EXPLAIN (or equivalent) and profile carefully with sample data
> of realistic size.
>
> Regards,
> Bill Karwin
>

Interesting. I'll defer to you on MySQL since you know it much better
than I (and I don't have anything running on MySQL that I could test
quickly right now anyway). Still, the idea left me somewhat curious,
so I ran a test on one of our SQL Server databases to see how some
different strategies compared.

In the interest of disclosure, the actual tables I used (renamed in
the examples below) are not very large as databases go. The "products"
table has almost 54000 rows, the "tags" table has almost 350, and the
"product_tags" table has about 132000 rows. (The nature of the real
tables that I tested is that each row in the "products" table has at
least one row in the "product_tags" table and only six distinct values
from "tags" have been used so far in "product_tags".)

I compared three different query strategies (shown below). Of these,
METHOD 1 had the lowest overall cost in resources, followed closely by
METHOD 3 (although the total elapsed time for METHOD3 was slightly
faster than that for METHOD 1). I did notice that METHOD 3 grew more
expensive the more tags I included in the search since each tag
requires additional self-joins.

Given that the code for METHOD 1 was the least complex (and therefore
the easiest to define using Zend_Db_Select, just to keep this somewhat
on topic) and had the lowest cost and second lowest overall elapsed
time, I would go that route on SQL Server. As I understand, you're
saying this is not necessarily the case for MySQL.

-- METHOD 1
SELECT      P.product_id, P.product_name, P.price
FROM        products AS P
INNER JOIN  product_tags AS PT
                ON  P.product_id = PT.product_id
INNER JOIN  tags AS T
                ON  T.tag_id = PT.tag_id
WHERE       T.tag IN ( 'foo', 'bar', 'baz', 'floob', 'widget' )
GROUP BY    P.product_id, P.product_name, P.price
HAVING      COUNT(DISTINCT T.tag_id) = 5


-- METHOD 2
SELECT      products.product_id, products.product_name, products.price
FROM        products
WHERE       EXISTS (
                SELECT      product_tags.product_id
                FROM        product_tags
                INNER JOIN  tags
                                ON  tags.tag_id = product_tags.tag_id
                WHERE       tags.tag IN ( 'foo', 'bar', 'baz',
'floob', 'widget' )
                  AND       product_tags.product_id = products.product_id
                GROUP BY    product_tags.product_id
                HAVING      COUNT(*) = 5
            )


-- METHOD 3
SELECT      P.product_id, P.product_name, P.price
FROM        products AS P
INNER JOIN  product_tags AS PT1
                ON  PT1.product_id = P.product_id
INNER JOIN  tags AS T1
                ON  T1.tag_id = PT1.tag_id
                AND T1.tag = 'foo'
INNER JOIN  product_tags AS PT2
                ON  PT2.product_id = P.product_id
INNER JOIN  tags AS T2
                ON  T2.tag_id = PT2.tag_id
                AND T2.tag = 'bar'
INNER JOIN  product_tags AS PT3
                ON  PT3.product_id = P.product_id
INNER JOIN  tags AS T3
                ON  T3.tag_id = PT3.tag_id
                AND T3.tag = 'baz'
INNER JOIN  product_tags AS PT4
                ON  PT4.product_id = P.product_id
INNER JOIN  tags AS T4
                ON  T4.tag_id = PT4.tag_id
                AND T4.tag = 'floob'
INNER JOIN  product_tags AS PT5
                ON  PT5.product_id = P.product_id
INNER JOIN  tags AS T5
                ON  T5.tag_id = PT5.tag_id
                AND T5.tag = 'widget'



Results:
METHOD 1: Query cost as calculated by Query Analyzer (relative to batch) 3%
----------------
Table 'products'. Scan count 1, logical reads 294, physical reads 0,
read-ahead reads 0.
Table 'product_tags'. Scan count 1, logical reads 344, physical reads
0, read-ahead reads 0.
Table 'tags'. Scan count 1, logical reads 5, physical reads 0,
read-ahead reads 0.
SQL Server Execution Times:
   CPU time = 156 ms,  elapsed time = 510 ms.

(7220 row(s) affected)

METHOD 2: Query cost as calculated by Query Analyzer (relative to batch) 89%
----------------
Table 'tags'. Scan count 131715, logical reads 263430, physical reads
0, read-ahead reads 0.
Table 'product_tags'. Scan count 53968, logical reads 108421, physical
reads 0, read-ahead reads 0.
Table 'products'. Scan count 1, logical reads 294, physical reads 0,
read-ahead reads 0.
SQL Server Execution Times:
   CPU time = 953 ms,  elapsed time = 1021 ms.

(7220 row(s) affected)

METHOD 3: Query cost as calculated by Query Analyzer (relative to batch) 9%
----------------
Table 'tags'. Scan count 5, logical reads 25, physical reads 0,
read-ahead reads 0.
Table 'product_tags'. Scan count 5, logical reads 1720, physical reads
0, read-ahead reads 0.
Table 'products'. Scan count 1, logical reads 294, physical reads 0,
read-ahead reads 0.
SQL Server Execution Times:
   CPU time = 266 ms,  elapsed time = 488 ms.

(7220 row(s) affected)


Andrew
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

RE: Question about SQL (not ZF related, sorry)

Vincent de Lau
Hi Andrew,

Just jumping in, here is another approach:

SELECT P.product_id, P.product_name, P.price
FROM  
      products AS P
        INNER JOIN  product_tags AS PT
                ON  P.product_id = PT.product_id
        INNER JOIN  tags AS T
                ON  T.tag_id = PT.tag_id AND T.tag IN ( 'foo', 'bar', 'baz', 'floob', 'widget' )
GROUP BY    P.product_id
HAVING      COUNT(DISTINCT T.tag_id) = 5

Theoretically, this should result in a smaller temporary table, since moving the IN check to the join condition will limit the records stored in a temporary table. Also, assuming that product_id is a primary key, you should be save in grouping on just P.product_id.

Related to this, I recently had a curious issue. While rewriting a subquery into a join, performance dropped dramatically. My hypothesis was that the amount of columns MySQL had to store in the temporary table was so large, that creating a reduced set of columns with a subquery was actually beneficial.

Vincent de Lau
 [hidden email]



> -----Oorspronkelijk bericht-----
> Van: Andrew Ballard [mailto:[hidden email]]
> Verzonden: donderdag 1 juli 2010 17:36
> Aan: Bill Karwin
> CC: [hidden email]
> Onderwerp: Re: [fw-general] Question about SQL (not ZF related, sorry)
>
> On Wed, Jun 30, 2010 at 4:46 PM, Bill Karwin <[hidden email]> wrote:
> > Common wisdom says that one join is always better than three or six,
> but
> > this actually performs much *worse* (if you use MySQL), because
> virtually
> > any query using GROUP BY uses a temporary table, which kills
> performance
> > because disk I/O is expensive, and temporary tables usually write to
> disk.
> >
> > This might not be such a bad solution if you use some other brand of
> > database that optimizes GROUP BY better than MySQL.  So you should
> analyze
> > queries with EXPLAIN (or equivalent) and profile carefully with
> sample data
> > of realistic size.
> >
> > Regards,
> > Bill Karwin
> >
>
> Interesting. I'll defer to you on MySQL since you know it much better
> than I (and I don't have anything running on MySQL that I could test
> quickly right now anyway). Still, the idea left me somewhat curious,
> so I ran a test on one of our SQL Server databases to see how some
> different strategies compared.
>
> In the interest of disclosure, the actual tables I used (renamed in
> the examples below) are not very large as databases go. The "products"
> table has almost 54000 rows, the "tags" table has almost 350, and the
> "product_tags" table has about 132000 rows. (The nature of the real
> tables that I tested is that each row in the "products" table has at
> least one row in the "product_tags" table and only six distinct values
> from "tags" have been used so far in "product_tags".)
>
> I compared three different query strategies (shown below). Of these,
> METHOD 1 had the lowest overall cost in resources, followed closely by
> METHOD 3 (although the total elapsed time for METHOD3 was slightly
> faster than that for METHOD 1). I did notice that METHOD 3 grew more
> expensive the more tags I included in the search since each tag
> requires additional self-joins.
>
> Given that the code for METHOD 1 was the least complex (and therefore
> the easiest to define using Zend_Db_Select, just to keep this somewhat
> on topic) and had the lowest cost and second lowest overall elapsed
> time, I would go that route on SQL Server. As I understand, you're
> saying this is not necessarily the case for MySQL.
>
> -- METHOD 1
> SELECT      P.product_id, P.product_name, P.price
> FROM        products AS P
> INNER JOIN  product_tags AS PT
>                 ON  P.product_id = PT.product_id
> INNER JOIN  tags AS T
>                 ON  T.tag_id = PT.tag_id
> WHERE       T.tag IN ( 'foo', 'bar', 'baz', 'floob', 'widget' )
> GROUP BY    P.product_id, P.product_name, P.price
> HAVING      COUNT(DISTINCT T.tag_id) = 5
>
>
> -- METHOD 2
> SELECT      products.product_id, products.product_name, products.price
> FROM        products
> WHERE       EXISTS (
>                 SELECT      product_tags.product_id
>                 FROM        product_tags
>                 INNER JOIN  tags
>                                 ON  tags.tag_id = product_tags.tag_id
>                 WHERE       tags.tag IN ( 'foo', 'bar', 'baz',
> 'floob', 'widget' )
>                   AND       product_tags.product_id =
> products.product_id
>                 GROUP BY    product_tags.product_id
>                 HAVING      COUNT(*) = 5
>             )
>
>
> -- METHOD 3
> SELECT      P.product_id, P.product_name, P.price
> FROM        products AS P
> INNER JOIN  product_tags AS PT1
>                 ON  PT1.product_id = P.product_id
> INNER JOIN  tags AS T1
>                 ON  T1.tag_id = PT1.tag_id
>                 AND T1.tag = 'foo'
> INNER JOIN  product_tags AS PT2
>                 ON  PT2.product_id = P.product_id
> INNER JOIN  tags AS T2
>                 ON  T2.tag_id = PT2.tag_id
>                 AND T2.tag = 'bar'
> INNER JOIN  product_tags AS PT3
>                 ON  PT3.product_id = P.product_id
> INNER JOIN  tags AS T3
>                 ON  T3.tag_id = PT3.tag_id
>                 AND T3.tag = 'baz'
> INNER JOIN  product_tags AS PT4
>                 ON  PT4.product_id = P.product_id
> INNER JOIN  tags AS T4
>                 ON  T4.tag_id = PT4.tag_id
>                 AND T4.tag = 'floob'
> INNER JOIN  product_tags AS PT5
>                 ON  PT5.product_id = P.product_id
> INNER JOIN  tags AS T5
>                 ON  T5.tag_id = PT5.tag_id
>                 AND T5.tag = 'widget'
>
>
>
> Results:
> METHOD 1: Query cost as calculated by Query Analyzer (relative to
> batch) 3%
> ----------------
> Table 'products'. Scan count 1, logical reads 294, physical reads 0,
> read-ahead reads 0.
> Table 'product_tags'. Scan count 1, logical reads 344, physical reads
> 0, read-ahead reads 0.
> Table 'tags'. Scan count 1, logical reads 5, physical reads 0,
> read-ahead reads 0.
> SQL Server Execution Times:
>    CPU time = 156 ms,  elapsed time = 510 ms.
>
> (7220 row(s) affected)
>
> METHOD 2: Query cost as calculated by Query Analyzer (relative to
> batch) 89%
> ----------------
> Table 'tags'. Scan count 131715, logical reads 263430, physical reads
> 0, read-ahead reads 0.
> Table 'product_tags'. Scan count 53968, logical reads 108421, physical
> reads 0, read-ahead reads 0.
> Table 'products'. Scan count 1, logical reads 294, physical reads 0,
> read-ahead reads 0.
> SQL Server Execution Times:
>    CPU time = 953 ms,  elapsed time = 1021 ms.
>
> (7220 row(s) affected)
>
> METHOD 3: Query cost as calculated by Query Analyzer (relative to
> batch) 9%
> ----------------
> Table 'tags'. Scan count 5, logical reads 25, physical reads 0,
> read-ahead reads 0.
> Table 'product_tags'. Scan count 5, logical reads 1720, physical reads
> 0, read-ahead reads 0.
> Table 'products'. Scan count 1, logical reads 294, physical reads 0,
> read-ahead reads 0.
> SQL Server Execution Times:
>    CPU time = 266 ms,  elapsed time = 488 ms.
>
> (7220 row(s) affected)
>
>
> Andrew

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Question about SQL (not ZF related, sorry)

Andrew Ballard
On Thu, Jul 1, 2010 at 11:55 AM, Vincent de Lau <[hidden email]> wrote:

> Hi Andrew,
>
> Just jumping in, here is another approach:
>
> SELECT P.product_id, P.product_name, P.price
> FROM
>      products AS P
>        INNER JOIN  product_tags AS PT
>                ON  P.product_id = PT.product_id
>        INNER JOIN  tags AS T
>                ON  T.tag_id = PT.tag_id AND T.tag IN ( 'foo', 'bar', 'baz', 'floob', 'widget' )
> GROUP BY    P.product_id
> HAVING      COUNT(DISTINCT T.tag_id) = 5
>
> Theoretically, this should result in a smaller temporary table, since moving the IN check to the join condition will limit the records stored in a temporary table. Also, assuming that product_id is a primary key, you should be save in grouping on just P.product_id.
>


I don't see any noticeable difference between the two in SQL Server.
I'm pretty sure that the optimizer knows that they are the same thing
and uses the same plan for both.


> Related to this, I recently had a curious issue. While rewriting a subquery into a join, performance dropped dramatically. My hypothesis was that the amount of columns MySQL had to store in the temporary table was so large, that creating a reduced set of columns with a subquery was actually beneficial.
>


I suppose it is possible. When faced with something like this, I
frequently write a few variations to a query and compare them against
approximately real data to see which approach performs best in a given
situation. Often the results are what I expected, but many times they
are not. That's why so many threads on a DBA board I follow include
answers of "it depends."

Andrew
Loading...