PostgreSQL slow COUNT() - is trigger the only solution? -
i have table posts, categorized by:
- type
- tag
- language
all of "categories" stored in next tables (posts_types) , connected via next tables (posts_types_assignment).
counting in postgresql slow (i have more 500k records in table) , need number of posts categorized combination of type/tag/lang.
if solve through triggers, full of many multi-level loops, doesn't nice , hard maintenance.
is there other solution how actual number of posts categorized in type/tag/language?
let me straight.
you have table posts. have table posts_types. 2 have many many join on posts_types_assignment. , have query slow:
select count(*) posts p join posts_types_assigment pta1 on p.id = pta1.post_id join posts_types pt1 on pt1.id = pta1.post_type_id , pt1.type = 'language' , pt1.name = 'english' join posts_types_assigment pta2 on p.id = pta2.post_id join posts_types pt2 on pt2.id = pta2.post_type_id , pt2.type = 'tag' , pt2.name = 'awesome' and know why painfully slow.
my first note postgresql have lot less work if had identifiers in posts table rather in joins. moot issue, decision has been made.
my more useful note believe postgresql has similar query optimizer oracle. in case limit combinatorial explosion of possible query plans has consider, considers plans start table, , repeatedly joins on 1 more data set @ time. no such query plan work here. can start pt1, 1 record, go pta1, bunch of records, join p, wind same number of records, join pta2, , huge number of records, join pt2, few records. joining pta2 slow step, because database has no idea records want, , therefore has create temporary result set every combination of post , piece of metadata (type, language or tag) on it.
if indeed problem, right plan looks this. join pt1 pta1, put index on it. join pt2 pta2, join result of first query, join p. count. means don't huge result sets.
if case, there no way tell query optimizer once want think new type of execution plan. there way force it.
create temporary table t1 select pta* posts_types pt join posts_types_assignment pta on pt.id = pta.post_type_id pt.type = 'language' , pt.name = 'english'; create index idx1 on t1 (post_id); create temporary table t2 select pta* posts_types pt join posts_types_assignment pta on pt.id = pta.post_type_id join t1 on t1.post_id = pta.post_id pt.type = 'language' , pt.name = 'english'; select count(*) posts p join t1 on p.id = t1.post_id; barring random typos, etc, perform better. if doesn't, double check indexes on tables.
Comments
Post a Comment