Saturday, March 14, 2009

Find an article with a list of tags

Let's say I have a web site with articles, and each article can have one or more tags, just like the blog posts in this web site (where they are called "labels").
I save all of the article's tags in a table, called ArticleTags, that have two columns: ArticleId and TagId.
Now let's say I have a list of tags - ASP.NET, SQL and WEB, for example, with IDs 1,2,3.
I want to find all of the articles that have all 3 tags.
If I try the simple query

SELECT ArticleID FROM ArticleTags WHERE TagId in (1,2,3)

I will get all the articles with either tag 1 or tag 2 or tag 3 - not the articles with all three of them.

When I looked for solutions on the web, I was surprised to find that many implemented this by saving all the tags as one long text field of the articles = "ASP.NET,SQL,WEB" - and used "LIKE" in the query. It is inefficient, because of the use of "like", and because the search is on the tag text itself and not on the tag ID, and also limits the number of tags to the capacity of one field, and I also want to save the text in a separate table, for localizations and other uses.

I finally got the following solution that gives me exactly what I was looking for:

SELECT ArticleID
FROM ArticleTags
WHERE (TagId IN (1, 2, 3))
GROUP BY ArticleID
HAVING (COUNT(*) = 3)

I am using the GROUP option to group all the same ArticleID together, and then count them. If their count is exactly the number of Tag IDs I am looking for, it means that this article has all the tags in my list. This condition has to be in the HAVING part of the query and not in the WHERE clause because it uses an aggregation field.

No comments:

Post a Comment