The Evil Query

Here is the use case: I want to search all of my bookmarks by multiple tags with an “and” operation. That means I want to see bookmarks that are tagged with Java AND JavaEE, not just Java and not just JavaEE. I want to see bookmarks tagged with C# AND 3.5 not just C# and not just 3.5. For most of the project I’m using Hibernate and the Criteria API with a lot of success. In this case I was not able to massage Hibernate into creating the correct query and that’s fine. Hibernate has its uses and it is acceptable to write custom SQL when the need arises. I wrote a method to build up the query and execute it. Here is part of the method that builds up the query:

StringBuilder sqlString =
new StringBuilder(
“select distinct(b.id)
from bookmarks b “);
int counter = 0;
for(Tag tag : tags) {
String alias = “j” + Integer.toString(counter);
String joinFragment = ” inner join bookmarks_to_tags {alias}
on b.id = {alias}.bookmark_id
and {alias}.tag_id = ? “;
joinFragment = joinFragment.replace(“{alias}”, alias);
sqlString.append(joinFragment);
counter++;
}

Right away it looks like this query will get very expensive for large tag sets. For smaller sets the query performs in a reasonable amount of time but more than 12 tags makes the query take quite a while. This is exacerbated by the fact that this query will be executed multiple times if there is a sufficiently large result set (default is over 25) to give the user a paginated view. This totally flies in the face of being nice to the database. Hammering it with a complex query over and over won’t do the users any favors.

What are my options here? I can work on making this query less complex (something that I will certainly do), but there is still the problem that, no matter how nice this query becomes, the query will be executed over and over as a user pages through the results. The complexity of the query and its repeated execution makes the result set a good candidate for caching. Caching the result set in memory with a key tied to the user or their session will allow the expensive query to be executed once with the bookmarks stored in a cache. When the user pages through the results the cache will be consulted first where the result set will be stored. I’m in the process of selecting a caching package for this project so I will post my selection and integration process soon.

Until then, I’d be happy to hear any suggestions on making this query a little nicer while preserving its “andness”.

Spring Web MVC and the benefits of pagination

Lately I’ve been working on a project that keeps my web bookmarks sync’d across multiple computers and different web browsers. The project is implemented in Java using JBoss 4.2.2, MySQL 5.0 and Spring 2.0. It’s not a very big project right now, only about 40 class files and some small number of configuration files, but the amount of data the application deals with can get fairly large.
Continue reading →

Spring Web MVC lists

I recently came across a blog post describing how *not* to use the Spring Web MVC form tag library. Spring Web MVC is one of those things that has a high learning curve but it pays off big once you get the hang of it. The form tag library is no exception. The original blog poster seems to want to display a list of things the form user can manipulate and then send the changes back to the app server. There are a few different ways we can do this with Spring Web MVC and some JSTL libraries so let’s take a look at the way I prefer to do it. Continue reading →

Response to Web Application: To Couple or Not to Couple?

I just read this thread on The Server Side and it makes me nervous that people still ask this kind of question. The thread starts off by asking “An usually annoying question when designing an action-based web application is: “Should I place everything in my action or should I separate the web logic from the business logic?” and the alarm bells in my head are going off already. Continue reading →

RE: Generics Microbenchmark

After the popularity of my “How’d this String get into my List<Integer>?!” post I decided to follow up with a few small benchmarks. I got the idea from Geoffrey Wiseman who read my previous post and wondered about Generics performance. Continue reading →