So, last week I hit a point where I had to post a question to the Hibernate Forums. I thought that I was doing something fairly straight-forward, but as far as I could tell, there was an undocumented limitation in Hibernate.
Imagine this scenario: you’ve got a persistent class, Account, and it has a one-to-one relationship with another class, AccountTransactionSummary. AccountTransactionSummary contains summary information — the last transaction date, and the like. As you might expect, the underlying table for AccountTransactionSummary is really a view, defined something like this:
create view ACCT_TRANS_SUMM (INSTITUTION, TRANSIT, ACCOUNT_NUMBER, LAST_TRANSACTION_DATE) as select INSTITUTION, TRANSIT, ACCOUNT_NUMBER, MAX(TRANSACTION_DATE) from TRANSACTION group by INSTITUTION, TRANSIT, ACCOUNT_NUMBER;
Although rare, it’s possible for an account to have no transactions, and therefore no AccountTransactionSummary.
The code all worked, but the performance implication was huge. Hibernate would try to populate the objects by issuing a statement like this:
SELECT account_.*, accountTran_.* FROM ACCOUNT account_ LEFT OUTER JOIN ACCT_TRANS_SUMM accountTran_ on account_.INSTITUTION = accountTran_.INSTITUTION and account_.TRANSIT = accountTran_.TRANSIT on account_.ACCOUNT_NUMBER = accountTran_.ACCOUNT_NUMBER where account_.ACCOUNT_HOLDER = ?
Our database product, DB2 for mainframe (z/OS), reacts to a statement like this by performing something called “view materialization”. It essentially creates, in memory, a copy of the contents of the entire view. It doesn’t even try to limit the view just to the few accounts held by one account holder — instead it materializes the data for every account. That had a significant performance slowdown.
So I flicked some switches in Hibernate. I indicated, in my mapping file, that the AccountTransactionSummary object had a fetch mode of “SELECT”, rather than “JOIN”. Like so:
<one-to-one name="transactionSummary" class="com.example.model.AccountTransactionSummary" fetch="select" />
That prevents the left outer join, but introduces the “N+1 SELECT” problem. That is to say, if my first query returns “N” Accounts, Hibernate issues an additional select statement for each of the Accounts. Hence, I get “N+1” total select statements.
No probs, I thought. I could mitigate that problem using the “batch-size” parameter on the class element in my AccountTransactionSummary.hbm.xml file. The batch-size parameter says, in essence, when trying to resolve N relationships, do it in chunks. So rather than issue “N” additional select statements, I issue “N/batch-size” additional select statements. Set the batch-size to be a large-ish number and most cases will only take a single additional statement.
Except, it didn’t work. Hibernate, for all that I could tell, simply ignored my batch-size parameter and happily sat there spitting out N+1 select statements.
I pushed and prodded all the settings that I thought might be affecting things. I figured that I must have done something wrong. Eventually, to maintain my sanity, I changed my settings so that rather than be a one-to-one relationship, I defined the AccountTransactionSummary as a many-to-one relationship. That’s not, strictly speaking, the “correct” relationship, but I needed to try something. As a many-to-one, the batch-size took effect immediately, and the N+1 problem disappeared.
(It’s worth noting that a many-to-one relationship behaves differently if the relationship can’t be resolved, though).
Those findings surprised me. Hibernate is such a great tool to use that I get shocked when I run into what appears to be an arbitrary limitation: that something that works fine for one relationship type doesn’t work for another relationship type, even when I’d expect internal code to be the same in both cases. It’s at times like this that I’m inclined to go crawling through the Hibernate code to find the problem and then create a patch.