CodeSOD: A Split in the Database

Oracle is… special. While their core product is their database software, what they actually sell is layers and layers of ERPs and HR systems that run on top of that database. And what they really make money on is the consulting required to keep those monsters from eating your company's IT team alive.

Because these ERPs are meant to be all things to all customers, you also will find that there are a lot of columns named things like attribute3. Your company's custom logic can stuff anything you want in there. "Do as thou wilt," as they say. And suffer the consequences.

For Steven H, his company had a requirement. The order lines needed to join to the manufactured batch that was used to fill that order. This makes sense, and is easy to implement- you add a join table that links BatchId and OrderLineId And, if the folks who build this feature did that, we wouldn't have an article.

To "solve" this problem, they simply mashed together all the order line IDs fulfilled by a batch into a text field called attribute7. The data looked like:

413314|413315|413329

That fulfilled the requirement, in someone's mind, and the ticket was closed and folks moved on to other work. And then a few years later, someone asked if they could actually display that data on a report. It seemed like a simple request, so it got kicked off to an offshore team.

This was their solution:

CREATE VIEW batch_order_lines_vw AS SELECT bh.batch_id, ol.header_id, <other fields go here> FROM order_lines ol, batch_header bh WHERE 1=1 AND ol.line_id IN ( SELECT TRIM( REGEXP_SUBSTR( bh.attribute7, '[^|]+', 1, LEVEL)) FROM DUAL CONNECT BY REGEXP_SUBSTR( bh.attribute7, '[^|]+', 1, LEVEL) IS NOT NULL) ORDER BY line_id ASC

This query joins the batches to the order lines by using a REGEXP_SUBSTR to split those pipe-separated order lines. In fact, it needs to run the same regex twice to actually handle the split. In a subquery that is going to be executed for every combination of rows in order_lines and batch_header. Each table has millions of rows, so you already know exactly what this query does: it times out.

Speaking of things timing out, Steven has this to say about where this went:

We reported this to the database development team and marked the request as blocked. It's been maybe 2 years since then and it's still in that same state. I have since transferred to another team.

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.

This post originally appeared on The Daily WTF.

Leave a Reply

Your email address will not be published. Required fields are marked *