News/blog Contact

Sort Order Problems in Relational Databases.
Kim S. Larsen.
International Journal of Foundations of Computer Science, 9(4): 399-429, 1998.
A relation of degree k can be sorted lexicographically in k! different ways, i.e., according to any one of the possible permutations of the schema of the relation. Such permutations are referred to as sort orders. When evaluating unary and binary relational algebra operators using sort-merge algorithms, sort orders fulfilling the constraints enforced by the operators are chosen for the operand relations. The relations are then sorted according to their assigned sort orders, and the result is obtained by merging. Should the operands already be sorted according to one of the permissible sort orders, then only a merging is required. The sort order of the result will depend on the sort orders of the operands.

When evaluating whole relational algebra expressions, the result of one operation will be used as an operand to the next. It is desirable to choose sort orders in such a way that the result of one operation will automatically fulfill the requirements of the next. In general, one would like to find a minimal number of operators in the expression for which this cannot be obtained, bearing in mind the overall goal of minimizing the total work.

We show that this problem is NP-hard, and that the corresponding decision problem is NP-complete. However, most simplifications of the original problem give rise to efficient algorithms. In fact, most frequently occurring queries can be analyzed in linear time in the size of the query. This is due to the fact that only a very limited number of subsets of all permutations of schemas can be encountered in the algorithms, which means that compact representations for these subsets can be found.


publication
Link to the publication at the publisher's site - subscription may be required.
Text required by the publisher (if any): The publication is available via EBSCO Publishing.

open access (268 KB)
The same as the publisher's version, when the publisher permits. Otherwise, the author's last version before the publisher's copyright; this is often exactly the same, but sometimes fonts, page numbers, figure numbers, etc. are different. It may also be a full version. However, it is safe to read this version, and at the same time cite the official version, as long as references to concrete locations, numbered theorems, etc. inside the article are avoided.

other publications
Other publications by the author.