Andrew Mobbs (mobbsy) wrote,
Andrew Mobbs

I was asked a question at work, so I was thinking about all this stuff, and it occurred to me that there might be somebody else in the world who was vaguely interested.

Also, people have asked me what exactly I do for a living, so far today; answer this question, look at what CPU configurations of HP servers can be bought, point somebody at the Oracle documentation on supplied PL/SQL packages, email a colleague about the differences between cheap and expensive network attached storage, and start writing up the analysis for a mechanism I've invented for transfering data between two parts of our software. In other weeks I've run benchmarks, coded prototypes, gone on-site to fix customer problems or as a consultant and been a group sysadmin. I've no idea what my job description is.

Oracle normally stores data in "heap tables", which are just packed row data. There's no better way of finding a given row than searching through the whole table, so the table data is referenced by B*Tree indexes. The index maps a key to a set of (datablock,offset) addresses. Given a key you traverse the tree, read the address and go straight to the appropriate data blocks.

There's an alternative sort of table called an "Index Organized Table" (IOT), which essentially has the whole table being a B*Tree structure, with some of the columns forming the primary key (which as a whole must uniquely identify a given row). This is efficient if virtually all data accesses to the table go via the primary key; you don't need to read the index blocks then the data blocks since everything is there together. You need not use all columns of the primary key to retrieve data, just the leading ones, for example the primary key may be (account,order_number) and you can still efficiently retrieve all rows for a given account.

There's a problem though if you don't access the data in an IOT via the primary key, it's expensive since you need to search many blocks to find the data you're looking for. It gets a little worse since you can't simply make a traditional secondary index that points toward the datablocks, since the B*tree structure of the IOT means data will shift around as the tree grows and it would be very inefficient to go and renumber all the index references every time that happened.

So, the secondary index on an IOT actually stores a mapping between the secondary key value and a set of primary keys (the secondary index need not refer to a unique row). To do a lookup by a secondary key means first using the secondary key to find one or more primary keys, then searching through the IOT by primary key to find the required data. This is quite bad, but not as bad as searching the whole IOT.

Oracle mostly¹ uses something called a "Cost Based Optimizer" (CBO) to decide how best to get to data given a query. Often it's most efficient to use an index, but sometimes there's a choice of index, or if you're getting back lots of rows then it's sometimes faster to just go to the data than go through the whole mechanism I described above for many rows. To use the CBO effectively you have to get Oracle to measure statistics about the size and shape of the data, otherwise it just has some default values that tend to make it prefer to use an index if there's one there.

After we made some changes² to an IOT, Oracle seems to have changed its mind about how to best get back the data. It used to see that there was a very simple query asking for a few values and specifying the first couple of elements of the primary key. The obviously correct method is to go directly to the IOT traverse the key and give back all rows with matching values. I was surprised to see that Oracle was now reading the entire secondary index, which had nothing to do with the query.

What I think is happening is that because all the values the query requested are part of the primary key, Oracle can retrieve all the requested data without ever looking at the IOT, as all the values are in the secondary key. The CBO is making some assumption about the size of the secondary index and the size of IOT and guessing wrongly that it's going to be faster to get the data by reading all of the secondary index than just bits of the IOT. I hope to fix this by getting Oracle to measure the statistics about the index and IOT to give it the information to make better guesses. Any DBA would tell me that we should be doing that anyway, but see ¹ below for the excuse.

¹ mostly> The other, older, method is the "Rule Based Optimizer", which uses a set of rules to determine the access path. The advantage of this to the software vendor is that it won't do the sort of random things I just described the CBO doing. However, it is less flexible, and using newer features of Oracle (such as IOTs) will silently enable the CBO as the RBO knows nothing about them. The version of software in question primarily uses the RBO, so we don't generally gather statistics; but because it uses an IOT, queries on that table use the CBO regardless.

² changes> For the curious, we partitioned the table; it's a big table, which can be difficult to manage as a single object (for example, backups). Oracle allows you to split a table into partitions, which leaves the table looking identical, but each partition can be managed independently.

  • Post a new comment


    default userpic

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.