I'm a big fan of Method R, both the company and the way to optimize SQL (or any other process that can possibly be measured). The method is explained in details in their classic book Optimizing Oracle Performance. But it is hard to practice method R without a good profiler. In every installation of Oracle database the profiler tkprof is included and it does serve for some problems, but hides a lot of information from you.
When I have a tasks that runs for a few hours I enable extended Oracle trace from start to end, avoiding as best as possible anything outside the time interval for the task itself, meaning I coordinate the tracing with the end user. When I have the trace file ready I check it briefly to make sure I don't have some rubbish before or after the time window. Especially on Windows it is important to check that the tracefile has not already been written to by a previous session, may be it is the same on Unix, I haven't checked since I'm in Windows land for the moment. I use two free tools that create a profile in html-format:
Oracle Session Resource Profiler (OraSRP) by Egor Starostin and TVD$XTAT by Christian Antognini, the author of the book Troubleshooting Oracle Performance. In that book he describes his profiler that you can download from his site. Excellent book by the way.
I often use both profilers because I like to compare the results. They usually agree surprisingly well, the only difference is what is considered unaccounted for time. Whenever unaccounted for time is large I check the tracefile again to see if there is a long period of time spent waiting for the client at the end. Sometimes the missing time is located inside the session, and at one occasion when I was stuck I bought a license of mrls that parse the tracefile and makes it easy to track down where this happens.
In addition to create an overall profile for your session they have profiles for individual events, OraSRP is especially good at this. This is interesting because it adds another level. One of the tricks you encounter on the web is to increase the parameter DB_FILE_MULTIBLOCK_READ_COUNT. This parameter decides how much Oracle can try to fetch in each call when reading many blocks at a time, typically db file scattered read or direct path read usually when doing table scans as opposed to single key search in an index. Reading 128 at once must be better than 16 that is often the default, but this profile will in some cases show you that Oracle cannot read a maximum number of blocks for each read call. In one query I was working on the event db file scattered read contributed to 51% of the response time, but a profile on this event showed that the number of blocks pr read call was almost flat with a variation from 1 block pr call (7.1%), 128 blocks pr call (1.0%) to 73 blocks pr call (0.1%). The benefit of increasing the parameter was not as much as I hoped for, and the solution was found elsewhere.
One of the important points with profiling is to discover where not to waste time. You might suspect a full table scan to be the problem (because some people believe FTS is always bad), but large tables can be scanned really fast these days. With tracing and profiling you don't have to guess, you measure and draw a conclusion. There are so many things that can go wrong in a complex system, if you can measure and do simple math you are much more likely to reach your goal early and know that you actually have reached it than when you apply all those tricks and guesswork.
"Filter early" is important when optimizing, with profiling you filter out early where you don't want to apply the focus of your brain.