Database Tuning‎ > ‎

How Oracle uses memory on AIX - Part1

posted Sep 19, 2010, 6:01 PM by Sachchida Ojha   [ updated Sep 27, 2010, 6:41 AM ]


In this post I am going to talk about how ORACLE allocates and uses memory when running on AIX, but I will also talk about the power of approximation and how it can sometimes be misused for ill purposes 

On the outset, ORACLE/AIX memory “deal” seems simple enough – obviously, ORACLE will use memory when it runs and many AIX commands (such as vmstat or ps) will show memory usage both system wide and specific to particular process. But, as always, the devil is in the details and the effect of those “details” may be far from subtle.

So, why don’t we go ahead and find that devil, shall we ?

The straightforward (but naive) approach

Let’s say that we have an ORACLE instance running on one of our AIX servers. How do we find out how much memory it is using ?

First of all, let’s consider what ORACLE memory consists of. Here we actually have 2 separate parts:

  1. Memory used by System Global Area or SGA, that is implemented in AIX as a shared memory segment.
  2. Total memory used by ORACLE processes that represent the instance.

Next, let’s determine how much memory is used by each part.

Fortunately, SGA size is readily available -> just run ipcs -bm command and it’s there (The size of shared segment should match sga_max_size in 10g or memory_max_target in 11g).

Calculating “process” memory is a bit more involved as we need to sum memory used by ALL instance processes, but it is not very difficult either. ps -l command will show individual process memory size and a simple awk script can sum all of them up, such as:

AIX> ps -elf |
egrep " oracle${ORACLE_SID} | ora_.*_${ORACLE_SID} " |
grep -v egrep | awk '{sum += $10} END {print sum/1024}'

The sum of SGA size and process size will obviously represent total memory usage by ORACLE instance and you can see the entire calculation in the slide below.

Well, let me tell you right away where the punch line is – this calculation is WRONG – in a typical case, it overestimates ORACLE memory usage by, at least, a factor of 5-10 or more.

But where exactly have we made a mistake ?

Have we identified ORACLE memory parts incorrectly ? No, ORACLE memory does indeed take 2 parts: SGA and combined per-process memory from instance processes.

Have we made a mistake in the summing formula somewhere ? Well, not really, the formula is trivial – not much room for errors here …

The reason our answer is wrong is more subtle and is related to the fact that modern operating systems (AIX included) employ a number of smart tricks to allocate and manage system memory.

For whatever reason, most AIX commands do not take these tricks into account and display memory “sizes” as if nothing is going on (read: the way it was done in the 70s).

Nevertheless, the tricks are there and their effects are real, so let’s see how we can “out trick the trickster” and find out the real memory allocation.

We will start with process memory.

Estimating Process Memory Use in AIX

Trick #1 – Some memory may be shared

To simplify a little, memory that is used by a typical AIX process usually is divided into 2 major parts:

  1. User Data – Variables, dynamically allocated data, function parameters and return values etc
  2. Program Code (program itself, shared libraries etc This is also known as: Text)

While user data is obviously unique to each process and will change in size slightly (or not so slightly) as the process runs, the code part is different – it is static and, moreover, it is exactly the same for all programs that run it.

Since all ORACLE processes that make ORACLE instance are “instantiated” from the same binary disk image – $ORACLE_HOME/bin/oracle (you did know that, didn’t you? ), there is no reason for operating system to duplicate ORACLE code segment – instead AIX loads it in memory once and then links to each process.

Let’s prove this:

AIX CODE Segments vs DATA Segments

The interesting observation here is that under normal circumstances and especially for idle (ORACLE) processes, code segment will be much larger than data (yes, ORACLE has a big code!), reaching up to 90-95% of the size reported by PS. That means that for ORACLE processes:

ps -l usually significantly over reports memory size, and the real size is MUCH less

Alternatively, I guess, you could say that ps does report size properly but only if this was the only process in the system – you pick your poison …

A simple shortcut to see the real memory size of the process (excluding memory that is shared) is to use ps v command:

AIX> ps v 880802
880802   - A    86:59 89065  7088 58020    xx 88839 52048  2.0  0.0 ora_s00

AIX> ps -elf | head -1;ps -elf | grep 880802 | grep -v grep
F       S      UID     PID    PPID   C PRI NI ADDR   SZ    WCHAN    STIME  TTY  TIME CMD
240001 A   oracle  880802    1   8  64 20 4d345400 95924   Mar 01    - 86:59 ora_s000_qaten

Here, the SIZE column shows the virtual size of process DATA segment (we will talk about what virtual means shortly) and as you can see it is much smaller (7088) than SZ (95924) that is reported by ps -l.

A more precise way is to use svmon command, that displays all process memory segments and can neatly group them into SHARED and EXCLUSIVE categories (well, there is also SYSTEM, but that is another story).

AIX> svmon -P 880802 -O segment=category -O filterprop=data

EXCLUSIVE segments                   Inuse      Pin     Pgsp  Virtual
1486       22      306     1765

Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp Virtual
144370        11 work text data BSS heap           s   1108     0  135  1211
7d4e3  ffffffff work application stack            s    132     0   18   150

SHARED segments                      Inuse      Pin     Pgsp  Virtual
1528419        0   418040  1565766

Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp Virtual
1e483c        10 clnt text data BSS heap,          s  13012     0    -     -
30a0  90000000 work shared library text          s   4914     0   76  9685

Ok, we can see now that ps -l process size is bloated because it does not take sharable segments into account. But is this the whole story ?

Not quite, there is still one other notable trick in AIX bag …

Trick #2: Some memory may be swapped away

Let me ask you this – would it be possible for the process to allocate 2 Gb of RAM on the system that only has 1 Gb of physical memory ?

The answer is: of course, and it happens every day on many systems (albeit, the ratio in this example is somewhat extreme). That is: most modern operating systems (AIX is no exception) are designed to handle workloads that require more memory than the system has.

So, right of the bat, when we are talking about memory, we may actually mean 2 quite different things:

  1. Memory that is requested by system processes (we will call it Virtual)
  2. Physical memory that the system has (we will cal it Real or Physical)

It should be obvious that if Virtual > Real, something must happen to the portion of Virtual memory that does not fit into Real memory for the system to continue working properly. What usually happens is that the excess of Virtual memory (normally the oldest or least used pages) is saved to a special area on disk called SWAP (or Paging Space in AIX).

I guess you see where I’m going with this – how do we know whether the memory allocated by ORACLE processes is really in memory or has it been swapped to disk?

Here, svmon and ps v will tell us the details again.

Let’s look at ps v shortcut first:

ps v 1282210
1282210      - A     1:05 14598 19224 63308    xx 88839 52048  0.0  0.0 ora_arc

Remember that SIZE represents process Virtual size (or memory that is requested by the process). RSS represents In Memory size of CODE+DATA, while TRS represents In Memory size of CODE. So, In Memory size of DATA (what we really want to know) is RSS-TRS=11260 (Kb), which means that 19224-11260=7964 (Kb) is probably in paging space.

Once again, svmon will show a more precise information:

AIX> svmon -P 1282210 -O filtercat=exclusive -O filterprop=data

Pid Command          Inuse      Pin     Pgsp  Virtual 64-bit Mthrd  16MB
1282210 oracle            2815       22     1963     4806      Y     N     N

Notice that the Virtual size is 4806 (4K pages) and 4806*4=19224, so we have a match here. InUse (In Memory) size is 2815 (4K pages) and 2815*4=11260, so we have a match again.

Yet, look at Pgsp statistics. 1963*4=7852 which does not match our estimation from ps. Here we can see that the formula:

Virtual Memory Size

Virtual Size = Size(In Memory) + Size(In Paging Space)

is not precise all the time – my guess is: some of paging space may be reserved even though the page is really in memory, conversely, there might be some resident memory leftovers after pages have been moved to paging space. In either case, svmon will generally provide a more accurate estimation (and you can see why ps v is only a shortcut).

To summarize, to see the true memory usage by ORACLE processes in AIX, you should adjust your estimation keeping these 2 facts in mind:

  • Do not over count segments that are shared (that is – code)
  • Recognize that some process memory may be in the paging space.

If you do NOT want to do these calculations by hand, you can download and tools from this site.

Finally, is there any way to control how much memory ORACLE instance processes are using ?

How to control ORACLE process memory usage

There are several process memory controls that can be implemented.

The simplest way is to use AIX ulimits – you can set maximum memory allocation limits, separately for process data (User Data), stack and rss (virtual memory) components. You can set these (per user) settings in /etc/security/limits (or through smit) and you can view them with ulimit -a command:

AIX> ulimit -a
data(kbytes)         unlimited
stack(kbytes)        4194304
memory(kbytes)       unlimited

But limiting process memory usage in this way is like having a firing squad enforce parking rules – one small mistake and you are dead! Plus, ORACLE processes do not really know that they are NOT supposed to exceed AIX ulimits and some of them oftentimes (you know ORACLE …) might need to have a lot more additional memory.

So, we need another mechanism, one that is gentler and aware what ORACLE processes are doing, but at the same time, sane enough to not let ORACLE kill the system with unreasonable memory demands.

This mechanism is provided by ORACLE and there are actually two of them:

The older one – manual process memory management sets memory usage limits individually per process. MAX per-process memory allocation is controlled by parameter sort_area_size (with sort_area_retained_size acting as the required minimum). This is the only mechanism available up to ORACLE 8i and it will still be used in later versions if parameter workarea_size_policy is set to MANUAL.

The newer one – automatic process memory management sets memory usage limit collectively for ALL ORACLE server processes. It is controlled by pga_aggregate_target parameter and works when workarea_size_policy=AUTO.

A couple of things to remember about pga_aggregate_target:

  • It is an advisory upper target that ORACLE will try to enforce, but might not be able to under extraordinary circumstances (i.e. you if have 5000 concurrent active sessions that need to sort data but only allocated 200M of pga_aggregate_target – there is no way 200M target will be met)
  • It might not cover all the process memory. Rather what it covers is – various work areas: sort, hash, bitmap merge etc but if you decide to allocate another 1,000,000 item PL/SQL array in your session – ORACLE has no choice but to let you use memory for that (however, subsequent sessions will have to use less memory for sorting,hashing etc).

Ok, so I think we have a better idea now how to see the real memory usage by ORACLE processes and how to control that usage.

But what about the other (and arguably bigger) chunk of memory that ORACLE uses – SGA ? Stay tuned as we will talk about that in Part 2.

Useful Commands

# Regular AIX commands that should be available on any system ...

# ps v: Shortcut for real memory usage
AIX> ps v pid

# ps v statistics for all processes that belong to $ORACLE_SID instance
AIX> ps gvw | head -1; ps gvw |
egrep " oracle${ORACLE_SID} | ora_.*_${ORACLE_SID} " | grep -v egrep

# Detailed memory usage by process:
# All memory segments allocated to particular process
AIX> svmon -P pid

# Relevant memory segments allocated to particular process
# and categorized as SHARED/EXCLUSIVE
AIX> svmon -P pid -O filteprop=data -O segment=category

# The same information as before but summarized
AIX> svmon -P pid -O filteprop=data

# The same information as before but for all $ORACLE_SID instance processes

AIX> svmon -P $(ps -elf | egrep " oracle${ORACLE_SID} | ora_.*_${ORACLE_SID} " |
grep -v egrep | awk '{print $4}') -O segment=category -O filterprop=data

# ORACLE memory utilization scripts - you can find these files in the TOOLS section

# Total memory allocation for ORACLE instance(processes and SGA)

# Memory allocation by ORACLE processes (from ps v)

# Memory allocation by ORACLE processes (from svmon)
# -c part is optional but will display useful session information
# (you should connect to user with SELECT ANY DICTIONARY privilege)

AIX> -i $ORACLE_SID -c db_connection

# Detailed memory report for ORACLE instance processes
# See where memory is spent: user data, stack, shared library vars etc ...
#  (you need to have a BIG screen here)
AIX> -i $ORACLE_SID -r detailed