SunSolve Internal

Infodoc ID   Synopsis   Date
21622   Performance and Tuning on Solaris 2.6 and 7   1 Aug 2000

Description Top

When a system is running slowly and performance is degrading
it is difficult to know what is the cause. Whether it is lack
of memory or a particular application, there are ways to diagnose
the situation. This article provides performance monitoring tips, 
some concepts like Intimate Shared Memory (ISM) and priority paging, 
web sites for further reading, and some diagnostic suggestions,
like how to read vmstat output. The emphasis is on Solaris 2.6 
and Solaris 7. 

CONTENTS  

I.  PERFORMANCE MONITORING

	A. SE Toolkit
				
	B. Sun Performance Information 
				
	C. Looking for a Performance Bottleneck?
   		1. Using the vmstat Command
   		2. About Free Memory
		3. Sample Quick Performance Script
			
II. TOPICS ON KERNEL PARAMETERS AND TUNING THE SOLARIS KERNEL

	A. Adrian Cockcroft's Perspective on Tuning
				
	B. Understanding 64-bit Sizing and Capacities		
		1. Definitions and Concepts
		2. Performance
		3. Sample Commands
				
	C. Commonly Asked About Kernel Parameters
		1. KAIO and MAXASYNCHIO	
		2. About File Descriptors and Their Limitations
			2a. Using select() library function
		3. Adjusting Pseudo-terminal Devices (ptys)
				
	D. Performance and Database Tuning
		1. Priority Paging	
		2. Intimate Shared Memory (ISM)
			2a. ISM Defaults
			2b. Swap Configurations Related to Shared Memory
	 	3. Interprocess Communication Parameters (IPC)
	 		3a. Shared Memory Parameters
			3b. Semaphore Parameters
			3c. Message Queue Parameters
	
III. APPENDIX

	A. Sample Quick Performance Script
	
IV. REFERENCES
  		 		  	
	A. IPC articles in SunSolve SRDBs and InfoDoc Collections
	B. More SunWorld Online Articles
	C. New Book on Oracle8 and Unix Performance Tuning
	D. What Are The Kernel Patch Numbers?
	
==========================
I.  PERFORMANCE MONITORING

A. SE Toolkit 

	There are commercial products for performance monitoring
	but the best is the SE Toolkit, produced by Sun engineers
	and available using the following URL:

		http://www.sun.com/sun-on-net/performance/se3/
 
	It reports disk activity, cpu usage, TCP and network connections,
	memory and more. It is easy to install, does not require a reboot,
	and displays graphically that is easy understand.


B. Sun Performance Information 

	The Sun Performance Information page is available using this URL:

		http://www.sun.com/sun-on-net/performance.html
    
	It has links to articles written by Adrian Cockcroft, including all
	of his SunWorld Online Q&A Columns, and the SE Toolkit.  Adrian Cockcroft 
	co-authored Sun's authoritative book on performance and tuning:
	
		"Sun Performance and Tuning - Java and the Internet"
		
	It is also available at the above URL.  There are new chapters
	in the Second Edition (1998) including the last 2 chapters dedicated
	entirely to the SE Toolkit and its performance rules.


C. Looking for a Performance Bottleneck?
  				
	When a system is not performing to expectations, it is necessary
	to determine where the bottleneck is. Entire system hangs and process 
	hangs are beyond the scope of this article. In the case of what appears
	to be an entire system hang, kernel core files may need to be generated.
	First call your Sun Solution Center to develop a course of action.

	When looking for a performance bottleneck, review and monitor the following
	as the initial step for diagnosing a problem:

		/etc/system       - review configuration parameters and what has
		                    been recently changed
		/var/adm/messages - look for WARNINGS, errors, reboots, panics etc
		prtconf           - contains system information including
		                    amount of RAM
		prtdiag           - check for errors and hardware failures               
		swap -l           - review swap configuration
		/etc/release      - contains operating system release information
	 	pkginfo -l        - provides a complete listing of packages installed
	 	                    and their release and version numbers
       		showrev -p        - a listing of patches installed
       		
	To see the top processes using CPU and memory:

		ps -eo pid,pcpu,args | sort +1n      %cpu
		ps -eo pid,vsz,args | sort +1n       kilobytes of virtual memory

		/usr/ucb/ps aux |more               output is sorted with highest
	                                            users (processes) of cpu and
	                                            memory  at the top
	                                    
      
	1. Using the vmstat Command
	---------------------------

	The command vmstat is more concise and provides more information per
	line than the sar command. Here we can see a classic example of not
	enough cpu capacity for the executing applications.  
	
% vmstat 15

 procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr m0 m1 m2 m3   in   sy   cs us sy id
 45 0 0 2887216 182104 3 707 449 6 455 0 80 2  6  1  0 1531 5797  983 61 30  9
 58 0 0 2831312 46408 5 983 582 56 3211 0 492 0 0 0  0 1413 4797 1027 69 31  0
 55 0 0 2830944 56064 2 649 656 3 806 0 121 0  0  0  0 1441 4627  989 69 31  0
 57 0 0 2827704 48760 4 818 723 6 800 0 121 0  0  1  0 1606 4316 1160 66 34  0
 56 0 0 2824712 47512 6 857 604 56 1736 0 261 0 0 1  0 1584 4939 1086 68 32  0
 58 0 0 2813400 47056 7 856 673 33 2374 0 355 0 0 0  0 1676 5112 1114 70 30  0
 60 1 0 2816712 49464 7 861 720 6 731 0 110 7  0  3  0 2329 6131 1067 64 36  0
 58 0 0 2817552 48392 4 585 521 0 996 0 146 0  0  0  0 1357 6724 1059 71 29  0
 56 0 0 2821256 52616 3 675 435 1 148 0 24  0  0  0  0 1554 5914 1150 68 32  0
 
 ^                                                                           ^
 |                                                                           |
 
	The column labeled "r" under the procs section is the run queue of 
	processes waiting to get on the cpus.  The "id" column is cpu idle 
	time. This machine lacks the cpu resources to keep up with the process
	demand.  If adding cpus does not help this condition then a review of
	application code would be needed. As it turned out in this case, upgrading
	from an Ultra2 to an Ultra 4500 fixed the problem and the applications 
	are running fine.
   
        2. About Free Memory
        --------------------
        
	To look for a shortage of memory, do not rely upon the "free" column.
	That column is not an indication of a lack of memory.  Much has
	been written on this subject in the SunWorld articles and "Sun Performance
	and Tuning".  To determine if there is a lack of memory examine at 
	the 12th column, "sr", or scan rate.
	
	If that column sustains high numbers (hundreds to thousands of pages/second),
	then there is not enough memory for the processes or files wanting to be loaded
	into memory, and adding more memory is likely to help.  By default, 
	lotsfree is 1/64 of memory.  That is why the free column usually hovers
	just above 1/64 of physical memory (RAM).

	The pageout scanner runs only when the free list shrinks below a threshold
	(lotsfree in pages). Any process or file inactive and not locked in memory
	will be paged out. The size of the freelist will appear to shrink to a very
	small value (determined by the tunable parameter "lotsfree"), and will remain 
	at that value.  The page daemon will "kick in" and look for more memory to 
	reclaim from exited and idle processes when the amount on the freelist drops
	below the lotsfree threshold.  There is no way for the "free" value to 
	grow much above the threshold, because there is no way to get the page 
	daemon to work to reclaim memory beyond the threshold. lotsfree can be
	increased but then memory is wasted.

	What all of this means, is that the size of the freelist is no indication of
	how much free memory there really is, because there may be a great amount of
	unused memory which has yet to be reclaimed by the page daemon, because the
	page daemon has no need to reclaim it.

	Files are also mapped directly into memory by processes using the mmap command.
	This command maps in the code and shared libraries for a process. Pages may
	have multiple references from several processes while also being resident 
	in the filesystem cache. A recent optimization is that pages with eight or more
	references are skipped by the scanner, even if they are inactive. This feature
	helps shared libraries and multiprocess server code stay resident in memory. 
	(p. 332 "Performance & Tuning - Java and the Internet")
  
  	See also:
  	
	The Memory-Go-Round - Confused by "free memory?"
	http://www.sunworld.com/swol-05-1997/swol-05-perf.html
  
	Clearing up swap space confusion - Why don't swap space numbers add up? 
	http://www.sunworld.com/swol-07-1998/swol-07-perf.html

	3. Sample Quick Performance Script
	----------------------------------  
   
	It is important to get a "snapshot" of the system when
	trying to determine a performance bottleneck. The
	"Sample Quick Performance Script" (see Appendix) uses
	the netstat, mpstat, vmstat, iostat, and ps commands to
	monitor system activity. Just running one of these command
	does not provide the context necessary for a complete
	debugging of the problem.

	This script doesn't need to be run by cron, it stops execution
	at 200 iterations (while [ $X != 200 ]), and it is easily modifiable.
	It does not contain crash(1M) kmastat and map kernelmap output 
	which is included in other scripts available in the SRDBs/InfoDoc
	SunSolve collections, primarily because kernel memory leaks are rare.
	However, crash is useful for determining kernel memory fragmentation. 

	See Appendix A for the Sample Quick Performance Script
		
		
=============================================================
II. TOPICS ON KERNEL PARAMETERS AND TUNING THE SOLARIS KERNEL

A. Adrian Cockcroft's Perspective 

	To start our discussion of whether to tune the Solaris 
	kernel, here are important points to keep in mind from 
	Adrian Cockcroft, author of SE Toolkit (see I. A. above):
		
		Tuning the kernel is a hard subject to deal with. Some
		tunables are well known and easy to explain. Others
		are more complex or change from one release to the
		next. The settings in use are often based on out-of-date
		folklore. (page 349 - "Sun Performance and Tuning - 
		Java and the Internet")
	
		So why is there so much emphasis on kernel tuning?
		And why are there such high expectations of the 
		performance boost available from kernel tweaks? I
		think the reasons are historical, and I'll return
		to my car analogy to explain it. (page 351 - "Sun
		Performance and Tuning - Java and the Internet")

	The author goes on to talk about the difference between 1970s
	cars that had, for example, carburetors that needed to be rebuilt 
	and workshop manuals for car repair, with the cars of
	today. Now in the 1990's, computerized ignition and fuel injection
	really makes self repair a thing of the past as there just aren't 
	user serviceable engine components.  

	Continuing on page 351:

		Unix started out in an environment where the end users
		had source code and did their own tuning and support. If you like
		this way of working, you probably already run the free
		Unix clone, Linux, on your PC at home. As Unix became a
		commercial platform for running applications, the end users
		changed. Commercial users just want to run their application,
		and tinkering with the operating system is a distraction.
		SunSoft engineers have put a lot of effort into automating
		the tuning for Solaris 2. It adaptively scales according to
		the hardware capabilities and the workload it is running.
		... The self-configuring and tuning nature of Solaris
		contributes to its ease of use and greatly reduces the
		gains from tweaking it yourself. Each successive version
		of Solaris 2 has removed tuning variables by converting
		hand-adjusted values into adaptively managed limits.
			
		If SunSoft can describe a tunable variable and when it should be
		tuned in detail, they could either document this in the
		manual or implement the tuning automatically. In most
		cases, automatic tuning has been implemented. The tuning
		manual should really tell you which things don't need to be
		tuned any more, but it doesn't.  This is one of my complaints
		about the manual, which is really in need of a complete
		rewrite. It is too closely based on the original Unix
		System V manual from many years ago when things did need tuning.
		page 351 - "Sun Performance and Tuning - Java and the Internet"
	
B. Understanding 64-bit Sizing and Capacities	
		
	1. Definitions and Concepts
	---------------------------
		
	What is meant by 64 bits and what are the performance gains?
	Here's what 64 bits means and how it is used:

	1) 64-bit arithmetic operations - 
	   Most computer languages support 64-bit arithmetic options.
	   Full-speed, 64-bit, floating-point and integer arithmetic has been
	   available since the SuperSPARC shipped in 1992. A full set of 
	   64-bit floating-point registers and accelerations are available
	   in V8plus mode on UltraSPARC and further accelerations are
	   available in SPARC V9.  (See p. 136 and 137 "Performance and Tuning
	   - Java and the Internet")
	   
	2) 64-bit data types -
	   The size of the int and long data types have been agreed on
	   by the Unix industry. The agreement is to leave int as 32 bits 
	   but to make long 64 bits. Prepare your code in advance by
	   using long to perform pointer arithmetic. This approach
	   is known as the LP64 option. (See p. 137 "Performance and Tuning
	   - Java and the Internet")
	   
	3) 64-bit internal and external buses -
	   A good description of CPU datapaths and external interfaces
	   and caches can be found on page 138 of "Performance and Tuning
	   - Java and the Internet". SuperSPARC and UltraSPARC (and earlier) 
	   are at least 64 bits wide and the fastest designs are wider still.
	    
        4) 64-bit addressing -        
           This is the latest developement in the history and evolution
           of 64 bit capability. Pointers and addresses (process address space
           and memory address space) go from 32-bit quantities to 64-bit 
           quantities. 64-bit addressing capability refers to the size of 
           the linear address space that can be manipulated directly by the CPU
           and with 64-bit addressing it is greater than 4 Gbytes. 


       	The 64-bit solaris 7 kernel needs 64-bit device drivers.
	Previous device drivers will only work in a 32-bit kernel.
   
	64-bit applications can only use 64-bit libraries.
	A 64-bit application cannot link with a 32-bit library.
   
   
	2. Performance
	--------------
	
   	To summarize how performance is affected by 64 bit addressing,
   	here is an excerpt from the SunWorld article titled "What's new
   	in Solaris 7?" 

		From a performance point of view, the ability to run 64-bit
		applications on Solaris 7 has two main benefits. One is that
		much larger problems can be solved efficiently using a
		bigger process address space; the other is that integer
		arithmetic computations get to use 64-bit registers and operations.
		Overall, programs get slightly larger due to larger pointer
		values in code and data structures. This in turn means that
		CPU caches are a little less likely to have enough cache lines, 
		and a slight slowdown might occur in programs that
		could run just as well in a 32-bit environment. 

	For the complete article visit the URL:
	http://www.sunworld.com/swol-11-1998/swol-11-perf.html
	
	3. Sample Commands
	------------------
	
	To determine the running kernel and supported instruction 
	set on Solaris 7 see man page for the isainfo command. 
	For example:

		% isainfo -v
		64-bit sparcv9 applications
		32-bit sparc applications

	The file command can be used to identify binaries or libraries:
	For example:
	
		% file /kernel/drv/sparcv9/poll
		/kernel/drv/sparcv9/poll:	ELF 64-bit MSB relocatable
		SPARCV9 Version 1
		
		% file /kernel/drv/poll
		/kernel/drv/poll:	ELF 32-bit MSB relocatable SPARC Version 1

	In addition, see the man page on largefile(5):

		% man largefile
		NAME
	  	largefile -  large file status of utilities
     
		DESCRIPTION  
   
   	  	On a 32-bit system, a large file is a regular file whose
   	  	size is greater than or equal to 2 Gbyte (2**31 bytes). A
    	  	small file is a regular file whose size is less than 2
    	  	Gbyte.  
	  	...
	  
	For more information see "Porting Performance Tools to 64bit Solaris
	White Paper" (URL http://www.sun.com/sun-on-net/performance.html)

C. Commonly Asked About Kernel Parameters

	1. KAIO and MAXASYNCHIO	
	-----------------------
	
	One kernel parameter that can not be adjusted is the kernel
	async I/O parameter, (also known as KAIO, or MAXASYNCHIO).
	Starting with Solaris 2.4 the kernel provides direct support for KAIO.
	The Solaris 2.5.1 implementation of KAIO is the first to permit 
	operation on full 64-bit devices. Previous versions supported 
	device sizes of only 2 GB or less. 
	
	2. About File Descriptors and Their Limitations
	-----------------------------------------------
		
	All versions of Solaris (including Solaris 7 64-bit) have a default
	"soft" limit of 64 and a default "hard" limit of 1024.

	Processes may need to open many files or sockets as file
	descriptors. Standard I/O (stdio) library functions have
	a defined limit of 256 file descriptors as the fopen() call,
	datatype char, will fail if it can not get a file descriptor between 
	0 and 255. The open() system call is of datatype
	int, removing this limitation.  However, if open() has opened
	0 to 255 file descriptors without closing any, fopen() will
	not be able to open any file descriptors as all the low-numbered
	ones have been used up. Applications that need to use many
	file descriptors to open a large number of sockets, or
	other raw files, should be forced to use descriptors
	numbered above 256. This allows system functions such as
	name services, to work as they depend upon stdio routines.
	(See p 368 "Performance and Tuning - Java and the Internet").

	There are limitations on the number of file descriptors
	available to the current shell and its descendents. (See the ulimit
	man page).  The maximum number of file descriptors that can
	be safely used for the shell and Solaris processes is 1024. 
	This limitation has been lifted for Solaris 7 64-bit which 
	can be 64k (65536).

	Therefore the recommended maximum values to be added to /etc/system are:

		set rlim_fd_cur=1024
		set rlim_fd_max=1024

	To use the limit command with csh:

		% limit descriptors 1024
	
	To use the ulimit command with Bourne or ksh:

		$ ulimit -n 1024

	However, some third-party applications need the max raised.  
	A possible recommendation would be to increase  rlim_fd_max, 
	but not the default (rlim_fd_cur). Then rlim_fd_cur can be
	raised on a per-process basis if needed, but the higher setting
	for rlim_fd_max doesn't affect all processes.
 

	2a. Using select() library function
	-----------------------------------
	
	The select(3C) library function uses a fixed-size
	bitfield that can only handle 1024 file descriptors
	unless on Solaris 7. The alternative is to use poll(2)
	which has no limit. This is explained in the man pages. 
	For example, the following is an excerpt from the 
	select(3c) man page on Solaris 2.6:
	
		% man -s3c select
		C Library Functions                                    select(3C)
		...
		NOTES
		The default value for FD_SETSIZE (currently 1024) is larger
		than the default limit on the number of open files. It is
		not possible to increase the size of the fd_set  data type
		when used with select().

		SunOS 5.6           Last change: 18 Apr 1997                    4


	The only way to safely raise the number of file descriptors
	above 1024 is through the application and recompiling.
	Solaris 7 allows upto 65536 fds passed to select() with 
	recompiling for a larger value of FD_SETSIZE.
	The application needs to be compiled for 64-bit inorder
	to get the larger FD_SETSIZE. You are limited to 1024 
	for a 32 bit application. Refer to /usr/include/sys/select.h.

	However, developers can define a new FD_SETSIZE.  A careful
	reading of the man pages is necessary. From the select() man page 
	on Solaris 7:
	
		% man -s3c select
		C Library Functions                                    select(3C)
		...
		NOTES
		The default value for FD_SETSIZE (currently 1024) is  larger
		than  the  default  limit  on  the  number of open files. To
		accommodate 32-bit applications that  wish to use  a  larger
		number  of  open  files  with  select(),  it  is possible to
		increase this size at compile time  by  providing  a  larger
		definition   of   FD_SETSIZE   before   the   inclusion   of
		<sys/types.h>. The maximum supported size for FD_SETSIZE  is
		65536.  The default value is already 65536 for 64-bit appli-
		cations.

		SunOS 5.7           Last change: 22 Apr 1998                    4

		
	3. Adjusting Pseudo-terminal Devices (ptys)
	-------------------------------------------
		
	Increasing ptys is accomplished by editing the /etc/system
	file using the following syntax:
	
		set pt_cnt=128
			
	A reboot with the -r option is necessary to create the new
	devices.  If this option was not used, it is not necessary to reboot
	again.	The following commands can be used to create the new devices:

		# drvconfig
		# devlinks

	Note: The parameter pt_cnt needs to be preset in the kernel at boot time
	by way of the /etc/system file because the kernel sizes a structure based
        on pt_cnt at boot If this is changed on a live kernel using the adb command
        or the crash command it can cause a system panic.
 
	To verify the number of ptys do the following:

  		# cd /dev
		# ls pty* | wc -l
  		# cd /dev/pts
		# ls                  

	The output from the ls command should show the highest number as 1 less
	than the number of pty* in the /dev directory (because of zero).
	(See the man page for pty and pts for more information.)  

	To see how many ptys are in use the crash command as root,
	shown in this example:
 		
  		# crash
  		dumpfile = /dev/mem, namelist = /dev/ksyms, outfile = stdout
		> pty
		ptms_tty TABLE SIZE = 48
		SLOT   MWQPTR   SWQPTR  PT_BUFP  TTYPID STATE
   		0 f5e84708 f5e84328 f5abcd08     243 mopen sopen
   		1 f5f2ad90 f5f2a9b0 f5abc498     291 mopen sopen
   		2 f5f2a3e0 f5f34ba8 f5f309c8     297 mopen sopen
   		3 f5f342f0 f5f4cca8 f5abc2b8     302 mopen sopen
   		4        0 f5f4c3f0        0       0 sopen
   		5 f5f568d0 f5f564f0 f5f30590     312 mopen sopen
   		6 f5933410 f5fdfad0 f605dcc8    1351 mopen sopen
		> q

	The above output shows 6 ptys in use and the table size of 48
	which is the default value. After increasing the pt_cnt parameter
	in the /etc/system file and reboot the system, the table size 
	will reflect the new setting. 


D. Performance and Database Tuning

	1. Priority paging
	------------------
 
	Priority paging is new with Solaris 7 and was backported to
	Solaris 2.6 (kernel patch 105181-09) and Solaris 2.5.1
	(kernel patch 103640-25) and the equivalent x86 kernel
	patches listed at the end of this article.

	Priority paging provides an improved paging algorithm
	which can significantly enhance system response when the file system 
	is being used. Priority paging introduces a new additional 
	water mark, cachefree. The paging parameters are now:

   		minfree < desfree < lotsfree < cachefree
   
	By default the new behavior is turned off, so it is important
	to enable this functionality on systems that are paging noticeably. 
	cachefree is set to lotsfree if priority_paging is not enabled.
	If it is enabled then cachefree is set to 2 times lotsfree.

	Adjusting this parameter makes switching between windows on 
	desktop systems faster, and is a big help for systems running 
	databases that read large files into memory from the filesystem.
	If your system pages heavily, speed increases of several hundred percent
	have been seen for compute-intensive jobs with a large dataset.
                         
	For more information on priority paging refer to the following URLs:

		http://www.sunworld.com/swol-11-1998/swol-11-perf.html                         
 		http://www.sun.com/sun-on-net/performance/priority_paging.html
                   
	From INFODOC ID: 17946:

		These tunables provide a mechanism to allow priority paging, 
		whereby filesystem pages will be paged out before application,
		executable, and/or shared library pages.  This can provide a 
		significant performance benefit when using applications which
		perform frequent, random access to filesystem data which is
		used once and then no longer required, by flushing filesystem
		pages in preference to other types of pages.

		By default, priority paging is disabled (priority_paging = 0); 
		it is	enabled by adding the following line to the /etc/system 
		file and rebooting:

			set priority_paging=1

		The default value of cachefree depends on the value of the 
		priority_paging	tunable.  If priority_paging is 0, then 
		cachefree is set equal to lotsfree.
		If priority_paging=1, then cachefree is set to 2 times lotsfree.
	
		When priority paging is enabled and free memory drops below 
		cachefree, the page scanner will start to run, but will only 
		mark pages used for filesystem data for pageout.  If freemem
		drops below lotsfree, the scanning algorithm performs just as
		it did before, regardless of the setting of priority_paging.

		cachefree can be adjusted by setting it in the /etc/system file; 
		if the value specified in /etc/system is less than lotsfree, 
		it will be set to lotsfree instead.  Setting cachefree equal
		to lotsfree is the equivalent to not having priority paging enabled.
		A reboot is necessary.

	2. Intimate Shared Memory (ISM)
	-------------------------------

	Large database applications on servers benefit from large
	shared caches of data. By default, applications such as
	Oracle, Informix, and Sybase use a special flag to specify
	that they want intimate shared memory (ISM).  ISM provides
	for the shared memory to be locked and cannot be paged
	out. Memory management data structures that are 
	normally created on a per process basis are created once and
	then shared by every process.  In Solaris 2.6 a further
	optimization takes place as the kernel tries to find 4-Mbyte
	contiguous blocks of physical memory that can be used as
	pages to map the shared memory. This greatly reduces 
	memory management unit overhead. (p.333 "Performance and Tuning
	- Java and the Internet")

	Excerpts from "Shared memory uncovered" by Jim Mauro:

http://www.sunworld.com/sunworldonline/swol-09-1997/swol-09-insidesolaris-2.html

	Intimate shared memory (ISM) is an optimization introduced 
	first in Solaris 2.2. It allows for the sharing of the translation 
	tables involved in the virtual to physical address translation
	for shared memory pages, as opposed to just sharing the 
	actual physical memory pages. Typically, non-ISM systems maintain
	a per-process mapping for the shared memory pages. With many processes
	attaching to shared memory, this creates a lot of redundant 
	mappings to the same physical pages that the kernel must maintain.
	Additionally, all modern processors implement some form of a translation
	lookaside buffer (TLB), which is (essentially) a hardware cache 
	of address translation information. SPARC processors are no exception,
	and, just like an instruction and data cache, the TLB has limits as to
	how many translations it can maintain at any one time. As processes 
	get context switched in and out, we can reduce the effectiveness 
	of the TLB. If those processes are sharing memory, and we can share
	the memory mappings also, we can make more effective use of the 
	hardware TLB.                 
 	...
	Closing notes - Shared memory is a powerful and relatively simple
	way to share data between processes. The use of shared memory by
	applications requires setting the shared memory tunable parameters 
	to provide sufficient resources for the application. Hopefully,
	the use and implementation of these tunables has been made clear. 

	Intimate shared memory is an important optimization that makes
	more efficient use of the kernel and hardware resources
	involved in the implementation of virtual memory and provides 
	a means of keeping heavily used shared pages locked in memory. 
	...
	

        a. ISM Defaults
        ---------------
			                                    
	Intimate shared memory is enabled by default and there is no
	need to edit the /etc/system file to turn on this feature. With a 
	currently patched kernel, turning off ISM can cause system
	degradation and possibly a hang condition.  In addition 
	database configuration files, such as Oracle's init.ora file,
	should not have "use_ism=false" which turns it off.                             


	b. Swap Configurations Related to Shared Memory 
	-----------------------------------------------
	
	To understand swap configurations related to shared memory
	see Sunworldonline, "Swap space implementation part 2"
	also by Jim Mauro:

	    http://www.sun.com/sunworldonline/swol-01-1998/swol-01-insidesolaris.html

	Here are some excerpts:

		So, how much swap should be configured? Well, the basic requirement
		for swap is driven directly by the amount of anonymous memory the 
		system needs to run the application. As we said earlier, if there 
		is enough physical RAM to hold all the required process pages,
		then the system can run literally with no physical swap configured.
		How much anonymous memory a system needs is not an easy thing to 
		determine with currently available tools. Your best bet is to use 
		the pmap command to examine process address space maps and measure
		the heap and anon sizes. This is time consuming, but  reasonably
		accurate. 

		As an additional data point, experience has shown that large 
		applications (e.g., SAP/R3, BAAN, Oracle) tend to have processes
		with large virtual address spaces. This is typically the result
		of attaching to large shared memory segments used by relational 
		databases and large copy-on-write (COW) segments that get mapped
		but sometimes never actually get touched. The net effect of this
		is that on large systems supporting these commercial applications,
		the virtual address space requirements grow to be quite large, 
		typically exceeding the physical memory size. Consequently,
		such systems often require a fair amount of swap disk configured
		to support many processes with large VA space running
		concurrently. You need to configure 1 to 1.5 times the amount
		of RAM you have for swap so the systems can fully utilize
		all of the physical memory without running out of virtual
		swap space. 


	3. Interprocess Communication Parameters (IPC)
	----------------------------------------------

	The values for the following IPC parameters need to be determined
	by the Database Administrator (DBA). The Sun Solution Centers can 
	not give recommendations for what the actual IPC parameter
	settings should be. These values are application dependent.

	For Solaris releases previous to 2.6 more swap space, "backing
	store" is needed for shared memory. Use swap -l. Divide the block 
	numbers by 2 to get megabytes. There should be atleast 2 times 
	the amount of swap for allocated shared memory (shmmax).

	Here is the default and maximum values for shmmax:

         	 Default         Maximum            
	shmmax  1048576 (1Meg)  4294967295  (4GB) 2.5.1, 2.6, 32bit solaris 7
                                2147483647  (2GB) 2.5 or lower
                       
                Solaris 2.6 shmmax and shmmin are unsigned ints (32 bit).
		Solaris 7 "32-bit" shmmax and shmmin are unsigned ints (32 bit).
		Solaris 7 "64-bit" shmmax and shmmin are unsigned longs (64 bit).
		In all cases, shmmni and shmseg are signed ints (31 bit).

	shmmax limits the maximum size of a shared memory segment, which is the 
	largest value which can be requested of shmget(2). The resource it controls
	is not preallocated. It is allocated on demand.
	
	Solaris 7 64-bit breaks the 4GB barrier. The maximum size is theoretical. 
	The actual settings need to be based on the system resources like
	memory and database sizes and configurations. The maximum size of 
	the segment itself (shmmax) is just an upper limit. 
  	
  	
	                    ******
	See REFERENCES A. IPC articles available in the SunSolve collections
	See REFERENCES B. More SunWorld Online Articles
	See REFERENCES C. New Book on Oracle8 and Unix Performance Tuning
	                    ******
	
   	-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
   	3a, 3b and 3c below show examples of the syntax for setting
	IPC parameters in the /etc/system file using decimal values.
	Hex values can also be used but would need to be preceeded with "0x".
	
	3a. Shared Memory Parameters  
	----------------------------

	Example of shared memory parameters in /etc/system:
	
	#
	set shmsys:shminfo_shmmax=1048576000
	set shmsys:shminfo_shmmin=1
	set shmsys:shminfo_shmmni=200
	set shmsys:shminfo_shmseg=10
	#
	
	The above parameters are displayed in the order in which they appear
	in the /usr/include/sys/shm.h file on Solaris 2.6 and Solaris 7.
	The header files, shm.h, sem.h and smg.h, are included below
	to demonstrate the following:

	 	1) show the exact parameter name 
	 	2) show the definition of each parameter 
	 	3) show how to construct the syntax for
	   	   adding an IPC parameter to the /etc/system file
	    
	For example, setting the maximum shared memory segment size
	in the /etc/system file would be as follows:

		set shmsys:shminfo_shmmax = <decimal or hex number>
   
	shmsys:shminfo_shmmax is constructed from: 
	 
	 	shm       (header file name - shm.h)
	        sys       (directory abbreviation -/usr/include/sys)
	        shminfo   (the structure name - struct shminfo)
	        shmmax    (a parameter in struct shminfo)

	From /usr/include/sys/shm.h:

	struct  shminfo {
		size_t  shmmax,         /* max shared memory segment size */
			shmmin;         /* min shared memory segment size */
         	int     shmmni,         /* # of shared memory identifiers */
                	shmseg;         /* max attached shared memory     */
                	                /* segments per process           */ 
	};                                
	
	Note that a type "size_t" is different on Solaris 2.6 and Solaris 7:

	Solaris 2.6 /usr/include/sys/types.h:

	#ifndef _SIZE_T
	#define	_SIZE_T
	typedef	uint_t	size_t;		/* len param for string funcs */
	#endif

	Solaris 7 /usr/include/sys/types.h:

	#ifndef _SIZE_T
	#define _SIZE_T
	#if defined(_LP64) || defined(_I32LPx)
	typedef ulong_t size_t;         /* size of something in bytes */
	#else
	typedef uint_t  size_t;         /* (historical version) */
	#endif
	#endif  /* _SIZE_T */

                
	3b. Semaphore Parameters  
	------------------------

	Example of a semaphore parameter in /etc/system:
	
	#
	set semsys:seminfo_semmap=15
	#
	  
	From /usr/include/sys/sem.h:
	/*
	 * Semaphore information structure
	 */
	struct  seminfo {
        	int     semmap;         /* # of entries in semaphore map */
        	int     semmni;         /* # of semaphore identifiers */
        	int     semmns;         /* # of semaphores in system */
        	int     semmnu;         /* # of undo structures in system */
        	int     semmsl;         /* max # of semaphores per id */
        	int     semopm;         /* max # of operations per semop call */
        	int     semume;         /* max # of undo entries per process */
        	int     semusz;         /* size in bytes of undo structure */
        	int     semvmx;         /* semaphore maximum value */
        	int     semaem;         /* adjust on exit max value */
	};


	3c. Message Queue Parameters  
	---------------------------

	Example of a message queue parameter in /etc/system:
	
	#
	set msgsys:msginfo_msgmap=150
	#
	
	From /usr/include/sys/msg.h:
	/*
 	* Message information structure.
 	*/
 
	struct msginfo {
        	int             msgmap; /* # of entries in msg map */
        	int             msgmax; /* max message size */
        	int             msgmnb; /* max # bytes on queue */
        	int             msgmni; /* # of message queue identifiers */
        	int             msgssz; /* msg segment size (should be word size */
                          		/* multiple) */
        	int             msgtql; /* # of system message headers */
        	ushort_t        msgseg; /* # of msg segments (MUST BE < 32768) */
	};

  	
	                    ******
	See REFERENCES A. IPC articles available in the SunSolve collections
	See REFERENCES B. More SunWorld Online Articles
	See REFERENCES C. New Book on Oracle8 and Unix Performance Tuning
	                    ******

============
III. APPENDIX

		**********************************************
		*** This script is provided as a courtesy. ***
		*** Usage, support and modifications       ***
		*** are left solely up to the user.        *** 
		**********************************************
		
	A. Sample Quick Performance Script
	
	----------------------------------------------------------
	----------- Begin Quick Performance Script ---------------   
	#!/bin/ksh 
	# 

	LOGDIR="/tmp/test"
	mkdir $LOGDIR

	NETSTAT_OUTFILE="$LOGDIR/netstat.out"
	MPSTAT_OUTFILE="$LOGDIR/mpstat.out"
	VMSTAT_OUTFILE="$LOGDIR/vmstat.out"
	IOSTAT_OUTFILE="$LOGDIR/iostat.out"
	PS_OUTFILE="$LOGDIR/ps_aux.out"
	INTERVAL=5
	COUNT=10
	TIME=`expr $INTERVAL \* $COUNT`
	TIMEPLUS=`expr $TIME + 5`
	X=1


	while [ $X != 200 ]
	do

		echo "" >> $PS_OUTFILE
        	date >> $PS_OUTFILE
        	/usr/ucb/ps -aux >> $PS_OUTFILE &

		echo "" >> $MPSTAT_OUTFILE
		date >> $MPSTAT_OUTFILE
        	mpstat $INTERVAL $COUNT >> $MPSTAT_OUTFILE &
		MPSTATPID=$!

		echo "" >> $VMSTAT_OUTFILE
        	date >> $VMSTAT_OUTFILE
        	vmstat  $INTERVAL $COUNT >> $VMSTAT_OUTFILE &
		VMSTATPID=$!

		echo "" >> $IOSTAT_OUTFILE
        	date >> $IOSTAT_OUTFILE
        	iostat -xp $INTERVAL $COUNT >> $IOSTAT_OUTFILE &
		IOSTATPID=$!

		echo "">> $NETSTAT_OUTFILE
        	date >> $NETSTAT_OUTFILE
        	netstat -i $INTERVAL >> $NETSTAT_OUTFILE &
        	NETSTATPID=$!

	#echo "mpstat $MPSTATPID"
	#echo "vmstat $VMSTATPID"
	#echo "iostat $IOSTATPID"
	#echo "netstat $NETSTATPID"
	
	#echo "sleeping for $TIMEPLUS"
		sleep $TIMEPLUS
		
	#echo "sleeping done"

		kill -9  $NETSTATPID

        	X=`expr $X + 1`
	done

	-------------- End Quick Performance Script ---------  
	-----------------------------------------------------
   
 
==============
IV. REFERENCES
  		 		  			
	A. IPC articles in SunSolve SRDBs and InfoDoc Collections
	---------------------------------------------------------
	
	There are numerous articles that have been written by the Sun Solution
	Centers on the subject of IPC parameters.  They are available on the
	SunSolve web site (sunsolve.sun.com) after logging in as a contract
	customer. Here is a partial list:

	If modifications to the /etc/system file do not seem to have taken
	effect, read the following:

	SRDB ID:    12824: sysdef -i does not report IPC parameters set
	                   in /etc/system

	For general information on the IPC parameters:

	INFODOC ID: 13421: Shared Memory Commands Explained
	INFODOC ID:  6328: All about Shared Memory Parameters in 2.X
	INFODOC ID:  2270: Understanding semaphores, seminfo_ semaphore info
	INFODOC ID: 13523: Semaphores Explained
	SRDB    ID: 12075: How to configure the IPC semaphores and shared memory 
	SRDB    ID:  5288: How to determine the IPC semaphore parameter values
	INFODOC ID:  2273: Kernel tuning parameters for message queues
	INFODOC ID:  7241: Determine the message queue parameters

	For debugging problems:

	SRDB    ID: 14854: Using adb to verify shared memory and
	                   semaphore parameters
	SRDB    ID: 12174: How to check how much shared memory is used by system
	INFODOC ID: 13480: How to use the UNDO feature in semaphore operations
	SRDB    ID: 13414: semop reports ENOSPC or "no space left on device"
        SRDB    ID: 16985: A process using shared memory has terminated, but 
                           swap space doesn't seem to get reclaimed. 
                           (Using ipcrm)
 
		 
	B. More SunWorld Online Articles
	--------------------------------
	
		Shared Memory Uncovered
http://www.sunworld.com/sunworldonline/swol-09-1997/swol-09-insidesolaris.html
	
		Setting Our Sights on Semaphores	
http://www.sunworld.com/sunworldonline/swol-10-1997/swol-10-insidesolaris.html
	
		Demangling Message Queues 
http://www.sunworld.com/swol-11-1997/swol-11-insidesolaris.html	
	
	
	C. New Book on Oracle8 and Unix Performance Tuning
	-----------------------------------------------------
	
	Oracle8 & Unix Performance Tuning 
	by Ahmed Alomari
	Prentice Hall PTR 1999
	
	This book has a wealth of information and covers Solaris 2.6.
	
	
	D. What Are The Kernel Patch Numbers?
	-------------------------------------
	
	These are the kernel patch numbers:
	
	-----------------------------------------------
	| Solaris 7   - 106541         7_x86 - 106542 |
	|       2.6   - 105181       2.6_x86 - 105182 |
	|       2.5.1 - 103640     2.5.1_x86 - 103641 |
	-----------------------------------------------
	
	Here's the command to run to determine what 
	kernel revision is installed:

   		% showrev -p | grep <patch #>

	(showrev will produce a lot of verbose output)  
	or use uname, for example:

		% uname -a
		SunOS scotty 5.6 Generic_105181-14 sun4u sparc SUNW,Ultra-5_10


                  
Applies To Hardware
Attachments (none)

Top

SunWeb Home SunWeb Search SunSolve Home Simple Search

Sun Proprietary/Confidential: Internal Use Only
Feedback to SunSolve Team