System Performance Tuning: Help for Unix Administrators
Rate it:
2%
Flag icon
John Hennessy and David Patterson: they are titled Computer Organization and Design: The Hardware/Software Interface and Computer Architecture: A Quantitative Approach (both published by Morgan Kaufmann).
Bhaskar Chowdhury
Noted.
11%
Flag icon
NCA uses a kernel module to transparently cache static web content in a kernel memory buffer, and replies to HTTP document requests for documents in its cache without ever waking up the application web server.
Bhaskar Chowdhury
SUN's NCA
15%
Flag icon
processor performance doubles roughly every eighteen months, but memory performance doubles roughly every seven years.
Bhaskar Chowdhury
Kinda bummer,but true.
16%
Flag icon
Caches are organized into equal-sized chunks called lines.
Bhaskar Chowdhury
hmm
17%
Flag icon
Linux addresses this issue by adopting an empirical rule related to the processor’s cache size: the larger the processor’s cache, the longer a process will wait for a piece of time on that processor.
Bhaskar Chowdhury
Yup.
18%
Flag icon
Every LWP has a kernel thread, but every kernel thread need not have an LWP:
Bhaskar Chowdhury
Unidirectional.
18%
Flag icon
Threads generally fall into five categories for scheduling: timesharing (ts), interactive (ia), kernel, real-time (rt), and interrupt.
Bhaskar Chowdhury
yup
20%
Flag icon
Buses implement either circuit-switched or packet-switched protocols.
Bhaskar Chowdhury
Righto!
22%
Flag icon
Spin locking is accomplished entirely within the processor’s cache, so that it does not cause excess bus traffic.
22%
Flag icon
On a system with many processors that is under load, the mutex locking itself can actually become a bottleneck.[16] If this is the case, adding additional processors will hinder, rather than help, performance.
Bhaskar Chowdhury
A classic case , which bust the myth.
22%
Flag icon
poor application design and implementation is a possible, indeed a likely, root cause of poor performance on multiprocessor systems.
Bhaskar Chowdhury
Caveat! Adding more processors doesn't help.
22%
Flag icon
larger processor caches often cause a great performance increase on multiprocessor systems.
Bhaskar Chowdhury
That's becasue lack of travel.
24%
Flag icon
PCI is a synchronous bus architecture, which means that all data transfers are performed relative to a system clock. The initial PCI specification permitted a maximum clock rate of 33 MHz, but the later Revision
24%
Flag icon
2.1 specification extended this to 66 MHz.
25%
Flag icon
The high speed of the PCI bus (up to 528 MB/second, at 64-bit data paths and a 66 MHz clock rate) limits the number of expansion slots on a single bus to no more than three or four slots due to electrical concerns.
Bhaskar Chowdhury
PCI limits.
25%
Flag icon
Each PCI device includes a set of registers that contain configuration data. These registers define what the type of the card is (SCSI, Ethernet, a framebuffer, etc.), as well as who manufactured the card, what the interrupt level of the card is, and so on.
Bhaskar Chowdhury
PCI Config store.
25%
Flag icon
PCI supports both 5-volt and 3.3-volt signaling levels.
Bhaskar Chowdhury
PCI Voltage range.
25%
Flag icon
In general, interrupt priorities are assigned in decreasing order of IRQ; that is, the system timer (IRQ 0) has priority over all other IRQs.
Bhaskar Chowdhury
System IRQ
26%
Flag icon
idle time is zero, as reported by vmstat, the first thing you should check is if your system has I/O throughput problems.
32%
Flag icon
The difference between them is how each memory cell is designed. Dynamic cells are charge-based, where each bit is represented by a charge stored in a tiny capacitor. The charge leaks away in a short period of time, so the memory must be continually refreshed to prevent data loss. The act of reading a bit also serves to drain the capacitor, so it’s not possible to read that bit again until it has been refreshed. Static cells, however, are based on gates, and each bit is stored in four or six connected transistors. SRAM memories retain data as long as they have power; refreshing is not ...more
32%
Flag icon
cheaper and offers the highest densities of cells per chip; it is smaller, less power-intensive, and runs cooler. However, SRAM is as much as an order of magnitude faster, and therefore is used in high-performance environments.
32%
Flag icon
The first represents the amount of time required to read or write a given location in memory, and is called the memory access time.
Bhaskar Chowdhury
MAT
32%
Flag icon
second, the memory cycle time, describes how frequently you can repeat a memory reference.
Bhaskar Chowdhury
MCT
32%
Flag icon
(SDRAM) memory, which uses a clock to synchronize the input and output of signals. This clock is coordinated with the CPU clock, so the timings of all the components are synchronized. SDRAM also implements two memory banks on each module, which essentially doubles the memory throughput; it also allows multiple memory requests to be pending at once. A variation on SDRAM, called double-data rate SDRAM (DDR SDRAM) is able to read data on both the rising and falling edges of the clock, which doubles the data rate of the memory chip.
32%
Flag icon
The virtual memory system is responsible for managing the associations between the used portions of this virtual address space into physical memory.
34%
Flag icon
If a process tries to write to a shared page, it incurs a copy-on-write fault.[5
Bhaskar Chowdhury
COW
34%
Flag icon
kswapd’s behavior is controlled by three parameters, called tries_base, tries_min, and swap_cluster,
Bhaskar Chowdhury
kswapd
35%
Flag icon
system that is paging is writing selected, infrequently used pages of memory to disk,
Bhaskar Chowdhury
Paging
35%
Flag icon
while a system that is swapping is writing entire processes from memory to disk.
Bhaskar Chowdhury
Swapping
35%
Flag icon
Paging is not necessarily indicative of a problem; it is the action of the page scanner to try and increase the size of the free list by moving inactive pages to disk.
Bhaskar Chowdhury
how paging works
36%
Flag icon
Memory is consumed by four things: the kernel, filesystem caches, processes, and intimately shared memory.
Bhaskar Chowdhury
Memory consumptions
69%
Flag icon
10 Mb/s Ethernet, the actual signals placed on the wire use a technique known as Manchester encoding, which allows the clock signal and the data to be transmitted in one logical parcel.
69%
Flag icon
This parcel, formally called a bit-symbol, includes the logical inverse of the encoded bit followed by the actual value of the encoded bit, so that there is always a signal transition in the middle of the bit-symbol. For example, the bit “0” would be encoded in Manchester as the bit-symbol “01.” This seems silly, since it appears to double the amount of work required to send a bit of data, but just like differential signaling, it is useful in long-distance communications. Its biggest disadvantage is that it generates signal changes on the wire twice as fast as the data rate, which makes the ...more
70%
Flag icon
The original standard used thick coaxial cable, and was known as 10BASE5. The 10 encodes the network data rate in Mb/s, the “BASE” refers to the use of a signaling method known as baseband, and the 5 describes the maximum segment length in 100 meter increments.
77%
Flag icon
The default values for tcp_conn_req_max_q0 and tcp_conn_req_max_q are 1,024 and 128, respectively.
77%
Flag icon
A certain type of denial-of-service attack, called SYN flooding, involves sending a large number of SYN packets with nonexistent source addresses. Because the second SYN is never acknowledged, the listen queue fills up and new connections get through only as old ones time out and are discarded from the queue. Whenever a dubious connection is discarded, the tcpHalfOpenDrop counter is incremented; a high value indicates that a SYN flood was likely attempted. If you observe this behavior, you can improve your protection by increasing tcp_conn_req_max_q0.
79%
Flag icon
NFS is stateless, the server and client need a mechanism to determine the other’s state in order to know when to reacquire a lock (e.g., when the server is rebooted) and when to invalidate a lock (e.g., when the client unmounts the filesystem); this is the role played by statd.
80%
Flag icon
Version 2 mount will default to UDP, and a Version 3 mount will default to TCP.
87%
Flag icon
gethrtime(), a function call that returns the current time in nanoseconds, and directly accessing the TICK register.