Arguably the most important resource in a computer is the memory. Running processes have to fit in the available memory (as the CPU can only operate on data in memory).
Almost all current general purpose operating systems use a mechanism called paging. Some special operating systems (in particular some embedded devices) might not use virtual memory but most often they do not have oven an operating system at all.
Paging uses an intermediate layer between a running process and physical memory. A process uses logical addresses which live in pages. Pages are loaded into physical memory frames as needed. Pages under Linux have 4kB of space.
A process can ask the OS for more memory – this process is called allocation. Kernel allocates a memory area for a process by assigning a new contiguous area aligned on pages boundaries or returns an error. In most cases the precise address is not relevant for the process, only the amount obtained.
The translation between logical and physical addresses has to be very fast. For this reason all modern CPUs have a special hardware device called memory management unit and translation lookaside buffer. Both of them ensure that very little time is needed for the translation.
Unused pages can be written to disk to save precious memory space. Under Windows the pages end up in a so-called swap file (growing as needed). Under Linux one can usually has a dedicated swap partition or a file of fixed size.
If the processor determines that an address requested by a process belongs to a page that is currently not loaded into memory, we have a page fault. The process is paused and the operating system is asked to load the page. A free frame is either found or, if memory is full, another frame (usually belonging to another process) is swapped to disk. The requested page is put in the empty frame.
Each process has its own page table which contains the mapping of pages to frames for this particular process. Corresponding page table is loaded when a process is being run.
When there is little free memory left the system has to decide which pages (of the running or maybe paused processes) can be swapped to disk. It should take a page that will not be needed for some time. Usually the least recently used (LRU) is being chosen.
If pages are moved often between swap and main memory we speak of thrashing. This degrades performance and responsiveness considerably and is a sign that the system needs more physical memory.
Process address space is strictly checked by the operating system. Pages can be marked read-only or non-executable. If a process tries to access an address that is outside the range of pages it was given, or writes to a read-only page, or tries to execute a page that was not marked as executable, a segmentation fault is generated. By default the process is killed.
Multiple processes may use the same memory area. This is used when a program requires shared libraries (DLLs in Windows). Only one copy of a library needs to be loaded into memory and it will be mapped into address spaces of multiple processes that need it. Each such process would have private pages for private variables used by the library and would share the code.
Often a page can be reused but a process still needs to be able to write data there. This is done by marking a page as copy-on-write. When a write occurs, a copy of the original page is created and the process writes into the copy.
In each process the memory addresses ranging from 0xFFFF800000000000 to 0xFFFFFFFFFFFFFFFF are reserved for the kernel data and they are the same in every process. Accessing this memory is possibly only for the kernel, not for user programs.
In each process addresses ranging from 0x0000000000000000 to 0x00007FFFFFFFFFFF are available for the process’s data, code etc.
The current address space of a process (valid addresses in virtual memory) can be looked up in the file
PID is the process number.
All running processes have a private memory space called the stack. Stack is used for temporary variables when a function is being called, e.g., all the function arguments and automatic variables in C live on the stack. If A process overflows its stack, i.e., by a too deep recursion, it will get the segmentation fault signal. Although stack can be very large if requsted, usually it is not that big and large amounts of data should not be put on the stack. By convention the stack is located close to the upper limit of the address space and grows downwards.
The data that does not belong to the currently executed function lives in the heap. The only way to access the heap is through pointers (directly or indirectly). New space on the heap is obtained in C by calling the
malloc function or the
new operator. Heap is the right place for large amounts of data.
Variables in bash can contain any text. A variable name has to start with a letter. Names are case-sensitive. They are created with the following syntax:
A shell variable can be promoted to an environment variable that is passed to commands. This is done by the
export command. Once exported the variable does not need to be reexported if its value changes.
This variable contains the list of directories that are searched for executables (commands). The list is separated by colons.
This variable allows one to customize the prompt.
This variable contains the characters that separate arguments (default: space, tab, newline).
This variable is used by many programs (C library in faact) to determine
locale, i.e., the language of the system.
The shell substitutes the variable value when it encounters the character
$ and the variable name. Non-existing variables are expanded to empty sequences. If we want to tell the shell where the variable name ends we can use braces.
Sometimes we need a raw
$. To suppress the variable expansion and leave the string as-is, use the single quotes:
Double quotes do not suppress expansion but skip cutting arguments at spaces.
A very important and useful construction does something similar with commands. When the shell encounters text in backticks or in
) it runs it as command and replaces the text with the command’s output.
The old way of doing calculations employs the
let command. More modern (but works only in bash, not in older shells) employs double parentheses. Then one calculates almost like in C.
Observe that one does not use the
$ character. Every variable is understood to contain a number.
Sometimes we need a list of elements of strictly prescribed list. Brace expansion can be helpful:
Every command when run leaves a small integer that can be checked – this is the so-called status-code. In C this is the value returned from the
main function. By convention when this value is 0 it means that the command completed successfully. Non-zero values mean some kind of error.
The status code is used instead of boolean values in conditional expressions.
Often a boolean test is needed (check if file exists, compare numbers). The old way of testing these conditions is by the
test program. This program can also be run as
[. Details can be found by reading
The new way of testing conditionals is by the
[[ ]] builtin. It works similarly to
[ but has sane meaning for
> and is generally considered better.
if construction looks like this
fi. True and false are chosen based on the status code of the
There is also an equivalent of the switch instruction:
for loop iterates over a list of values, with each iteration the given variable’s value is set to consecutive values from the list. We can have many commands inside the loop
While loop looks works similarly to the while loop in C.