CSAPP Class Notes(3)

  • program state

    • jumps and branches
    • calls and return
  • system state: exceptional control flow(ECF)

    • exceptions (low level)
    • process context switch: OS software + hardware timer
    • signals: OS software
    • non-local jumps: C runtime library
  • An exception is a transfer of control to OS kernel in response to some event(eg: control-c)

image

  • There is an exception table and corresponding handler

image

  • Asynchronous exceptions(Interrupts): eg–Timer interrupt

  • Synchronous exceptions: caused by an instruction

    • trap: intentional, eg – system calls(like read, open, fork, kill etc)
    • fault: unintentional, eg – page fault(recoverable), protection fault(unrecoverable)
    • abort: unintentional and unrecoverable, eg – illegal instruction
  • example of fault

image

here the OS sends SIGSEGV signal to user process, and process exits with segment fault

  • A process is an instance of running program

  • two key abstractions

    • logical control flow: program seems to have exclusive use of CPU, provided by kernel mechanism called context switching(saved register and switch to another process). Note: physical control flow is the level of instructions.
    • private address space: program seems to have exclusive use of main memory, provided by kernel mechanism called virtual memory
  • concurrent process(one single core)

image

concurrent: flows overlap in time – A & B, A & C

sequential: otherwise – B & C

  • context switch

image

  • pid = fork(); to create a child process (Note: always check the pid)

  • Child is almost identical to parent:

    • Child get an identical (but separate) copy of parent’s virtual address space
    • identical copies of parent’s open file descriptors
    • different PID
  • fork example

image

Note that fork() calls once, but return twice (one in parent, one in child)

image

  • reap child process

    • When process terminates, it still consumes system resources (zombie process)
    • parent terminate child (wait/waitpid) and kernel delete zombie process
    • if parent does nothing, init process(pid=1) will reap the child.
  • execve: loading and running programs

    • different from process, it has different codes to run

linux process hierarchy

image

login shell: user interface, eg: ls

A shell is an application program (a process) that runs user programs

Note that waitpid is used when using background job

If no signals, the background job will become zombies

A signal is a message that notifies a process that an event has occurred

image

Sending signals: kernel sends/delivers a signal to a destination process

Receiving signals: destination process receives a signal when it is forced by the kernel to react to the delivery of the signal. Some possible ways to react:

  • ignore
  • terminate(optional core dump–save the memory state or something to a file for debugging)
  • catch and do something by signal handler (then to an asynchronous interrupt)

Pending signal: a signal is pending if sent but not yet received. at most one pending signal since signals are not queued

Block signal: a process can block the receipt of certain signals

Process group:

image

example of process group: bin/kill

image

example of ctrl-c(SIGINT) and ctrl-z(suspend a process by sending SIGTSTP, putting the process into background and stop running):

image

STAT:

  • first letter: S: sleeping, T: stopped, R: running, Z: zombie
  • second letter: s: session leader, +: foreground process group

signal handling example:

image

signal handler as concurrent flow:

image

and handler can be interrupted by other handlers(nested handlers):

image

The ways to temporarily block signal: Sigprocmask

image

async-signal-safe: reentrant(eg: all variables stored on stack) or non-interruptible by signals.

  • keep your handlers as simple as possible (eg: set flag and return)
  • call only async-signal-safe functions in your handler(printf, malloc, exit are not safe!)
  • save and restore errno on entry and exit (so other handlers don’t overwrite)
  • protect access to shared data structure(so avoid concurrent issue)
  • declare global variables as volatile(to prevent compiler from storing them in register, only in memory)
  • declare global flag as volatile sig_atomic_t(sig_atomic_t is a safe type)

why is printf unsafe? It has a lock, if one thread requires a lock, and before release it, it’s interrupted and the handler wants to printf again — dead lock.

you can’t use signals to count events, since pending signals are not queued (at most one pending signal of any type)

pause() can be dangerous: parent sets pid=0, if pid is 0 and the signal comes between while and pause(), the pause() may wait forever.

image

so here we can use Sigsuspend

jump(return) from one function directly to another function without calling it by using setjmp, longjmp.

  • regular file

    • text files: ASCII or Unicode
    • Binary files: .jpg image or object files… everything else
  • directories

    • consists of an array of links
    • tree hierarchy
  • Note: Do not call system call (read, write) too many times because of the cost of context switch unless you have to. Try read more bytes at one time

A robust wrapper for basic I/O.

Metadata: maintained by kernel, accessed by stat and fstat

How unix kernel represents open files:

image

So file sharing makes sense:

image

Note: fork, a child inherits its parent’s open files

image

Remember to call close() both in parent and child codes.

I/O redirection by using dup2:

image

image

fopen, fead, fflush etc.

Note: there is a buffer in standard IO to reduce multiple system calls

image

image
\

  • standard IO is not suitable for network sockets(not designed for it)
  • standard IO functions are not async-signal-safe (unix IO functions are), and not appropriate for signal handlers
  • standard IO can help you decrease the number of system calls and handle short counts(a = read() < 0)

image

MMU: Memory Management Unit, does address translation

VM can help:

  • use main memory efficiently(cache)
  • simplify memory management
  • isolate address space(to protect privileged kernel and code)

image

Virtual memory is an array of N contiguous bytes, the content of the array on disk are cached in physical memory(DRAM Cache)

image

Page table is an array of PTEs(page table entries) that maps virtual pages to physical pages.

image

image

page miss causes page fault(an exception), page fault handler selects a victim to be evicted(here VP4), then put VP3 to the place of VP4

image

Here we allocate a new page(VP5).

The virtual memory here seems inefficient, but it works because of locality. If **working set(the active pages program tend to access)** < main memory size, good performance. Otherwise, since we need to swap pages, performance are down.

image

Key idea: each process has its own virtual address space

they can also share code and data(read-only)

image

It can also simplify linking and loading:

  • each program has similar virtual address space. Code, data and heap always start at the same address.

image

Extend PTEs with permission bits, MMU checks these bits on each access.

image

Just like a fully-associated cache system. Started by a virtual memory address, and finally get the physical memory address.

Note that offsets are the same.

image

The process of page hit. Note for 2~3: MMU fetches PTE from page table in memory

image

The process of page fault. Note for 7: Handler returns to original process, restarting faulting instruction.

Translation Lookaside buffer(TLB): a small set-associative hardware cache in MMU, contains complete page table entries for small number of pages.

image

TLB-hit:

image

TLB-miss:

image

1MB of virtual memory, 4KB page size, 256KB of physical memory, TLB: 8 entries, 2-way set associative

  • How many bits are needed to represent virtual address space? 20 (2^20 = 1MB)
  • How many bits are needed to represent physical address space? 18
  • How many bits are needed to represent offset? 12 (2^12 = 4KB)
  • How many bits are needed to represent VPN(virtual page number)? 8
  • How many bits are in the TLB index? 2 (since we have four sets totally)
  • How many bits are in the TLB Tag? 6 (just the rest)

Another example:

image

One-Level page table can take up a lot of space.

Eg: 32 bit-environment, 4 KB(2^12) page size, we should have 2^32 * 2^-12 = 2^20 pages, 4-byte PTE, totally 2^22 bytes = 4MB for one process. 1000 process = 4GB

Eg2: 64 bit-environment, 4KB page size, 48-bit address space(24 bits reserved), we should have 2^36 pages, 8-byte PTE, totally 2^39 bytes = 512GB page table!

Note: page size means dividing physical memory into several blocks, the size of block is page size. PTE is for offset in a page(and some privileged bits).

image

The way to translate multi-level-page-table:

image

image

image

Mapping memory to file.

image

Private Copy-on-write(COW) objects.

And Note that this is why fork can be efficient. It doesn’t really copy data that can be shared by two processes.

image

The process of execve(load and run a new program in the current process):

  • free vm_area and page table for old areas
  • create vm_area and page table for new areas
  • set PC to entry point in .text

Using memory mapping a lot here.

mmap: allows you to do memory mapping like kernel does.

image

Require virtual memory by using ways such as malloc(to Heap, maintained by blocks).

Allocators:

  • explicit allocator: C malloc and free
  • implicit allocator: java allocates memory but do not need free (garbage collection)

Note: in this lecture, the unit of malloc is word(4 bytes)

Example:

image

Fragmentation:

  • Internal fragmentation: Blocks have to be aligned, when payload is smaller than one single block, this happens.
  • External fragmentation: As the picture shows above, the fragmentation appears.

Q: How much to free? (since we only have a pointer passed to the free function?)

image

A: by using a header

Q: How to keep track of free blocks?

A: Implicit list to links all of (headers of) the blocks. Additional bit for allocation status. Trick: if blocks are aligned, some low-order bits are always zero(4 bytes, 1000…)

image

example:

image

final block(size 0 / status 1) to be the end of search.

Q: ways to use implicit list?(placement policy)

A:

  • first fit: scan from the beginning
  • next fit: scan starts from the last malloc(fast but maybe more fragments)
  • best fit: search all of the list and find the best malloc to reduce fragments

Q: How to free by using implicit list?

A: Coalescing

image

But in this way we couldn’t coalesce the block we used before.

So we can use boundary tags(constant free even with coalescing):

image

Note: boundary tags can be also optimized. For example, we can add additional status bit to represent whether the previous block needs to coalesce.

Coalescing policy:

  • immediate coalescing: coalesce each time free is called
  • deferred coalescing: coalesce as you scan the list for malloc or external fragmentation reaches some threshold.

image

Maintain list of free blocks. That’s why we can use payload area.

image

Note that blocks can be in any order.

image

It’s easy to allocate now.

For free, tow policies:

  • LIFO(last-in-first-out)

    • insert freed block at the beginning of the list
    • simple and constant time, more fragmentation
  • Address-ordered policy

    • make sure addr(prev) < addr(curr) < addr(next)
    • requires search but less fragmentation.

LIFO eg:

image

LIFO eg2: with coalesce(remember to use boundary tag)

image

LIFO eg3: with coalesce

image

Explicit list only care for free blocks instead of all blocks, much faster when memory is nearly full.

Each size class of blocks has its own free list.

image

separate classes for small size; two-power size class for larger sizes.

  • Higher throughput(log time for power-of-two size classes)
  • Better memory utilization: approximates a best-fit search.

Automatic garbage collection: application never has to free.

Classical GC algorithms

  • Mark-and-sweep collection: introduce later
  • Reference-counting: reference count to indicate whether an object can be freed.
  • Copying collection: copy all of used object in A to B, then free A. Not used too much
  • Generational Collectors: based on lifetimes

Mark-and-sweep:

image

root: Locations not in the heap and contain pointers into the heap(e.g. registers, global variables) reachable: a path from root to node. garbage: non-reachable node.

When out of space:

  1. Mark: start at roots and set mark(extra mark bit) on reachable node recursively.
  2. Sweep: scan all blocks and free blocks that are not marked

image

Note: mark-and-sweep in C is conservative, you don’t know a large number is a pointer or a long type. So we can use a balanced tree to keep track of all allocated blocks.

Malloc Lab is out!

Using gprof is a good way to analyze efficiency.