OS 筆記 - CH9 Virtual Memory Management

Posted on 2019-12-31 Edited on 2023-05-09 In OS Views: 1672 Views: 1279 Disqus: 0 Comments Word count in article: 7k Reading time ≈ 6 mins.

CH9 Virtual Memory Management 筆記

CH9 Virtual Memory Management 筆記 !

Background
Demand Paging
- step
- Demand Paging Performance
Process Creation
Page Replacement
- Page Replacement Algorithms
Allocation of Frames
Trashing
- Working-Set Model
- Page Fault Frequency Scheme

Background

我們不用一開始就全都載入，如以前介紹的 dynamic loading
還有其他很多用不到的東西，像是 array, same library
總之目的就是有更好的 memory utilization
- 不是說能放更多東西而是放的東西是真的會用到的 ~~斷捨離~~
Virtual Memory
- 優勢
  - 可以放 large process (會變慢但至少"可以放")
  - 增加 CPU/Resource utilization (因為可以放更多了)
  - Simplify programming :
    因為切割了，所以 Compiler 等等的不用管 physical address，簡單化
  - Program faster : 以前要全部 load 到 memory 才會執行，搭配 demand paging
    可以加速 (尤其是一開始啟動)
Virtual memory can be implemented via
- Demand paging
- Demand segmentation : more complicated due to variable sizes
  - 但通常不會用這個。再搬回來就很可能找不到，要花 O(n) 時間在找。
    相對於 Paging : O(1)
- Paging 比較偏硬體那邊，Segment 則是對 programmer 比較友善

Demand Paging

需要時才載入
- Less I/O needed → Faster response
- Less memory needed → More users
但是 Access 時要檢查
1. 有沒有效
2. 是不是真的在 Memory
Pure Demand Paging
- 一開始什麼都沒有，load 一個 process → create page table
  → 要 run 的時候才真的開始抓
- 有點極端的例子，現今還是會至少 load 一些 memory content
Swapper (midterm scheduler) 以 Page 為單位在做搬移 (否則會是以 process 為單位)
Hardware Support
- Page table :
  - 1 → page in memory
  - 0 → page not in memory : 一開始預設全都是 0
- Secondary Memory (swap space, backing store) :
  通常會是 high-speed disk (swap device)
Page Fault 步驟
1. First reference to a page will trap to OS → page-fault trap
2. OS 查看 internal table (in PCB) to decide
  → Invalid reference : abort
  → Just not in memory : continue
3. Get an empty frame
4. Swap the page from disk (swap space) into the frame
5. Reset page table, valid-invalid bit = 1
6. Restart instruction
Page Replacement
- 為了決定 "誰該被踢出去"，需要 Replacement Algorithms
Demand Paging Performance
- Effective Access Time (EAT): (1 – p) x ma + p x pft
  - P : page fault rate, ma : mem. access time, pft : page fault time
  - 其實就很簡單的機率公式而已，知道一下就好
- EX : mem. access = 200ns, page faut time = 8ms
  EAT = (1 - p) * 200ns + p * 8ms = 200ns + 7,999,800ns x p
- 上面這個例子可以發現 bottleneck 是 Page fault time，畢竟硬碟真的太慢
  - Access time is proportional to the page fault rate
  - 我們可以調整的就剩下命中率，所以要盡量讓命中率高一點
- Locality 非常重要。(和 TLB 同樣道理)
  (Spatial Locality 和 Temporal Locality)
  因為有 Locality 的存在，一次搬一整個 Page，很多 ( 通常是 4KB )
  Hit rate 得以上升，讓 VM 對電腦不會有太大的影響
- Read in the page from disk (MOST expensive) → bottleneck

Process Creation

Demand Paging : only bring in the page containing the first instruction
- 只有 Load time 很小，每次 Swapping 很快
Copy-on-Write :
- the parent and the child process share the same frames initially, and frame-copy when a page is written
- 有 virtual address 的話，可以讓 fork 搭配 C-o-W，run time 再 create
Memory-Mapped File: map a file into the virtual address space to bypass file system calls (e.g., read(), write())
- 一般有點慢，若 file address 綁在 memory address 綁在一起 (跳過 file system)
  則可以 disk content 當 memory content 來用

Copy-on-Write

允許 parent 和 child 共用同一個 frames in memory
- 如果 frame 被更動了，就會 copy 一塊出來 → 更有效
  但會跑的時候會慢一點點
- btw, 找 Free frame 後記得要清空 Frame

Memory-Mapping Files

Approach :
讓使用者用 **Memory access**的方式去 r/w a file → 快 !
MMF allows file I/O to be treated as routine memory access by mapping a disk block to a memory frame
和 Memory-mapping I/O 有關
On modern operating systems, it is possible to mmap (pronounced “em-map”) a file to a region of memory. When this is done, the file can be accessed just like an array in the program.
reference
Benefit :
- 比起 read() and write() system calls，memory access 更快。
- 可以做 Share
  - 但有安全性問題，要另外 maintain
Concerns : Security, data lost, more programming efforts
- Data lost : 之前 file system 會幫你固定弄還原點，但 memory 斷電就會掉檔
flow :
- 注意後面的 munmap，最後要再 copy 進去。

Page Replacement

Page fault 發生時，總要找個傢伙把它踢回去硬碟。
為了效能，我們需要有個演算法幫我們決定。
Dirty bit : 確認 page 有沒有被寫過，如果沒被寫過
反正它本來就在硬碟了，也不用特別寫回。可以省很多時間
Solve two major problems for demand paging
- Frame-allocation algorithm : 決定分配多少 frame 給 process
- Page Replacement

Page Replacement Algorithms

目標 : lowest page-fault rate
Reference string : 依照時間排列，第幾個 page 準備要被 reference
(單純名詞解釋一下)

First-In-First-Out (FIFO) Algorithm

越老的越欠踢
Example :
Reference string : 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
→ 9 個 page faults (數紅色的)

FIFO Illustrating Belady's Anomaly
- 增加 frame 數量，結果竟然不一定會改善 = =
  這種毫無原因無法調整的東西不被接受，故大家更愛 LRU (可被證明)
- Example : 10 page faults

Optimal (Belady) Algorithm

最晚會被使用到的，把它踢出去。
- 但這要預知未來 ... 無法實做
Example : 6 page faults

LRU Algorithm (Least Recently Used)

An approximation of optimal algorithm
- 不敢稱王，但是接近最好
- 往回看 log 去篩選
Implement operation O(1)，用 double-linked list + hash map
- double-linked list : 想調整一個從 tail 到 head，就 pop tail & push to head
- hash map : 用來找 "那個 page number 是在第幾個 index"
  (這樣才能快速找到它，把它給 dequeue)
想法 : 用 time stamp 去調整。
觀察 :
cycle = 5，ref 1，把 1 調到 head ( the youngest )
cycle = 6，ref 2，把 2 調到 head

Stack Algorithm

A property of algorithms (可被證明，不會跑掉 ex: Optimal, LRU)
n+1 size 的 problem 一定是 n size 的 superset (超集合，總之就是更大的集合)
Stack algorithms do not suffers from Belady's anomaly
查了一下，最新的在 stack.top()，舊的會堆積在底下

Counting Algorithms (LFU, MFU)

LFU Algorithm : (least frequently used)
- 對於每次 ref 就 ++，越不常用(value低)的就踢
MFU Algorithm (most frequently used)
- 考慮到有可能 OS 用某東西用很久 ... 結果突然不用了卻一直占用
  所以反其道而行，值最高的踢。
這兩個都有共同的缺點 :
- 可能加到很大 (for迴圈...)，硬體會 overflow 等等的 issue
- 也不一定會接近 optimal。

The Clock Page Replacement (Second Chance)

補充一下，因為有些地方還是會考 ...
概念上類似 LRU，能保留最近使用過的頁框，但比 LRU 還要容易實作得多。
觀察 :
- 灰色代表這個格子下一個 cycle 是待宰對象
- 字母跟著的數字，剛被插入的時候是 1，每次輪到它就會 -1
  如果它是 0 的時候又是灰色，減下去變成 -1，代表它要被踢出去了。
- 如果 hit 就把該字母 +1，不用進入偵測灰格子在哪
- Cycle 4 : E1 -1 → E0，逃過一劫；F1 -1 → F0，也逃過，A1同理，最後還是回到 E0
  E0 -1 → E(-1)，所以 cycle 4 就決定是 E 出去。
- Cycle 5 : F hit, 所以 F0 → F1，且灰色格子不動，因為 hit，沒有人要被踢。
  Cycle 10 也是
- Cycle 7 : F hit, 所以 F0 → F1，且灰色格子不動 (停在 frame 1)

Allocation of Frames

每個 process 最低所需 frame 的數量，可以透過以下兩種去決定分配時機點
Fixed allocation : 固定，跟 run time 無關，事先決定好。
Priority allocation : Run time 決定，有 priority。

從其他角度 ...

Local allocation : 你在 replace 的時候只能在自己被分配的數量內做 replace
Global allocation : 從任何一個程式的 page 都可以去搶。
- 但這樣有些人太可憐了，通常會有 mimimun 限制，避免 trashing

其實 Fix 和 local 這種事先被決定好的，通常都不太好。(程式變化很大)

Trashing

Page fault 非常頻繁在做 → 一直做 I/O → CPU idle → 效能低落
→ OS 自作聰明增加 degree of multiprogramming (為了拯救 utilization)
→ 然後就什麼都做不了，完全 GG
只要花在 paging 的時間比 executing 多，就叫 trasing。
Prevent : Working-set model, Page-fault frequency

Working-Set Model

去算一段時間內用了多少 page，那個數量就是差不多該分配給它的數量。
Locality : a set of pages that are actively used together
- 會隨時間改變，所需的數量是會變動的
名詞 :
- window : a parameter ∆ (delta)，時間，time window
- working set : 那段時間所需 page 數量 (an approximation locality)
以這張圖舉例，左邊 : 固定保持過去 10 個紀錄，並且只有 {1, 2, 5, 6, 7} 被 ref
所以 WS size = 5

如果 Total WS Size > availabe frames → 可能會有 trashing
- 相反的，如果 WS Size << available frames，則可以提高 Multi 程度
- 注意，是 Total，用全局來看，調整整個環境可用的 frame 數量
但 ... too expensive for tracking (counting)

Page Fault Frequency Scheme

控制每個 Process Page fault rate 在某個 range 裡面
- 太高可能 trashing，太低(0) 就是 frame 太多了可以 multi 高一點
根據 process 去調整，給某某 process 多一點、挖誰的來補 ... etc.

Summary

核心 : Locality 會隨時間改變，我們需要動態調整。
一個週期大概是 peak 的開始到下一個 peak 開始之前。(如圖)
這張圖雖然 y 軸是統計 page fault rate
但每經過這樣的時間就應該重新調整，跟 working set 概念還是挺相似的。
Working-Set 還是比較少用，因為成本高

Thrashing 考題補充

這題簡直老套，中央超愛。

何者可以改善 Thrashing ?
- Increase multiprogramming degree
  - 你是要忙到死 ?
- Decrease multiprogramming degree
  - 給一點喘息空間才是對的
- Install faster CPU
  - 做得快跟你有沒有空間做是兩回事
- Install more main memory
  - 這是主要解法，有更多的 memory 可以用就能執行(load)更多 process
- Install bigger disk
  - 並無正負相關，你用了 10TB 的 disk 記憶體還是不夠用
    請不要把 VM 當作你可以直接執行程式的地方
- Install faster disk
  - 可能可以，畢竟 Thrashing 就是卡在大家都在 I/O，但治標不治本
- Local replacement policy used
  - 可能可以，有可能是原本的替換方式太爛了，所以用 LRU 可能可以改善 page falut 次數
- Prepaging used
  - 原本的 demand paging 會是 consecutive 的慢慢載入 page，但這樣會有一大堆 Page Falut
    所以這邊使用 Prepaging 來做改善，直接同時拿一堆可能程式剛執行會先要的
  - 但這也只有一開始有改善，後面程式執行如果不好好控制 PF 仍然會發生 Thrashing
- Use bigger page size
  - 一次搬比較多，這樣 I/O request 的頻率可以減少
- Use smaller page size
  - 狂做 I/O ...

Review

p.25

Virtual memory ? Physical Memory ?
Demand paging ?
Page table support for demand paging ?
OS handling steps for page fault ?
Page replacement ?
Copy-on-write ? Usage ?
Memory-mapped file ? Usage ?

p.45

Page replacement steps ?
Place replacement algorithm goal ?
Dirty bit usage ?
Belady’s anomaly ?
FIFO ? Optimal? LRU?
Fixed vs. priority frame allocation ?
Global vs. local frame allocation ?

p.55

Thrashing definition ?
Process locality ?
When will thrashing happen ? Solution ?

參考

Mage 大的筆記、周志遠教授 PPT 與影片

END

-----------------------------------

2019.12.31