第6章: 並行システムのテストと検証

6.1 並行システムテストの理論

Dijkstraのテスト不可能性（1969年）

Edsger W. Dijkstraは1969年に、テストの本質的な限界を指摘した（"Notes on Structured Programming", 1969）:

> "Program testing can be used to show the presence of bugs, but never to show their absence." > （プログラムテストはバグの存在を示すことはできるが、バグの不在を証明することは決してできない）

並行システムでは、この問題はさらに深刻である:

逐次プログラム:
  同じ入力 → 同じ出力
  テスト可能なパス数: 有限（条件分岐の組み合わせ）

並行プログラム:
  同じ入力 → 異なる出力（可能）
  可能なインターリービング: 指数的に増大
  n個のスレッドがそれぞれm命令 → (nm)! / (m!)^n 通り

Heisenbugs（1987年）

Jim GrayとAndreas Reuterは1987年に、観察によって挙動が変化するバグをHeisenbugと命名した（"Transaction Processing: Concepts and Techniques", 1993年出版）:

語源: 量子力学のハイゼンベルクの不確定性原理から

Heisenbugの特徴:
1. 再現が困難（または不可能）
2. デバッガで観察すると消失
3. ログ出力を追加すると消失
4. タイミング依存

原因:
- printf()やデバッガの追加 → タイミング変化
- タイミング変化 → スケジューリング変化
- スケジューリング変化 → レース条件が発現しない

Bohrbugs（対照概念）:

決定的で再現可能なバグ
逐次プログラムの典型的なバグ
デバッグが容易

Beizer のテスト分類（1990年）

Boris Beizerは"Software Testing Techniques"（1990年）において、テスト技法を体系化した:

ブラックボックステスト:

内部構造を考慮しない
入力と出力の関係のみをテスト
境界値分析、同値分割

ホワイトボックステスト:

内部構造を考慮する
コードパスのカバレッジ
条件カバレッジ、分岐カバレッジ

並行システムでの追加考慮:

1. インターリービングカバレッジ
2. 同期ポイントのテスト
3. デッドロック条件の網羅
4. レース条件の誘発

6.2 形式的検証

形式的手法の歴史

形式的検証は、数学的手法でプログラムの正しさを証明する分野である。

Floyd-Hoare論理（1969年）:

前提条件と事後条件を用いた証明
{P} S {Q}
Pが成り立つ状態でSを実行すると、Qが成り立つ

TLA+（1994年）:

Leslie Lamportが開発したTLA+（Temporal Logic of Actions）は、並行・分散システムの仕様記述と検証のための言語である:

TLA+ での食事する哲学者問題の仕様（簡略版）:

---- MODULE DiningPhilosophers ----
CONSTANT N
VARIABLES state, forks

TypeOK ==
    /\ state \in [1..N -> {"thinking", "hungry", "eating"}]
    /\ forks \in [0..N-1 -> {"free", "taken"}]

Init ==
    /\ state = [i \in 1..N |-> "thinking"]
    /\ forks = [i \in 0..N-1 |-> "free"]

TakeLeft(i) ==
    /\ state[i] = "hungry"
    /\ forks[i-1] = "free"
    /\ forks' = [forks EXCEPT ![i-1] = "taken"]
    /\ UNCHANGED state

Eat(i) ==
    /\ state[i] = "hungry"
    /\ forks[i-1] = "taken"
    /\ forks[i % N] = "taken"
    /\ state' = [state EXCEPT ![i] = "eating"]
    /\ UNCHANGED forks

NoDeadlock == \E i \in 1..N: ENABLED(Eat(i))
====

モデル検査（Model Checking）

Edmund M. ClarkeとE. Allen Emersonは1981年にモデル検査を提案した（1981年、2007年チューリング賞受賞）:

モデル検査の原理:
1. システムを有限状態機械としてモデル化
2. 検証したい性質を時相論理で記述
3. 全ての状態を探索
4. 性質違反があれば反例を出力

利点:
- 自動化された検証
- 反例（デバッグ情報）の生成

欠点:
- 状態爆発（State Explosion）
- 大規模システムへの適用困難

SPINモデルチェッカー（Gerard J. Holzmann, 1991）:

/* Promela での食事する哲学者問題 */

#define N 5

byte fork[N] = 0;

proctype Philosopher(byte id) {
    byte left = id;
    byte right = (id + 1) % N;

    do
    :: /* thinking */
       atomic {
           fork[left] == 0 -> fork[left] = 1
       };
       atomic {
           fork[right] == 0 -> fork[right] = 1
       };
       /* eating */
       fork[left] = 0;
       fork[right] = 0
    od
}

init {
    byte i;
    for (i : 0 .. N-1) {
        run Philosopher(i)
    }
}

/* 検証したい性質 */
ltl no_deadlock { []<>somePhiloEating }

6.3 動的解析ツールの理論

Valgrindのアーキテクチャ

Nicholas Nethercoteらが開発したValgrind（2002年）は、動的バイナリ計装フレームワークである:

Valgrindの動作原理:

1. プログラムをVEX中間表現に変換
2. 計装コードを挿入
3. 変換されたコードを実行
4. 各ツールが特定の解析を実行

ツール:
- Memcheck: メモリエラー検出
- Helgrind: データレース検出
- DRD: データレース検出（別実装）
- Callgrind: プロファイリング

Helgrindのデータレース検出アルゴリズム:

Helgrindは、Happens-Before関係を追跡してデータレースを検出する:

各メモリアクセスについて:
1. アクセスの種類（読み取り/書き込み）を記録
2. アクセス時のロック状態を記録
3. スレッドのベクタークロックを更新
4. 前回のアクセスとの順序関係をチェック

順序付けがない場合 → データレース報告

ThreadSanitizerのアルゴリズム

Google開発のThreadSanitizer（TSan）は、コンパイル時計装による高速なデータレース検出を実現する:

TSanの最適化:
1. シャドウメモリ: 各メモリ位置の最近のアクセスを記録
2. ベクタークロック圧縮: 効率的なHappens-Before追跡
3. ロックセット: 保持ロックの集合を追跡

コンパイル時計装:
- 全てのメモリアクセス前にチェックコードを挿入
- ロック取得/解放をフック
- スレッド作成/合流をフック

TSanの使用:

# コンパイル
gcc -fsanitize=thread -g -O1 philo.c -lpthread -o philo

# 実行
./philo 5 800 200 200

# 典型的な出力
# WARNING: ThreadSanitizer: data race (pid=12345)
#   Write of size 4 at 0x... by thread T1:
#     #0 set_death philo.c:156
#   Previous read of size 4 at 0x... by thread T2:
#     #0 check_death philo.c:123

アドレスサニタイザ（ASan）

AddressSanitizerは、メモリエラー検出ツールである:

検出可能なエラー:
- ヒープバッファオーバーフロー
- スタックバッファオーバーフロー
- use-after-free
- double-free
- メモリリーク

動作原理:
- レッドゾーン: バッファ周囲に「毒」領域を配置
- シャドウメモリ: 各バイトの有効性を追跡
- フック: malloc/freeを置き換え

6.4 ストレステストの理論

負荷テストの原則

リトルの法則（1961年）:

John D.C. Littleが証明した待ち行列理論の基本定理:

L = λ × W

L: システム内の平均アイテム数
λ: 到着率（単位時間あたりの到着数）
W: 平均滞在時間

並行システムへの適用:

食事する哲学者問題:
- L: 食事中の平均哲学者数
- λ: 食事開始の頻度
- W: 食事時間

安定条件: λ × W < N/2
（N人の哲学者がN本のフォークを共有）

ストレステストのパターン

負荷増加テスト:

スレッド数を段階的に増加
1 → 2 → 5 → 10 → 50 → 100 → 200

各段階で:
- スループットを測定
- レイテンシを測定
- エラー率を測定

持続負荷テスト:

一定の負荷を長時間維持
目的:
- メモリリークの検出
- リソース枯渇の検出
- 長期的な安定性確認

スパイクテスト:

急激な負荷変動
低負荷 → 高負荷 → 低負荷
目的:
- システムの回復力確認
- バッファリングの検証

6.5 Philosophersのテスト戦略

機能テストケース

基本機能テスト:

#!/bin/bash
# test_basic.sh

RED='\033[0;31m'
GREEN='\033[0;32m'
NC='\033[0m'

test_case() {
    local name="$1"
    local args="$2"
    local timeout="$3"
    local expect_death="$4"

    echo -n "Testing: $name... "

    output=$(timeout "$timeout" ./philo $args 2>&1)
    exit_code=$?

    if [ $exit_code -eq 124 ]; then
        if [ "$expect_death" = "no" ]; then
            echo -e "${GREEN}PASS${NC} (no death, timeout expected)"
        else
            echo -e "${RED}FAIL${NC} (timeout)"
        fi
        return
    fi

    if echo "$output" | grep -q "died"; then
        if [ "$expect_death" = "yes" ]; then
            echo -e "${GREEN}PASS${NC} (death detected)"
        else
            echo -e "${RED}FAIL${NC} (unexpected death)"
            echo "$output" | grep "died"
        fi
    else
        if [ "$expect_death" = "no" ]; then
            echo -e "${GREEN}PASS${NC} (no death)"
        else
            echo -e "${RED}FAIL${NC} (expected death)"
        fi
    fi
}

echo "=== Basic Tests ==="
test_case "No death (generous time)" "5 800 200 200" 3 "no"
test_case "One philosopher dies" "1 400 200 200" 2 "yes"
test_case "Tight timing survives" "4 410 200 200" 5 "no"
test_case "Too tight dies" "4 310 200 100" 2 "yes"
test_case "Two philosophers" "2 800 200 200" 3 "no"

食事回数テスト:

echo "=== Must Eat Tests ==="

test_must_eat() {
    local philos="$1"
    local must_eat="$2"

    echo -n "Testing: $philos philosophers, $must_eat meals each... "

    output=$(timeout 10 ./philo $philos 800 200 200 $must_eat 2>&1)

    # 各哲学者が指定回数食べたか確認
    for i in $(seq 1 $philos); do
        eat_count=$(echo "$output" | grep " $i is eating" | wc -l)
        if [ $eat_count -lt $must_eat ]; then
            echo -e "${RED}FAIL${NC} (philosopher $i ate $eat_count times)"
            return
        fi
    done

    if echo "$output" | grep -q "died"; then
        echo -e "${RED}FAIL${NC} (unexpected death)"
    else
        echo -e "${GREEN}PASS${NC}"
    fi
}

test_must_eat 5 7
test_must_eat 4 10
test_must_eat 100 3

タイミング検証

#!/bin/bash
# test_timing.sh

check_death_timing() {
    local time_to_die="$1"
    local tolerance="$2"  # 許容誤差（ms）

    echo -n "Testing death timing (time_to_die=$time_to_die)... "

    output=$(timeout 5 ./philo 1 $time_to_die 200 200 2>&1)

    death_line=$(echo "$output" | grep "died")
    death_time=$(echo "$death_line" | awk '{print $1}')

    if [ -z "$death_time" ]; then
        echo -e "${RED}FAIL${NC} (no death detected)"
        return
    fi

    min_time=$((time_to_die))
    max_time=$((time_to_die + tolerance))

    if [ "$death_time" -ge "$min_time" ] && [ "$death_time" -le "$max_time" ]; then
        echo -e "${GREEN}PASS${NC} (died at ${death_time}ms)"
    else
        echo -e "${RED}FAIL${NC} (died at ${death_time}ms, expected ${min_time}-${max_time})"
    fi
}

echo "=== Timing Tests ==="
check_death_timing 100 15
check_death_timing 200 15
check_death_timing 500 15
check_death_timing 1000 20

データレーステスト

#!/bin/bash
# test_races.sh

echo "=== Data Race Detection with ThreadSanitizer ==="

# TSan付きでコンパイル
echo "Compiling with ThreadSanitizer..."
gcc -fsanitize=thread -g -O1 -pthread srcs/*.c -o philo_tsan 2>/dev/null

if [ $? -ne 0 ]; then
    echo "TSan compilation failed (may not be supported)"
    exit 1
fi

echo "Running tests..."

# 複数回実行（レース条件は確率的）
for i in {1..10}; do
    echo -n "Run $i: "
    output=$(timeout 3 ./philo_tsan 5 800 200 200 2>&1)

    if echo "$output" | grep -q "ThreadSanitizer: data race"; then
        echo -e "${RED}DATA RACE DETECTED${NC}"
        echo "$output" | grep -A 10 "ThreadSanitizer"
        exit 1
    else
        echo -e "${GREEN}clean${NC}"
    fi
done

echo "All runs passed without data races"

メモリリークテスト

#!/bin/bash
# test_memory.sh

echo "=== Memory Leak Detection with Valgrind ==="

valgrind --leak-check=full --error-exitcode=1 \
    --suppressions=/dev/null \
    ./philo 5 800 200 200 5 2>&1 | tee valgrind_output.txt

if grep -q "All heap blocks were freed" valgrind_output.txt; then
    echo -e "${GREEN}No memory leaks${NC}"
else
    echo -e "${RED}Memory leaks detected${NC}"
fi

# Helgrind でデータレースチェック
echo ""
echo "=== Data Race Detection with Helgrind ==="

valgrind --tool=helgrind --error-exitcode=1 \
    ./philo 5 800 200 200 3 2>&1 | tee helgrind_output.txt

if grep -q "ERROR SUMMARY: 0 errors" helgrind_output.txt; then
    echo -e "${GREEN}No data races${NC}"
else
    echo -e "${RED}Possible data races detected${NC}"
fi

6.6 デッドロック検出と回避

動的デッドロック検出

待機グラフ（Wait-for Graph）分析:

/* 動的デッドロック検出の実装概念 */

typedef struct s_wait_graph {
    int         thread_count;
    int         **edges;       /* edges[i][j] = 1: thread i waits for thread j */
    pthread_mutex_t mutex;
}   t_wait_graph;

/* サイクル検出（深さ優先探索） */
int has_cycle(t_wait_graph *graph)
{
    int *visited = calloc(graph->thread_count, sizeof(int));
    int *rec_stack = calloc(graph->thread_count, sizeof(int));
    int result = 0;

    for (int i = 0; i < graph->thread_count; i++)
    {
        if (!visited[i] && dfs_cycle(graph, i, visited, rec_stack))
        {
            result = 1;
            break;
        }
    }

    free(visited);
    free(rec_stack);
    return (result);
}

/* ロック取得時にグラフを更新 */
void before_lock(t_wait_graph *graph, int thread_id, int lock_owner)
{
    pthread_mutex_lock(&graph->mutex);
    graph->edges[thread_id][lock_owner] = 1;

    if (has_cycle(graph))
        printf("DEADLOCK DETECTED!\n");

    pthread_mutex_unlock(&graph->mutex);
}

静的デッドロック検出

コンパイル時にロック取得順序を分析:

ロック順序グラフ:
1. 全てのロック取得パターンを抽出
2. ロック間の順序関係をグラフ化
3. サイクルを検出

例:
関数A: lock(m1); lock(m2); unlock(m2); unlock(m1);
関数B: lock(m2); lock(m1); unlock(m1); unlock(m2);

→ m1 → m2 と m2 → m1 のエッジ存在
→ サイクル検出 → デッドロックの可能性

Philosophersでのデッドロック回避実装

/* 戦略1: リソース順序付け */
int take_forks_ordered(t_philo *philo)
{
    int first, second;

    /* 常にIDの小さいフォークから取得 */
    if (philo->left_fork_id < philo->right_fork_id)
    {
        first = philo->left_fork_id;
        second = philo->right_fork_id;
    }
    else
    {
        first = philo->right_fork_id;
        second = philo->left_fork_id;
    }

    pthread_mutex_lock(&philo->data->forks[first]);
    if (check_simulation_end(philo->data))
    {
        pthread_mutex_unlock(&philo->data->forks[first]);
        return (FALSE);
    }
    print_status(philo, "has taken a fork");

    pthread_mutex_lock(&philo->data->forks[second]);
    if (check_simulation_end(philo->data))
    {
        pthread_mutex_unlock(&philo->data->forks[second]);
        pthread_mutex_unlock(&philo->data->forks[first]);
        return (FALSE);
    }
    print_status(philo, "has taken a fork");

    return (TRUE);
}

/* 戦略2: 開始時刻のずらし */
void *philosopher_routine(void *arg)
{
    t_philo *philo = (t_philo *)arg;

    /* 偶数IDは開始を遅延 */
    if (philo->id % 2 == 0)
        precise_sleep(philo->data->time_to_eat / 2);

    /* ループ開始 */
    while (!check_simulation_end(philo->data))
    {
        /* ... */
    }
    return (NULL);
}

6.7 GDBによる並行プログラムのデバッグ

マルチスレッドデバッグの基本

# デバッグシンボル付きコンパイル
gcc -g -pthread srcs/*.c -o philo

# GDB起動
gdb ./philo

# GDB内コマンド
(gdb) set args 5 800 200 200
(gdb) break philosopher_routine
(gdb) run

スレッド操作

# 全スレッド表示
(gdb) info threads

# 出力例:
#   Id   Target Id         Frame
# * 1    Thread 0x7... (LWP 12345) "philo" in main at main.c:10
#   2    Thread 0x7... (LWP 12346) "philo" philosopher_routine at routine.c:25
#   3    Thread 0x7... (LWP 12347) "philo" philosopher_routine at routine.c:30

# スレッド切り替え
(gdb) thread 2

# 全スレッドのバックトレース
(gdb) thread apply all bt

# 特定のスレッドでコマンド実行
(gdb) thread apply 2-5 print philo->id

ミューテックスのデバッグ

# ミューテックスの状態確認
(gdb) print data->forks[0]

# ロックの所有者を確認（glibc実装依存）
(gdb) print data->forks[0].__data.__owner

# ブレークポイントで停止後、ミューテックスを解放
(gdb) call pthread_mutex_unlock(&data->forks[0])

非停止デバッグモード

# 非停止モード有効化（他のスレッドを停止しない）
(gdb) set target-async on
(gdb) set non-stop on

# 個別スレッドの続行
(gdb) continue &

# スレッドを停止
(gdb) interrupt -a

6.8 出力検証

ログ解析スクリプト

#!/usr/bin/env python3
# validate_philo_output.py

import sys
import re
from collections import defaultdict

def parse_line(line):
    """ログ行をパース"""
    match = re.match(r'^(\d+)\s+(\d+)\s+(.+)
継続的テスト
#!/bin/bash
# continuous_test.sh

RUNS=100
FAILURES=0

echo "Running $RUNS iterations..."

for i in $(seq 1 $RUNS); do
    # 進捗表示
    printf "\rRun %d/%d" $i $RUNS

    # 実行と出力キャプチャ
    timeout 5 ./philo 5 800 200 200 5 > output.txt 2>&1

    # 検証
    python3 validate_philo_output.py output.txt 800 > /dev/null 2>&1
    if [ $? -ne 0 ]; then
        FAILURES=$((FAILURES + 1))
        echo ""
        echo "Failure detected at run $i"
    fi

    # データレースチェック（オプション）
    if [ "$CHECK_RACES" = "1" ]; then
        timeout 5 ./philo_tsan 5 800 200 200 5 > /dev/null 2>&1
        if [ $? -ne 0 ]; then
            echo ""
            echo "Race detected at run $i"
        fi
    fi
done

echo ""
echo "Results: $((RUNS - FAILURES))/$RUNS passed, $FAILURES failures"
6.9 パフォーマンステスト
スループット測定
/* パフォーマンス測定用の計装 */

typedef struct s_perf_counters {
    atomic_long total_meals;
    atomic_long total_wait_time;
    long        start_time;
    long        end_time;
}   t_perf_counters;

void philo_eat(t_philo *philo)
{
    long before_eat = get_timestamp_ms();

    print_status(philo, "is eating");

    pthread_mutex_lock(&philo->data->meal_mutex);
    philo->last_meal_time = get_timestamp_ms();
    philo->meals_eaten++;
    pthread_mutex_unlock(&philo->data->meal_mutex);

    /* パフォーマンスカウンタ更新 */
    atomic_fetch_add(&philo->data->perf->total_meals, 1);

    precise_sleep(philo->data->time_to_eat);
}

void print_performance_stats(t_data *data)
{
    long duration = data->perf->end_time - data->perf->start_time;
    long total_meals = atomic_load(&data->perf->total_meals);

    printf("\n=== Performance Statistics ===\n");
    printf("Duration: %ld ms\n", duration);
    printf("Total meals: %ld\n", total_meals);
    printf("Throughput: %.2f meals/sec\n",
           (double)total_meals / duration * 1000);
    printf("Avg meals per philosopher: %.2f\n",
           (double)total_meals / data->num_philos);
}
ベンチマークスクリプト
#!/bin/bash
# benchmark.sh

echo "=== Philosopher Benchmark ==="
echo ""

benchmark() {
    local philos=$1
    local time=$2
    local must_eat=$3

    echo "Config: $philos philosophers, $must_eat meals each"

    start=$(date +%s%N)
    ./philo $philos $time 200 200 $must_eat > /dev/null 2>&1
    end=$(date +%s%N)

    duration=$(( (end - start) / 1000000 ))  # ミリ秒
    total_meals=$(( philos * must_eat ))
    throughput=$(echo "scale=2; $total_meals / $duration * 1000" | bc)

    echo "  Duration: ${duration}ms"
    echo "  Throughput: ${throughput} meals/sec"
    echo ""
}

benchmark 5 800 10
benchmark 10 800 10
benchmark 50 800 10
benchmark 100 800 5
benchmark 200 800 3
6.10 まとめ
本章では、並行システムのテストと検証について学んだ:
テスト理論: Dijkstraのテスト限界、Heisenbugs
形式的検証: TLA+、モデル検査
動的解析: Valgrind、ThreadSanitizer
ストレステスト: 負荷テストの原則、リトルの法則
機能テスト: タイミング検証、データレース検出
デッドロック: 検出と回避戦略
GDBデバッグ: マルチスレッドデバッグ技法
出力検証: ログ解析、継続テスト
並行プログラムのテストは、その非決定性により本質的に困難である。しかし、適切なツールと体系的なアプローチにより、高品質な並行ソフトウェアを開発することは可能である。
---
参考文献
Dijkstra, E. W. (1969). "Notes on Structured Programming", Technical Report EWD-249
Gray, J. & Reuter, A. (1993). "Transaction Processing: Concepts and Techniques", Morgan Kaufmann
Beizer, B. (1990). "Software Testing Techniques", 2nd Edition, Van Nostrand Reinhold
Clarke, E. M., Emerson, E. A., & Sifakis, J. (2009). "Model Checking: Algorithmic Verification and Debugging", Communications of the ACM, 52(11)
Lamport, L. (1994). "The Temporal Logic of Actions", ACM TOPLAS, 16(3)
Nethercote, N. & Seward, J. (2007). "Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation", PLDI 2007
Serebryany, K. & Iskhodzhanov, T. (2009). "ThreadSanitizer: Data Race Detection in Practice", WBIA 2009
Holzmann, G. J. (2004). "The SPIN Model Checker", Addison-Wesley
Savage, S., et al. (1997). "Eraser: A Dynamic Data Race Detector for Multithreaded Programs", ACM TOCS, 15(4)
Downey, A. B. (2016). "The Little Book of Semaphores", Green Tea Press
, line.strip())
    if not match:
        return None
    return {
        'timestamp': int(match.group(1)),
        'philo_id': int(match.group(2)),
        'action': match.group(3)
    }

def validate_output(filename, time_to_die):
    """出力を検証"""
    with open(filename, 'r') as f:
        lines = f.readlines()

    events = []
    errors = []

    for i, line in enumerate(lines):
        parsed = parse_line(line)
        if not parsed:
            errors.append(f"Line {i+1}: Invalid format: {line.strip()}")
            continue
        parsed['line_num'] = i + 1
        events.append(parsed)

    # タイムスタンプの単調増加チェック
    prev_ts = -1
    for event in events:
        if event['timestamp'] < prev_ts:
            errors.append(f"Line {event['line_num']}: Timestamp decreased")
        prev_ts = event['timestamp']

    # 死亡後の出力チェック
    death_found = False
    for event in events:
        if death_found and event['action'] != 'died':
            errors.append(f"Line {event['line_num']}: Output after death")
        if event['action'] == 'died':
            death_found = True

    # 状態遷移の検証
    philo_states = defaultdict(list)
    for event in events:
        philo_states[event['philo_id']].append(event)

    for philo_id, state_events in philo_states.items():
        last_meal_time = 0
        for event in state_events:
            if event['action'] == 'is eating':
                # 死亡までの時間チェック
                if event['timestamp'] - last_meal_time > time_to_die:
                    errors.append(
                        f"Philosopher {philo_id} ate after time_to_die "
                        f"(gap: {event['timestamp'] - last_meal_time}ms)"
                    )
                last_meal_time = event['timestamp']

    return errors

if __name__ == "__main__":
    if len(sys.argv) != 3:
        print("Usage: python3 validate.py <output_file> <time_to_die>")
        sys.exit(1)

    errors = validate_output(sys.argv[1], int(sys.argv[2]))

    if errors:
        print("Validation FAILED:")
        for error in errors[:10]:  # 最初の10個のエラー
            print(f"  - {error}")
        if len(errors) > 10:
            print(f"  ... and {len(errors) - 10} more errors")
        sys.exit(1)
    else:
        print("Validation PASSED")
        sys.exit(0)

継続的テスト

___CODE_BLOCK_34___

6.9 パフォーマンステスト

スループット測定

___CODE_BLOCK_35___

ベンチマークスクリプト

___CODE_BLOCK_36___

6.10 まとめ

本章では、並行システムのテストと検証について学んだ:

テスト理論: Dijkstraのテスト限界、Heisenbugs
形式的検証: TLA+、モデル検査
動的解析: Valgrind、ThreadSanitizer
ストレステスト: 負荷テストの原則、リトルの法則
機能テスト: タイミング検証、データレース検出
デッドロック: 検出と回避戦略
GDBデバッグ: マルチスレッドデバッグ技法
出力検証: ログ解析、継続テスト

並行プログラムのテストは、その非決定性により本質的に困難である。しかし、適切なツールと体系的なアプローチにより、高品質な並行ソフトウェアを開発することは可能である。

---

参考文献

Dijkstra, E. W. (1969). "Notes on Structured Programming", Technical Report EWD-249
Gray, J. & Reuter, A. (1993). "Transaction Processing: Concepts and Techniques", Morgan Kaufmann
Beizer, B. (1990). "Software Testing Techniques", 2nd Edition, Van Nostrand Reinhold
Clarke, E. M., Emerson, E. A., & Sifakis, J. (2009). "Model Checking: Algorithmic Verification and Debugging", Communications of the ACM, 52(11)
Lamport, L. (1994). "The Temporal Logic of Actions", ACM TOPLAS, 16(3)
Nethercote, N. & Seward, J. (2007). "Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation", PLDI 2007
Serebryany, K. & Iskhodzhanov, T. (2009). "ThreadSanitizer: Data Race Detection in Practice", WBIA 2009
Holzmann, G. J. (2004). "The SPIN Model Checker", Addison-Wesley
Savage, S., et al. (1997). "Eraser: A Dynamic Data Race Detector for Multithreaded Programs", ACM TOCS, 15(4)
Downey, A. B. (2016). "The Little Book of Semaphores", Green Tea Press