fix(api): 修复墙检测任务超时占用与状态展示

为超过 5 分钟未领取或未上报的 pending/checking
任务自动标记失败,避免长期占用 active 状态并阻塞新检测

同时区分前端“等待节点领取”和“检测中”展示,
补充跳过原因提示,并更新相关测试与文档
This commit is contained in:
yinjianm
2026-04-28 01:44:07 +08:00
parent ff50030364
commit 329d52f89f
7 changed files with 104 additions and 17 deletions
+7
View File
@@ -1,5 +1,12 @@
# CHANGELOG # CHANGELOG
## [0.6.4] - 2026-04-28
### 修复
- **[node-gfw-check]**: 修复墙检测任务卡在 `pending/checking` 后会长期占用 active 状态的问题;超过 5 分钟未被节点端领取或未上报的任务会标记为检测失败,管理端区分展示“等待节点领取”和“检测中”。同时修正 mi-node 的 ping 成功判定,避免正常可达但平均延迟解析不到时被误判为超时 — by yinjianm
- 类型: 快速修改(无方案包)
- 文件: app/Services/ServerGfwCheckService.php, app/Console/Commands/SyncServerGfwChecks.php, admin-frontend/src/utils/nodes.ts, admin-frontend/src/views/nodes/NodesView.vue, E:/code/go/mi-node/internal/gfwcheck/gfwcheck.go
## [0.6.3] - 2026-04-28 ## [0.6.3] - 2026-04-28
### 新增 ### 新增
+2 -2
View File
@@ -13,7 +13,7 @@
- 子节点列表展示继承父节点最新 `gfw_check`,并返回 `inherited=true``source_node_id` - 子节点列表展示继承父节点最新 `gfw_check`,并返回 `inherited=true``source_node_id`
- `server_gfw_checks.status` 使用 `pending / checking / normal / blocked / partial / failed / skipped` - `server_gfw_checks.status` 使用 `pending / checking / normal / blocked / partial / failed / skipped`
- 管理端 `POST server/manage/checkGfw` 接收 `{ ids: number[] }`,响应中区分 `started``skipped` - 管理端 `POST server/manage/checkGfw` 接收 `{ ids: number[] }`,响应中区分 `started``skipped`
- 后端定时命令 `sync:server-gfw-checks` 会自动为 `gfw_check_enabled=1` 的父节点创建检测任务;已有 `pending/checking` 任务时跳过,避免重复检测 - 后端定时命令 `sync:server-gfw-checks` 会自动为 `gfw_check_enabled=1` 的父节点创建检测任务;已有未超时的 `pending/checking` 任务时跳过,超过 5 分钟未领取或未上报的任务会自动标记为 `failed`
- 节点端 `GET server/gfw/task` 只向父节点返回待执行任务;节点端 `POST server/gfw/report` 必须校验 `check_id` 归属当前节点 - 节点端 `GET server/gfw/task` 只向父节点返回待执行任务;节点端 `POST server/gfw/report` 必须校验 `check_id` 归属当前节点
- `v2_server.gfw_check_enabled` 控制节点是否参与自动墙检测与墙状态自动显隐;父节点开启时会自动创建检测任务,子节点不独立检测但可单独关闭随父节点自动隐藏 / 恢复 - `v2_server.gfw_check_enabled` 控制节点是否参与自动墙检测与墙状态自动显隐;父节点开启时会自动创建检测任务,子节点不独立检测但可单独关闭随父节点自动隐藏 / 恢复
- `blocked` 结果会自动隐藏仍开启墙检测托管且当前显示中的父节点及其子节点,并设置 `gfw_auto_hidden=1` - `blocked` 结果会自动隐藏仍开启墙检测托管且当前显示中的父节点及其子节点,并设置 `gfw_auto_hidden=1`
@@ -21,7 +21,7 @@
- `sync:server-auto-online` 会把最新墙状态 `blocked` 和未恢复的 `gfw_auto_hidden` 作为显示否决条件,防止自动上线重新发布疑似被墙节点 - `sync:server-auto-online` 会把最新墙状态 `blocked` 和未恢复的 `gfw_auto_hidden` 作为显示否决条件,防止自动上线重新发布疑似被墙节点
- 当前检测方向只做节点服务器主动 ping 国内三网目标;后续墙内探测 IP 可在同一任务模型中扩展 - 当前检测方向只做节点服务器主动 ping 国内三网目标;后续墙内探测 IP 可在同一任务模型中扩展
- 参考脚本中的 Telegram 通知、chat_id、bot token 和自动安装依赖逻辑不得进入项目实现 - 参考脚本中的 Telegram 通知、chat_id、bot token 和自动安装依赖逻辑不得进入项目实现
- mi-node 使用 Go 原生 runner 调用系统 `ping`,按三网目标并发检测并结构化上报 `summary / operator_summary / raw_result` - mi-node 使用 Go 原生 runner 调用系统 `ping`,按三网目标并发检测并结构化上报 `summary / operator_summary / raw_result`;ping 命令成功即视为目标可达,平均延迟解析失败不再把正常可达误判为超时
- Docker runtime 镜像需要提供 `ping`,当前通过 Alpine `iputils` 满足 - Docker runtime 镜像需要提供 `ping`,当前通过 Alpine `iputils` 满足
## 依赖关系 ## 依赖关系
+12 -2
View File
@@ -120,10 +120,20 @@ export function getNodeGfwMeta(node: AdminNodeItem): NodeGfwMeta {
} }
} }
if (status === 'pending' || status === 'checking') { if (status === 'pending') {
return {
label: `${inheritedPrefix}等待节点领取`,
searchText: `${inherited ? '随父节点 继承 ' : ''}等待节点领取 等待检测 gfw pending`,
tagType: 'primary',
tone: 'checking',
inherited,
}
}
if (status === 'checking') {
return { return {
label: `${inheritedPrefix}检测中`, label: `${inheritedPrefix}检测中`,
searchText: `${inherited ? '随父节点 继承 ' : ''}检测中 等待检测 gfw checking pending`, searchText: `${inherited ? '随父节点 继承 ' : ''}检测中 正在检测 gfw checking`,
tagType: 'primary', tagType: 'primary',
tone: 'checking', tone: 'checking',
inherited, inherited,
+3 -2
View File
@@ -376,7 +376,8 @@ async function handleCheckGfw(ids: number[], label: string) {
if (started > 0) { if (started > 0) {
ElMessage.success(`${label}已发起墙状态检测,${started} 个父节点等待上报`) ElMessage.success(`${label}已发起墙状态检测,${started} 个父节点等待上报`)
} else if (skipped > 0) { } else if (skipped > 0) {
ElMessage.info('所选节点均为子节点,墙状态随父节点显示') const reason = response.data?.skipped?.[0]?.reason
ElMessage.info(reason || '所选节点暂未发起新的墙状态检测')
} else { } else {
ElMessage.info('没有可检测的节点') ElMessage.info('没有可检测的节点')
} }
@@ -666,7 +667,7 @@ watch(
<ElOption label="疑似被墙" value="blocked" /> <ElOption label="疑似被墙" value="blocked" />
<ElOption label="部分异常" value="partial" /> <ElOption label="部分异常" value="partial" />
<ElOption label="检测失败" value="failed" /> <ElOption label="检测失败" value="failed" />
<ElOption label="检测中" value="checking" /> <ElOption label="等待/检测中" value="checking" />
<ElOption label="未检测" value="unchecked" /> <ElOption label="未检测" value="unchecked" />
<ElOption label="随父节点" value="inherited" /> <ElOption label="随父节点" value="inherited" />
</ElSelect> </ElSelect>
+3 -2
View File
@@ -19,11 +19,12 @@ class SyncServerGfwChecks extends Command
); );
$this->info(sprintf( $this->info(sprintf(
'Server GFW checks synced: total=%d started=%d skipped=%d active=%d', 'Server GFW checks synced: total=%d started=%d skipped=%d active=%d expired=%d',
$result['total'], $result['total'],
count($result['started']), count($result['started']),
count($result['skipped']), count($result['skipped']),
$result['active'] $result['active'],
$result['expired'] ?? 0
)); ));
return self::SUCCESS; return self::SUCCESS;
+60 -7
View File
@@ -8,6 +8,8 @@ use Illuminate\Support\Collection;
class ServerGfwCheckService class ServerGfwCheckService
{ {
private const ACTIVE_TASK_TIMEOUT_SECONDS = 300;
private const TASK_STATUS = [ private const TASK_STATUS = [
ServerGfwCheck::STATUS_PENDING, ServerGfwCheck::STATUS_PENDING,
ServerGfwCheck::STATUS_CHECKING, ServerGfwCheck::STATUS_CHECKING,
@@ -17,6 +19,8 @@ class ServerGfwCheckService
{ {
$ids = array_values(array_unique(array_filter(array_map('intval', $ids)))); $ids = array_values(array_unique(array_filter(array_map('intval', $ids))));
$servers = Server::whereIn('id', $ids)->get()->keyBy('id'); $servers = Server::whereIn('id', $ids)->get()->keyBy('id');
$this->expireStaleActiveTasks($ids);
$activeLookup = $this->activeTaskServerLookup($ids);
$started = []; $started = [];
$skipped = []; $skipped = [];
@@ -46,6 +50,15 @@ class ServerGfwCheckService
continue; continue;
} }
if (isset($activeLookup[(int) $server->id])) {
$skipped[] = [
'id' => $id,
'status' => ServerGfwCheck::STATUS_SKIPPED,
'reason' => '已有检测任务等待节点领取或上报',
];
continue;
}
$check = $this->createCheck($server, $adminUserId); $check = $this->createCheck($server, $adminUserId);
$started[] = [ $started[] = [
'id' => $server->id, 'id' => $server->id,
@@ -74,12 +87,8 @@ class ServerGfwCheckService
} }
$servers = $query->get(); $servers = $query->get();
$activeServerIds = ServerGfwCheck::whereIn('server_id', $servers->pluck('id')) $expired = $this->expireStaleActiveTasks($servers->pluck('id'));
->whereIn('status', self::TASK_STATUS) $activeLookup = $this->activeTaskServerLookup($servers->pluck('id'));
->pluck('server_id')
->map(fn ($id) => (int) $id)
->all();
$activeLookup = array_flip($activeServerIds);
$started = []; $started = [];
$skipped = []; $skipped = [];
@@ -105,7 +114,8 @@ class ServerGfwCheckService
'started' => $started, 'started' => $started,
'skipped' => $skipped, 'skipped' => $skipped,
'total' => $servers->count(), 'total' => $servers->count(),
'active' => count($activeServerIds), 'active' => count($activeLookup),
'expired' => $expired,
]; ];
} }
@@ -169,6 +179,8 @@ class ServerGfwCheckService
return null; return null;
} }
$this->expireStaleActiveTasks([$node->id]);
$check = ServerGfwCheck::where('server_id', $node->id) $check = ServerGfwCheck::where('server_id', $node->id)
->whereIn('status', self::TASK_STATUS) ->whereIn('status', self::TASK_STATUS)
->orderByDesc('id') ->orderByDesc('id')
@@ -185,6 +197,47 @@ class ServerGfwCheckService
return $this->formatTask($check->refresh()); return $this->formatTask($check->refresh());
} }
private function activeTaskServerLookup($serverIds): array
{
$ids = collect($serverIds)
->map(fn ($id) => (int) $id)
->filter()
->unique()
->values();
if ($ids->isEmpty()) {
return [];
}
return array_flip(ServerGfwCheck::whereIn('server_id', $ids)
->whereIn('status', self::TASK_STATUS)
->pluck('server_id')
->map(fn ($id) => (int) $id)
->all());
}
private function expireStaleActiveTasks($serverIds): int
{
$ids = collect($serverIds)
->map(fn ($id) => (int) $id)
->filter()
->unique()
->values();
if ($ids->isEmpty()) {
return 0;
}
return ServerGfwCheck::whereIn('server_id', $ids)
->whereIn('status', self::TASK_STATUS)
->where('updated_at', '<=', now()->subSeconds(self::ACTIVE_TASK_TIMEOUT_SECONDS))
->update([
'status' => ServerGfwCheck::STATUS_FAILED,
'error_message' => '墙检测任务超时:节点端未领取或未上报结果',
'checked_at' => time(),
]);
}
public function reportResult(Server $node, array $payload): bool public function reportResult(Server $node, array $payload): bool
{ {
$checkId = (int) ($payload['check_id'] ?? 0); $checkId = (int) ($payload['check_id'] ?? 0);
+17 -2
View File
@@ -16,6 +16,7 @@ class ServerGfwCheckServiceTest extends TestCase
{ {
$eligible = $this->makeServer(['name' => 'eligible-parent']); $eligible = $this->makeServer(['name' => 'eligible-parent']);
$active = $this->makeServer(['name' => 'active-parent']); $active = $this->makeServer(['name' => 'active-parent']);
$stale = $this->makeServer(['name' => 'stale-parent']);
$this->makeServer([ $this->makeServer([
'name' => 'disabled-parent', 'name' => 'disabled-parent',
'gfw_check_enabled' => false, 'gfw_check_enabled' => false,
@@ -29,17 +30,31 @@ class ServerGfwCheckServiceTest extends TestCase
'server_id' => $active->id, 'server_id' => $active->id,
'status' => ServerGfwCheck::STATUS_PENDING, 'status' => ServerGfwCheck::STATUS_PENDING,
]); ]);
$staleCheck = ServerGfwCheck::create([
'server_id' => $stale->id,
'status' => ServerGfwCheck::STATUS_PENDING,
]);
$staleCheck->forceFill([
'created_at' => now()->subMinutes(10),
'updated_at' => now()->subMinutes(10),
])->save();
$result = app(ServerGfwCheckService::class)->startAutomaticChecks(); $result = app(ServerGfwCheckService::class)->startAutomaticChecks();
$this->assertSame(2, $result['total']); $this->assertSame(3, $result['total']);
$this->assertSame(1, $result['active']); $this->assertSame(1, $result['active']);
$this->assertSame([$eligible->id], array_column($result['started'], 'id')); $this->assertSame(1, $result['expired']);
$this->assertSame([$eligible->id, $stale->id], array_column($result['started'], 'id'));
$this->assertCount(1, $result['skipped']); $this->assertCount(1, $result['skipped']);
$this->assertDatabaseHas('server_gfw_checks', [ $this->assertDatabaseHas('server_gfw_checks', [
'server_id' => $eligible->id, 'server_id' => $eligible->id,
'status' => ServerGfwCheck::STATUS_PENDING, 'status' => ServerGfwCheck::STATUS_PENDING,
]); ]);
$this->assertDatabaseHas('server_gfw_checks', [
'id' => $staleCheck->id,
'status' => ServerGfwCheck::STATUS_FAILED,
'error_message' => '墙检测任务超时:节点端未领取或未上报结果',
]);
} }
public function test_report_result_hides_blocked_nodes_and_restores_only_auto_hidden_nodes(): void public function test_report_result_hides_blocked_nodes_and_restores_only_auto_hidden_nodes(): void