Merge pull request 'refactor: optimize AI payload by reducing token usage and streamline findings structure' (#89 ) from feat/壓縮AI內容 into 整理程式碼

Reviewed-on: #89
chore: update ai-review findings [skip ci]
2026-05-13 01:41:12 +00:00 · 2026-05-13 01:40:58 +00:00 · 2026-05-13 01:39:13 +00:00 · 2026-05-13 01:33:15 +00:00 · 2026-05-13 01:31:25 +00:00 · 2026-05-13 01:29:11 +00:00
9 changed files with 220 additions and 110 deletions
@@ -149,6 +149,11 @@
    "location": "app/llm.test.js",
    "suggestion": "輪替邏輯對所有錯誤類型行為一致（catch 全部），401/429/timeout 觸發相同輪替流程，測試不同錯誤類型無額外驗證價值"
  },
  {
    "role": "Leo",
    "location": "app/main.js",
    "suggestion": "main.js 中的 Step 標題註解為 pipeline 流程說明，非待整理的 TODO，不需要轉換為具體任務"
  },
  {
    "role": "Rex",
    "location": "app/package.json",
@@ -203,5 +208,20 @@
    "role": "Zara",
    "location": "app/main.js",
    "suggestion": "deduplicateWithAI 和 filterFalsePositivesWithAI 為循序依賴流程（去重後才能過濾），無法平行化"
  },
  {
    "role": "Leo",
    "location": "app/comments.js",
    "suggestion": "buildTable 函式已在 comments.js 第 13 行定義，非未定義或未匯入，不會導致執行時錯誤"
  },
  {
    "role": "Maya",
    "location": "app/gitea.js",
    "suggestion": "filterDiff 的單元測試已在 gitea.test.js 補齊，涵蓋過濾 .gitea/、不誤過濾其他路徑、全部排除、空 diff 四種情境"
  },
  {
    "role": "Leo",
    "location": "TODO.md",
    "suggestion": "TODO.md 的階段編號僅供內部開發追蹤，無外部文件引用，階段編號調整不影響任何外部一致性"
  }
 ]
@@ -2,8 +2,22 @@
  {
    "level": "info",
    "role": "Rex",
-    "location": "action.yaml",
+    "location": "app/gitea.js:19",
-    "suggestion": "此 Action 需要 `contents: write`、`pull-requests: write` 和 `issues: write` 權限。這些權限對於 Action 的正常運作是必要的（例如寫入 findings.json、發布評論），但屬於較廣泛的權限。建議在文件或使用說明中明確指出這些權限的需求及其潛在影響，確保使用者了解並接受。",
+    "suggestion": "將 `filterDiff` 函數中的 diff 區塊過濾邏輯從正則表達式改為 `startsWith` 是一個重要的安全改進。這可以有效防止潛在的正則表達式注入攻擊，即使 `excludePrefixes` 參數未來可能受到外部控制，也能確保過濾邏輯的安全性。",
-    "is_new": true
+    "is_new": false
  },
  {
    "level": "info",
    "role": "Rex",
    "location": "app/main.js:46",
    "suggestion": "在將 Git Diff 內容傳遞給 AI 進行分析之前，明確呼叫 `filterDiff` 函數以排除 `.gitea/` 等敏感路徑，是一個良好的安全實踐。這有助於避免 AI 分析到不必要的或包含敏感配置的非業務程式碼，降低潛在的資訊洩漏風險。",
    "is_new": false
  },
  {
    "level": "info",
    "role": "Rex",
    "location": "app/main.js:98",
    "suggestion": "新增對 `findings.json` 和 `exclusions.json` 檔案進行 JSON 格式驗證的步驟，並在格式錯誤時嘗試重置和備份，這是一個重要的健壯性與安全措施。它能防止因檔案損壞或惡意修改導致的服務中斷或行為異常，確保系統的穩定性和資料的完整性。",
    "is_new": false
  }
 ]
@@ -23,6 +23,8 @@
 5. 將提示詞放到 ./app/prompts 內供程式讀取
 6. API Key 支援逗號分隔傳入多個，隨機順序各嘗試一次，全部失敗則 exit 1
 7. 讀取 Git Diff 時排除 `.gitea/` 資料夾內的所有檔案，避免 AI 分析 workflow 設定等非業務程式碼
 8. 階段五完成後驗證 `findings.json` 與 `exclusions.json` 是否為合法 JSON 格式，格式錯誤時先嘗試重置為空陣列並備份原檔，修正失敗才 exit 1
 9. 傳給 AI 的 findings 只保留必要欄位（level、role、location、suggestion），排除 `is_new` 等內部欄位；system prompt 精簡為指令核心；exclusions hint 只傳 location 與 suggestion，減少 token 用量
 # 使用說明
@@ -30,6 +32,8 @@
 2. 在 `.gitea/workflows` 資料夾中建立 `ai-review.yaml'
 3. 在 `ai-review.yaml` 中填入以下內容(選擇一個使用)：
 > **權限說明**：此 Action 需要 `contents: write`（寫入 findings.json）、`pull-requests: write`（發佈 PR comment）、`issues: write`（發佈 issue comment）三項權限，為正常運作所必要，無法縮減。
 ### 1. OpenAI
 ```yaml
 name: AI
@@ -3,48 +3,54 @@
 ## 階段一：基本流程串接
 - 目標：確保 action 可以被觸發，pipeline 各步驟依序執行，log 出每個主要階段的進入與完成。
 - 驗收：log 中能看到每個階段（如「Step1: pipeline start」、「Step2: findings merge」等）明確訊息，且流程能走完（即使還沒產生 findings）。
- 完成
+- 未驗收
-## 階段二：Findings 產生與合併
+## 階段二：Git Diff 排除 .gitea/ 資料夾
 - 目標：各角色（style/security/performance/maintainability/testing）能產生 findings，並正確合併新舊 findings。
 - 驗收：log 中能看到每個角色 findings 數量、合併後 findings 統計，並有「Step3: merged findings total=...」等訊息。
 - 完成
 ## 階段三：AI 去重與角色確認
 - 目標：嘗試呼叫 LLM 進行 findings 去重與角色確認，API 額度不足時要有降級處理 log。
 - 驗收：log 中能看到 deduplication/resolution confirmation 成功或失敗（如 402），降級時有「保留所有問題」等明確訊息。
 - 完成
 ## 階段四：AI 排除問題過濾
 - 目標：讀取排除問題檔案（`.gitea/ai-review/exclusions.json`）進行規則過濾，並呼叫 AI 判斷剩餘問題是否為誤報或不適用，兩層過濾後產生最終問題清單。
 - 驗收：log 中能看到排除問題檔案讀取成功或不存在的訊息、規則過濾數量變化，以及「AI 誤報過濾: N -> M 筆」或降級訊息。
 - 完成
 ## 階段五：findings 寫入與 comment 發布
 - 目標：`.gitea/ai-review/findings.json` 正確寫入，comment 發布順序正確（舊問題→非嚴重→嚴重），每步有 log。
 - 驗收：log 中能看到 `.gitea/ai-review/findings.json` 寫入、comment sync 的詳細訊息與順序。
 - 完成
 ## 階段六：記憶區 commit/push 與錯誤處理
 - 目標：記憶區能成功 commit/push，錯誤時有明確 log，流程結束有總結訊息。
 - 驗收：log 有「persisted findings」、「commit=...」、「push=...」等訊息，錯誤時有「Runner failed: ...」等明確錯誤說明。
 - 完成
 ## 階段七：阻擋嚴重問題 PR（第 8 點）
 - 目標：如果 PR 問題表格中有嚴重（critical）問題，workflow 需直接 exit 1，不讓流程成功。
 - 驗收：log 中能看到「critical 問題存在，workflow 結束（exit 1）」等明確訊息，且 workflow 狀態為失敗。
 - 完成
 ## 階段八：API Key 輪替
 - 目標：所有平台的 API Key 支援逗號分隔傳入多個，隨機順序各嘗試一次，單一 Key 失敗時自動換下一個，全部失敗則 exit 1。
 - 驗收：log 中能看到「key[N/M] 失敗」等訊息，換 key 後繼續執行；傳入單一 Key 時行為與原本相同；全部 Key 失敗時 log「所有 API Key 均失敗，終止流程」且 workflow 狀態為失敗。
 - 完成
 ## 階段九：Git Diff 排除 .gitea/ 資料夾
 - 目標：讀取 Git Diff 時排除 `.gitea/` 資料夾內的所有檔案，避免 AI 分析 workflow 設定等非業務程式碼。
 - 驗收：PR 中有 `.gitea/` 路徑的變更時，diff 內容不包含該路徑的區塊，AI 分析結果不含 `.gitea/` 相關問題。
- 完成
+- 未驗收
---
+## 階段三：Findings 產生與合併
 - 目標：各角色（style/security/performance/maintainability/testing）能產生 findings，並正確合併新舊 findings。
 - 驗收：log 中能看到每個角色 findings 數量、合併後 findings 統計，並有「Step3: merged findings total=...」等訊息。
 - 未驗收
-所有階段驗收通過。
+## 階段四：AI 去重與角色確認
 - 目標：嘗試呼叫 LLM 進行 findings 去重與角色確認，API 額度不足時要有降級處理 log。
 - 驗收：log 中能看到 deduplication/resolution confirmation 成功或失敗（如 402），降級時有「保留所有問題」等明確訊息。
 - 未驗收
 ## 階段五：AI 排除問題過濾
 - 目標：讀取排除問題檔案（`.gitea/ai-review/exclusions.json`）進行規則過濾，並呼叫 AI 判斷剩餘問題是否為誤報或不適用，兩層過濾後產生最終問題清單。
 - 驗收：log 中能看到排除問題檔案讀取成功或不存在的訊息、規則過濾數量變化，以及「AI 誤報過濾: N -> M 筆」或降級訊息。
 - 未驗收
 ## 階段六：findings 寫入與 comment 發布
 - 目標：`.gitea/ai-review/findings.json` 正確寫入，comment 發布順序正確（舊問題→非嚴重→嚴重），每步有 log。
 - 驗收：log 中能看到 `.gitea/ai-review/findings.json` 寫入、comment sync 的詳細訊息與順序。
 - 未驗收
 ## 階段七：階段六後驗證 JSON 格式
 - 目標：階段六完成後驗證 `findings.json` 與 `exclusions.json` 是否為合法 JSON 格式，格式錯誤時先嘗試重置為空陣列並備份原檔，修正失敗才 exit 1。
 - 驗收：log 中能看到兩個檔案的驗證結果（成功或失敗），格式錯誤時有「嘗試修正」訊息與備份路徑，修正失敗時 workflow 狀態為失敗。
 - 未驗收
 ## 階段八：記憶區 commit/push 與錯誤處理
 - 目標：記憶區能成功 commit/push，錯誤時有明確 log，流程結束有總結訊息。
 - 驗收：log 有「persisted findings」、「commit=...」、「push=...」等訊息，錯誤時有「Runner failed: ...」等明確錯誤說明。
 - 未驗收
 ## 階段九：阻擋嚴重問題 PR（第 8 點）
 - 目標：如果 PR 問題表格中有嚴重（critical）問題，workflow 需直接 exit 1，不讓流程成功。
 - 驗收：log 中能看到「critical 問題存在，workflow 結束（exit 1）」等明確訊息，且 workflow 狀態為失敗。
 - 未驗收
 ## 階段十：API Key 輪替
 - 目標：所有平台的 API Key 支援逗號分隔傳入多個，隨機順序各嘗試一次，單一 Key 失敗時自動換下一個，全部失敗則 exit 1。
 - 驗收：log 中能看到「key[N/M] 失敗」等訊息，換 key 後繼續執行；傳入單一 Key 時行為與原本相同；全部 Key 失敗時 log「所有 API Key 均失敗，終止流程」且 workflow 狀態為失敗。
 - 未驗收
 ## 階段十一：壓縮 AI 傳入內容減少 token 用量
 - 目標：傳給 AI 的 findings 只保留必要欄位（level、role、location、suggestion）；system prompt 精簡為指令核心；exclusions hint 只傳 location 與 suggestion；AI 回傳後補回原始完整欄位（含 is_new）。
 - 驗收：AI 呼叫的 payload 不含 is_new 等內部欄位，去重與誤報過濾後的 findings 仍保有完整欄位供後續流程使用。
 - 未驗收
@@ -18,25 +18,31 @@ export async function analyzeWithRole(role, diff) {
 }
 /**
- * 讀取舊 findings（從 workspace 的 FINDINGS_PATH）
+ * 讀取 JSON 陣列檔案，失敗或不存在時回傳空陣列
 */
-export function loadOldFindings(workspace) {
+function readJSONArray(fullPath, label) {
  const fullPath = path.join(workspace, FINDINGS_PATH);
  if (!fs.existsSync(fullPath)) {
-    console.log('  舊 findings 檔案不存在，視為空');
+    console.log(`  ${label}檔案不存在，視為空`);
    return [];
  }
  try {
    const data = JSON.parse(fs.readFileSync(fullPath, 'utf8'));
-    const old = (Array.isArray(data) ? data : []).map(f => ({ ...f, is_new: false }));
+    return Array.isArray(data) ? data : [];
    console.log(`  讀取舊 findings: ${old.length} 筆`);
    return old;
  } catch (e) {
-    console.log(`  ⚠️  讀取舊 findings 失敗: ${e.message}，視為空`);
+    console.log(`  ⚠️  讀取${label}失敗: ${e.message}，視為空`);
    return [];
  }
 }
 /**
 * 讀取舊 findings（從 workspace 的 FINDINGS_PATH）
 */
 export function loadOldFindings(workspace) {
  const old = readJSONArray(path.join(workspace, FINDINGS_PATH), '舊 findings ').map(f => ({ ...f, is_new: false }));
  console.log(`  讀取舊 findings: ${old.length} 筆`);
  return old;
 }
 /**
 * 合併新舊 findings，以 (role + location + suggestion前50字) 為 key 去除重複
 */
@@ -70,22 +76,26 @@ function fallback(label, findings, e) {
  return findings;
 }
 /** 只保留 AI 需要的欄位，減少 token 用量 */
 function toAIPayload(findings) {
  return findings.map(({ level, role, location, suggestion }) => ({ level, role, location, suggestion }));
 }
 /**
 * 呼叫 LLM 進行語意去重，失敗時降級回傳原始 findings
 */
 export async function deduplicateWithAI(findings) {
  if (findings.length === 0) return findings;
-  const systemPrompt = `你是一位程式碼審查問題去重專家。
+  const systemPrompt = `移除語意重複的程式碼審查問題（JSON 陣列）。保留等級較高者（critical > warning > info）。只回傳去重後的 JSON 陣列。`;
 給你一份問題清單（JSON 陣列），請移除語意重複的問題（即使描述文字不同，但指的是同一個問題）。
 保留等級較高的版本，優先保留 critical > warning > info。
 只回傳去重後的 JSON 陣列，不要有其他文字。`;
  try {
-    const result = await chatJSON(systemPrompt, `以下是問題清單，請去除語意重複的項目：\n\n${JSON.stringify(findings, null, 2)}`);
+    const result = await chatJSON(systemPrompt, JSON.stringify(toAIPayload(findings)));
    if (Array.isArray(result) && result.length > 0) {
      console.log(`  AI 去重: ${findings.length} -> ${result.length} 筆`);
-      return result;
+      // 以 location+suggestion 為 key，將原始 findings 的完整欄位（含 is_new）補回
      const origMap = new Map(findings.map(f => [`${f.location}|${String(f.suggestion).slice(0, 50)}`, f]));
      return result.map(r => origMap.get(`${r.location}|${String(r.suggestion).slice(0, 50)}`) ?? r);
    }
    throw new Error('AI 回傳空陣列');
  } catch (e) {
@@ -97,20 +107,9 @@ export async function deduplicateWithAI(findings) {
 * 讀取排除問題檔案（從 workspace 的 EXCLUSIONS_PATH）
 */
 export function loadExclusions(workspace) {
-  const fullPath = path.join(workspace, EXCLUSIONS_PATH);
+  const exclusions = readJSONArray(path.join(workspace, EXCLUSIONS_PATH), '排除問題');
-  if (!fs.existsSync(fullPath)) {
+  console.log(`  讀取排除問題: ${exclusions.length} 筆`);
-    console.log('  排除問題檔案不存在，跳過過濾');
+  return exclusions;
    return [];
  }
  try {
    const data = JSON.parse(fs.readFileSync(fullPath, 'utf8'));
    const exclusions = Array.isArray(data) ? data : [];
    console.log(`  讀取排除問題: ${exclusions.length} 筆`);
    return exclusions;
  } catch (e) {
    console.log(`  ⚠️  讀取排除問題失敗: ${e.message}，跳過過濾`);
    return [];
  }
 }
 /**
@@ -136,22 +135,17 @@ export async function filterFalsePositivesWithAI(findings, exclusions = []) {
  if (findings.length === 0) return findings;
  const exclusionHint = exclusions.length > 0
-    ? `\n\n以下是已知的誤報或不需處理的問題清單（供參考，相同檔案路徑且語意相近的問題應一併排除）：\n${JSON.stringify(exclusions, null, 2)}`
+    ? `\n已知誤報（相同路徑且語意相近者一併排除）：\n${JSON.stringify(exclusions.map(({ location, suggestion }) => ({ location, suggestion })))}`
    : '';
-  const systemPrompt = `你是一位資深程式碼審查專家，負責判斷審查問題是否為誤報或不需處理。
+  const systemPrompt = `判斷以下程式碼審查問題是否為誤報或不適用（如已正確使用 secrets、CI/CD 必要權限等），移除後只回傳需保留的 JSON 陣列。${exclusionHint}`;
 給你一份問題清單（JSON 陣列），每筆包含 level、role、location、suggestion。
 請移除以下類型的問題：
 1. 誤報：問題描述與實際程式碼不符（例如：程式碼已正確使用環境變數或 secrets，卻被標記為硬編碼敏感資料）
 2. 不適用：問題在此專案情境下不需處理（例如：CI/CD action 本來就需要透過環境變數傳遞 token）
 3. 與已知誤報清單語意相近的問題（檔案路徑相同且建議內容相似）
 只回傳需要保留的問題 JSON 陣列，不要有其他文字。${exclusionHint}`;
  try {
-    const result = await chatJSON(systemPrompt, `請判斷以下問題清單，移除誤報或不需處理的問題：\n\n${JSON.stringify(findings, null, 2)}`);
+    const result = await chatJSON(systemPrompt, JSON.stringify(toAIPayload(findings)));
    if (Array.isArray(result) && result.length > 0) {
      console.log(`  AI 誤報過濾: ${findings.length} -> ${result.length} 筆`);
-      return result;
+      const origMap = new Map(findings.map(f => [`${f.location}|${String(f.suggestion).slice(0, 50)}`, f]));
      return result.map(r => origMap.get(`${r.location}|${String(r.suggestion).slice(0, 50)}`) ?? r);
    }
    throw new Error('AI 回傳空陣列或非陣列');
  } catch (e) {
@@ -3,6 +3,8 @@ import fs from 'fs';
 import path from 'path';
 import { GITEA_SERVER_URL, GITEA_REPOSITORY, GITEA_TOKEN, PR_HEAD_BRANCH, FINDINGS_PATH } from './config.js';
 const remoteUrl = `${GITEA_SERVER_URL.replace(/\/$/, '')}/${GITEA_REPOSITORY}.git`;
 function makeRunner(spawn) {
  return function run(args, cwd, env) {
    const opts = { cwd, encoding: 'utf8' };
@@ -30,7 +32,6 @@ function withAskpass(workspace, fn) {
 */
 export function cloneRepo(workspace, _spawnSync = spawnSync) {
  const run = makeRunner(_spawnSync);
  const remoteUrl = `${GITEA_SERVER_URL.replace(/\/$/, '')}/${GITEA_REPOSITORY}.git`;
  const repoDir = path.join(workspace, 'repo');
  return withAskpass(workspace, credEnv => {
@@ -48,7 +49,6 @@ export function cloneRepo(workspace, _spawnSync = spawnSync) {
 export async function commitAndPush(workspace, _spawnSync = spawnSync) {
  const run = makeRunner(_spawnSync);
  const remoteUrl = `${GITEA_SERVER_URL.replace(/\/$/, '')}/${GITEA_REPOSITORY}.git`;
  try {
    const repoDir = cloneRepo(workspace, _spawnSync);
@@ -6,12 +6,20 @@ const httpsAgent = GITEA_SKIP_TLS_VERIFY ? new https.Agent({ rejectUnauthorized:
 const headers = () => ({ Authorization: `token ${GITEA_TOKEN}`, 'Content-Type': 'application/json' });
 const api = (path) => `${GITEA_SERVER_URL.replace(/\/$/, '')}/api/v1${path}`;
 /**
 * 取得 PR 的原始 Git Diff 內容。
 * 注意：回傳值未經路徑過濾，呼叫端須使用 filterDiff 排除敏感路徑（如 .gitea/）後再傳給 AI。
 */
 export async function getPRDiff() {
  const resp = await axios.get(api(`/repos/${GITEA_REPOSITORY}/pulls/${PR_NUMBER}.diff`), { headers: headers(), timeout: 60000, httpsAgent });
-  return filterDiff(resp.data, ['.gitea/']);
+  return resp.data;
 }
-function filterDiff(diff, excludePrefixes) {
+/**
 * 過濾 diff 內容，移除路徑符合 excludePrefixes 的區塊。
 * 每個區塊以 "diff --git a/<prefix>" 開頭判斷，使用 startsWith 精確比對前綴。
 */
 export function filterDiff(diff, excludePrefixes) {
  return diff.split(/(?=^diff --git )/m)
    .filter(block => !excludePrefixes.some(p => block.startsWith(`diff --git a/${p}`)))
    .join('');
@@ -2,13 +2,10 @@ import { describe, it, afterEach, mock } from 'node:test';
 import assert from 'node:assert/strict';
 import axios from 'axios';
 // gitea.js reads env vars at module load time (ESM cache), so we test
 // the actual values baked in at import time and verify behavior via axios mocks.
 afterEach(() => mock.restoreAll());
 describe('gitea', async () => {
-  const { getPRDiff, postComment } = await import('./gitea.js');
+  const { getPRDiff, filterDiff, postComment } = await import('./gitea.js');
  it('getPRDiff calls Gitea diff API with Authorization header', async () => {
    let capturedUrl, capturedOpts;
@@ -48,7 +45,6 @@ describe('gitea', async () => {
      return { data: '' };
    });
    await getPRDiff();
    // httpsAgent is undefined when GITEA_SKIP_TLS_VERIFY !== 'true'
    assert.equal(capturedOpts.httpsAgent, undefined);
  });
@@ -62,3 +58,32 @@ describe('gitea', async () => {
    await assert.rejects(() => postComment('test'), /api error/);
  });
 });
 describe('filterDiff', async () => {
  const { filterDiff } = await import('./gitea.js');
  const block = (file) => `diff --git a/${file} b/${file}\n--- a/${file}\n+++ b/${file}\n@@ -1 +1 @@\n-old\n+new\n`;
  it('filters out .gitea/ blocks', () => {
    const diff = block('.gitea/workflows/review.yaml') + block('src/index.js');
    const result = filterDiff(diff, ['.gitea/']);
    assert.ok(!result.includes('.gitea/'));
    assert.ok(result.includes('src/index.js'));
  });
  it('does not filter non-.gitea/ blocks', () => {
    const diff = block('src/index.js') + block('README.md');
    const result = filterDiff(diff, ['.gitea/']);
    assert.equal(result, diff);
  });
  it('returns empty string when all blocks are excluded', () => {
    const diff = block('.gitea/workflows/review.yaml') + block('.gitea/ai-review/findings.json');
    const result = filterDiff(diff, ['.gitea/']);
    assert.equal(result, '');
  });
  it('returns empty string for empty diff', () => {
    assert.equal(filterDiff('', ['.gitea/']), '');
  });
 });
@@ -1,6 +1,8 @@
-import { GITEA_REPOSITORY, PR_NUMBER, PR_HEAD_BRANCH, PR_BASE_BRANCH, getLLMConfig } from './config.js';
+import fs from 'fs';
 import path from 'path';
 import { GITEA_REPOSITORY, PR_NUMBER, PR_HEAD_BRANCH, PR_BASE_BRANCH, getLLMConfig, FINDINGS_PATH, EXCLUSIONS_PATH } from './config.js';
 import { loadRoles, getRoleIntro } from './roles.js';
-import { getPRDiff, postComment } from './gitea.js';
+import { getPRDiff, filterDiff, postComment } from './gitea.js';
 import { analyzeWithRole, loadOldFindings, mergeFindings, sortByLevel, deduplicateWithAI, loadExclusions, applyExclusions, filterFalsePositivesWithAI } from './findings.js';
 import { saveFindings, postOldFindingsComment, postNewNonCriticalComment, postNewCriticalComments } from './comments.js';
 import { cloneRepo, commitAndPush } from './git.js';
@@ -45,8 +47,18 @@ async function main() {
    console.log(`  ⚠️  comment 發布失敗（繼續執行）: ${e.message}`);
  }
-  // Step2: 各角色分析 diff 產生新 findings
+  // Step2: 排除 .gitea/ 資料夾內的所有檔案
-  console.log('\n📊 Step2: Findings 產生');
+  console.log('\n🗂️  Step2: Git Diff 過濾');
  diff = filterDiff(diff, ['.gitea/']);
  console.log(`  排除 .gitea/ 後 diff 長度: ${diff.length} 字元`);
  if (!diff.trim()) {
    console.log('  ⚠️  過濾後 diff 為空，無需審查');
    process.exit(0);
  }
  // Step3: 各角色分析 diff 產生新 findings
  console.log('\n📊 Step3: Findings 產生');
  const results = await Promise.allSettled(roles.map(role => analyzeWithRole(role, diff)));
  const newFindings = [];
  for (let i = 0; i < results.length; i++) {
@@ -56,10 +68,11 @@ async function main() {
      console.log(`  ⚠️  [${roles[i].name}] 分析失敗（跳過）: ${results[i].reason?.message}`);
    }
  }
-  console.log(`  Step2 完成: 新 findings 總計 ${newFindings.length} 筆`);
+  console.log(`  Step3 完成: 新 findings 總計 ${newFindings.length} 筆`);
-  // Step3: 讀取舊 findings，合併去重（含 AI 語意去重）
+  // Step4: 讀取舊 findings，合併去重（含 AI 語意去重）
-  console.log('\n🔀 Step3: Findings 合併');
+  console.log('\n🔀 Step4: Findings 合併');
  // Clone repo 以讀取舊 findings 與排除清單
  let repoDir;
  try {
    repoDir = cloneRepo(WORKSPACE);
@@ -68,38 +81,64 @@ async function main() {
  }
  const oldFindings = loadOldFindings(repoDir || WORKSPACE);
  const mergedFindings = mergeFindings(oldFindings, newFindings);
-  console.log(`  Step3 merged findings total=${mergedFindings.length}`);
+  console.log(`  Step4 merged findings total=${mergedFindings.length}`);
-  console.log('\n🤖 Step3b: AI 語意去重');
+  console.log('\n🤖 Step4b: AI 語意去重');
  const deduped = await deduplicateWithAI(mergedFindings);
  const sorted = sortByLevel(deduped);
-  console.log(`  Step3b dedup findings total=${sorted.length} (critical=${sorted.filter(f=>f.level==='critical').length} warning=${sorted.filter(f=>f.level==='warning').length} info=${sorted.filter(f=>f.level==='info').length})`);
+  console.log(`  Step4b dedup findings total=${sorted.length} (critical=${sorted.filter(f=>f.level==='critical').length} warning=${sorted.filter(f=>f.level==='warning').length} info=${sorted.filter(f=>f.level==='info').length})`);
-  // Step4: 讀取排除問題檔案，過濾 PR 問題表格，並請 AI 判斷誤報
+  // Step5: 讀取排除問題檔案，過濾 PR 問題表格，並請 AI 判斷誤報
-  console.log('\n🚫 Step4: AI 排除問題過濾');
+  console.log('\n🚫 Step5: AI 排除問題過濾');
  // 輸入至 findings 用於 AI 誤報過濾，exclusions 同時作為已知誤報參考
  const exclusions = loadExclusions(repoDir || WORKSPACE);
  const ruleFiltered = applyExclusions(sorted, exclusions);
  const filtered = await filterFalsePositivesWithAI(ruleFiltered, exclusions);
-  console.log(`  Step4 完成: findings total=${filtered.length}`);
+  console.log(`  Step5 完成: findings total=${filtered.length}`);
-  // Step5: 寫入 findings.json，依序發布 comment
+  // Step6: 寫入 findings.json，依序發布 comment
-  console.log('\n📝 Step5: Findings 寫入與 Comment 發布');
+  console.log('\n📝 Step6: Findings 寫入與 Comment 發布');
  saveFindings(WORKSPACE, filtered);
  try {
    await postOldFindingsComment(filtered);
    await postNewNonCriticalComment(filtered);
    await postNewCriticalComments(filtered);
-    console.log('  Step5 完成');
+    console.log('  Step6 完成');
  } catch (e) {
    console.log(`  ⚠️  comment 發布失敗（繼續執行）: ${e.message}`);
  }
-  // Step6: commit/push findings.json 到來源分支
+  // Step7: 驗證 findings.json 與 exclusions.json 為合法 JSON
-  console.log('\n💾 Step6: 記憶區 Commit/Push');
+  console.log('\n🔎 Step7: JSON 格式驗證');
  for (const relPath of [FINDINGS_PATH, EXCLUSIONS_PATH]) {
    const fullPath = path.join(repoDir || WORKSPACE, relPath);
    if (!fs.existsSync(fullPath)) {
      console.log(`  ⚠️  ${relPath} 不存在，跳過驗證`);
      continue;
    }
    try {
      JSON.parse(fs.readFileSync(fullPath, 'utf8'));
      console.log(`  ✅ ${relPath} JSON 格式正確`);
    } catch (e) {
      console.error(`  ❌ ${relPath} JSON 格式錯誤: ${e.message}，嘗試修正...`);
      try {
        const backupPath = fullPath + '.bak';
        fs.copyFileSync(fullPath, backupPath);
        fs.writeFileSync(fullPath, '[]\n', 'utf8');
        console.log(`  ✅ ${relPath} 已重置為空陣列（原檔備份至 ${relPath}.bak）`);
      } catch (repairErr) {
        console.error(`  ❌ ${relPath} 修正失敗: ${repairErr.message}`);
        process.exit(1);
      }
    }
  }
  // Step8: commit/push findings.json 到來源分支
  console.log('\n💾 Step8: 記憶區 Commit/Push');
  await commitAndPush(WORKSPACE);
-  // Step7: 有 critical 問題則 exit 1
+  // Step9: 有 critical 問題則 exit 1
-  console.log('\n🚦 Step7: 嚴重問題檢查');
+  console.log('\n🚦 Step9: 嚴重問題檢查');
  const criticalCount = filtered.filter(f => f.level === 'critical').length;
  if (criticalCount > 0) {
    console.log(`  ❌ 發現 ${criticalCount} 個嚴重問題，workflow 結束（exit 1）`);
Author	SHA1	Message	Date
admin	505cf6d30d	Merge pull request 'refactor: optimize AI payload by reducing token usage and streamline findings structure' (#89 ) from feat/壓縮AI內容 into 整理程式碼 Reviewed-on: #89	2026-05-13 01:41:12 +00:00
AI Review Bot	c3e57ff442	chore: update ai-review findings [skip ci]	2026-05-13 01:40:58 +00:00
jiantw83	5876154dbb	refactor: optimize AI payload by reducing token usage and streamline findings structure	2026-05-13 01:39:13 +00:00
AI Review Bot	0e0cd252b0	chore: update ai-review findings [skip ci]	2026-05-13 01:33:15 +00:00
jiantw83	fcc8d59f7a	refactor: add suggestion for TODO.md clarification and enhance filterDiff test documentation	2026-05-13 01:31:25 +00:00
AI Review Bot	a92b6440ff	chore: update ai-review findings [skip ci]	2026-05-13 01:29:11 +00:00
jiantw83	8d8ace636e	refactor: add new suggestion for filterDiff unit tests and update getPRDiff documentation for clarity	2026-05-13 01:27:33 +00:00
AI Review Bot	fdeceee52f	chore: update ai-review findings [skip ci]	2026-05-13 01:21:17 +00:00
jiantw83	fade942267	refactor: add new suggestion for comments.js and enhance filterDiff tests for better coverage	2026-05-13 01:18:21 +00:00
jiantw83	4834396652	refactor: streamline JSON file reading logic and improve error handling in findings.js and git.js	2026-05-13 01:16:34 +00:00
AI Review Bot	0108a05886	chore: update ai-review findings [skip ci]	2026-05-13 01:14:09 +00:00
jiantw83	6db660f872	refactor: update TODO stages to reflect current status and improve clarity; modify diff filtering logic in gitea.js and main.js	2026-05-13 01:12:20 +00:00
jiantw83	45468d89d3	refactor: reorganize TODO stages for clarity and update section titles	2026-05-13 01:08:04 +00:00
jiantw83	6c6680fd3e	feat: enhance JSON format validation with backup and reset mechanism on error	2026-05-13 01:06:11 +00:00
jiantw83	49a02ebb6b	feat: add JSON format validation for findings and exclusions after processing	2026-05-13 01:02:33 +00:00
AI Review Bot	37cf5f82fa	chore: update ai-review findings [skip ci]	2026-05-13 00:51:30 +00:00