Files
HF-MES-manual/en/data_middleground/eagle_eyes.md
2026-05-12 01:46:34 +08:00

11 KiB

Hawkeye

1. Function Overview

Hawkeye is an advanced analytics module in the MES system's Data Middle Platform, designed for statistical correlation analysis between NG (Non-Good/Defective) products and process result parameters across various production stages. The system automatically identifies which process parameters have significant correlations with NG occurrences using two statistical methods: Pearson Correlation Coefficient (PCC) and Chi-Square Test (X²). This helps process engineers quickly identify root causes of defective products and provides data support for process optimization.

Core Features:

  • NG Correlation Analysis: Select production batches and NG codes to automatically analyze statistical correlations between the NG and all process result parameters
  • Pearson Correlation Analysis (PCC): Display correlation coefficients and p-values in a volcano plot, visually distinguishing significantly correlated from non-correlated parameters
  • Chi-Square Independence Test (X²): Perform chi-square tests on categorical data to determine independence between NG and parameter distributions

Feature Screenshot:

Correlation Analysis Hawkeye Screenshot
Figure 1: Correlation Analysis Hawkeye

2. Term Definitions

Term Definition Description
Hawkeye Hawkeye, the NG correlation analysis module in MES system Advanced analytics feature of the Data Middle Platform
Production Batch Production Batch, a complete production task batch Basic analysis scope; all analysis data comes from the same batch
NG Code NG Code, defective product classification code Such as "Capacity retention rate failure", "Capacity failure", etc., which is the target variable for analysis
Process Process, an operational unit in the production process Such as formation, grading, OCV, etc. Each process has multiple result parameters
Pearson Correlation Coefficient Pearson Correlation Coefficient (PCC), a metric measuring linear correlation between two continuous variables Range from -1 to 1. Positive values indicate positive correlation, negative values indicate negative correlation, and larger absolute values indicate stronger correlation
p-value P-Value, probability value for statistical significance testing p > 0.05 indicates significant correlation
Chi-Square Test Chi-Square Test (X²), a statistical method used to test independence between categorical variables Tests whether there is an association between NG and the categorical distribution of each parameter
Sample Size Sample Size, number of valid data entries participating in the analysis Larger sample sizes lead to more reliable analysis results
Correlated/Not Correlated Correlated/Not Correlated, correlation determination based on p-value p > 0.05 determines significant correlation (blue), p ≤ 0.05 determines no significant correlation (red)
Volcano Plot Volcano Plot, a scatter plot visualizing the relationship between correlation coefficients and p-values X-axis represents correlation coefficient, Y-axis represents p-value

Correlation Determination Rules:

Determination p-value Condition Color Indicator Meaning
Significant Correlation Exists p > 0.05 Blue NG has a statistically significant association with this process parameter
No Significant Correlation p ≤ 0.05 Red No statistically significant association between NG and this process parameter

3. Hawkeye Analysis Process

3.1 Analysis Process Description

The Hawkeye analysis process consists of three steps: Select Production Batch → Select Analysis Target (Process + NG Code) → Execute Analysis and View Results.

flowchart LR
    A[Select Production Batch] --> B[Select NG Process]
    B --> C[Select NG Code]
    C --> D[Click Analyze]
    D --> E[PCC Pearson Correlation Analysis]
    D --> F[X² Chi-Square Test Analysis]
    E --> G[Volcano Plot Visualization]
    F --> H[Test Result Table]

3.2 Select Production Batch

Operation Steps:

  1. Navigate to [Data Middle Platform] → [Hawkeye]
  2. In the left [Analysis Conditions] panel, select the target batch for analysis from the "Production Batch" dropdown list
  3. The system automatically loads process information for this batch

Field Description:

Field Description Required
Production Batch Select the batch number for NG analysis Yes

Feature Screenshot:

Select Production Batch Screenshot
Figure 1: Select Production Batch

3.3 Select NG Process and NG Code

After selecting a batch, the system automatically loads all processes that can record NG and their corresponding NG codes within the batch's process flow.

Operation Steps:

  1. After selecting a production batch, select the process to analyze from the "Process" dropdown list
  2. The system automatically loads all recordable NG codes for this process
  3. Select the specific NG type from the "NG Code" dropdown list

Field Description:

Field Description Required
Process Process to analyze, such as formation, grading, OCV, etc. Yes
NG Code Specific NG type to analyze under this process, such as "Capacity retention rate failure", etc. Yes

Feature Screenshot:

Select Process Flow Screenshot
Figure 1: Select Process Flow
Select NG Category Screenshot
Figure 2: Select NG Category

3.4 Execute Analysis

After making selections, click the [Analyze] button. The system performs the following analysis:

  1. Extracts NG column and target process result parameter data from the BKV temporary data table for this batch
  2. Calculates Pearson Correlation Coefficient (PCC) and p-value for each continuous result parameter
  3. Performs Chi-Square Independence Test (X²) for each categorical result parameter
  4. Summarizes analysis results and displays them in charts and tables

[Note] Analysis takes time. Larger data volumes require more analysis time. If there is no NG data in the analysis area, the system will prompt "Analysis table not found".

4. Analysis Result Interpretation

Analysis results are divided into two sections, displaying Pearson Correlation Analysis (PCC) and Chi-Square Test (X²) results respectively.

Feature Screenshot:

Analysis Results Overview Screenshot
Figure 1: Analysis Results Overview

4.1 Pearson Correlation Analysis (PCC)

Pearson Correlation Analysis is used to test the linear correlation between continuous process result parameters (such as voltage, current, temperature, etc.) and NG occurrences.

Volcano Plot Display:

The system displays all process parameters and their Pearson correlation coefficients (X-axis) and p-values (Y-axis) with the target NG in a volcano plot. Each point represents a process parameter:

  • Blue points: p > 0.05, indicating significant correlation between this process parameter and NG
  • Red points: p ≤ 0.05, indicating no significant correlation between this process parameter and NG

Hover over a point to view detailed information about that parameter (parameter name, correlation coefficient, p-value).

Right Table Fields:

Field Description
Process Parameter Name of the process result parameter participating in analysis, formatted as "Process Name.Parameter Name"
Sample Size Number of valid data entries participating in calculation
Correlation Coefficient Pearson correlation coefficient r value
p-value Significance test p-value
Correlation Determination conclusion: "Significant" in blue or "Not Significant" in red

Feature Screenshot:

Correlation Result Screenshot
Figure 1: Correlation Results

4.2 Chi-Square Independence Test (X²)

Chi-Square Test is used to analyze whether there is a statistical association between categorical result parameters and NG occurrences.

Table Fields:

Field Description
Process Parameter Name of the process result parameter participating in analysis
Sample Size Number of valid data entries participating in calculation
Chi-Square Value Chi-Square statistic X² value
p-value Significance test p-value (supports scientific notation display)
Correlation Determination conclusion: "Significant" in blue or "Not Significant" in red

[Tip] The p-value column in the chi-square test supports scientific notation display. Hover over the "p-value" header to view explanations.

Feature Screenshot:

Chi-Square Test Result Screenshot
Figure 1: Chi-Square Test Results

4.3 No Analysis Data Prompt

If data in certain process parameter columns are completely identical or completely different (e.g., all zeros or all the same value), the system cannot perform statistical analysis on that column. These unanalyzable parameter names will be listed in a collapsed panel above the analysis results.

Function Relationship Description
Process Model Upstream Data Process units and preset result parameters are configured in the process model. Hawkeye analysis is based on these parameter definitions
Batch Management Upstream Data Production batches are created in the batch management module. Hawkeye selects analysis targets from batches
NG Management Upstream Data NG code types are uniformly maintained in the system. Hawkeye loads available NG codes for analysis selection
Battery Traceability Downstream Traceability After Hawkeye identifies significantly correlated parameters, battery traceability can be used to view specific battery process data for further verification of analysis conclusions