11 KiB
Hawkeye
1. Function Overview
Hawkeye is an advanced analytics module in the MES system's Data Middle Platform, designed for statistical correlation analysis between NG (Non-Good/Defective) products and process result parameters across various production stages. The system automatically identifies which process parameters have significant correlations with NG occurrences using two statistical methods: Pearson Correlation Coefficient (PCC) and Chi-Square Test (X²). This helps process engineers quickly identify root causes of defective products and provides data support for process optimization.
Core Features:
- NG Correlation Analysis: Select production batches and NG codes to automatically analyze statistical correlations between the NG and all process result parameters
- Pearson Correlation Analysis (PCC): Display correlation coefficients and p-values in a volcano plot, visually distinguishing significantly correlated from non-correlated parameters
- Chi-Square Independence Test (X²): Perform chi-square tests on categorical data to determine independence between NG and parameter distributions
Feature Screenshot:
2. Term Definitions
| Term | Definition | Description |
|---|---|---|
| Hawkeye | Hawkeye, the NG correlation analysis module in MES system | Advanced analytics feature of the Data Middle Platform |
| Production Batch | Production Batch, a complete production task batch | Basic analysis scope; all analysis data comes from the same batch |
| NG Code | NG Code, defective product classification code | Such as "Capacity retention rate failure", "Capacity failure", etc., which is the target variable for analysis |
| Process | Process, an operational unit in the production process | Such as formation, grading, OCV, etc. Each process has multiple result parameters |
| Pearson Correlation Coefficient | Pearson Correlation Coefficient (PCC), a metric measuring linear correlation between two continuous variables | Range from -1 to 1. Positive values indicate positive correlation, negative values indicate negative correlation, and larger absolute values indicate stronger correlation |
| p-value | P-Value, probability value for statistical significance testing | p > 0.05 indicates significant correlation |
| Chi-Square Test | Chi-Square Test (X²), a statistical method used to test independence between categorical variables | Tests whether there is an association between NG and the categorical distribution of each parameter |
| Sample Size | Sample Size, number of valid data entries participating in the analysis | Larger sample sizes lead to more reliable analysis results |
| Correlated/Not Correlated | Correlated/Not Correlated, correlation determination based on p-value | p > 0.05 determines significant correlation (blue), p ≤ 0.05 determines no significant correlation (red) |
| Volcano Plot | Volcano Plot, a scatter plot visualizing the relationship between correlation coefficients and p-values | X-axis represents correlation coefficient, Y-axis represents p-value |
Correlation Determination Rules:
| Determination | p-value Condition | Color Indicator | Meaning |
|---|---|---|---|
| Significant Correlation Exists | p > 0.05 | Blue | NG has a statistically significant association with this process parameter |
| No Significant Correlation | p ≤ 0.05 | Red | No statistically significant association between NG and this process parameter |
3. Hawkeye Analysis Process
3.1 Analysis Process Description
The Hawkeye analysis process consists of three steps: Select Production Batch → Select Analysis Target (Process + NG Code) → Execute Analysis and View Results.
flowchart LR
A[Select Production Batch] --> B[Select NG Process]
B --> C[Select NG Code]
C --> D[Click Analyze]
D --> E[PCC Pearson Correlation Analysis]
D --> F[X² Chi-Square Test Analysis]
E --> G[Volcano Plot Visualization]
F --> H[Test Result Table]
3.2 Select Production Batch
Operation Steps:
- Navigate to [Data Middle Platform] → [Hawkeye]
- In the left [Analysis Conditions] panel, select the target batch for analysis from the "Production Batch" dropdown list
- The system automatically loads process information for this batch
Field Description:
| Field | Description | Required |
|---|---|---|
| Production Batch | Select the batch number for NG analysis | Yes |
Feature Screenshot:
3.3 Select NG Process and NG Code
After selecting a batch, the system automatically loads all processes that can record NG and their corresponding NG codes within the batch's process flow.
Operation Steps:
- After selecting a production batch, select the process to analyze from the "Process" dropdown list
- The system automatically loads all recordable NG codes for this process
- Select the specific NG type from the "NG Code" dropdown list
Field Description:
| Field | Description | Required |
|---|---|---|
| Process | Process to analyze, such as formation, grading, OCV, etc. | Yes |
| NG Code | Specific NG type to analyze under this process, such as "Capacity retention rate failure", etc. | Yes |
Feature Screenshot:
3.4 Execute Analysis
After making selections, click the [Analyze] button. The system performs the following analysis:
- Extracts NG column and target process result parameter data from the BKV temporary data table for this batch
- Calculates Pearson Correlation Coefficient (PCC) and p-value for each continuous result parameter
- Performs Chi-Square Independence Test (X²) for each categorical result parameter
- Summarizes analysis results and displays them in charts and tables
[Note] Analysis takes time. Larger data volumes require more analysis time. If there is no NG data in the analysis area, the system will prompt "Analysis table not found".
4. Analysis Result Interpretation
Analysis results are divided into two sections, displaying Pearson Correlation Analysis (PCC) and Chi-Square Test (X²) results respectively.
Feature Screenshot:
4.1 Pearson Correlation Analysis (PCC)
Pearson Correlation Analysis is used to test the linear correlation between continuous process result parameters (such as voltage, current, temperature, etc.) and NG occurrences.
Volcano Plot Display:
The system displays all process parameters and their Pearson correlation coefficients (X-axis) and p-values (Y-axis) with the target NG in a volcano plot. Each point represents a process parameter:
- Blue points: p > 0.05, indicating significant correlation between this process parameter and NG
- Red points: p ≤ 0.05, indicating no significant correlation between this process parameter and NG
Hover over a point to view detailed information about that parameter (parameter name, correlation coefficient, p-value).
Right Table Fields:
| Field | Description |
|---|---|
| Process Parameter | Name of the process result parameter participating in analysis, formatted as "Process Name.Parameter Name" |
| Sample Size | Number of valid data entries participating in calculation |
| Correlation Coefficient | Pearson correlation coefficient r value |
| p-value | Significance test p-value |
| Correlation | Determination conclusion: "Significant" in blue or "Not Significant" in red |
Feature Screenshot:
4.2 Chi-Square Independence Test (X²)
Chi-Square Test is used to analyze whether there is a statistical association between categorical result parameters and NG occurrences.
Table Fields:
| Field | Description |
|---|---|
| Process Parameter | Name of the process result parameter participating in analysis |
| Sample Size | Number of valid data entries participating in calculation |
| Chi-Square Value | Chi-Square statistic X² value |
| p-value | Significance test p-value (supports scientific notation display) |
| Correlation | Determination conclusion: "Significant" in blue or "Not Significant" in red |
[Tip] The p-value column in the chi-square test supports scientific notation display. Hover over the "p-value" header to view explanations.
Feature Screenshot:
4.3 No Analysis Data Prompt
If data in certain process parameter columns are completely identical or completely different (e.g., all zeros or all the same value), the system cannot perform statistical analysis on that column. These unanalyzable parameter names will be listed in a collapsed panel above the analysis results.
6. Related Functions
| Function | Relationship | Description |
|---|---|---|
| Process Model | Upstream Data | Process units and preset result parameters are configured in the process model. Hawkeye analysis is based on these parameter definitions |
| Batch Management | Upstream Data | Production batches are created in the batch management module. Hawkeye selects analysis targets from batches |
| NG Management | Upstream Data | NG code types are uniformly maintained in the system. Hawkeye loads available NG codes for analysis selection |
| Battery Traceability | Downstream Traceability | After Hawkeye identifies significantly correlated parameters, battery traceability can be used to view specific battery process data for further verification of analysis conclusions |






