FusionCore v0
Physics-Aware Predictive Maintenance Pipeline
Explore the full RUL pipeline from zero-leakage preprocessing and physics-aware feature engineering through baseline modelling, explainability, and operational impact.
FusionCore v0 is a completed, five-phase predictive maintenance pipeline for Remaining Useful Life estimation of commercial turbofan engines. It operates on the NASA C-MAPSS corpus across all four subsets, pooled into a unified dataset designated FD00u, comprising 709 run-to-failure engine trajectories and approximately 160,000 flight cycles.
The programme objective is PHM-grade RUL prediction under aerospace safety constraints: not merely minimising symmetric error metrics such as RMSE and MAE, but satisfying the NASA asymmetric safety scoring function, which penalises late predictions far more strongly than early ones.
Commercial turbofan engines degrade continuously due to HPC erosion, thermodynamic fatigue, and compound fan-blade degradation. Traditional time-based maintenance is conservative and inefficient. Condition-based maintenance, enabled by robust RUL estimation, defers intervention until the engine’s actual degradation state warrants action.
The core technical challenge is separating flight-regime-driven variance from genuine degradation signal. Without regime normalisation, a model trained on multi-regime data learns altitude, not wear.
| Phase | Title | Key Output |
|---|---|---|
| 1 | EDA & Physics Grounding | Active sensor set, variance audit, ACF/PACF, lifecycle trajectories |
| 2 | Zero-Leakage Normalisation | FD00u unified manifold, K-Means regime clustering (K = 6), regime Z-score pipeline |
| 3 | Physics-Aware Feature Engineering | 91-feature vector, kinematic expansion, virtual sensors, fatigue indices, survival analysis |
| 4 | XGBoost Baseline + SOTA Benchmarking | XGBoost baseline, TFT and N-HiTS training, DeepAR disqualification, forensic audit |
| 5 | Model Evaluation & Sign-Off | Four-quadrant aerospace evaluation, financial impact analysis |
| Metric | Value |
|---|---|
| RMSE | 14.85 cycles |
| MAE | 10.34 cycles |
| NASA Asymmetric Score | 4,336.3 |
| Critical-Band F2 (β = 2) | 0.9339 |
| Projected net saving (100-aircraft fleet) | +$13,818,171 per annum |
The neural comparators TFT and N-HiTS both returned a NASA Score of 10,061,188.1, a ratio of 2,320:1 relative to XGBoost, and are therefore classified as operationally non-viable in their v0 configurations. DeepAR was formally disqualified on architectural grounds before evaluation. All six Phase 5 programme gates pass.
| Subset | Fault Mode | Operating Regimes | Train Engines | Test Engines | Total Cycles | Median Life | Max Life |
|---|---|---|---|---|---|---|---|
| FD001 | HPC degradation (single) | 1 | 100 | 100 | 20,631 | 199 | 362 |
| FD002 | HPC degradation (single) | 6 | 260 | 259 | 53,759 | 199 | 378 |
| FD003 | HPC + Fan compound | 1 | 100 | 100 | 24,720 | 220 | 525 |
| FD004 | HPC + Fan compound | 6 | 248 | 249 | 61,249 | 234 | 543 |
| FD00u | Combined | — | 708 | 708 | 160,359 | — | — |
The most consequential Phase 1 finding is regime-conditional sensor behaviour: sensors such as s1 and s5 are effectively dead in single-regime subsets but become active in multi-regime subsets because they encode altitude-driven variance rather than degradation directly.
The rule is simple: if σ2 > τ the sensor is active; if σ2 ≤ τ the sensor is classified as dead.
| Sensor | Physical Quantity | FD001 Dead | FD002 Active | Classification |
|---|---|---|---|---|
| s1 | T2 — Fan Inlet Temperature | ✓ | ✓ | Regime-conditional |
| s5 | P2 — Fan Inlet Pressure | ✓ | ✓ | Regime-conditional |
| s6 | epr — Engine Pressure Ratio | ✓ | ✓ | Regime-conditional |
| s10 | epr proxy | ✓ | ✓ | Regime-conditional |
| s16 | farB — Burner Fuel-Air Ratio | ✓ | ✓ | Globally dead |
| s18 | Nf_dmd | ✓ | Dead | Globally dead |
| s19 | PCNfR_dmd | ✓ | Dead | Globally dead |
ACF analysis on s4 (EGT) remains significant through lag 17, while PACF exhibits a dominant lag-1 spike. That directly justifies first-order kinematic velocity features in Phase 3.
Multi-regime data must be normalised by operating regime, but the regime labels are absent. K-Means is therefore fitted on operational settings only, with K = 6, matching the documented physical regime count.
This is the pipeline’s zero-leakage constraint: validation and test statistics are never used when fitting normalisation parameters.
| Z-Score Range | Physical Interpretation | SPC Zone |
|---|---|---|
| |z| < 2 | Healthy variance band | Inner control zone |
| |z| ≥ 2 | Warning-limit breach | Warning zone |
| |z| ≥ 3 | Control-limit breach | Functional failure threshold |
| Partition | Engines | Rows | Purpose |
|---|---|---|---|
| Internal Training | 567 | 129,331 | Model fitting and parameter fitting |
| Internal Validation | 142 | 31,028 | Model selection and tuning |
| NASA Official Test | 707 | 104,897 | Final Iron Wall evaluation |
| Step | Component | Features Added | Cumulative Total |
|---|---|---|---|
| Core Manifold | 3 settings + 21 sensors | 24 | 24 |
| Kinematic Expansion | Δxt, rolling mean, rolling std on 17 active sensors | 51 | 75 |
| Virtual Sensors | CPR, Ethermal, EGT drift | 3 | 78 |
| Cumulative Fatigue Indices | Miner’s Rule proxies | 3 | 81 |
| Gap Resolution Features | Additional resolved features | 10 | 91 |
The C-index reaches 0.6398, indicating moderate discrimination and motivating future validation against continuous-flight N-CMAPSS data.
| Physical Sensor | Corrected Sensor | Physical Relationship | Pearson r | VIF |
|---|---|---|---|---|
| s8 — Nf | s13 — NRf | NRf = Nf / √θ | 0.9447 | 9.3 |
| s9 — Nc | s14 — NRc | NRc = Nc / √θ | 0.9464 | 9.6 |
XGBoost is configured with 1,000 estimators, learning rate 0.05, max depth 6, subsample 0.8, colsample_bytree 0.8, and early stopping patience of 50 rounds.
| Metric | Training Set | Validation Set |
|---|---|---|
| RMSE | 9.68 | 14.99 |
| MAE | 6.47 | 9.96 |
| NASA Score | 213,795.6 | 176,641.2 |
The baseline survives both the Target Null Test and the Physical Perturbation Reactivity Test, indicating the model is learning genuine degradation physics rather than spurious correlations.
DeepAR is disqualified because it cannot ingest the exogenous 91-feature manifold properly and would violate zero-leakage constraints by conditioning on past target values at inference time.
| Model | Validation RMSE | Test RMSE | Validation NASA | Test NASA |
|---|---|---|---|---|
| TFT | 57.88 | 63.40 | 443,438.3 | 10,061,188.1 |
| N-HiTS | 57.88 | 63.40 | 443,438.3 | 10,061,188.1 |
The identical scores to four decimal places strongly suggest a checkpoint-loading collision and are therefore flagged as a blocking item for FusionCore v1.
The official NASA test set is accessed only in Phase 5, with all Phase 2 and Phase 3 artefacts replayed in strict forward-only mode. No retraining is permitted after test contact.
| Model | RMSE | MAE | NASA Score | Critical-Band F2 | Net Save | Status |
|---|---|---|---|---|---|---|
| XGBoost | 14.85 | 10.34 | 4,336.3 | 0.9339 | +$13,818,171 | RECOMMENDED ★ |
| TFT | 63.40 | 48.71 | 10,061,188.1 | 0.0532 | −$1,631,210 | Not viable (v0) |
| N-HiTS | 63.40 | 48.71 | 10,061,188.1 | 0.0532 | −$1,631,210 | Collision unresolved |
| DeepAR | — | — | — | — | — | DISQUALIFIED |
XGBoost correctly classifies 147 of 157 critical-band engines with zero Healthy-to-Critical misclassifications. By contrast, the neural comparators collapse toward ceiling-saturated predictions and miss the critical band almost entirely.
Under baseline MRO assumptions, deploying FusionCore v0 with its XGBoost recommendation yields a projected annual saving of +$13,818,171 for a 100-aircraft fleet.
| Gap ID | Priority | Description | Resolution Path |
|---|---|---|---|
| G-V1-01 (G5) | HIGH | True Yeo-Johnson transformation required for CPR and Ethermal | Fit on X_train only and produce dedicated neural matrices |
| G-V1-02 (G9) | MANDATORY | s16 ablation via SHAP | Residual analysis to confirm whether s16 contributes signal or noise |
| G-V1-03 (G3) | MANDATORY | Terminal class imbalance | Evaluate subset-stratified sampling if SHAP residuals show terminal bias |
FusionCore v1 inherits the Phase 3 FD00u parquet outputs directly, using the identical 80:20 engine-boundary split to preserve strict comparability with FusionCore v0.
FusionCore v1 is the PiNet programme: Predictive In-orbital Network, a compact physics-guided temporal neural architecture for commercial turbofan Remaining Useful Life estimation. It takes the validated 91-feature FD00u manifold produced by FusionCore v0 Phase 3 unchanged, then learns over 30-cycle engine windows using a long-memory Temporal Convolutional Network and a small physics-token MLP.
The output is operational rather than academic: one head predicts scalar RUL, while a second head classifies the engine state into Healthy, Warning, or Critical. The aim is to preserve the physics fidelity of the v0 XGBoost baseline while adding a trainable temporal backbone that can learn degradation trajectory structure directly.
| Area | FusionCore v1 Scope |
|---|---|
| Dataset | NASA C-MAPSS only: FD001, FD002, FD003, and FD004 pooled as FD00u. |
| Input | FusionCore v0 Phase 3 91-feature parquet output, with neural-only Yeo-Johnson treatment for high-skew virtual sensors. |
| Architecture | Two-branch deterministic PiNet backbone: 30-cycle TCN plus physics-token MLP, fused into a shared embedding. |
| Comparator | FusionCore v0 Phase 4 XGBoost carried forward unchanged: RMSE 14.85, NASA Score 4,336.3, Critical-band F2 0.9339. |
| Out of scope | N-CMAPSS deployment, real-time fleet dashboards, and full probabilistic deployment infrastructure. |
PiNet does not discard the engineering work from v0. It uses the same physics-aware manifold and then gives the most interpretable gas-path quantities their own forward-graph location. That matters because a TCN can see all 91 features, but it is not guaranteed to preserve compressor pressure ratio, isentropic efficiency, cumulative fatigue, Cox hazard, and regime-weight information in a way an engineer can audit.
The v1 roadmap is therefore deliberately conservative: fixed hyperparameters from PHM literature, no informal tuning against the NASA test set, a single official test-set touch at programme close, and one clearly pre-declared advanced ablation using triplet metric learning with Shannon regularisation.
| Metric | XGBoost v0 Reference | PiNet v1 Aim |
|---|---|---|
| RMSE | 14.85 cycles | Beat or match without weakening safety performance |
| NASA Asymmetric Score | 4,336.3 | Priority metric when RMSE and operational risk disagree |
| Critical-band F2 | 0.9339 | Maintain or improve safety-weighted terminal detection |
| Critical recall | 0.9363 | Strong pass if PiNet recall is at least the XGBoost recall |
A model that improves headline RMSE but loses the Critical band is not treated as operationally superior. The roadmap is explicit: safety-sensitive recall and NASA asymmetric cost govern the final judgement.
FusionCore v1 introduces PiNet, the Predictive In-orbital Network. The programme is built around three principles: simplification, comparability, and falsifiability. It deliberately avoids a sprawling experimental surface so that any performance difference against the v0 XGBoost reference can be explained rather than waved away.
| Directive | Implementation in v1 |
|---|---|
| Consume the v0 evidence base | Use the FusionCore v0 Phase 3 91-feature FD00u parquet output unchanged as the fixed input manifold. |
| Build the PiNet backbone | Pair a long-memory TCN over 30-cycle windows with a small physics-token MLP over selected gas-path and degradation scalars. |
| Train one primary model | Optimise a two-term loss combining NASA asymmetric RUL cost with class-weighted cross-entropy for Healthy, Warning, and Critical risk bands. |
| Evaluate once | Touch the NASA C-MAPSS official test set exactly once at programme close, reporting RMSE, NASA Score, MAE, and Critical precision/recall/F2. |
| Preserve a novelty hook | Run one pre-declared compute-bounded ablation with triplet metric learning and Shannon regularisation. |
The dataset is deliberately narrow: NASA C-MAPSS, subsets FD001-FD004, pooled as FD00u. N-CMAPSS is not included in v1. This keeps the benchmark comparable with the published PHM literature and with the v0 XGBoost result.
| Partition | Engines | Rows | Role | When Touched |
|---|---|---|---|---|
| Internal Train | 567 (~80%) | 129,331 | Loss optimisation and weight updates | Every batch of every epoch |
| Internal Validation | 142 (~20%) | 31,028 | Convergence monitoring and epoch selection | Once per epoch |
| NASA Official Test | 707 | 104,897 | Headline benchmark and literature comparability | Exactly once at programme close |
All inherited preprocessing artefacts remain read-only. The only v1-specific transformation is the Yeo-Johnson treatment for high-skew virtual sensors in the neural input matrix, fitted on Internal Train only and then frozen for validation and test.
PiNet is a two-branch backbone. The TCN branch learns temporal structure from 30-cycle windows of the 91-feature matrix. The physics branch processes a curated set of physically interpretable scalars: active gas-path sensors, virtual thermodynamic features, cumulative fatigue indices, Cox hazard score, inverse-frequency regime weight, and fault-family context.
| Component | Input | Mechanism | Roadmap Rationale |
|---|---|---|---|
| TCN branch | 30-cycle windows of the Yeo-Johnson-transformed 91-feature matrix | Dilated causal 1D convolutions with residual connections | Causality prevents look-ahead; dilation covers the full window without excessive parameters. |
| Physics-token branch | Approximately 11 physically meaningful scalars per cycle | Two-layer MLP with ReLU and BatchNorm | Keeps gas-path thermodynamic state auditable instead of burying it in the TCN latent space. |
| Fusion embedding | Last-timestep TCN state, mean-pooled TCN state, and physics embedding | Two-layer fusion MLP with LayerNorm | Combines current condition, distributed degradation evidence, and explicit physics state. |
| Output heads | Shared embedding de = 128 | RUL regression head plus three-class softmax risk head | Produces both scalar RUL and operational banding for maintenance decisions. |
The physics branch is not duplicated modelling for its own sake. It exists because feature presence is not the same as representation preservation: a TCN can mix physically interpretable values with 80-plus other channels, while a dedicated branch keeps gas-path interpretation visible and cheap to audit.
The Stage A primary model uses one composite loss. It combines NASA asymmetric RUL cost with class-weighted cross-entropy, and both terms are observation-weighted by the inverse-frequency regime weight.
| Risk Band | RUL Range | Natural Fraction | Training Batch Share |
|---|---|---|---|
| Healthy | RUL ≥ 80 | ≈ 60% | 40-50% |
| Warning | 30 ≤ RUL < 80 | ≈ 27% | 25-30% |
| Critical | RUL < 30 | ≈ 13.2% | 25-30% |
Validation and test distributions are never resampled. Batch balancing is only a training stabilisation mechanism to stop the terminal region from being drowned by Healthy-cycle windows.
There is no informal hyperparameter search in v1. The roadmap fixes the primary configuration before Stage A training starts. Any deviation becomes an explicit ablation, separated from the headline benchmark.
| Hyperparameter | Fixed Value | Source / Rationale |
|---|---|---|
| Embedding dim de | 128 | Li et al. C-MAPSS trade-off |
| TCN layers L | 5 | ERF = 63 cycles > W = 30 |
| TCN kernel K | 3 | Bai et al. TCN standard |
| Dilations | {1, 2, 4, 8, 16} | Doubling pattern for long memory |
| Hidden channels | 64 | PHM literature precedent |
| Window length W | 30 cycles | Heimes precedent plus v0 autocorrelation lag 17 margin |
| Physics MLP | 2 layers, BatchNorm, ReLU | Shallow by design because tokens are already meaningful |
| Dropout | 0.15 | Regularisation on a 16 GB Apple Silicon MPS budget |
| Batch size | 64 | Apple Silicon headroom |
| Learning rate | 1 × 10−3 | Adam default and PHM precedent |
| Optimiser | AdamW | Standard neural PHM practice |
| Epochs | 80, early stopping patience 15 | Empirical from prior SOTA work |
The sole head-to-head comparator is the v0 XGBoost result carried forward unchanged. DeepAR, TFT, and PatchTST are not treated as meaningful v1 comparators because their v0 NASA Scores were several thousand-fold worse than XGBoost and therefore outside the operational viability band.
| Metric | XGBoost v0 Carry-Forward | PiNet v1 Decision Logic |
|---|---|---|
| RMSE | 14.85 cycles | Report per subset and pooled; improvement is valuable only if safety metrics hold. |
| NASA Score | 4,336.3 | Primary operational cost metric when RMSE and risk conflict. |
| Critical-band F2 | 0.9339 | Safety-weighted measure because recall is twice as important as precision. |
| Critical recall | 0.9363 | Strong pass if PiNet recall − XGBoost recall ≥ 0. |
| Outcome | Condition | Programme Decision |
|---|---|---|
| Strong Pass | PiNet Critical recall − XGBoost Critical recall ≥ 0 | PiNet declared operationally superior. |
| Pass with Advisory | PiNet Critical recall − XGBoost Critical recall ≥ −0.02 | PiNet declared comparable and flagged as a watch item for the probabilistic extension. |
| Fail | PiNet Critical recall − XGBoost Critical recall < −0.02 | PiNet is not adopted as a replacement; failure analysis is required in Phase v1.6. |
The phase plan runs from data verification through window construction, PiNet implementation, primary training, single-touch NASA test evaluation, diagnostics and SHAP audit, then the advanced ablation. Every phase has numeric assertions and supporting visuals; no phase starts until the preceding gates pass.
The advanced ablation preserves the MSc-dissertation novelty hook without contaminating the headline result. It adds semi-hard triplet metric learning and a Shannon entropy term to the Stage A objective, but only on a representative compute-bounded subset first.
Ablation rollout to the full FD00u Internal Train is authorised only if the sample run produces a tighter held-out RUL distribution, improved Critical-class F2, or stronger same-stage versus different-stage embedding separation under a Kolmogorov-Smirnov gate. Null results are reported as null results rather than hidden.
The roadmap also defines a future probabilistic extension: conformal RUL prediction intervals and temperature-scaled risk-band probabilities over the frozen PiNet backbone, without retraining the deterministic core.
The roadmap keeps a compact leakage register active for v1. The most important controls are engine-boundary disjointness, frozen preprocessing artefacts, within-engine windowing, validation used only for early stopping, and a logged one-time NASA test-set evaluation.
| Risk | Category | Mitigation |
|---|---|---|
| Engine appearing in multiple partitions | Hard leakage | GroupShuffleSplit by (subset_origin, unit_id) with programmatic non-intersection assertion. |
| Validation or test contaminating fitted statistics | Hard leakage | All inherited artefacts are read-only; v1 Yeo-Johnson parameters are fitted on Internal Train only. |
| Windowing across engine boundaries | Hard leakage if unchecked | Composite-key enforcement in the windowing function and per-batch assertion. |
| NASA test accessed before close | Hard leakage | Test set touched exactly once at Phase v1.4; no post-evaluation iteration. |
The thesis addresses a compound challenge at the intersection of food security, climate science, and embedded machine learning: automated plant-disease detection in field environments where expert knowledge is scarce, labelled image data is limited, and the deployment platform imposes hard computational constraints.
The research goal is to develop a lightweight deep neural network pipeline capable of determining whether a plant is healthy or unhealthy from a small number of field images, with enough efficiency to run inference on resource-constrained IoT hardware. In the thesis, this capability profile is framed under precision agriculture.
The investigation operates on the PlantVillage dataset, publicly available via Kaggle and Mohanty’s GitHub repository, comprising 20,638 images across 15 plant-disease categories. For the binary classification objective, all 15 categories are collapsed into two classes.
| Class | Frequency | Proportion |
|---|---|---|
| Healthy | 3,221 | 15.61% |
| Unhealthy | 17,417 | 84.39% |
| Total | 20,638 | 100.00% |
A critical early finding is that the healthy class is the minority class, which inverts the usual disease-detection assumption. That imbalance has direct implications for loss design, threshold calibration, and recall prioritisation.
Stage 1 — Baseline Learning Model (BLM): MobileNetV2 configured as a standard binary classifier with random hyperparameter search. Its primary function is to establish three foundational hyperparameters that are inherited by Stage 2.
Stage 2 — Few-Shot Learning Model (FSLM): A Siamese Network with triplet-loss metric learning followed by a 2-way 5-shot episodic FSL evaluation with Bayesian hyperparameter optimisation. Its primary function is to demonstrate high-accuracy binary classification from minimal labelled examples.
| Stage | Model | Accuracy | Recall | MCC |
|---|---|---|---|---|
| Stage 1 (Test) | BLM — MobileNetV2 | 1.0000 | 1.0000 | 1.0000 |
| Stage 2 (Validation) | FSLM — 2-way 5-shot | 0.9969 | — | — |
| Stage 2 (Test) | FSLM — 2-way 5-shot | 0.9875 | 0.9877 | — |
| # | Category | Images |
|---|---|---|
| 1 | Pepper Bell Bacterial Spot | 997 |
| 2 | Pepper Bell Healthy | 1,478 |
| 3 | Potato Early Blight | 1,000 |
| 4 | Potato Late Blight | 1,000 |
| 5 | Potato Healthy | 152 |
| 6 | Tomato Bacterial Spot | 2,127 |
| 7 | Tomato Early Blight | 1,000 |
| 8 | Tomato Late Blight | 1,909 |
| 9 | Tomato Leaf Mold | 952 |
| 10 | Tomato Septoria Leaf Spot | 1,771 |
| 11 | Tomato Two-Spotted Spider Mite | 1,676 |
| 12 | Tomato Target Spot | 1,404 |
| 13 | Tomato Yellow Leaf Curl Virus | 3,209 |
| 14 | Tomato Mosaic Virus | 373 |
| 15 | Tomato Healthy | 1,591 |
| Total | 20,639 |
Two anomalous files were identified during the file-extension audit: almost all images are .jpg, with two exceptions in .png and .jpeg. After collapsing the 15 categories into a binary target, the class ratio becomes 5.41:1 in favour of the unhealthy class.
| Partition | Proportion | Role |
|---|---|---|
| Training (7) | 70% | Weight optimisation, random search, Siamese metric learning |
| Validation (2) | 20% | Hyperparameter tuning and episodic support-set pool |
| Test (1) | 10% | Final evaluation and episodic query set |
MobileNetV2 was selected because its depthwise separable convolution structure substantially reduces parameter count while preserving feature extraction capability. The width multiplier α = 0.5 halves the number of convolutional kernels at each layer.
A logistic classification threshold of 0.3, rather than 0.5, is used to improve recall on the unhealthy class because the cost of missing a diseased plant is materially higher than the cost of a false positive.
| Hyperparameter | Symbol | Search Space | Optimal (Trial 2) |
|---|---|---|---|
| Width Multiplier | α | {0.35, 0.5, 0.75, 1.0} | 0.5 |
| Learning Rate | η | {0.0001, 0.001, 0.01} | 0.0001 |
| Dropout Rate | P | {0.2, 0.3, 0.4, 0.5} | 0.3 |
The primary discriminator is the Matthews Correlation Coefficient (MCC), preferred over accuracy under class imbalance because it uses all four cells of the confusion matrix.
| Dataset | Accuracy | Precision | Recall | F1-Score | MCC |
|---|---|---|---|---|---|
| Train | 0.9999 | 1.0000 | 0.9998 | 0.9999 | 0.9995 |
| Validation | 0.9985 | 0.9989 | 0.9994 | 0.9991 | 0.9945 |
| Test | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
Sub-stage 2a repurposes the MobileNetV2 backbone inside a Siamese Network and trains it with triplet loss. This is metric learning, not episodic FSL. Sub-stage 2b freezes that embedding branch and performs a 2-way 5-shot episodic evaluation with Bayesian hyperparameter optimisation.
| Component | Parameters | Notes |
|---|---|---|
| Trainable (MobileNetV2 + Dense layers) | ~1,180,000 | Updated during triplet-loss training |
| Non-Trainable (ImageNet pretrained weights) | 18,544 | Frozen from Stage 1 |
| Total | ~1,200,000 | |
| Memory footprint | 4.57 MB | IoT-deployable |
The geometric constraint is d− ≥ d+ + αtrip. The observed triplet loss falls rapidly from 0.0027 at epoch 1 to approximately zero by epoch 2, indicating the pretrained backbone already provides a highly structured embedding space.
t-SNE visualisation confirms clean healthy/unhealthy cluster separation, while the pairwise distance density plot shows that intra-class and inter-class distance distributions are only minimally overlapping.
Each episode is a miniature binary classification task with N = 2 classes and k = 5 labelled support examples per class. A Softmax classifier is then applied over the frozen embedding space.
The sign of λShan determines whether entropy is penalised or rewarded, which is what makes the hyperparameter landscape non-trivial.
The Bayesian optimiser uses a Gaussian Process surrogate and Expected Improvement (EI) as the acquisition function.
| Hyperparameter | Symbol | Search Space | Optimal (Trial 2) | Role |
|---|---|---|---|---|
| Shannon coefficient | λShan | {−10.0, −5.0, −1.0, 0.0, 0.01, 0.5} | 0.0 | Regularisation weight |
| Smoothing parameter | ξ | {3/2, 5/2} | 5/2 (= 2.5) | Embedding-space sensitivity |
| EI exploration weight | ξEI | {0.0, 0.1, 0.5, 0.75, 1.0} | 0.75 | Exploration–exploitation balance |
| Trial | λShan | ξ | ξEI | Validation Accuracy | Notes |
|---|---|---|---|---|---|
| 2 ★ | 0.0 | 2.5 | 0.75 | ~0.9969 | Optimal — Shannon term inactive; moderate exploration bias |
| 3 | −5.0 | — | — | ~0.9969 | Inverted regulariser; still highly performant |
| 10 | — | — | — | Worst trial | Demonstrates hyperparameter sensitivity |
| Partition | Accuracy | Recall | F1-Score |
|---|---|---|---|
| Validation | ~0.9969 | — | — |
| Test | 0.9875 | 0.9877 | — |
The 98.77% recall on the test set is the operationally critical result: it quantifies the fraction of genuinely diseased plants correctly identified, which is the central requirement for field deployment under limited expert supervision.
Controlled environment bias: PlantVillage is captured under studio conditions, so the reported performance is not expected to transfer directly to field conditions with illumination variation, occlusion, motion blur, and other sources of noise.
Class imbalance inversion: the minority class is healthy, not unhealthy. Real agricultural deployments may reverse or destabilise this prevalence pattern, implying a need for threshold recalibration or explicit class weighting.
2-way episodic constraint: the thesis addresses a binary problem only. Multi-class severity staging is proposed as a future extension using cosine-similarity Softmax, class-weighted triplet sampling, and zero-bias initialisation prior to L2 normalisation.
| Thesis Component | FusionCore Adaptation | Adaptation Notes |
|---|---|---|
| Siamese Network triplet loss | PiNet metric-learning regularisation via Ltriplet | Adapted from binary image classification to multi-stage RUL regression using RUL-band proximity rather than class labels |
| Shannon regulariser (λShan, ξ) | Embedding-collapse prevention in PiNet backbone pretraining | Used to counter collapse risk under terminal-class imbalance in C-MAPSS |
| Bayesian HPO (GP + EI) | Deferred from the FusionCore v1 headline experiment | The v1 PiNet roadmap uses fixed PHM-sourced hyperparameters; Bayesian tuning remains a future research stream rather than a test-set-facing control |
| 2-way 5-shot episodic FSL | True episodic FSL deferred to FusionCore v2 | N-CMAPSS provides the cross-dataset novelty required for a genuine episodic setting |
Technical focus developed through extensive independent study and a growing specialist library covering aerospace predictive maintenance, prognostics, deep learning, and time-series methods. My interest sits in building intelligent systems that can identify early signs of degradation, model temporal behaviour, and support decision-making in high-stakes engineering environments. I am especially drawn to Remaining Useful Life estimation, anomaly detection, and physics-aware machine learning approaches that connect strong technical performance with operational credibility.
Asset health, condition monitoring, and operational reliability.
Run-to-failure simulation, C-MAPSS datasets, RUL estimation, condition monitoring, and Industry 4.0 maintenance strategies.
Neural models for perception, sequence learning, and forecasting.
CNNs, RNNs, Temporal Fusion Transformers, DeepAR, N-HiTS, Siamese Networks, ResNet, MobileNet, and Inception architectures.
Temporal modelling of degradation, anomalies, and lifecycle trends.
Multi-horizon forecasting, degradation modelling, anomaly detection, and physics-informed feature engineering for temporal data.
Probabilistic reasoning, inference, and validation strategy.
Bayesian optimisation, density estimation, survival analysis, hyperparameter tuning, and cross-validation strategies.
Learning efficiently from limited labelled data.
Prototypical networks, Siamese architectures, domain adaptation, and learning from limited labelled data.
Building trust and transparency in high-stakes models.
SHAP values, model interpretability, t-SNE visualisation, and building trust in high-stakes ML systems.
LinkedIn article tiles sit here as a separate stream from the wider study themes, using the same hover-led tile language and outward link treatment used across Mission Control.
I got my Data Science Master’s in my early 50s… then discovered the “abundant job market” had quietly left the building. So instead of waiting for “experience” to magically appear, I’m building it the hard way: FusionCore, a real aerospace predictive maintenance project using NASA turbofan sensor data (C-MAPSS)—focused on time-series anomaly detection and Remaining Useful Life (RUL) prediction.
This is Post 1 of a series where I’ll share updates (weekly, where possible) as I follow a roadmap I’ve laid out in the article: pick a niche, build domain knowledge, ship a real project, get it reviewed by industry people, and make it easy for employers to assess the work. Not glamorous, occasionally chaotic… but at least it’s honest.
If this resonates, feel free to repost it for anyone else career-switching or job-hunting. And if you work in aerospace / predictive maintenance (or you’ve already broken into it), I’d love to connect—even a quick “here’s what I’d do differently” could save me weeks. Also: if you’re on a similar journey, tell me what you’re building (or what’s not working). Misery loves company… but progress loves receipts.
I built my data science project the fastest way to fool myself: grabbed the data, let AI sprint, shipped six notebooks… and called it “progress.” The problem? I still couldn’t explain what the numbers meant, which variables mattered to whom, or what risk hides inside “good model performance.”
So, I scrapped the notebook pile and rebuilt the work like an operating system: compress feedback loops, delete/simplify before automating, iterate fast, and engineer the project so it survives scrutiny without me in the room—think mission readiness review, but the payload is my own competence.
I’ve just shared a new article on how I set up my predictive maintenance project — and why I spent far longer building the roadmap than touching the model itself. Before getting to the glamorous machine learning bit, there was the small matter of understanding the physics, the risk, and the cost of being wrong. Turns out, in aerospace, “just winging it” is not a recognised methodology.
My latest article is about a result that genuinely surprised me.
I built FusionCore v0, a physics-aware predictive maintenance pipeline for turbofan engine Remaining Useful Life estimation, fully expecting the neural networks to lead.
They didn’t.
But the article is about far more than which model won.
It is about how to build AI that can speak to engineering, safety, operations, and finance at the same time. Using FusionCore v0 as the case study, I explain why the strongest result came from a model built on physics grounding, zero-leakage controls, and risk-aware evaluation, and why AI in aerospace has to be judged not just by prediction quality, but by how well it handles the operational consequences of being wrong.