Tropical Cyclone EVOlution Model

TC Evolution is a multimodal tropical cyclone intensity estimation model designed to analyze the current maximum sustained wind (V_max) of a tropical cyclone from a short time sequence of environmental and satellite data. The system began as a 2025 prototype and is currently documented here as Beta 1.6.

Unlike a purely image-based estimator, TC Evolution combines:

environmental fields from numerical weather prediction data,
satellite structure from infrared and water vapor imagery,
surface context from sea-surface temperature and land masking,
and recent storm history from track-based scalar features.

Its design goal is to estimate current intensity rather than to serve as a long-range forecast model.

Overview

TC Evolution is a sequence model that ingests a short history of storm-centered gridded data and predicts the present tropical cyclone intensity in knots. The operational Beta 1.6 implementation uses a 16-channel input over multiple 6-hour frames and a neural architecture built around:

2D residual encoders for full-disk satellite structure,
2D residual encoders for zoomed inner-core satellite structure,
a 2D residual encoder for environmental and surface fields,
a temporal Transformer encoder,
and scalar storm-history features.

The final predicted intensity is generated as a delta update relative to the prior 6-hour intensity: <math>\hat{V}_{t} = V_{t-6} + \Delta V</math>

This delta formulation was adopted to encourage the model to learn storm evolution rather than regress directly toward the climatological mean.

History

2025 prototype

TC Evolution began in 2025 as a prototype intensity-analysis system focused on storm-centered gridded inputs and synoptic best-track intensity targets. Early development emphasized:

current-intensity estimation rather than long-lead forecasting,
storm-relative spatial crops,
storm-wise train/validation/test separation,
and explicit use of recent intensity history.

Transition to multimodal beta design

The prototype evolved into a two-stage beta design:

Stage A: Satellite-only pretraining
- Satellite-only pretraining on long-term GridSat data spanning 1998–2024.
- Objective: teach the model to recognize tropical cyclone structure from infrared and water vapor imagery over a long historical span.

Stage B: Multimodal fine-tuning
- Fine-tuning on a multimodal dataset spanning 2015–2024.
- Added environmental reanalysis/forecast-style fields, sea-surface temperature, land masking, and storm-history scalars.

Beta 1.6

Beta 1.6 refers to the operationalized multimodal current-intensity system described on this page. The Beta 1.6 line includes:

fixed 16-channel storm-centered frame construction,
ATCF-based operational track ingestion,
GOES file selection and storm-centered satellite extraction,
GFS environmental field gathering,
OISST sea-surface temperature retrieval,
and checkpoint-based inference from the multimodal final model.

Design principles

TC Evolution was developed around several principles:

1. Current intensity, not long-range forecasting

The model is intended to estimate the current V_max from recent storm evolution and surrounding environmental structure. It is not primarily a long-range track or intensity forecast system.

2. Storm evolution matters

The model uses:

multiple recent synoptic frames,
prior intensity history,
intensity change over the last 6–12 hours,
and storm motion.

This was meant to better represent how tropical cyclones intensify or weaken over time.

3. Inner-core structure matters

The model uses both:

a full storm-centered satellite crop,
and a zoomed inner-core crop.

This reflects the importance of eye structure, ring symmetry, convective organization, and core compactness in intensity estimation.

4. Environment still matters

Major intensity changes are not controlled by cloud structure alone. Environmental wind fields, shear-related diagnostics, SST, and land interaction were included to reduce purely visual bias.

5. Operational practicality

The final implementation was built to run from operational data streams using:

ATCF best-track or operational b-deck style storm position data,
GOES satellite imagery,
GFS gridded fields,
and daily OISST.

Model architecture

Input structure

The model ingests a sequence of storm-centered frames with shape:

[channels, time, height, width]

For Beta 1.6, the input uses 16 channels on a 41 × 41 grid.

Channel layout

Channel group	Channels	Description
Environmental fields	10	u10, v10, u850, v850, u200, v200, gh850, rh850, shear200_850, shear850_sfc
Satellite full field	2	infrared (IR), water vapor (WV)
Satellite zoom field	2	zoomed infrared (IR zoom), zoomed water vapor (WV zoom)
Surface context	2	sea-surface temperature (SST), land mask

Scalar feature vector

A separate scalar feature vector is used alongside the gridded sequence. In Beta 1.6 it includes:

latitude,
longitude encoded as sine/cosine,
month encoded as sine/cosine,
basin one-hot encoding,
zonal and meridional motion estimates,
V_max at t-6 h,
V_max at t-12 h,
6-hour intensity tendency,
missingness flags for previous intensities.

Neural structure

The model is implemented as TropicalCurrentIntensityNet.

Satellite branches

Two separate residual 2D backbones are used:

full_sat_enc for full storm structure,
zoom_sat_enc for inner-core structure.

Each branch is a small residual CNN using:

convolution,
GroupNorm,
GELU activation,
residual blocks,
progressive downsampling,
adaptive global pooling.

Environmental branch

A third residual 2D backbone, env_enc, processes the 12-channel non-satellite input composed of:

10 environmental channels,
SST,
land mask.

Frame fusion

For each time step:

the full satellite branch embedding,
the zoom satellite branch embedding,
and the environmental branch embedding

are concatenated and projected into a shared hidden representation.

Temporal modeling

The time sequence is modeled with a Transformer encoder using:

a learned class token,
learned temporal positional embeddings,
hidden dimension 384,
8 attention heads,
4 Transformer layers.

The Transformer summarizes the evolution of the storm over the input sequence.

Scalar fusion

Scalar storm-history features are passed through a small multilayer scalar network, then fused with the temporal representation.

Output

The Beta 1.6 operational inference model predicts only the intensity delta: <math>\Delta V</math>

The final intensity estimate is: <math>\hat{V}_{t} = V_{t-6} + \Delta V</math>

Data

Stage A pretraining data

Satellite-only pretraining used a GridSat-based dataset spanning 1998–2024. The satellite channels used for this phase were:

IR
WV
IR zoom
WV zoom

The purpose of Stage A was to expose the satellite encoders to a much longer historical archive than was available from the multimodal period.

Stage B fine-tuning data

The multimodal fine-tuning phase used data spanning 2015–2024, including:

storm-centered GFS-derived environmental fields,
storm-centered satellite imagery,
sea-surface temperature,
land masking,
and best-track intensity labels.

Label source

Training labels were derived from HURDAT2-style best-track records at synoptic times. Operational 2025 testing used ATCF/b-deck style best-track information because full HURDAT updates for those cases were not the primary operational source during testing.

Spatial setup

The model uses storm-centered grids with:

output resolution: 0.25°
outer box size: 10.0°
inner zoom size: 4.0°
final grid size: 41 × 41

Temporal setup

The model uses a short synoptic sequence at 6-hour spacing. The fine-tuned model checkpoint stores the effective sequence length used in training and inference.

Training approach

Stage A

Satellite-only pretraining emphasized structural recognition across a long historical archive. Development logs shared during testing reported the following Stage A test results:

Metric	Value
Score	8.620
RMSE	5.73 kt
RMSE 96+	9.91 kt
RMSE 137+	13.91 kt

These Stage A results were viewed primarily as a representation-learning step rather than the final operational benchmark.

Stage B

Stage B added multimodal inputs and reused pretrained satellite weights where compatible. The fine-tuning phase used:

storm-wise stratified splitting,
balanced sampling across intensity bins,
lower learning rate for imported satellite parameters,
temporary freezing of satellite encoders in early epochs,
and a weighted score emphasizing strong-storm performance.

Development logs shared during testing reported the following final Stage B test results:

Metric	Value
Score	7.413
RMSE	5.47 kt
RMSE 96+	7.24 kt
RMSE 137+	12.54 kt
RMSE 96–112	6.86 kt
RMSE 113–136	7.39 kt
RMSE 137+ bin	12.54 kt

These results showed an improvement over the satellite-only stage, especially for stronger storms.

Operational inference pipeline

Beta 1.6 operational inference performs the following steps:

Read storm position/intensity history from ATCF-style b-deck data.
Build a sequence of recent storm-centered frames at 6-hour intervals.
Gather the environmental fields from GFS.
Gather GOES IR and WV imagery.
Create both full-field and zoomed satellite crops.
Gather OISST and create a land mask.
Normalize inputs using statistics saved in the trained checkpoint.
Predict the current V_max.

The operational script is designed to estimate intensity for cases where the storm history is available from ATCF and the environmental and satellite files can be fetched.

Informal 2025 spot-check performance

A small set of 2025 storm spot checks was shared during development. These are not a full validation study, but they are useful operational examples.

Storm ID	Timestamp (UTC)	Best-track Vmax	TC Evolution	Absolute error
AL132025	2025-10-28 12:00	155 kt	151.5 kt	3.5 kt
AL082025	2025-09-27 00:00	120 kt	100.0 kt	20.0 kt
AL022025	2025-06-29 18:00	40 kt	37.1 kt	2.9 kt

Rounded to standard 5-kt operational bins:

AL132025 was effectively a near-hit at extreme intensity,
AL022025 was effectively correct at weak-storm intensity,
AL082025 showed a notable underestimation.

Comparison with ADT in the same spot checks

In the same informal comparison set, ADT values provided during testing indicated:

TC Evolution outperformed ADT on one extreme-intensity case,
ADT outperformed TC Evolution on one mid-major hurricane case,
both methods were effectively correct on one weak-storm case.

This was treated as a useful sanity check rather than a formal skill ranking.

Strengths

Observed and intended strengths of Beta 1.6 include:

ability to incorporate both structure and environment,
explicit representation of storm evolution,
physically meaningful use of recent intensity history,
capacity to represent very intense hurricanes without obvious mean-collapse in all cases,
operational compatibility with ATCF, GOES, GFS, and OISST inputs.

Known limitations

Current limitations include:

performance remains case-dependent, especially in some major-hurricane regimes,
operational behavior can differ from training conditions because GOES-era inference is not identical to the historical satellite training distribution,
the system is only as good as the storm-centered data extraction and upstream file availability,
formal multi-season operational benchmarking is still required,
the model should not be treated as a standalone official operational decision system without additional validation.

Intended use

TC Evolution is intended for:

experimental tropical cyclone intensity analysis,
internal benchmarking against satellite-based techniques,
case-study review,
and research-oriented operational prototyping.

It is not intended to replace official agency products without broader validation.

Naming

The name TC Evolution reflects the core concept behind the model:

tropical cyclone structure evolves in time,
environment influences that evolution,
and intensity is estimated as an evolving state rather than a static image classification problem.