Jump to content

Tropical Cyclone EVOlution Model: Difference between revisions

From Continental Storm Service
No edit summary
No edit summary
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
{{DISPLAYTITLE:TC Evolution}}


'''TC Evolution''' is a multimodal tropical cyclone intensity estimation model designed to analyze the '''current maximum sustained wind''' (V<sub>max</sub>) of a tropical cyclone from a short time sequence of environmental and satellite data. The system began as a '''2025 prototype''' and is currently documented here as '''Beta 1.6'''.


Unlike a purely image-based estimator, TC Evolution combines:
'''TC Evolution''' is an experimental tropical cyclone current-intensity estimation model developed in 2025 and currently described as '''Beta 1.6'''. It is designed to estimate a storm's present maximum sustained wind (V<sub>max</sub>) from a short sequence of storm-centred environmental and satellite inputs rather than produce a long-range forecast.
* environmental fields from numerical weather prediction data,
* satellite structure from infrared and water vapor imagery,
* surface context from sea-surface temperature and land masking,
* and recent storm history from track-based scalar features.


Its design goal is to estimate ''current intensity'' rather than to serve as a long-range forecast model.
The system combines gridded environmental data, storm-centred infrared and water-vapour satellite imagery, sea-surface temperature, land masking, and recent storm-history scalars into a single neural architecture. In its Beta 1.6 form, TC Evolution uses a multimodal sequence model with separate encoders for full-storm satellite structure, zoomed inner-core satellite structure, and environmental fields, followed by a temporal [[Transformer (deep learning architecture)|Transformer]] encoder.
 
== Overview ==
TC Evolution is a sequence model that ingests a short history of storm-centered gridded data and predicts the present tropical cyclone intensity in knots. The operational Beta 1.6 implementation uses a '''16-channel input''' over multiple 6-hour frames and a neural architecture built around:
* 2D residual encoders for full-disk satellite structure,
* 2D residual encoders for zoomed inner-core satellite structure,
* a 2D residual encoder for environmental and surface fields,
* a temporal Transformer encoder,
* and scalar storm-history features.
 
The final predicted intensity is generated as a '''delta update''' relative to the prior 6-hour intensity:
<math>\hat{V}_{t} = V_{t-6} + \Delta V</math>
 
This delta formulation was adopted to encourage the model to learn storm evolution rather than regress directly toward the climatological mean.


== History ==
== History ==
=== 2025 prototype ===
TC Evolution began as a '''2025 prototype''' focused on tropical cyclone current-intensity analysis from storm-centred gridded data. The earliest versions were built to answer a practical question: whether a neural model could infer present intensity from recent storm evolution, rather than rely on a single image or on long-range forecast logic.
TC Evolution began in 2025 as a prototype intensity-analysis system focused on storm-centered gridded inputs and synoptic best-track intensity targets. Early development emphasized:
* current-intensity estimation rather than long-lead forecasting,
* storm-relative spatial crops,
* storm-wise train/validation/test separation,
* and explicit use of recent intensity history.


=== Transition to multimodal beta design ===
The project later evolved into a two-stage training workflow:
The prototype evolved into a two-stage beta design:


# '''Stage A: Satellite-only pretraining'''
=== Stage A ===
#* Satellite-only pretraining on long-term GridSat data spanning '''1998–2024'''.
The first major training stage used a satellite-only pretraining approach based on long-term storm-centred IR and WV imagery from '''1998 to 2024'''. This stage was intended to teach the model broad tropical-cyclone structural recognition before environmental data were added.
#* Objective: teach the model to recognize tropical cyclone structure from infrared and water vapor imagery over a long historical span.


# '''Stage B: Multimodal fine-tuning'''
=== Stage B ===
#* Fine-tuning on a multimodal dataset spanning '''2015–2024'''.
The second major stage fine-tuned the model on a multimodal dataset spanning '''2015 to 2024'''. This stage added environmental wind and height fields, sea-surface temperature, land masking, and scalar storm-history features. The resulting architecture became the basis of the operational Beta line.
#* Added environmental reanalysis/forecast-style fields, sea-surface temperature, land masking, and storm-history scalars.


=== Beta 1.6 ===
=== Beta 1.6 ===
'''Beta 1.6''' refers to the operationalized multimodal current-intensity system described on this page. The Beta 1.6 line includes:
'''Beta 1.6''' refers to the operational inference implementation using the Stage B multimodal architecture. In this version, the model is driven by:
* fixed 16-channel storm-centered frame construction,
* storm-centred GFS environmental fields,
* ATCF-based operational track ingestion,
* storm-centred GOES infrared and water-vapour imagery,
* GOES file selection and storm-centered satellite extraction,
* OISST sea-surface temperature,
* GFS environmental field gathering,
* land masking,
* OISST sea-surface temperature retrieval,
* and ATCF storm-history data for recent motion and prior intensity.
* and checkpoint-based inference from the multimodal final model.


== Design principles ==
== Operation ==
TC Evolution was developed around several principles:
TC Evolution is designed for '''current intensity analysis'''. It does not operate as a traditional long-range forecasting model. Instead, it takes a short sequence of recent storm-centred frames, processes their structure and environment, and predicts the storm's present intensity.


=== 1. Current intensity, not long-range forecasting ===
In operational use, the system:
The model is intended to estimate the current V<sub>max</sub> from recent storm evolution and surrounding environmental structure. It is not primarily a long-range track or intensity forecast system.
* reads storm history from ATCF best-track or operational b-deck style data,
* gathers recent environmental fields from GFS,
* gathers recent satellite imagery from GOES,
* gathers sea-surface temperature data,
* builds storm-centred grids for several recent synoptic times,
* normalises the data using checkpoint statistics,
* and estimates the current V<sub>max</sub> in knots.


=== 2. Storm evolution matters ===
== Principles ==
The model uses:
TC Evolution was built around several design principles.
* multiple recent synoptic frames,
* prior intensity history,
* intensity change over the last 6–12 hours,
* and storm motion.


This was meant to better represent how tropical cyclones intensify or weaken over time.
=== Evolution over snapshot analysis ===
The model is intended to represent the idea that tropical cyclone intensity is not purely a visual snapshot problem. Instead, intensity depends on how the storm has been evolving over recent hours. For that reason, the model uses:
* previous intensity,
* recent intensity tendency,
* storm motion,
* and a sequence of recent frames rather than a single time step.


=== 3. Inner-core structure matters ===
=== Inner-core structure matters ===
The model uses both:
The system uses both a full storm-centred satellite crop and a zoomed inner-core crop. This was intended to let the model separate broad-scale storm organisation from features such as eye definition, eyewall structure, and core symmetry.
* a full storm-centered satellite crop,
* and a zoomed inner-core crop.


This reflects the importance of eye structure, ring symmetry, convective organization, and core compactness in intensity estimation.
=== Environment matters ===
The model does not treat the cyclone as an isolated image. Environmental wind fields, low- and upper-level flow, shear-related diagnostics, SST, and land interaction are included because storm intensity is strongly linked to environmental context.


=== 4. Environment still matters ===
=== Operational practicality ===
Major intensity changes are not controlled by cloud structure alone. Environmental wind fields, shear-related diagnostics, SST, and land interaction were included to reduce purely visual bias.
The inference system was designed around data sources that can be retrieved operationally, especially for storms not yet present in finalized archival best-track files.


=== 5. Operational practicality ===
== Architecture ==
The final implementation was built to run from operational data streams using:
The Beta 1.6 implementation uses a neural network named '''TropicalCurrentIntensityNet'''.
* ATCF best-track or operational b-deck style storm position data,
* GOES satellite imagery,
* GFS gridded fields,
* and daily OISST.


== Model architecture ==
=== Input format ===
=== Input structure ===
The model ingests a storm-centred sequence with shape:
The model ingests a sequence of storm-centered frames with shape:


<pre>
<pre>
Line 92: Line 65:
</pre>
</pre>


For Beta 1.6, the input uses '''16 channels''' on a '''41 × 41''' grid.
In Beta 1.6, the input uses:
* '''16 channels'''
* '''41 × 41''' spatial grids
* a short sequence of recent 6-hourly frames
 
=== Channel structure ===
The 16 channels are arranged as follows:


=== Channel layout ===
{| class="wikitable"
{| class="wikitable"
! Channel group !! Channels !! Description
! Channel group !! Channels !! Description
Line 100: Line 78:
| Environmental fields || 10 || u10, v10, u850, v850, u200, v200, gh850, rh850, shear200_850, shear850_sfc
| Environmental fields || 10 || u10, v10, u850, v850, u200, v200, gh850, rh850, shear200_850, shear850_sfc
|-
|-
| Satellite full field || 2 || infrared (IR), water vapor (WV)
| Satellite full field || 2 || infrared (IR), water vapour (WV)
|-
|-
| Satellite zoom field || 2 || zoomed infrared (IR zoom), zoomed water vapor (WV zoom)
| Satellite zoom field || 2 || zoomed infrared, zoomed water vapour
|-
|-
| Surface context || 2 || sea-surface temperature (SST), land mask
| Surface context || 2 || sea-surface temperature (SST), land mask
|}
|}


=== Scalar feature vector ===
=== Scalar features ===
A separate scalar feature vector is used alongside the gridded sequence. In Beta 1.6 it includes:
In addition to the gridded input, TC Evolution uses a separate scalar feature vector. In Beta 1.6, this includes:
* latitude,
* latitude,
* longitude encoded as sine/cosine,
* longitude encoded as sine and cosine,
* month encoded as sine/cosine,
* month encoded as sine and cosine,
* basin one-hot encoding,
* basin one-hot encoding,
* zonal and meridional motion estimates,
* zonal and meridional storm motion,
* V<sub>max</sub> at t-6 h,
* V<sub>max</sub> at t−6 h,
* V<sub>max</sub> at t-12 h,
* V<sub>max</sub> at t−12 h,
* 6-hour intensity tendency,
* 6-hour intensity change,
* missingness flags for previous intensities.
* and availability flags for recent intensities.


=== Neural structure ===
=== Branch encoders ===
The model is implemented as '''TropicalCurrentIntensityNet'''.
The model uses three separate 2D residual backbones:


==== Satellite branches ====
==== Full satellite encoder ====
Two separate residual 2D backbones are used:
The '''full_sat_enc''' branch processes the two-channel full storm satellite input.
* '''full_sat_enc''' for full storm structure,
* '''zoom_sat_enc''' for inner-core structure.


Each branch is a small residual CNN using:
==== Zoom satellite encoder ====
* convolution,
The '''zoom_sat_enc''' branch processes the two-channel inner-core zoom satellite input.
* GroupNorm,
* GELU activation,
* residual blocks,
* progressive downsampling,
* adaptive global pooling.


==== Environmental branch ====
==== Environmental encoder ====
A third residual 2D backbone, '''env_enc''', processes the 12-channel non-satellite input composed of:
The '''env_enc''' branch processes the remaining twelve channels:
* 10 environmental channels,
* ten environmental fields,
* SST,
* SST,
* land mask.
* and land mask.


==== Frame fusion ====
Each branch is built from:
For each time step:
* convolution layers,
* the full satellite branch embedding,
* [[Group normalization|GroupNorm]],
* the zoom satellite branch embedding,
* GELU activation,
* and the environmental branch embedding
* residual 2D blocks,
* progressive downsampling,
* and adaptive average pooling.


are concatenated and projected into a shared hidden representation.
=== Temporal encoder ===
 
For each time step, the outputs of the three branches are concatenated and projected into a shared hidden representation. These per-frame embeddings are then fed into a temporal Transformer encoder with:
==== Temporal modeling ====
The time sequence is modeled with a Transformer encoder using:
* a learned class token,
* a learned class token,
* learned temporal positional embeddings,
* learned temporal positional embeddings,
* hidden dimension 384,
* hidden dimension 384,
* 8 attention heads,
* 8 attention heads,
* 4 Transformer layers.
* and 4 Transformer layers.


The Transformer summarizes the evolution of the storm over the input sequence.
The temporal encoder is intended to summarize short-term storm evolution rather than a single static frame.


==== Scalar fusion ====
=== Scalar fusion and output ===
Scalar storm-history features are passed through a small multilayer scalar network, then fused with the temporal representation.
The scalar feature vector is processed by a small scalar network and fused with the Transformer summary. The fused representation is then used to produce the final intensity estimate.


==== Output ====
The Beta 1.6 operational implementation predicts intensity through a delta formulation:
The Beta 1.6 operational inference model predicts only the intensity delta:
<math>\Delta V</math>


The final intensity estimate is:
<math>\hat{V}_{t} = V_{t-6} + \Delta V</math>
<math>\hat{V}_{t} = V_{t-6} + \Delta V</math>
This means the model predicts the present intensity as the previous 6-hour intensity plus a learned adjustment.


== Data ==
== Data ==
=== Stage A pretraining data ===
=== Stage A training data ===
Satellite-only pretraining used a GridSat-based dataset spanning '''1998–2024'''. The satellite channels used for this phase were:
The satellite-only pretraining stage used storm-centred data from '''1998–2024''', consisting of:
* IR
* infrared imagery,
* WV
* water-vapour imagery,
* IR zoom
* zoomed infrared imagery,
* WV zoom
* and zoomed water-vapour imagery.


The purpose of Stage A was to expose the satellite encoders to a much longer historical archive than was available from the multimodal period.
This stage was intended to provide long-horizon structural pretraining for tropical cyclone appearance.


=== Stage B fine-tuning data ===
=== Stage B training data ===
The multimodal fine-tuning phase used data spanning '''2015–2024''', including:
The multimodal fine-tuning stage used storm-centred data from '''2015–2024'''. These data included:
* storm-centered GFS-derived environmental fields,
* GFS-derived environmental fields,
* storm-centered satellite imagery,
* satellite IR and WV imagery,
* sea-surface temperature,
* zoomed satellite imagery,
* land masking,
* SST,
* land mask,
* and best-track intensity labels.
* and best-track intensity labels.


=== Label source ===
=== Label source ===
Training labels were derived from HURDAT2-style best-track records at synoptic times. Operational 2025 testing used ATCF/b-deck style best-track information because full HURDAT updates for those cases were not the primary operational source during testing.
Training targets were based on synoptic tropical cyclone intensity labels. For operational 2025-style testing and inference, ATCF best-track or b-deck style records were used when final archival products were not the intended operational source.


=== Spatial setup ===
=== Spatial setup ===
The model uses storm-centered grids with:
The storm-centred grid configuration used in Beta 1.6 is:
* output resolution: '''0.25°'''
* outer box size: '''10.0°'''
* inner zoom size: '''4.0°'''
* final grid size: '''41 × 41'''


=== Temporal setup ===
{| class="wikitable"
The model uses a short synoptic sequence at '''6-hour spacing'''. The fine-tuned model checkpoint stores the effective sequence length used in training and inference.
! Parameter !! Value
|-
| Output resolution || 0.25°
|-
| Outer box size || 10.
|-
| Inner zoom size || 4.
|-
| Final grid size || 41 × 41
|}


== Training approach ==
== Training ==
=== Stage A ===
=== Stage A results ===
Satellite-only pretraining emphasized structural recognition across a long historical archive. Development logs shared during testing reported the following Stage A test results:
The satellite-only stage was treated mainly as a representation-learning stage rather than the final operational model. Development logs reported the following Stage A test metrics:


{| class="wikitable"
{| class="wikitable"
Line 216: Line 192:
|}
|}


These Stage A results were viewed primarily as a representation-learning step rather than the final operational benchmark.
=== Stage B results ===
 
The multimodal fine-tuning stage improved performance and became the basis of the operational model. Development logs reported the following final Stage B test metrics:
=== Stage B ===
Stage B added multimodal inputs and reused pretrained satellite weights where compatible. The fine-tuning phase used:
* storm-wise stratified splitting,
* balanced sampling across intensity bins,
* lower learning rate for imported satellite parameters,
* temporary freezing of satellite encoders in early epochs,
* and a weighted score emphasizing strong-storm performance.
 
Development logs shared during testing reported the following final Stage B test results:


{| class="wikitable"
{| class="wikitable"
Line 246: Line 213:
|}
|}


These results showed an improvement over the satellite-only stage, especially for stronger storms.
These results were interpreted as an improvement over the satellite-only stage, especially in the stronger-storm regime.
 
== Operational inference pipeline ==
Beta 1.6 operational inference performs the following steps:
 
# Read storm position/intensity history from ATCF-style b-deck data.
# Build a sequence of recent storm-centered frames at 6-hour intervals.
# Gather the environmental fields from GFS.
# Gather GOES IR and WV imagery.
# Create both full-field and zoomed satellite crops.
# Gather OISST and create a land mask.
# Normalize inputs using statistics saved in the trained checkpoint.
# Predict the current V<sub>max</sub>.
 
The operational script is designed to estimate intensity for cases where the storm history is available from ATCF and the environmental and satellite files can be fetched.


== Informal 2025 spot-check performance ==
== Performance ==
A small set of 2025 storm spot checks was shared during development. These are not a full validation study, but they are useful operational examples.
=== Informal 2025 operational spot checks ===
During 2025-style operational testing, several storm cases were manually compared against best-track values and the [[Advanced Dvorak technique|Advanced Dvorak Technique]] (ADT).


{| class="wikitable"
{| class="wikitable"
! Storm ID !! Timestamp (UTC) !! Best-track Vmax !! TC Evolution !! Absolute error
! Storm ID !! Time (UTC) !! Best-track V<sub>max</sub> !! TC Evolution !! Absolute error
|-
|-
| AL132025 || 2025-10-28 12:00 || 155 kt || 151.5 kt || 3.5 kt
| AL132025 || 2025-10-28 12:00 || 155 kt || 151.5 kt || 3.5 kt
Line 275: Line 229:
|}
|}


Rounded to standard 5-kt operational bins:
In this small informal sample:
* AL132025 was effectively a near-hit at extreme intensity,
* the model performed very well on one extreme-intensity case,
* AL022025 was effectively correct at weak-storm intensity,
* underestimated one major hurricane case,
* AL082025 showed a notable underestimation.
* and performed very well on one weak-storm case.


=== Comparison with ADT in the same spot checks ===
=== ADT comparison ===
In the same informal comparison set, ADT values provided during testing indicated:
The same spot-check sample was also compared against ADT values provided during testing. In that limited comparison:
* TC Evolution outperformed ADT on one extreme-intensity case,
* TC Evolution outperformed ADT on one extreme-intensity case,
* ADT outperformed TC Evolution on one mid-major hurricane case,
* ADT outperformed TC Evolution on one mid-major hurricane case,
* both methods were effectively correct on one weak-storm case.
* and both were effectively correct on one weak-storm case.


This was treated as a useful sanity check rather than a formal skill ranking.
This comparison was informal and not presented as a formal full-season skill study.
 
== Operational implementation ==
The Beta 1.6 inference code includes:
* safe PyTorch checkpoint loading,
* ATCF storm-history retrieval,
* GFS file selection and download,
* GOES file selection and processing,
* OISST retrieval and interpolation,
* storm-centred frame construction,
* and final model inference from a saved multimodal checkpoint.
 
The operational implementation normalises gridded and scalar features using statistics stored in the model checkpoint.


== Strengths ==
== Strengths ==
Observed and intended strengths of Beta 1.6 include:
Observed and intended strengths of TC Evolution include:
* ability to incorporate both structure and environment,
* explicit modelling of storm evolution rather than single-frame regression,
* explicit representation of storm evolution,
* separate treatment of full-storm and inner-core satellite structure,
* physically meaningful use of recent intensity history,
* integration of environmental context,
* capacity to represent very intense hurricanes without obvious mean-collapse in all cases,
* direct use of previous intensity history,
* operational compatibility with ATCF, GOES, GFS, and OISST inputs.
* and demonstrated ability to represent very intense hurricanes in at least some operational spot checks.


== Known limitations ==
== Limitations ==
Current limitations include:
Known limitations of Beta 1.6 include:
* performance remains case-dependent, especially in some major-hurricane regimes,
* case-dependent errors in some major-hurricane situations,
* operational behavior can differ from training conditions because GOES-era inference is not identical to the historical satellite training distribution,
* limited formal operational verification relative to established methods,
* the system is only as good as the storm-centered data extraction and upstream file availability,
* dependence on upstream file availability and storm-centred data extraction quality,
* formal multi-season operational benchmarking is still required,
* potential data-distribution differences between training imagery and operational imagery,
* the model should not be treated as a standalone official operational decision system without additional validation.
* and the fact that the system remains a beta research model rather than an official operational standard.


== Naming ==
The name '''TC Evolution''' reflects the central idea of the model: tropical cyclone intensity is treated as an evolving state shaped by recent storm history, internal structure, and environmental conditions.


== Intended use ==
== Intended use ==
TC Evolution is intended for:
TC Evolution is intended for:
* experimental tropical cyclone intensity analysis,
* experimental tropical cyclone current-intensity analysis,
* internal benchmarking against satellite-based techniques,
* case-study review,
* case-study review,
* and research-oriented operational prototyping.
* research prototyping,
 
* and internal benchmarking against satellite-based techniques.
It is not intended to replace official agency products without broader validation.
 
== Naming ==
The name '''TC Evolution''' reflects the core concept behind the model:
* tropical cyclone structure evolves in time,
* environment influences that evolution,
* and intensity is estimated as an evolving state rather than a static image classification problem.


== See also ==
It is not intended to replace official agency products without broader validation and full operational testing.
* Tropical cyclone intensity estimation
* Advanced Dvorak Technique (ADT)
* HURDAT2
* ATCF
* GOES
* GFS
* OISST

Latest revision as of 11:58, 15 March 2026


TC Evolution is an experimental tropical cyclone current-intensity estimation model developed in 2025 and currently described as Beta 1.6. It is designed to estimate a storm's present maximum sustained wind (Vmax) from a short sequence of storm-centred environmental and satellite inputs rather than produce a long-range forecast.

The system combines gridded environmental data, storm-centred infrared and water-vapour satellite imagery, sea-surface temperature, land masking, and recent storm-history scalars into a single neural architecture. In its Beta 1.6 form, TC Evolution uses a multimodal sequence model with separate encoders for full-storm satellite structure, zoomed inner-core satellite structure, and environmental fields, followed by a temporal Transformer encoder.

History

[edit | edit source]

TC Evolution began as a 2025 prototype focused on tropical cyclone current-intensity analysis from storm-centred gridded data. The earliest versions were built to answer a practical question: whether a neural model could infer present intensity from recent storm evolution, rather than rely on a single image or on long-range forecast logic.

The project later evolved into a two-stage training workflow:

Stage A

[edit | edit source]

The first major training stage used a satellite-only pretraining approach based on long-term storm-centred IR and WV imagery from 1998 to 2024. This stage was intended to teach the model broad tropical-cyclone structural recognition before environmental data were added.

Stage B

[edit | edit source]

The second major stage fine-tuned the model on a multimodal dataset spanning 2015 to 2024. This stage added environmental wind and height fields, sea-surface temperature, land masking, and scalar storm-history features. The resulting architecture became the basis of the operational Beta line.

Beta 1.6

[edit | edit source]

Beta 1.6 refers to the operational inference implementation using the Stage B multimodal architecture. In this version, the model is driven by:

  • storm-centred GFS environmental fields,
  • storm-centred GOES infrared and water-vapour imagery,
  • OISST sea-surface temperature,
  • land masking,
  • and ATCF storm-history data for recent motion and prior intensity.

Operation

[edit | edit source]

TC Evolution is designed for current intensity analysis. It does not operate as a traditional long-range forecasting model. Instead, it takes a short sequence of recent storm-centred frames, processes their structure and environment, and predicts the storm's present intensity.

In operational use, the system:

  • reads storm history from ATCF best-track or operational b-deck style data,
  • gathers recent environmental fields from GFS,
  • gathers recent satellite imagery from GOES,
  • gathers sea-surface temperature data,
  • builds storm-centred grids for several recent synoptic times,
  • normalises the data using checkpoint statistics,
  • and estimates the current Vmax in knots.

Principles

[edit | edit source]

TC Evolution was built around several design principles.

Evolution over snapshot analysis

[edit | edit source]

The model is intended to represent the idea that tropical cyclone intensity is not purely a visual snapshot problem. Instead, intensity depends on how the storm has been evolving over recent hours. For that reason, the model uses:

  • previous intensity,
  • recent intensity tendency,
  • storm motion,
  • and a sequence of recent frames rather than a single time step.

Inner-core structure matters

[edit | edit source]

The system uses both a full storm-centred satellite crop and a zoomed inner-core crop. This was intended to let the model separate broad-scale storm organisation from features such as eye definition, eyewall structure, and core symmetry.

Environment matters

[edit | edit source]

The model does not treat the cyclone as an isolated image. Environmental wind fields, low- and upper-level flow, shear-related diagnostics, SST, and land interaction are included because storm intensity is strongly linked to environmental context.

Operational practicality

[edit | edit source]

The inference system was designed around data sources that can be retrieved operationally, especially for storms not yet present in finalized archival best-track files.

Architecture

[edit | edit source]

The Beta 1.6 implementation uses a neural network named TropicalCurrentIntensityNet.

Input format

[edit | edit source]

The model ingests a storm-centred sequence with shape:

[channels, time, height, width]

In Beta 1.6, the input uses:

  • 16 channels
  • 41 × 41 spatial grids
  • a short sequence of recent 6-hourly frames

Channel structure

[edit | edit source]

The 16 channels are arranged as follows:

Channel group Channels Description
Environmental fields 10 u10, v10, u850, v850, u200, v200, gh850, rh850, shear200_850, shear850_sfc
Satellite full field 2 infrared (IR), water vapour (WV)
Satellite zoom field 2 zoomed infrared, zoomed water vapour
Surface context 2 sea-surface temperature (SST), land mask

Scalar features

[edit | edit source]

In addition to the gridded input, TC Evolution uses a separate scalar feature vector. In Beta 1.6, this includes:

  • latitude,
  • longitude encoded as sine and cosine,
  • month encoded as sine and cosine,
  • basin one-hot encoding,
  • zonal and meridional storm motion,
  • Vmax at t−6 h,
  • Vmax at t−12 h,
  • 6-hour intensity change,
  • and availability flags for recent intensities.

Branch encoders

[edit | edit source]

The model uses three separate 2D residual backbones:

Full satellite encoder

[edit | edit source]

The full_sat_enc branch processes the two-channel full storm satellite input.

Zoom satellite encoder

[edit | edit source]

The zoom_sat_enc branch processes the two-channel inner-core zoom satellite input.

Environmental encoder

[edit | edit source]

The env_enc branch processes the remaining twelve channels:

  • ten environmental fields,
  • SST,
  • and land mask.

Each branch is built from:

  • convolution layers,
  • GroupNorm,
  • GELU activation,
  • residual 2D blocks,
  • progressive downsampling,
  • and adaptive average pooling.

Temporal encoder

[edit | edit source]

For each time step, the outputs of the three branches are concatenated and projected into a shared hidden representation. These per-frame embeddings are then fed into a temporal Transformer encoder with:

  • a learned class token,
  • learned temporal positional embeddings,
  • hidden dimension 384,
  • 8 attention heads,
  • and 4 Transformer layers.

The temporal encoder is intended to summarize short-term storm evolution rather than a single static frame.

Scalar fusion and output

[edit | edit source]

The scalar feature vector is processed by a small scalar network and fused with the Transformer summary. The fused representation is then used to produce the final intensity estimate.

The Beta 1.6 operational implementation predicts intensity through a delta formulation:

<math>\hat{V}_{t} = V_{t-6} + \Delta V</math>

This means the model predicts the present intensity as the previous 6-hour intensity plus a learned adjustment.

Stage A training data

[edit | edit source]

The satellite-only pretraining stage used storm-centred data from 1998–2024, consisting of:

  • infrared imagery,
  • water-vapour imagery,
  • zoomed infrared imagery,
  • and zoomed water-vapour imagery.

This stage was intended to provide long-horizon structural pretraining for tropical cyclone appearance.

Stage B training data

[edit | edit source]

The multimodal fine-tuning stage used storm-centred data from 2015–2024. These data included:

  • GFS-derived environmental fields,
  • satellite IR and WV imagery,
  • zoomed satellite imagery,
  • SST,
  • land mask,
  • and best-track intensity labels.

Label source

[edit | edit source]

Training targets were based on synoptic tropical cyclone intensity labels. For operational 2025-style testing and inference, ATCF best-track or b-deck style records were used when final archival products were not the intended operational source.

Spatial setup

[edit | edit source]

The storm-centred grid configuration used in Beta 1.6 is:

Parameter Value
Output resolution 0.25°
Outer box size 10.0°
Inner zoom size 4.0°
Final grid size 41 × 41

Training

[edit | edit source]

Stage A results

[edit | edit source]

The satellite-only stage was treated mainly as a representation-learning stage rather than the final operational model. Development logs reported the following Stage A test metrics:

Metric Value
Score 8.620
RMSE 5.73 kt
RMSE 96+ 9.91 kt
RMSE 137+ 13.91 kt

Stage B results

[edit | edit source]

The multimodal fine-tuning stage improved performance and became the basis of the operational model. Development logs reported the following final Stage B test metrics:

Metric Value
Score 7.413
RMSE 5.47 kt
RMSE 96+ 7.24 kt
RMSE 137+ 12.54 kt
RMSE 96–112 6.86 kt
RMSE 113–136 7.39 kt
RMSE 137+ bin 12.54 kt

These results were interpreted as an improvement over the satellite-only stage, especially in the stronger-storm regime.

Performance

[edit | edit source]

Informal 2025 operational spot checks

[edit | edit source]

During 2025-style operational testing, several storm cases were manually compared against best-track values and the Advanced Dvorak Technique (ADT).

Storm ID Time (UTC) Best-track Vmax TC Evolution Absolute error
AL132025 2025-10-28 12:00 155 kt 151.5 kt 3.5 kt
AL082025 2025-09-27 00:00 120 kt 100.0 kt 20.0 kt
AL022025 2025-06-29 18:00 40 kt 37.1 kt 2.9 kt

In this small informal sample:

  • the model performed very well on one extreme-intensity case,
  • underestimated one major hurricane case,
  • and performed very well on one weak-storm case.

ADT comparison

[edit | edit source]

The same spot-check sample was also compared against ADT values provided during testing. In that limited comparison:

  • TC Evolution outperformed ADT on one extreme-intensity case,
  • ADT outperformed TC Evolution on one mid-major hurricane case,
  • and both were effectively correct on one weak-storm case.

This comparison was informal and not presented as a formal full-season skill study.

Operational implementation

[edit | edit source]

The Beta 1.6 inference code includes:

  • safe PyTorch checkpoint loading,
  • ATCF storm-history retrieval,
  • GFS file selection and download,
  • GOES file selection and processing,
  • OISST retrieval and interpolation,
  • storm-centred frame construction,
  • and final model inference from a saved multimodal checkpoint.

The operational implementation normalises gridded and scalar features using statistics stored in the model checkpoint.

Strengths

[edit | edit source]

Observed and intended strengths of TC Evolution include:

  • explicit modelling of storm evolution rather than single-frame regression,
  • separate treatment of full-storm and inner-core satellite structure,
  • integration of environmental context,
  • direct use of previous intensity history,
  • and demonstrated ability to represent very intense hurricanes in at least some operational spot checks.

Limitations

[edit | edit source]

Known limitations of Beta 1.6 include:

  • case-dependent errors in some major-hurricane situations,
  • limited formal operational verification relative to established methods,
  • dependence on upstream file availability and storm-centred data extraction quality,
  • potential data-distribution differences between training imagery and operational imagery,
  • and the fact that the system remains a beta research model rather than an official operational standard.


Naming

[edit | edit source]

The name TC Evolution reflects the central idea of the model: tropical cyclone intensity is treated as an evolving state shaped by recent storm history, internal structure, and environmental conditions.

Intended use

[edit | edit source]

TC Evolution is intended for:

  • experimental tropical cyclone current-intensity analysis,
  • case-study review,
  • research prototyping,
  • and internal benchmarking against satellite-based techniques.

It is not intended to replace official agency products without broader validation and full operational testing.