Structured State Space Models and Mamba. Models like Mamba [Gu and Dao, 2023] can be in-
terpreted within GWO as employing a sophisticated Path, Shape, and Weight. The Path is defined by
a structured state-space recurrence, enabling it to model long-range dependencies efficiently. The Shape is
causal (1D), processing information sequentially. Critically, the Weight function is highly dynamic and input-
dependent, realized through selective state parameters that allow the model to focus on or forget information
based on the context, creating an effective content-aware bottleneck for sequences.
Structured State Space Models and Mamba. Models like Mamba [Gu and Dao, 2023] can be in- terpreted within GWO as employing a sophisticated Path, Shape, and Weight. The Path is defined by a structured state-space recurrence, enabling it to model long-range dependencies efficiently. The Shape is causal (1D), processing information sequentially. Critically, the Weight function is highly dynamic and input- dependent, realized through selective state parameters that allow the model to focus on or forget information based on the context, creating an effective content-aware bottleneck for sequences.