Artifact Removal Strategies#

GastroPy provides three composable time-domain preprocessing functions that remove artifacts from the raw EGG signal before bandpass filtering, plus a named pipeline variant in egg_clean that bundles them together.

Function

Method

Reference

hampel_filter

Sliding-window median; replaces spikes with local median

Davies & Gather (1993); Dalmaijer (2025)

mad_filter

Global MAD outlier replacement; faster, less adaptive

Dalmaijer (2025)

remove_movement_artifacts

LMMSE Wiener filter for movement noise

Gharibans et al. (2018)

egg_clean(..., method="dalmaijer2025")

Hampel → LMMSE → IIR Butterworth pipeline

Dalmaijer (2025)

import matplotlib.pyplot as plt
import numpy as np

import gastropy as gp

plt.rcParams["figure.dpi"] = 100
plt.rcParams["figure.facecolor"] = "white"

# Synthetic EGG: gastric sine + pink noise + injected spike
rng = np.random.default_rng(42)
sfreq = 10.0
t = np.arange(0, 300, 1.0 / sfreq)
gastric = np.sin(2 * np.pi * 0.05 * t)           # 3 cpm signal
noise = 0.3 * rng.standard_normal(len(t))          # background noise
movement = 0.5 * np.sin(2 * np.pi * 0.008 * t)    # slow drift / movement
signal = gastric + noise + movement

# Inject two large spike artifacts
signal[300] = 8.0
signal[1800] = -7.5

print(f"Signal length : {len(signal)} samples ({len(signal)/sfreq:.0f} s)")
print(f"Signal range  : [{signal.min():.2f}, {signal.max():.2f}]")
Signal length : 3000 samples (300 s)
Signal range  : [-7.50, 8.00]

Hampel Filter — Spike Removal#

hampel_filter slides a window of length 2k+1 across the signal. Any sample deviating from the local median by more than n_sigma × 1.4826 × MAD is replaced by that median.

hampel_cleaned = gp.hampel_filter(signal, k=5, n_sigma=3.0)

fig, axes = plt.subplots(2, 1, figsize=(10, 5), sharex=True)
axes[0].plot(t, signal, lw=0.8, color="#d62728", label="Raw (with spikes)")
axes[0].set_ylabel("Amplitude (µV)")
axes[0].legend(loc="upper right")
axes[1].plot(t, hampel_cleaned, lw=0.8, color="#1f77b4", label="Hampel filtered")
axes[1].set_ylabel("Amplitude (µV)")
axes[1].set_xlabel("Time (s)")
axes[1].legend(loc="upper right")
fig.suptitle("Hampel Filter — Spike Removal")
plt.tight_layout()
plt.show()

print(f"Max absolute value before: {np.abs(signal).max():.2f}")
print(f"Max absolute value after : {np.abs(hampel_cleaned).max():.2f}")
../_images/00fee0712292ae4bb32c07aae221177073f47f1ef053921c735ccc2abf17d472.png
Max absolute value before: 8.00
Max absolute value after : 2.59

MAD Filter — Global Outlier Removal#

mad_filter computes a single global median and MAD across the entire signal. Faster than hampel_filter but less adaptive to slow signal drift. Suitable as a quick first pass on stationary recordings.

mad_cleaned = gp.mad_filter(signal, n_sigma=3.0)

fig, axes = plt.subplots(2, 1, figsize=(10, 5), sharex=True)
axes[0].plot(t, signal, lw=0.8, color="#d62728", label="Raw (with spikes)")
axes[0].set_ylabel("Amplitude (µV)")
axes[0].legend(loc="upper right")
axes[1].plot(t, mad_cleaned, lw=0.8, color="#2ca02c", label="MAD filtered")
axes[1].set_ylabel("Amplitude (µV)")
axes[1].set_xlabel("Time (s)")
axes[1].legend(loc="upper right")
fig.suptitle("MAD Filter — Global Outlier Removal")
plt.tight_layout()
plt.show()

print(f"Max absolute value after MAD filter: {np.abs(mad_cleaned).max():.2f}")
../_images/e8ef46db91de82aaa64955a928d0a64f650d73ada508ebc14cba3ce6b8520cd4.png
Max absolute value after MAD filter: 2.59

Movement Artifact Filter — LMMSE#

remove_movement_artifacts applies an LMMSE Wiener-filter that estimates local variance in sliding windows and attenuates time segments where variance exceeds the expected signal variance. Effective for movement and drift artifacts (Gharibans et al., 2018).

# First remove spikes, then attenuate movement artifacts
spike_cleaned = gp.hampel_filter(signal, k=5)
movement_cleaned = gp.remove_movement_artifacts(spike_cleaned, sfreq=sfreq, freq=0.05)

# Compare PSDs
freqs_raw, psd_raw = gp.psd_welch(signal, sfreq, fmin=0.0, fmax=0.15, overlap=0.75)
freqs_clean, psd_clean = gp.psd_welch(movement_cleaned, sfreq, fmin=0.0, fmax=0.15, overlap=0.75)

fig, ax = plt.subplots(figsize=(8, 4))
ax.semilogy(freqs_raw * 60, psd_raw, color="#d62728", lw=1.2, label="Raw")
ax.semilogy(freqs_clean * 60, psd_clean, color="#1f77b4", lw=1.2, label="Hampel + LMMSE")
ax.axvspan(2, 4, alpha=0.15, color="green", label="Normogastric (2–4 cpm)")
ax.set_xlabel("Frequency (cpm)")
ax.set_ylabel("Power spectral density")
ax.set_title("Effect of Preprocessing on PSD")
ax.legend()
plt.tight_layout()
plt.show()
../_images/1f9f5caedf9e177d4e3e2bb691769dc0be547b9cf0ed8e85625685b58f4536b6.png

Dalmaijer 2025 — Bundled Pipeline#

egg_clean(method="dalmaijer2025") runs Hampel → LMMSE → IIR Butterworth as a single named pipeline. Compare with the default FIR bandpass on the same noisy signal.

cleaned_fir, info_fir = gp.egg_clean(signal, sfreq=sfreq, method="fir")
cleaned_d25, info_d25 = gp.egg_clean(signal, sfreq=sfreq, method="dalmaijer2025")

print("FIR pipeline:")
print(f"  filter_method : {info_fir['filter_method']}")
print(f"  numtaps       : {info_fir['fir_numtaps']}")

print("\nDalmaijer 2025 pipeline:")
print(f"  cleaning_method : {info_d25['cleaning_method']}")
print(f"  hampel_k        : {info_d25['hampel_k']}")
print(f"  movement_window : {info_d25['movement_window']} cycles")
print(f"  filter_method   : {info_d25['filter_method']}")

fig, axes = plt.subplots(2, 1, figsize=(10, 5), sharex=True)
axes[0].plot(t, cleaned_fir, lw=0.8, color="#1f77b4", label='egg_clean(method="fir")')
axes[0].set_ylabel("Amplitude (µV)")
axes[0].legend(loc="upper right")
axes[1].plot(t, cleaned_d25, lw=0.8, color="#ff7f0e", label='egg_clean(method="dalmaijer2025")')
axes[1].set_ylabel("Amplitude (µV)")
axes[1].set_xlabel("Time (s)")
axes[1].legend(loc="upper right")
fig.suptitle("FIR vs. Dalmaijer 2025 Pipeline")
plt.tight_layout()
plt.show()
FIR pipeline:
  filter_method : fir
  numtaps       : 501

Dalmaijer 2025 pipeline:
  cleaning_method : dalmaijer2025
  hampel_k        : 3
  movement_window : 1.0 cycles
  filter_method   : iir_butter
../_images/4c8fda992874aa3b172ec6d84e9fa9f022a143174b6055b8249860af5a62ca57.png

See also: Bandpass Filtering, One-Call EGG Pipeline, Multi-Channel Processing, Artifact Detection