Proposed system architecture. A hybrid AI-assisted judging stack can reduce marking errors by combining: (1) multi-view video (2–4 synchronized cameras @ ≥60 fps) plus optional wearable IMUs (6–9 axis on shins/thighs/torso) for occlusion-robust kinematics; (2) real-time 2D/3D pose estimation (e.g., heatmap-based keypoint detectors with temporal tracking) and triangulation or video-to-3D lift; (3) a rule-aware scoring model that maps features to event scores; and (4) judge-in-the-loop UI with explainable overlays.

Feature engineering aligned to Yogasana rules. From the 3D keypoints and IMU quaternions, compute:

  • Static alignment: joint angles (e.g., θ_knee, θ_hip), segment perpendicularity/parallelism to floor, spinal curvature proxies.
  • Symmetry & balance: left–right angle deltas; CoM projection inside base-of-support; sway metrics (RMS drift) over the required hold window.
  • Hold validity: continuous time above a pose similarity threshold (e.g., cosine similarity > τ on normalized angle vectors) with uncertainty bands.
  • Transitions (entry/exit): jerk, smoothness (spectral arc length), temporal consistency.
  • Aesthetic proxies (optional): cadence stability, breath-paced micro-motion if a chest band is allowed.

Scoring model. Two options:

  • Interpretable pipeline: deterministic rule checks + weighted linear model tuned on expert labels. Pros: transparency, easy appeals.
  • Supervised learner: gradient-boosted trees or temporal transformers consuming feature sequences to predict sub-scores (Alignment, Balance, Hold, Transition). Use multi-task learning with ordinal regression heads to respect rating scales.
  • Training data & labeling. Curate a benchmark dataset: multi-view videos across levels, body types, attire, mats/backgrounds; annotate frame-accurate landmarks and segment-level sub-scores from ≥5 certified judges. Use adjudication to form a consensus label and keep individual labels to estimate noise. Active learning can target rare poses/edge cases.

    Calibration & fairness.

    • Per-venue calibration: camera intrinsics/extrinsics checkboard routine + latency sync.
    • Anthropometry normalization: scale-invariant features; avoid penalizing body shape/height.
    • Domain shift defenses: test-time augmentation, clothing color/skin-tone invariance tests, IMU fusion in occlusions.
    • Bias audits: slice metrics across gender/age/skin tone; report ΔMAE and equalized error gaps.

    Validation protocol. Report:

    • Agreement with panel: MAE per sub-score, Kendall’s τ rank correlation, Cohen’s/ICC vs. judges.
    • Robustness: performance under occlusion, lighting, and camera misalignment stress tests.
    • Ablations: 2D-only vs. 3D vs. 3D+IMU.
    • Field trial: randomized events with/without AI; measure dispute rate, time-to-final, inter-judge variance.

    Real-time deployment. Edge inference on GPUs (≈10–20 ms/stream) with

    More Mehjabin . M's questions See All
    Similar questions and discussions