I do historical swordfighting and noticed AI struggles to track it. I’m building an open dataset to help fix this. Does my schema make sense? [P]
Our take
The confluence of embodied AI and niche historical practices is yielding unexpectedly valuable data opportunities, as demonstrated by this compelling project from a historical European martial arts (HEMA) practitioner. The challenge of bridging the Sim2Real gap—the difficulty of transferring AI models trained in simulated environments to real-world applications—is a well-documented hurdle, and this initiative directly addresses it with a uniquely difficult dataset. We’ve recently seen work highlighting the potential of new architectural approaches like EML Trees are Universal Approximators [EML Trees are Universal Approximators [R]], and this project provides a grounded, practical testbed for evaluating such advancements. Similarly, Google's Agentic Peer-Reviewer Handled ~10K Papers at ICML/STOC [Google's Agentic Peer-Reviewer Handled ~10K Papers at ICML/STOC — Formal Research Paper Now Out [R]] showcases the rapid progress in AI automation, and the need for robust, specialized datasets like this one becomes increasingly evident as AI systems tackle more complex real-world tasks.
The beauty of this project lies in its recognition of a specific, overlooked problem space. Swordfighting, at its high-performance level, presents a perfect storm of computer vision obstacles: rapid, non-linear movements, occlusion from bulky clothing, and the sheer speed of the blades themselves, often blurring below the resolution of standard cameras. The proposed dataset structure, with its meticulous annotation of keypoints, biomechanics, and potential vision hazards, is a thoughtful approach to capturing these nuances. The inclusion of "occlusion_rating" and "motion_blur_expected" fields demonstrates a clear understanding of the challenges involved, and the focus on hyper-trimmed clips ensures data efficiency. The open nature of this initiative, hosted on Hugging Face, is particularly commendable, fostering collaboration and accelerating progress in this area—a refreshing contrast to the often-guarded nature of proprietary datasets.
Beyond the immediate application to computer vision for HEMA scoring systems, this project has broader implications for embodied AI research. The principles used to analyze and track human movement in this context – accounting for occlusion, motion blur, and complex biomechanics – are directly transferable to other areas like robotics, sports analytics, and even medical imaging. Developing robust tracking algorithms for thin, fast-moving objects is a fundamental challenge, and the lessons learned from analyzing sword blades could inform advancements in tracking other similarly challenging objects, such as surgical instruments or even microscopic particles. The meticulous annotation scheme, focused on biomechanics and specific edge alignments, suggests a depth of analysis that could be valuable for researchers beyond the immediate HEMA community.
Ultimately, the success of this project hinges on the feedback of the AI research community. The practitioner’s explicit request for "brutal feedback" underscores a genuine desire to create a truly useful resource. It's a testament to the growing recognition that diverse perspectives and collaborative efforts are essential to overcoming the limitations of current AI models. As Cerebras OpenAI deal capacity has effectively killed the waitlist for everyone else [Cerebras OpenAI deal capacity has effectively killed the waitlist for everyone else [D]], the demand for specialized, high-quality datasets will only increase, making initiatives like this one increasingly vital to the future of AI. Will we see a surge of similar, domain-specific datasets emerging from unexpected corners of expertise, further enriching the training landscape for embodied AI?
Hi everyone,
I’m a historical swordfighter (HEMA practitioner), and while I’m not a computer vision engineer or a roboticist, I’ve been reading a lot about the current bottlenecks in embodied AI, specifically around the Sim2Real gap and thin-object tracking.
It occurred to me that high-level swordfighting is basically a perfect nightmare scenario for computer vision. We move at maximum athletic output, we shift our weight rapidly in non-linear ways (great for bipedal balance testing), we are completely covered in thick, bulky black jackets that hide our joints, and our steel blades move at 80mph, dropping below sub-pixel resolution or causing massive motion blur.
I think it would be cool to have a computer vision scoring system for tournaments so I'm working to put together a mini-dataset using a synchronized multi-view setup (120/240fps) to map 100 hyper-trimmed clips of these specific physics edge cases.
Since I'm non-technical, I used some AI assistance to help me structure what an AI-ready dataset card should look like, and I've hosted the placeholder page on Hugging Face to test the schema before I start shooting video with my clubmates.
Here is the JSON line structure I'm currently planning to annotate each video with:
{ "clip_id": "hema_ls_001", "meta": { "weapon": "Longsword", "source_text": "Joachim Meyer (1570)", "capture_fps": 120 }, "time_stamps": { "start_frame": 120, "blade_contact_frame": 165, "recovery_end_frame": 210 }, "biomechanics": { "initial_guard": "Right Vom Tag", "ending_guard": "Left Ochs", "footwork_type": "Passing step offline", "strike_trajectory": "Diagonal Oberhau", "edge_alignment": "True edge" }, "computer_vision_hazards": { "occlusion_rating": "High (Crossed arms, bulky torso jacket)", "motion_blur_expected": true }, "frame_annotations": [ { "frame_index": 165, "is_contact_event": true, "keypoints_2d_pixel_coordinates": { "fencer_a_right_wrist": [412.5, 780.2], "fencer_a_left_wrist": [430.1, 795.4], "fencer_a_head_center": [425.0, 510.8], "fencer_b_right_wrist": [580.4, 765.1], "fencer_b_left_wrist": [565.0, 750.3], "sword_a_guard": [455.0, 810.0], "sword_a_tip": [890.4, 320.1], "sword_b_guard": [540.2, 790.6], "sword_b_tip": [310.5, 450.2] }, "segmentation_masks": { "sword_a_polygon_points": [[455.0, 810.0], [460.1, 805.2], [888.2, 322.5], [890.4, 320.1], [455.0, 810.0]], "occluded_pixels_detected": true } } ] } My questions for the researchers here:
- Does this metadata structure actually give you what you need to test trajectory prediction or pose estimation?
- Are there any specific keypoints (like explicit crossguard coordinates or footwork velocity metrics) that your models are starving for that I should add to the annotations while I'm doing the manual work?
You can check out the full dataset description card and leave feedback or join the beta waitlist directly on Hugging Face here: https://huggingface.co/datasets/benito87/longsword-spatial-physics-100
I want to make sure this is actually useful, so any brutal feedback on the structure or parameters is highly appreciated.
[link] [comments]
Read on the original site
Open the publisher's page for the full experience