Contributing Annotations
Support Aurora: GitHub Sponsors · Patreon · Buy Me a Coffee
SolarHub relies on citizen scientists to provide the ground-truth data needed to train our machine-learning models. Contributing is easy and handled entirely through GitHub Issues.
How to Contribute
The SolarHub UI (Repo A) facilitates this process by providing a graphical interface for classification. However, if you're comfortable with JSON and GitHub, you can contribute directly.
1. Identify a Task
Check the annotations/ directory for task JSON files (e.g., sunspot.json). These files contain blank templates waiting for labels.
2. Submit a GitHub Issue
To contribute a label, create a new issue in this repository with the label annotation. Use the following format:
Issue Body Template
### Task Type
sunspot
### Record ID
sp-1234
### Your Label
class_a,10 1 5 2 3 ; class_h,2 7 1 4
- Task Type: Must be one of
sunspot,magnetogram,solar_flare, etc. - Record ID: The unique ID found in the task JSON.
- Your Label: One or more annotations in
label,regionformat (multiple annotations separated by;). Labels must be valid for the task type. Theregionpayload is stored exactly as submitted.
3. Automated Processing
Once your issue is submitted and labeled as annotation:
- The
Parse Annotation Issueworkflow is triggered. scripts/parse_issue_annotation.pyvalidates your input.- If valid, your contribution is appended to the
annotationslist for the corresponding task in the localannotations/directory. - If you submit again for the same record ID with the same GitHub username, the parser rejects the duplicate.
- The issue is automatically acknowledged and closed.
Valid Labels
SolarHub uses standard scientific classification systems for solar features. To ensure high data quality, please use only these predefined labels:
| Task Type | Scientific Labels |
|---|---|
| Sunspot | class_a to class_h |
| Magnetogram | alpha, beta, gamma, delta |
| Solar Flare | x_class, m_class, c_class... |
| Coronal Hole | polar, equatorial, mid-latitude |
| Prominence | quiescent, active, eruptive |
| CME | full_halo, partial_halo, normal |
What Happens Next?
Your contribution is appended to the annotations list for the task in the local annotations/ directory. During the next Nightly Pipeline run (00:30 UTC), all annotations are merged into the master HuggingFace dataset, preserving the history of each contribution. This data is then used to retrain our solar prediction models.