Skip to main content

Contributing Annotations

Support Aurora: GitHub Sponsors · Patreon · Buy Me a Coffee

SolarHub relies on citizen scientists to provide the ground-truth data needed to train our machine-learning models. Contributing is easy and handled entirely through GitHub Issues.

How to Contribute

The SolarHub UI (Repo A) facilitates this process by providing a graphical interface for classification. However, if you're comfortable with JSON and GitHub, you can contribute directly.

1. Identify a Task

Check the annotations/ directory for task JSON files (e.g., sunspot.json). These files contain blank templates waiting for labels.

2. Submit a GitHub Issue

To contribute a label, create a new issue in this repository with the label annotation. Use the following format:

Issue Body Template

### Task Type
sunspot

### Record ID
sp-1234

### Your Label
class_a,10 1 5 2 3 ; class_h,2 7 1 4
  • Task Type: Must be one of sunspot, magnetogram, solar_flare, etc.
  • Record ID: The unique ID found in the task JSON.
  • Your Label: One or more annotations in label,region format (multiple annotations separated by ;). Labels must be valid for the task type. The region payload is stored exactly as submitted.

3. Automated Processing

Once your issue is submitted and labeled as annotation:

  1. The Parse Annotation Issue workflow is triggered.
  2. scripts/parse_issue_annotation.py validates your input.
  3. If valid, your contribution is appended to the annotations list for the corresponding task in the local annotations/ directory.
  4. If you submit again for the same record ID with the same GitHub username, the parser rejects the duplicate.
  5. The issue is automatically acknowledged and closed.

Valid Labels

SolarHub uses standard scientific classification systems for solar features. To ensure high data quality, please use only these predefined labels:

Task TypeScientific Labels
Sunspotclass_a to class_h
Magnetogramalpha, beta, gamma, delta
Solar Flarex_class, m_class, c_class...
Coronal Holepolar, equatorial, mid-latitude
Prominencequiescent, active, eruptive
CMEfull_halo, partial_halo, normal

What Happens Next?

Your contribution is appended to the annotations list for the task in the local annotations/ directory. During the next Nightly Pipeline run (00:30 UTC), all annotations are merged into the master HuggingFace dataset, preserving the history of each contribution. This data is then used to retrain our solar prediction models.