Issue Templates for robin-sparkless (Upstream)ο
The following text can be copied into the robin-sparkless GitHub repo when opening issues.
Issues created from robin_sparkless_needs.md (2026-02-08)ο
# |
Title |
Link |
|---|---|---|
182 |
select()/with_column() resolve Column expressions by name instead of evaluating them |
https://github.com/eddiethedean/robin-sparkless/issues/182 |
184 |
[Enhancement] Filter: support ColumnβColumn comparisons (col_a > col_b) |
https://github.com/eddiethedean/robin-sparkless/issues/184 |
185 |
[Enhancement] filter(condition): document Column-only or accept literal bool |
https://github.com/eddiethedean/robin-sparkless/issues/185 |
186 |
[Enhancement] lit(): extend to date/datetime types for PySpark parity |
https://github.com/eddiethedean/robin-sparkless/issues/186 |
187 |
[Enhancement] Window API for row_number, rank, sum over window, lag, lead |
https://github.com/eddiethedean/robin-sparkless/issues/187 |
Created via gh issue create -R eddiethedean/robin-sparkless with body files in docs/robin_issue_*.md.
Issues created 2026-02-06 (from ownership analysis)ο
# |
Title |
Link |
|---|---|---|
174 |
[Enhancement] Add Python operator overloads to Column for PySpark compatibility |
https://github.com/eddiethedean/robin-sparkless/issues/174 |
175 |
[Enhancement] Join on= parameter: accept string for single column (PySpark compatibility) |
https://github.com/eddiethedean/robin-sparkless/issues/175 |
Created via python scripts/create_robin_github_issues_2026_02.py
Sparkless parity issues created (earlier)ο
#1β#17: Created from initial subset (join, filter, select, transformations); see
scripts/create_robin_github_issues.py.104 additional issues: Created from broad parity run: same tests run in Robin mode (
tests/robin_parity_broad_results.txt) and PySpark mode (tests/pyspark_parity_failed_results.txt); issues opened only for tests that fail with Robin and pass with PySpark, excluding the 17 above. Script:scripts/create_robin_github_issues_from_results.py(uses--dry-runto preview).Second batch (19 issues): From
tests/parity/sql/andtests/parity/internal/. Robin run saved totests/robin_parity_sql_internal_results.txt(23 failed, 32 passed). Those 23 run in PySpark β 19 passed, 4 skipped. Issues created for the 19 parity gaps. Command to reproduce results:SPARKLESS_TEST_BACKEND=robin SPARKLESS_BACKEND=robin python -m pytest tests/parity/sql/ tests/parity/internal/ -v --tb=line -q 2>&1 | tee tests/robin_parity_sql_internal_results.txt
Then run failed IDs in PySpark and create issues:
python scripts/create_robin_github_issues_from_results.py \ --robin-results tests/robin_parity_sql_internal_results.txt \ --pyspark-results tests/pyspark_parity_sql_internal_results.txt \ --no-already-filed
Use
--dry-runto preview before creating issues.
Sparkless integration note (no upstream feature request needed)ο
Finding: Robin-sparkless already provides what Sparkless needs:
Arbitrary schema: Use
create_dataframe_from_rows(data, schema)wheredatais a list of dicts or lists andschemais a list of(column_name, dtype_str)(e.g.[("id", "bigint"), ("name", "string")]). The 3-column restriction applies only tocreate_dataframe().Operations: The DataFrame API already has
filter,select,with_column,order_by,order_by_exprs,group_by,limit,union,union_by_name,join, andGroupedData(count, sum, avg, min, max, agg, etc.).
The gap is in Sparkless: our Robin materializer currently uses only create_dataframe (3-column) and supports only filter/select/limit. We will extend it to use create_dataframe_from_rows and to translate more operations to the existing robin-sparkless API. No upstream feature issues are required for βflexible schemaβ or βmore operations.β
Bug report templateο
Title: [Bug] Short description of the bug
Body:
Description
[One or two sentences describing the incorrect behavior.]
To reproduce
[Minimal code or steps, e.g. Sparkless snippet that calls robin_sparkless and triggers the bug.]
# Example:
import robin_sparkless
# ...
Expected behavior
[What you expect to happen.]
Actual behavior
[What actually happens (error message, wrong result, etc.).]
Environment
Python version:
robin-sparkless version:
OS:
Additional context
[Optional: stack trace, logs, or links to Sparkless integration code.]