Welcome to Sparklessο
π Test PySpark code at lightning speedβno JVM required
Sparkless is a lightweight PySpark replacement that runs your tests 10x faster by eliminating JVM overhead. Your existing PySpark code works unchangedβjust swap the import.
# Before
from pyspark.sql import SparkSession
# After
from sparkless.sql import SparkSession
Key Featuresο
β‘ 10x Faster - No JVM startup (30s β 0.1s)
π― Drop-in Replacement - Use existing PySpark code unchanged
π¦ Zero Java - Pure Python with Polars backend (thread-safe, no SQL required)
π§ͺ 100% Compatible - Full PySpark 3.2-3.5 API support
π Lazy Evaluation - Mirrors PySparkβs execution model
π Production Ready - 2314+ passing tests, 100% mypy typed
π§΅ Thread-Safe - Polars backend designed for parallel execution
Quick Startο
from sparkless.sql import SparkSession, functions as F
# Create session
spark = SparkSession("MyApp")
# Your PySpark code works as-is
data = [{"name": "Alice", "age": 25}, {"name": "Bob", "age": 30}]
df = spark.createDataFrame(data)
# All operations work
result = df.filter(F.col("age") > 25).select("name").collect()
print(result)
# Output: [Row(name='Bob')]
Documentation Contentsο
Getting Started
API Reference
Guides
Advanced Topics
Additional Resources