Storage Management
Storage Manager
Storage manager module.
This module provides a unified storage manager that can use different backends.
- class sparkless.storage.manager.TableMetadata(name, schema, created_at, table_schema, properties)[source]
Bases:
objectMetadata for a table in the catalog.
- Parameters:
Initialize table metadata.
- Parameters:
- class sparkless.storage.manager.StorageManagerFactory[source]
Bases:
objectFactory for creating storage managers.
- static create_memory_manager()[source]
Create a memory storage manager.
- Return type:
IStorageManager- Returns:
Memory storage manager instance.
- class sparkless.storage.manager.UnifiedStorageManager(backend)[source]
Bases:
IStorageManagerUnified storage manager that can switch between backends.
- Parameters:
backend (
IStorageManager)
Initialize unified storage manager.
- Parameters:
backend (
IStorageManager) – Storage backend to use.
- __init__(backend)[source]
Initialize unified storage manager.
- Parameters:
backend (
IStorageManager) – Storage backend to use.
- get_current_schema()[source]
Return the current schema used for unqualified table references.
- Return type:
- set_current_schema(schema_name)[source]
Set the current schema used for unqualified table references.
- create_table(schema, table, columns)[source]
Create a new table.
- Parameters:
schema (
str) – Name of the schema.table (
str) – Name of the table.columns (
Union[List[StructField],StructType]) – Table columns definition.
- Return type:
- get_table_schema(schema_name, table_name)[source]
Get table schema.
- Parameters:
- Return type:
Union[Any,StructType]- Returns:
Table schema.
- get_table_metadata(schema_name, table_name)[source]
Get table metadata including Delta-specific fields.
- switch_backend(backend)[source]
Switch to a different storage backend.
- Parameters:
backend (
IStorageManager) – New storage backend to use.- Return type:
- save_table_metadata(qualified_name, metadata)[source]
Store table metadata.
- Parameters:
qualified_name (
str) – Qualified table name (schema.table).metadata (
TableMetadata) – Table metadata to store.
- Return type:
- get_table_metadata_by_name(qualified_name)[source]
Retrieve table metadata by qualified name.
- Parameters:
qualified_name (
str) – Qualified table name (schema.table).- Return type:
- Returns:
Table metadata or None if not found.
- list_table_metadata(schema)[source]
List all table metadata for a schema.
- Parameters:
schema (
str) – Schema name.- Return type:
- Returns:
List of table metadata objects.
Storage Backends
Memory Storage
Memory storage backend.
This module provides an in-memory storage implementation.
- class sparkless.storage.backends.memory.MemoryTable(name, schema)[source]
Bases:
ITableIn-memory table implementation.
- Parameters:
name (
str)schema (
StructType)
Initialize memory table.
- Parameters:
name (
str) – Table name.schema (
StructType) – Table schema.
- __init__(name, schema)[source]
Initialize memory table.
- Parameters:
name (
str) – Table name.schema (
StructType) – Table schema.
- property schema: StructType
Get table schema.
- class sparkless.storage.backends.memory.MemorySchema(name)[source]
Bases:
objectIn-memory database schema (namespace) implementation.
- Parameters:
name (
str)
Initialize memory schema.
- Parameters:
name (
str) – Schema name.
- create_table(table, columns)[source]
Create a new table in this schema.
- Parameters:
table (
str) – Name of the table.columns (
Union[List[StructField],StructType]) – Table columns definition.
- Return type:
- class sparkless.storage.backends.memory.MemoryStorageManager[source]
Bases:
IStorageManagerIn-memory storage manager implementation.
Initialize memory storage manager.
- get_table_schema(schema_name, table_name)[source]
Get table schema.
- Parameters:
- Return type:
- Returns:
Table schema.
- get_table_metadata(schema_name, table_name)[source]
Get table metadata including Delta-specific fields.
File Storage
File-based storage backend.
This module provides a file-based storage implementation using JSON files.
- class sparkless.storage.backends.file.FileTable(name, schema, file_path)[source]
Bases:
ITableFile-based table implementation.
- Parameters:
name (
str)schema (
StructType)file_path (
str)
Initialize file table.
- Parameters:
name (
str) – Table name.schema (
StructType) – Table schema.file_path (
str) – Path to table data file.
- __init__(name, schema, file_path)[source]
Initialize file table.
- Parameters:
name (
str) – Table name.schema (
StructType) – Table schema.file_path (
str) – Path to table data file.
- property schema: StructType
Get table schema.
- class sparkless.storage.backends.file.FileSchema(name, base_path)[source]
Bases:
ISchemaFile-based schema implementation.
Initialize file schema.
- create_table(table, columns)[source]
Create a new table in this schema.
- Parameters:
table (
str) – Name of the table.columns (
Union[List[StructField],StructType]) – Table columns definition.
- Return type:
- class sparkless.storage.backends.file.FileStorageManager(base_path='sparkless_storage')[source]
Bases:
IStorageManagerFile-based storage manager implementation.
- Parameters:
base_path (
str)
Initialize file storage manager.
- Parameters:
base_path (
str) – Base path for storage files.
- __init__(base_path='sparkless_storage')[source]
Initialize file storage manager.
- Parameters:
base_path (
str) – Base path for storage files.
- create_table(schema_name, table_name, fields)[source]
Create a new table.
- Parameters:
schema_name (
str) – Name of the schema.table_name (
str) – Name of the table.fields (
Union[List[StructField],StructType]) – Table fields definition.
- Return type:
- get_table_schema(schema_name, table_name)[source]
Get table schema.
- Parameters:
- Return type:
- Returns:
Table schema.
- get_table_metadata(schema_name, table_name)[source]
Get table metadata including Delta-specific fields.
Storage Models
Dataclass models for type-safe DuckDB storage operations.
This module provides dataclass-based models for Sparkless’s storage layer, ensuring type safety for all database operations.
- class sparkless.storage.models.StorageMode(value)[source]
-
Storage operation modes with type safety.
- APPEND = 'append'
- OVERWRITE = 'overwrite'
- IGNORE = 'ignore'
- class sparkless.storage.models.MockDeltaVersion(version, timestamp, operation, data_snapshot)[source]
Bases:
objectRepresents a single version of a Delta table for time travel.
- class sparkless.storage.models.MockTableMetadata(table_name, schema_name='default', id=None, created_at=<factory>, updated_at=None, row_count=0, schema_version='1.0', storage_format='columnar', is_temporary=False, format=None, version=0, properties=<factory>, version_history=<factory>)[source]
Bases:
objectType-safe table metadata model for DuckDB storage.
- Parameters:
- version_history: List[MockDeltaVersion]
- __init__(table_name, schema_name='default', id=None, created_at=<factory>, updated_at=None, row_count=0, schema_version='1.0', storage_format='columnar', is_temporary=False, format=None, version=0, properties=<factory>, version_history=<factory>)
- Parameters:
- class sparkless.storage.models.ColumnDefinition(column_name, column_type, table_id=None, id=None, is_nullable=True, is_primary_key=False, default_value=None, column_order=0)[source]
Bases:
objectType-safe column definition model for DuckDB tables.
- Parameters:
- __init__(column_name, column_type, table_id=None, id=None, is_nullable=True, is_primary_key=False, default_value=None, column_order=0)
- class sparkless.storage.models.DuckDBTableModel(table_name, schema_name='default')[source]
Bases:
objectBase model for DuckDB table operations with type safety.
- class sparkless.storage.models.DuckDBConnectionConfig(database_path='sparkless.duckdb', read_only=False, memory_limit=None, thread_count=None, enable_extensions=True)[source]
Bases:
objectType-safe configuration for DuckDB connections.
- Parameters:
- class sparkless.storage.models.StorageOperationResult(success, rows_affected, operation_type, table_name, error_message=None, execution_time_ms=None)[source]
Bases:
objectType-safe result model for storage operations.
- Parameters:
Serialization
JSON serialization module.
This module provides JSON serialization and deserialization for storage.
- class sparkless.storage.serialization.json.JSONSerializer[source]
Bases:
objectJSON serializer for storage operations.
- static serialize_schema(schema, file_path)[source]
Serialize schema to JSON file.
- Parameters:
schema (
StructType) – Schema to serialize.file_path (
str) – Path to output file.
- Return type:
CSV serialization module.
This module provides CSV serialization and deserialization for storage.
- class sparkless.storage.serialization.csv.CSVSerializer[source]
Bases:
objectCSV serializer for storage operations.
- static serialize_schema(schema, file_path)[source]
Serialize schema to CSV file.
- Parameters:
schema (
StructType) – Schema to serialize.file_path (
str) – Path to output file.
- Return type: