Storage Management

Storage Manager

Storage manager module.

This module provides a unified storage manager that can use different backends.

class sparkless.storage.manager.TableMetadata(name, schema, created_at, table_schema, properties)[source]

Bases: object

Metadata for a table in the catalog.

Parameters:

Initialize table metadata.

Parameters:
  • name (str) – Table name.

  • schema (str) – Schema/database name.

  • created_at (datetime) – Creation timestamp.

  • table_schema (StructType) – Table schema structure.

  • properties (Dict[str, Any]) – Table properties.

__init__(name, schema, created_at, table_schema, properties)[source]

Initialize table metadata.

Parameters:
  • name (str) – Table name.

  • schema (str) – Schema/database name.

  • created_at (datetime) – Creation timestamp.

  • table_schema (StructType) – Table schema structure.

  • properties (Dict[str, Any]) – Table properties.

class sparkless.storage.manager.StorageManagerFactory[source]

Bases: object

Factory for creating storage managers.

static create_memory_manager()[source]

Create a memory storage manager.

Return type:

IStorageManager

Returns:

Memory storage manager instance.

static create_file_manager(base_path='sparkless_storage')[source]

Create a file storage manager.

Parameters:

base_path (str) – Base path for storage files.

Return type:

IStorageManager

Returns:

File storage manager instance.

static create_polars_manager(db_path=None)[source]

Create a Polars storage manager (default in v3.0.0+).

Parameters:

db_path (Optional[str]) – Optional path for persistent storage. If None, uses in-memory storage.

Return type:

IStorageManager

Returns:

Polars storage manager instance.

class sparkless.storage.manager.UnifiedStorageManager(backend)[source]

Bases: IStorageManager

Unified storage manager that can switch between backends.

Parameters:

backend (IStorageManager)

Initialize unified storage manager.

Parameters:

backend (IStorageManager) – Storage backend to use.

__init__(backend)[source]

Initialize unified storage manager.

Parameters:

backend (IStorageManager) – Storage backend to use.

create_schema(schema)[source]

Create a new schema.

Parameters:

schema (str) – Name of the schema to create.

Return type:

None

schema_exists(schema)[source]

Check if schema exists.

Parameters:

schema (str) – Name of the schema to check.

Return type:

bool

Returns:

True if schema exists, False otherwise.

drop_schema(schema_name, cascade=False)[source]

Drop a schema.

Parameters:
  • schema_name (str) – Name of the schema to drop.

  • cascade (bool) – Whether to cascade the drop operation.

Return type:

None

list_schemas()[source]

List all schemas.

Return type:

List[str]

Returns:

List of schema names.

get_current_schema()[source]

Return the current schema used for unqualified table references.

Return type:

str

set_current_schema(schema_name)[source]

Set the current schema used for unqualified table references.

Parameters:

schema_name (str)

Return type:

None

table_exists(schema, table)[source]

Check if table exists.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

Return type:

bool

Returns:

True if table exists, False otherwise.

create_table(schema, table, columns)[source]

Create a new table.

Parameters:
Return type:

None

drop_table(schema, table)[source]

Drop a table.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

Return type:

None

insert_data(schema, table, data, mode='append')[source]

Insert data into table.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

  • data (List[Dict[str, Any]]) – Data to insert.

  • mode (str) – Insert mode (“append”, “overwrite”, “ignore”).

Return type:

None

query_table(schema, table, filter_expr=None)[source]

Query data from table.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

  • filter_expr (Optional[str]) – Optional filter expression.

Return type:

List[Dict[str, Any]]

Returns:

List of data rows.

get_table_schema(schema_name, table_name)[source]

Get table schema.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

Return type:

Union[Any, StructType]

Returns:

Table schema.

get_data(schema, table)[source]

Get all data from table.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

Return type:

List[Dict[str, Any]]

Returns:

List of data rows.

create_temp_view(name, dataframe)[source]

Create a temporary view from a DataFrame.

Parameters:
  • name (str) – Name of the temporary view.

  • dataframe (Any) – DataFrame to create view from.

Return type:

None

list_tables(schema_name=None)[source]

List tables in schema.

Parameters:

schema_name (Optional[str]) – Name of the schema. If None, list tables in all schemas.

Return type:

List[str]

Returns:

List of table names.

get_table_metadata(schema_name, table_name)[source]

Get table metadata including Delta-specific fields.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

Return type:

Union[Any, Dict[str, Any]]

Returns:

Table metadata.

update_table_metadata(schema, table, metadata_updates)[source]

Update table metadata fields.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

  • metadata_updates (Dict[str, Any]) – Dictionary of metadata fields to update.

Return type:

None

switch_backend(backend)[source]

Switch to a different storage backend.

Parameters:

backend (IStorageManager) – New storage backend to use.

Return type:

None

save_table_metadata(qualified_name, metadata)[source]

Store table metadata.

Parameters:
  • qualified_name (str) – Qualified table name (schema.table).

  • metadata (TableMetadata) – Table metadata to store.

Return type:

None

get_table_metadata_by_name(qualified_name)[source]

Retrieve table metadata by qualified name.

Parameters:

qualified_name (str) – Qualified table name (schema.table).

Return type:

Optional[TableMetadata]

Returns:

Table metadata or None if not found.

list_table_metadata(schema)[source]

List all table metadata for a schema.

Parameters:

schema (str) – Schema name.

Return type:

List[TableMetadata]

Returns:

List of table metadata objects.

cleanup_temp_tables()[source]

Clean up temporary tables to free memory.

Return type:

None

optimize_storage()[source]

Optimize storage by cleaning up and compacting data.

Return type:

None

get_memory_usage()[source]

Get current memory usage statistics.

Return type:

Dict[str, Any]

Returns:

Dictionary with memory usage information.

force_garbage_collection()[source]

Force garbage collection to free memory.

Return type:

None

get_table_sizes()[source]

Get estimated sizes of all tables.

Return type:

Dict[str, int]

Returns:

Dictionary mapping table names to estimated sizes.

cleanup_old_tables(max_age_hours=24)[source]

Clean up tables older than specified age.

Parameters:

max_age_hours (int) – Maximum age in hours before cleanup.

Return type:

int

Returns:

Number of tables cleaned up.

Storage Backends

Memory Storage

Memory storage backend.

This module provides an in-memory storage implementation.

class sparkless.storage.backends.memory.MemoryTable(name, schema)[source]

Bases: ITable

In-memory table implementation.

Parameters:

Initialize memory table.

Parameters:
  • name (str) – Table name.

  • schema (StructType) – Table schema.

__init__(name, schema)[source]

Initialize memory table.

Parameters:
  • name (str) – Table name.

  • schema (StructType) – Table schema.

property name: str

Get table name.

property schema: StructType

Get table schema.

property metadata: Dict[str, Any]

Get table metadata.

insert_data(data, mode='append')[source]

Insert data into table.

Parameters:
  • data (List[Dict[str, Any]]) – Data to insert.

  • mode (str) – Insert mode (“append”, “overwrite”, “ignore”).

Return type:

None

query_data(filter_expr=None)[source]

Query data from table.

Parameters:

filter_expr (Optional[str]) – Optional filter expression.

Return type:

List[Dict[str, Any]]

Returns:

List of data rows.

get_schema()[source]

Get table schema.

Return type:

StructType

Returns:

Table schema.

get_metadata()[source]

Get table metadata.

Return type:

Dict[str, Any]

Returns:

Table metadata.

insert(data)[source]

Insert data into table.

Parameters:

data (List[Dict[str, Any]])

Return type:

None

query(**filters)[source]

Query data from table.

Parameters:

filters (Any)

Return type:

List[Dict[str, Any]]

count()[source]

Count rows in table.

Return type:

int

truncate()[source]

Truncate table.

Return type:

None

drop()[source]

Drop table.

Return type:

None

class sparkless.storage.backends.memory.MemorySchema(name)[source]

Bases: object

In-memory database schema (namespace) implementation.

Parameters:

name (str)

Initialize memory schema.

Parameters:

name (str) – Schema name.

__init__(name)[source]

Initialize memory schema.

Parameters:

name (str) – Schema name.

create_table(table, columns)[source]

Create a new table in this schema.

Parameters:
Return type:

None

table_exists(table)[source]

Check if table exists in this schema.

Parameters:

table (str) – Name of the table.

Return type:

bool

Returns:

True if table exists, False otherwise.

drop_table(table)[source]

Drop a table from this schema.

Parameters:

table (str) – Name of the table.

Return type:

None

list_tables()[source]

List all tables in this schema.

Return type:

List[str]

Returns:

List of table names.

class sparkless.storage.backends.memory.MemoryStorageManager[source]

Bases: IStorageManager

In-memory storage manager implementation.

Initialize memory storage manager.

__init__()[source]

Initialize memory storage manager.

create_schema(schema)[source]

Create a new schema.

Parameters:

schema (str) – Name of the schema to create.

Return type:

None

schema_exists(schema)[source]

Check if schema exists.

Parameters:

schema (str) – Name of the schema to check.

Return type:

bool

Returns:

True if schema exists, False otherwise.

drop_schema(schema_name, cascade=False)[source]

Drop a schema.

Parameters:
  • schema_name (str) – Name of the schema to drop.

  • cascade (bool) – Whether to cascade the drop operation.

Return type:

None

list_schemas()[source]

List all schemas.

Return type:

List[str]

Returns:

List of schema names.

table_exists(schema, table)[source]

Check if table exists.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

Return type:

bool

Returns:

True if table exists, False otherwise.

create_table(schema_name, table_name, fields)[source]

Create a new table.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

  • fields (Union[List[Any], StructType]) – Table fields definition.

Return type:

None

drop_table(schema_name, table_name)[source]

Drop a table.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

Return type:

None

insert_data(schema_name, table_name, data)[source]

Insert data into table.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

  • data (List[Dict[str, Any]]) – Data to insert.

Return type:

None

query_data(schema_name, table_name, **filters)[source]

Query data from table.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

  • **filters (Any) – Optional filter parameters.

Return type:

List[Dict[str, Any]]

Returns:

List of data rows.

query_table(schema, table, filter_expr=None)[source]

Query data from table.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

  • filter_expr (Optional[str]) – Optional filter expression.

Return type:

List[Dict[str, Any]]

Returns:

List of data rows.

get_table_schema(schema_name, table_name)[source]

Get table schema.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

Return type:

StructType

Returns:

Table schema.

get_data(schema, table)[source]

Get all data from table.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

Return type:

List[Dict[str, Any]]

Returns:

List of data rows.

create_temp_view(name, dataframe)[source]

Create a temporary view from a DataFrame.

Parameters:
  • name (str) – Name of the temporary view.

  • dataframe (Any) – DataFrame to create view from.

Return type:

None

list_tables(schema_name=None)[source]

List tables in schema.

Parameters:

schema_name (Optional[str]) – Name of the schema. If None, list tables in all schemas.

Return type:

List[str]

Returns:

List of table names.

get_table_metadata(schema_name, table_name)[source]

Get table metadata including Delta-specific fields.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

Return type:

Dict[str, Any]

Returns:

Table metadata dictionary.

update_table_metadata(schema_name, table_name, metadata_updates)[source]

Update table metadata fields.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

  • metadata_updates (Dict[str, Any]) – Dictionary of metadata fields to update.

Return type:

None

close()[source]

Close storage backend and clean up resources.

For in-memory storage, this is a no-op as there are no external resources.

Return type:

None

File Storage

File-based storage backend.

This module provides a file-based storage implementation using JSON files.

class sparkless.storage.backends.file.FileTable(name, schema, file_path)[source]

Bases: ITable

File-based table implementation.

Parameters:

Initialize file table.

Parameters:
  • name (str) – Table name.

  • schema (StructType) – Table schema.

  • file_path (str) – Path to table data file.

__init__(name, schema, file_path)[source]

Initialize file table.

Parameters:
  • name (str) – Table name.

  • schema (StructType) – Table schema.

  • file_path (str) – Path to table data file.

property name: str

Get table name.

property schema: StructType

Get table schema.

property metadata: Dict[str, Any]

Get table metadata.

insert_data(data, mode='append')[source]

Insert data into table.

Parameters:
  • data (List[Dict[str, Any]]) – Data to insert.

  • mode (str) – Insert mode (“append”, “overwrite”, “ignore”).

Return type:

None

query_data(filter_expr=None)[source]

Query data from table.

Parameters:

filter_expr (Optional[str]) – Optional filter expression.

Return type:

List[Dict[str, Any]]

Returns:

List of data rows.

get_schema()[source]

Get table schema.

Return type:

StructType

Returns:

Table schema.

get_metadata()[source]

Get table metadata.

Return type:

Dict[str, Any]

Returns:

Table metadata.

insert(data)[source]

Insert data into table.

Parameters:

data (List[Dict[str, Any]])

Return type:

None

query(**filters)[source]

Query data from table.

Parameters:

filters (Any)

Return type:

List[Dict[str, Any]]

count()[source]

Count rows in table.

Return type:

int

truncate()[source]

Truncate table.

Return type:

None

drop()[source]

Drop table.

Return type:

None

class sparkless.storage.backends.file.FileSchema(name, base_path)[source]

Bases: ISchema

File-based schema implementation.

Parameters:

Initialize file schema.

Parameters:
  • name (str) – Schema name.

  • base_path (str) – Base path for schema files.

__init__(name, base_path)[source]

Initialize file schema.

Parameters:
  • name (str) – Schema name.

  • base_path (str) – Base path for schema files.

create_table(table, columns)[source]

Create a new table in this schema.

Parameters:
Return type:

None

table_exists(table)[source]

Check if table exists in this schema.

Parameters:

table (str) – Name of the table.

Return type:

bool

Returns:

True if table exists, False otherwise.

drop_table(table)[source]

Drop a table from this schema.

Parameters:

table (str) – Name of the table.

Return type:

None

list_tables()[source]

List all tables in this schema.

Return type:

List[str]

Returns:

List of table names.

property fields: List[Any]

Get schema fields.

add_field(field)[source]

Add field to schema.

Parameters:

field (Any)

Return type:

None

remove_field(field_name)[source]

Remove field from schema.

Parameters:

field_name (str)

Return type:

None

get_field(field_name)[source]

Get field by name.

Parameters:

field_name (str)

Return type:

Optional[Any]

field_names()[source]

Get field names.

Return type:

List[str]

field_types()[source]

Get field types.

Return type:

Dict[str, Any]

__eq__(other)[source]

Check equality with another schema.

Parameters:

other (Any)

Return type:

bool

__hash__()[source]

Get hash for schema.

Return type:

int

__str__()[source]

Get string representation.

Return type:

str

__repr__()[source]

Get representation.

Return type:

str

class sparkless.storage.backends.file.FileStorageManager(base_path='sparkless_storage')[source]

Bases: IStorageManager

File-based storage manager implementation.

Parameters:

base_path (str)

Initialize file storage manager.

Parameters:

base_path (str) – Base path for storage files.

__init__(base_path='sparkless_storage')[source]

Initialize file storage manager.

Parameters:

base_path (str) – Base path for storage files.

create_schema(schema)[source]

Create a new schema.

Parameters:

schema (str) – Name of the schema to create.

Return type:

None

schema_exists(schema)[source]

Check if schema exists.

Parameters:

schema (str) – Name of the schema to check.

Return type:

bool

Returns:

True if schema exists, False otherwise.

drop_schema(schema_name, cascade=False)[source]

Drop a schema.

Parameters:
  • schema_name (str) – Name of the schema to drop.

  • cascade (bool) – Whether to cascade the drop operation.

Return type:

None

list_schemas()[source]

List all schemas.

Return type:

List[str]

Returns:

List of schema names.

table_exists(schema, table)[source]

Check if table exists.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

Return type:

bool

Returns:

True if table exists, False otherwise.

create_table(schema_name, table_name, fields)[source]

Create a new table.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

  • fields (Union[List[StructField], StructType]) – Table fields definition.

Return type:

None

drop_table(schema_name, table_name)[source]

Drop a table.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

Return type:

None

insert_data(schema_name, table_name, data)[source]

Insert data into table.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

  • data (List[Dict[str, Any]]) – Data to insert.

Return type:

None

query_data(schema_name, table_name, **filters)[source]

Query data from table.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

  • **filters (Any) – Optional filter parameters.

Return type:

List[Dict[str, Any]]

Returns:

List of data rows.

query_table(schema, table, filter_expr=None)[source]

Query data from table.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

  • filter_expr (Optional[str]) – Optional filter expression.

Return type:

List[Dict[str, Any]]

Returns:

List of data rows.

get_table_schema(schema_name, table_name)[source]

Get table schema.

Parameters:
  • schema_name (str) – Name of the schema.

  • table_name (str) – Name of the table.

Return type:

StructType

Returns:

Table schema.

get_data(schema, table)[source]

Get all data from table.

Parameters:
  • schema (str) – Name of the schema.

  • table (str) – Name of the table.

Return type:

List[Dict[str, Any]]

Returns:

List of data rows.

create_temp_view(name, dataframe)[source]

Create a temporary view from a DataFrame.

Parameters:
  • name (str) – Name of the temporary view.

  • dataframe (Any) – DataFrame to create view from.

Return type:

None

list_tables(schema_name=None)[source]

List tables in schema.

Parameters:

schema_name (Optional[str]) – Name of the schema. If None, list tables in all schemas.

Return type:

List[str]

Returns:

List of table names.

get_table_metadata(schema_name, table_name)[source]

Get table metadata including Delta-specific fields.

Parameters:
  • schema_name (str)

  • table_name (str)

Return type:

Dict[str, Any]

update_table_metadata(schema_name, table_name, metadata_updates)[source]

Update table metadata fields.

Parameters:
Return type:

None

close()[source]

Close storage backend and clean up resources.

For file-based storage, this is a no-op as files are managed per operation.

Return type:

None

Storage Models

Dataclass models for type-safe DuckDB storage operations.

This module provides dataclass-based models for Sparkless’s storage layer, ensuring type safety for all database operations.

class sparkless.storage.models.StorageMode(value)[source]

Bases: str, Enum

Storage operation modes with type safety.

APPEND = 'append'
OVERWRITE = 'overwrite'
IGNORE = 'ignore'
class sparkless.storage.models.MockDeltaVersion(version, timestamp, operation, data_snapshot)[source]

Bases: object

Represents a single version of a Delta table for time travel.

Parameters:
version: int
timestamp: datetime
operation: str
data_snapshot: List[Dict[str, Any]]
__init__(version, timestamp, operation, data_snapshot)
Parameters:
class sparkless.storage.models.MockTableMetadata(table_name, schema_name='default', id=None, created_at=<factory>, updated_at=None, row_count=0, schema_version='1.0', storage_format='columnar', is_temporary=False, format=None, version=0, properties=<factory>, version_history=<factory>)[source]

Bases: object

Type-safe table metadata model for DuckDB storage.

Parameters:
table_name: str
schema_name: str = 'default'
id: int | None = None
created_at: datetime
updated_at: datetime | None = None
row_count: int = 0
schema_version: str = '1.0'
storage_format: str = 'columnar'
is_temporary: bool = False
format: str | None = None
version: int = 0
properties: Dict[str, Any]
version_history: List[MockDeltaVersion]
__init__(table_name, schema_name='default', id=None, created_at=<factory>, updated_at=None, row_count=0, schema_version='1.0', storage_format='columnar', is_temporary=False, format=None, version=0, properties=<factory>, version_history=<factory>)
Parameters:
class sparkless.storage.models.ColumnDefinition(column_name, column_type, table_id=None, id=None, is_nullable=True, is_primary_key=False, default_value=None, column_order=0)[source]

Bases: object

Type-safe column definition model for DuckDB tables.

Parameters:
column_name: str
column_type: str
table_id: int | None = None
id: int | None = None
is_nullable: bool = True
is_primary_key: bool = False
default_value: str | None = None
column_order: int = 0
__init__(column_name, column_type, table_id=None, id=None, is_nullable=True, is_primary_key=False, default_value=None, column_order=0)
Parameters:
class sparkless.storage.models.DuckDBTableModel(table_name, schema_name='default')[source]

Bases: object

Base model for DuckDB table operations with type safety.

Parameters:
  • table_name (str)

  • schema_name (str)

table_name: str
schema_name: str = 'default'
get_full_name()[source]

Get fully qualified table name.

Return type:

str

__init__(table_name, schema_name='default')
Parameters:
  • table_name (str)

  • schema_name (str)

class sparkless.storage.models.DuckDBConnectionConfig(database_path='sparkless.duckdb', read_only=False, memory_limit=None, thread_count=None, enable_extensions=True)[source]

Bases: object

Type-safe configuration for DuckDB connections.

Parameters:
database_path: str = 'sparkless.duckdb'
read_only: bool = False
memory_limit: str | None = None
thread_count: int | None = None
enable_extensions: bool = True
__init__(database_path='sparkless.duckdb', read_only=False, memory_limit=None, thread_count=None, enable_extensions=True)
Parameters:
class sparkless.storage.models.StorageOperationResult(success, rows_affected, operation_type, table_name, error_message=None, execution_time_ms=None)[source]

Bases: object

Type-safe result model for storage operations.

Parameters:
success: bool
rows_affected: int
operation_type: str
table_name: str
error_message: str | None = None
execution_time_ms: float | None = None
__init__(success, rows_affected, operation_type, table_name, error_message=None, execution_time_ms=None)
Parameters:
class sparkless.storage.models.QueryResult(data, row_count, column_count, query, execution_time_ms=None)[source]

Bases: object

Type-safe model for query results.

Parameters:
data: List[Dict[str, Any]]
row_count: int
column_count: int
query: str
execution_time_ms: float | None = None
__init__(data, row_count, column_count, query, execution_time_ms=None)
Parameters:

Serialization

JSON serialization module.

This module provides JSON serialization and deserialization for storage.

class sparkless.storage.serialization.json.JSONSerializer[source]

Bases: object

JSON serializer for storage operations.

static serialize_data(data, file_path)[source]

Serialize data to JSON file.

Parameters:
  • data (List[Dict[str, Any]]) – Data to serialize.

  • file_path (str) – Path to output file.

Return type:

None

static deserialize_data(file_path)[source]

Deserialize data from JSON file.

Parameters:

file_path (str) – Path to input file.

Return type:

List[Dict[str, Any]]

Returns:

Deserialized data.

static serialize_schema(schema, file_path)[source]

Serialize schema to JSON file.

Parameters:
  • schema (StructType) – Schema to serialize.

  • file_path (str) – Path to output file.

Return type:

None

static deserialize_schema(file_path)[source]

Deserialize schema from JSON file.

Parameters:

file_path (str) – Path to input file.

Return type:

StructType

Returns:

Deserialized schema.

CSV serialization module.

This module provides CSV serialization and deserialization for storage.

class sparkless.storage.serialization.csv.CSVSerializer[source]

Bases: object

CSV serializer for storage operations.

static serialize_data(data, file_path)[source]

Serialize data to CSV file.

Parameters:
  • data (List[Dict[str, Any]]) – Data to serialize.

  • file_path (str) – Path to output file.

Return type:

None

static deserialize_data(file_path)[source]

Deserialize data from CSV file.

Parameters:

file_path (str) – Path to input file.

Return type:

List[Dict[str, Any]]

Returns:

Deserialized data.

static serialize_schema(schema, file_path)[source]

Serialize schema to CSV file.

Parameters:
  • schema (StructType) – Schema to serialize.

  • file_path (str) – Path to output file.

Return type:

None

static deserialize_schema(file_path)[source]

Deserialize schema from CSV file.

Parameters:

file_path (str) – Path to input file.

Return type:

StructType

Returns:

Deserialized schema.