Delta Column Mapping
Delta Lake supports column mapping, which allows Delta table columns and the
corresponding Parquet columns to use different names. Column mapping enables
Delta schema evolution operations such as RENAME COLUMN
on a Delta table
without the need to rewrite the underlying Parquet files. It also allows users
to name Delta table columns by using characters that are not allowed by Parquet,
such as spaces, so that users can directly ingest CSV or JSON data into Delta
without the need to rename columns due to previous character constraints.
Requirements
- DBR 10.2 or above.
- Column mapping requires the Delta table version to be reader
version 2 and writer version 5. For a Delta table with the required table
version, you can enable column mapping by setting
delta.columnMappingMode
toname
. You can upgrade the table version and enable column mapping by using a singleALTER TABLE
command:
ALTER TABLE <table_name> SET TBLPROPERTIES (
'delta.minReaderVersion' = '2',
'delta.minWriterVersion' = '5',
'delta.columnMapping.mode' = 'name'
)
Note
After you set these properties in the table, you can only read from and write to this Delta table by using DBR 10.2 and above.
Supported characters in column names
When column mapping is enabled for a Delta table, you can include spaces as well
as any of these characters in the table's column names: ,;{}()\n\t=
.
Rename a column
When column mapping is enabled for a Delta table, you can rename a column:
ALTER TABLE <table_name> RENAME COLUMN old_col_name TO new_col_name
For more examples, see _.