Real‑time analytics on SAP HANA thrives on two pillars: speed and efficiency. While the in‑memory architecture delivers micro‑second query response times, the underlying column store compression determines how much data can be kept in memory, how quickly it can be scanned, and ultimately how scalable your analytics platform becomes.
In this article we’ll dive deep into the mechanics of SAP HANA’s column‑store compression, uncover common pitfalls, and walk through actionable steps you can apply today to squeeze every last byte of storage efficiency without sacrificing query performance. Whether you’re a data modeler, a DBA, or an ABAP developer responsible for data provisioning, the techniques below will help you keep the column store lean, hot, and ready for the next wave of real‑time insights.
SAP HANA stores tables in two physical formats:
| Format | Ideal Use‑Case | Compression Potential |
|---|---|---|
| Row Store | Transactional (OLTP) workloads, frequent updates | Limited (mostly dictionary) |
| Column Store | Analytical (OLAP) workloads, bulk reads | High (multiple algorithms) |
In the column store, each column is stored contiguously, enabling vectorized processing and massively parallel scans. Because values of the same column often exhibit redundancy (e.g., repeated status codes, timestamps with small deltas), HANA can apply a suite of compression algorithms that dramatically reduce the memory footprint.
At its core, HANA’s column store applies a pipeline of compression techniques:
The engine automatically selects the best combination per column, but table design and data‑type choices can tip the scales in favor of more aggressive compression.
In short, compression is not a nice‑to‑have; it’s a performance enabler for any real‑time analytics scenario.
When a column contains a limited set of distinct values (e.g., country codes, status flags), HANA builds a dictionary mapping each unique value to a small integer key. The column itself stores only these keys.
-- Example: Inspect dictionary size for a column
SELECT TABLE_NAME, COLUMN_NAME, DICTIONARY_SIZE
FROM M_CS_COLUMNS
WHERE TABLE_NAME = 'SALES_ORDER' AND COLUMN_NAME = 'ORDER_STATUS';
A well‑chosen VARCHAR length or a CHAR(3) for country codes can shrink the dictionary dramatically.
If a column’s dictionary keys repeat consecutively (common in time‑series data where a status holds for many rows), HANA stores the value once together with a run length.
-- Demonstrate RLE impact via compression ratio
SELECT TABLE_NAME, COLUMN_NAME,
COMPRESSED_SIZE / RAW_SIZE AS "RLE_RATIO"
FROM M_CS_COLUMNS
WHERE TABLE_NAME = 'SENSOR_READINGS' AND COLUMN_NAME = 'STATUS';
Numeric columns that increase monotonically (e.g., timestamps, IDs) are stored as the difference from the previous value. Small deltas compress well with bit‑packing.
-- Show delta compression for a monotonically increasing column
SELECT TABLE_NAME, COLUMN_NAME,
AVG(DELTA) AS "AVG_DELTA",
COMPRESSED_SIZE / RAW_SIZE AS "DELTA_RATIO"
FROM M_CS_PARTITIONS
WHERE TABLE_NAME = 'EVENT_LOG' AND COLUMN_NAME = 'EVENT_TIME';
Long strings often share common prefixes (e.g., product codes PROD-2023-001). HANA extracts the shared part, storing only the unique suffix per row.
Before you start tweaking, you need a baseline. SAP HANA provides a set of system views that surface compression statistics at the table, column, and partition level.
| View | Purpose |
|---|---|
M_CS_TABLES | High‑level table compression metrics (raw vs. compressed size). |
M_CS_COLUMNS | Per‑column compression details, algorithm used, dictionary size. |
M_CS_PARTITIONS | Partition‑specific stats – useful when tables are horizontally partitioned. |
/* Overall compression health for all column‑store tables */
SELECT t.SCHEMA_NAME,
t.TABLE_NAME,
t.ROW_COUNT,
t.RAW_SIZE/1024/1024 AS RAW_MB,
t.COMPRESSED_SIZE/1024/1024 AS COMPRESSED_MB,
ROUND(100 * (1 - t.COMPRESSED_SIZE/t.RAW_SIZE), 1) AS "COMPRESSION_%"
FROM M_CS_TABLES t
WHERE t.COMPRESSED_SIZE > 0
ORDER BY "COMPRESSION_%" DESC
LIMIT 20;
The top‑20 tables with the highest compression percentages are prime candidates for re‑evaluation—they often indicate that the chosen data types are already optimal, but you may discover opportunities to increase compression further (e.g., by adding partitions).
Below are concrete steps you can take, grouped by impact level. Each recommendation includes a code snippet or configuration tip you can apply immediately.
| Column Type | Recommended Choice | Reason |
|---|---|---|
| Integer IDs | INTEGER → BIGINT only if >2 B rows | Smaller integer fits tighter bit‑packing. |
| Fixed‑Length Text | CHAR(n) if length is constant | Enables prefix compression. |
| Variable Text | NVARCHAR with a tight maximum | Reduces dictionary overhead. |
| Dates/Times | DATE, TIMESTAMP (no timezone) | Delta encoding works best on pure numeric representation. |
ABAP Example – Defining a CDS View with Optimal Types
@AbapCatalog.sqlViewName: 'ZV_SALES_ORDER'
@AbapCatalog.compiler.compareFilter: true
define view Z_C_SalesOrder
as select from sflight
{
key carrid as CarrierID, // CHAR(3) – fits dictionary
key connid as ConnectionID, // INTEGER – delta friendly
key fldate as FlightDate, // DATE
price as Price, // DECIMAL(10,2)
seatsmax as SeatsMax, // SMALLINT
seatsocc as SeatsOccupied
}
Partitioning a massive fact table by a high‑cardinality, time‑based column (e.g., YEAR) not only improves query pruning but also creates independent compression domains. Each partition can develop its own dictionary, often yielding a higher overall compression ratio.
-- Create a partitioned table (monthly partitions)
CREATE COLUMN TABLE SALES_FACT (
SALES_ID BIGINT,
SALES_DATE DATE,
REGION NVARCHAR(20),
PRODUCT_ID BIGINT,
QUANTITY INTEGER,
REVENUE DECIMAL(15,2)
)
PARTITION BY RANGE (SALES_DATE) (
PARTITION p_2022_01 VALUES LESS THAN ('2022-02-01'),
PARTITION p_2022_02 VALUES LESS THAN ('2022-03-01'),
PARTITION p_2022_03 VALUES LESS THAN ('2022-04-01')
-- … continue for each month …
);
After loading data, re‑compress each partition individually to let the engine re‑evaluate dictionaries.
ALTER TABLE SALES_FACT REBUILD PARTITION p_2022_01 COMPRESS;
CustomerName → CustomerID). The surrogate key benefits from dictionary + RLE, while the original string can be stored in a separate dimension table.When loading bulk data via SAP HANA Smart Data Integration (SDI), SLT, or ABAP ODP, configure the pipeline to bulk‑load rather than row‑by‑row inserts. Bulk loading allows the engine to build dictionaries once per batch, which is far more efficient than constantly updating them.
{
"source": {
"type": "jdbc",
"connection": "oracle_source",
"query": "SELECT * FROM ORDERS"
},
"target": {
"type": "hana",
"table": "ORDERS_FACT",
"bulkLoad": true,
"batchSize": 50000
},
"options": {
"compression": "auto",
"rebuildDictionary": true
}
}
SAP HANA provides a global compression parameter that influences the aggressiveness of the dictionary size limit and RLE thresholds.
-- Show current compression parameters
SELECT * FROM SYS.M_INIFILE_CONTENTS
WHERE FILE_NAME = 'global.ini' AND SECTION = 'compression';
-- Example: Increase dictionary size limit for VARCHAR columns
INSERT INTO SYS.M_INIFILE_CONTENTS (FILE_NAME, SECTION, KEY, VALUE)
VALUES ('global.ini', 'compression', 'DICTIONARY_MAX_SIZE', '8192');
Caution: Changing
global.iniaffects the entire system. Test in a non‑production environment first and monitor memory usage closely.
If a table has undergone heavy updates or deletions, its compression may degrade over time. Use the REBUILD statement to force a fresh compression pass.
-- Full table recompression (offline)
ALTER TABLE SALES_FACT REBUILD COMPRESS;
-- Online recompression (minimal lock)
ALTER TABLE SALES_FACT REBUILD COMPRESS ONLINE;
While the column store is the default for analytical workloads, some use cases (e.g., write‑intensive master data) still rely on the row store. HANA applies dictionary compression to row‑store tables as well, but the gains are typically modest. If you must keep a table in the row store, consider:
COLUMN STORE hint in the CREATE TABLE statement.CREATE ROW TABLE EMPLOYEE_MASTER (
EMP_ID BIGINT,
FIRST_NAME NVARCHAR(30),
LAST_NAME NVARCHAR(30),
DEPT_ID INTEGER
) WITH DEFAULT PARTITIONING;
NSE enables you to store data on high‑performance SSDs while still leveraging HANA’s compression engine. When using NSE, you can set compression policies per file system.
{
"nse": {
"mountPoint": "/hana/nse",
"compression": {
"enabled": true,
"algorithm": "LZ4",
"dictionarySize": "16KB"
}
}
}
When provisioning data from SAP ECC or S/4HANA to HANA, you can embed compression hints directly in the ODP replication definition.
DATA: lo_replication TYPE REF TO /bobf/if_odb_replication.
lo_replication->set_option( iv_key = 'COMPRESSION' iv_value = 'TRUE' ).
lo_replication->set_option( iv_key = 'DICTIONARY_MAX_SIZE' iv_value = '4096' ).
The downstream SLT replication will honor these settings, ensuring that the initial load is already compressed optimally.
REPORT zpush_sales_data.
DATA: lt_sales TYPE TABLE OF zs_sales_raw,
lo_conn TYPE REF TO cl_sql_connection,
lo_stmt TYPE REF TO cl_sql_statement.
SELECT * FROM zs_sales_raw INTO TABLE lt_sales UP TO 100000 ROWS.
lo_conn = cl_sql_connection=>get_connection( iv_db = 'HANA' ).
lo_stmt = lo_conn->create_statement( ).
" Use the HANA specific hint to enable bulk compression
lo_stmt->execute_update( |INSERT /*+ COMPRESS */ INTO SALES_FACT
SELECT * FROM :lt_sales| ).
COMMIT WORK.
The /*+ COMPRESS */ hint tells the optimizer to treat the incoming batch as a compression‑aware bulk load, reducing dictionary rebuild overhead.
-- Alert when compression ratio falls below 30 %
INSERT INTO SYS.M_ALERTS (ALERT_NAME, CONDITION, SEVERITY, ACTION)
VALUES (
'LowCompression_SalesFact',
'SELECT COMPRESSED_SIZE/RAW_SIZE FROM M_CS_TABLES WHERE TABLE_NAME = ''SALES_FACT'' < 0.30',
'HIGH',
'CALL REBUILD_TABLE_COMPRESSION(''SALES_FACT'')'
);
A typical maintenance window could include:
| Frequency | Action |
|---|---|
| Daily | Run M_CS_TABLES health check; flag tables with compression drop > 5 %. |
| Weekly | Rebuild partitions that exceed a defined row‑change threshold (e.g., > 10 %). |
| Monthly | Full‑table recompression for static dimension tables. |
| Quarterly | Review data‑type choices and partitioning strategy; adjust global.ini if needed. |
Automation can be achieved via SQLScript procedures combined with SAP HANA XS or SAP Cloud Platform jobs.
CREATE PROCEDURE REBUILD_TABLE_COMPRESSION (IN iv_table_name NVARCHAR(128))
LANGUAGE SQLSCRIPT
AS
BEGIN
DECLARE sql_stmt NVARCHAR(500);
sql_stmt := ||'ALTER TABLE '|| iv_table_name ||' REBUILD COMPRESS ONLINE;';
EXECUTE IMMEDIATE :sql_stmt;
END;
M_CS_TABLES, M_CS_COLUMNS) and set proactive alerts.global.ini settings can push compression further, but require careful testing.The next generation of SAP HANA is already embracing AI‑driven adaptive compression. Early prototypes use machine‑learning models to predict the optimal combination of dictionary size, RLE thresholds, and delta encoding per column based on data‑distribution trends

SAP Expert and Training Specialist with 6+ years of experience. Helped 500+ professionals advance their SAP careers.