Data Compaction
The query handler "/compact" merges documents with small chunks of data into a document containing a big chunk of data. In most cases this will improve query speeds and disk usage.
Here's an example using curl: curl 'http://localhost:8913/solr/chronix/compact?joinKey=metric'
Parameters
Parameter | Description |
---|---|
joinKey | Comma separated list of field names. It represents the primary key of your time series. All documents with the same values for the given fields will be merged. |
ppc | Points per chunk. Defines the number of data points to be merged into one document. |
Note: Select you joinKey very carefully. Otherwise you potentially merge data chunks of two independent time series.
Performance Factors
The data set contains about 112,815,835 pairs of timestamp and value (long, double).
Compaction | ppc | Record count | Disk Usage (MB) | Compaction Time (sec) |
---|---|---|---|---|
no | 610,355 | 212 | -1 | |
yes | 100 | 1,422,369 | 357 | 134 |
yes | 500 | 284,815 | 187 | 75 |
yes | 1,000 | 142,573 | 160 | 93 |
yes | 5,000 | 28,850 | 131 | 69 |
yes | 10,000 | 14,797 | 126 | 61 |
yes | 25,000 | 6,408 | 123 | 61 |
yes | 100,000 | 2,051 | 121 | 60 |
yes | 500,000 | 920 | 119 | 63 |