核心内容摘要
AI写论文必知!4款AI论文写作工具,让你告别论文写作的烦恼!
在翻阅DuckDB的文档时看到bitstring_agg这么个函数, 还提到能代替count(DISTINCT …)获得更高的性能。
但文档没有给出输出的例子。
bitstring_agg(arg)Description The bitstring_agg function takes any integer type as input and returns a bitstring with bits set for each distinct value. The left-most bit represents the smallest value in the column and the right-most bit the maximum value. If possible, the min and max are retrieved from the column statistics. Otherwise, it is also possible to provide the min and max values.Example bitstring_agg(A)TipThe combination of bit_count and bitstring_agg can be used as an alternative to count(DISTINCT …), with possible performance improvements in cases of low cardinality and dense values.bitstring_agg(arg, min, max)Description Returns a bitstring with bits set for each distinct position defined in arg. All positions must be within the range [min, max] or an Out of Range Error will be thrown.Example bitstring_agg(A, 1,
先来看bitstring_agg的输出memory Dselectbitstring_agg(A,1,
from(select11aunionallselect13unionallselect
;┌─────────────────────────┐ │ bitstring_agg(A,1,
│ │bit│ ├─────────────────────────┤ │0000000000101000000000│ └─────────────────────────┘ memory Dselectbitstring(bitstring_agg(A,1,
,
from(select11aunionallselect13unionallselect
;┌────────────────────────────────────────┐ │ bitstring(bitstring_agg(A,1,
,
│ │bit│ ├────────────────────────────────────────┤ │000000000000000000101000000000│ └────────────────────────────────────────┘它返回从左到右第11位和第13位为1其他位为0的二进制字符串。
如果用bitstring(长度)扩充字符串的长度则在左侧补零。
下面用随机100万个整数来测试bitstring_agg函数配合bit_count求不同值的计数并与count(DISTINCT …)比较用时。
memory Dcreatetabletas(select(i*random())::intifromrange(1,