Skip to content

Releases: KxSystems/pykx

3.0.0

12 Nov 11:34
bf05b31
Compare
Choose a tag to compare

Full up-to-date release notes are available here.

🎉 Major Features/Changes 🎉

  • Addition of functionality to allow for development of end-to-end streaming workflows consisting of data-ingestion, persistence and query. This functionality is outlined in-depth here.
  • Update to the PyKX Query API to support a significantly more Python first approach to querying kdb+ in-memory and on-disk databases.
>>> table = kx.Table(data={
...     'sym': kx.random.random(100, ['AAPL', 'GOOG', 'MSFT']),
...     'date': kx.random.random(100, kx.q('2022.01.01') + [0,1,2]),
...     'price': kx.random.random(100, 1000.0),
...     'size': kx.random.random(100, 100) 
... })
>>> table.select(columns=kx.Column('price').max(), where=kx.Column('size') > 5)
>>> table.update(column=kx.Column('price').wavg(kx.Column('size')).rename('vwap'), by=kx.Column('sym'))
>>> table.delete(column=kx.Column('sym'))
>>> table.update(column=(kx.Column('price') * kx.Column('size')).rename('total'))

Beta features available in the 2.* versions of PyKX have now been migrated to full support.
- The full list of these features are as follows:
- Database Creation and Management
- Compression and Encryption Module
- Remote Function Execution
- Streamlit Integration
- Multi-threaded use of PyKX

❓ What else? ❓

  • Extension to our integration with Jupyter Notebooks by adding a q-first mode of operation which allows users working between the two languages to more easily automate workflows depending on both
  • Addition of a new utility function kx.util.detect_bad_columns to validate if the columns of a table object conform to the naming conventions supported by kdb+ and highlighting if the table contains duplicate column names raising a warning indicating potential issues and returning True if the table contains invalid columns.
  • When generating IPC connections with reconnection_attempts users can now configure the initial delay between first and second attempts and the function which updates the delay on successive attempts using the reconnection_delay and reconnection_function keywords. See here for a worked example.
  • Two new options added on first initialisation of PyKX to allow users to:
    • Use the path to their already downloaded kc.lic/k4.lic licenses without going through the “Do you want to install a license” workflow
    • Allow users to persist for future use that they wish to use the IPC only unlicensed mode of PyKX, this will persist a file ~/.pykx-config which sets configuration denoting unlicensed mode is to be used.
  • Addition of function kx.util.install_q to allow users who do not have access to a q executable at the time of installing PyKX. See here for instructions regarding its use.
    Addition of function kx.util.start_q_subprocess to allow a q process to be started on a specified port with supplied initialisation arguments

🔧 Fixes & Improvements 🔧

As with any release PyKX 3.0 provides a significant number of bug fixes and improvements, the following are a subset:

  • Addition of support for help when interacting with q keywords and operators via PyKX
  • Previously loading pykx.q during q startup using QINIT or QHOME/q.q resulted in a segfault or a corruption.
  • The function kx.util.debug_environment now returns the applied configuration values at startup instead of customised values
  • Operations on kx.GroupbyTable objects which have been indexed previously would raise an error indicating invalid key access
  • Attempts to load a database using the kx.DB module previously would raise an nyi error if the path to the database contained a space

2.5.2

05 Jul 12:34
8b9cfb6
Compare
Choose a tag to compare

PyKX 2.5.2 has been released 🎉 Full release notes for client consumption can be found here.

Highlights:

Converting PyKX generic lists using the keyword parameter raw=True would previously return incorrect results, the values received being the memory address of the individual elements of the list, this has now been resolved:

>>> a = kx.q('(1; 3.4f; `asad; "asd")')
>>> a.np(raw=True)
array([1, 3.4, b'asad', b'asd'], dtype=object)
  • Fix to issue where use of kx.SymbolAtom with getitem method on kx.Table objects would return a table rather then vector/list. The return now mirrors the expected return which matches str type inputs
>>> import pykx as kx
>>> tab = kx.Table(data={'x': [1, 2, 3], 'y': ['a', 'b', 'c']})
>>> tab['x']
pykx.LongVector(pykx.q('1 2 3'))
>>> tab[kx.SymbolAtom('x')]
pykx.LongVector(pykx.q('1 2 3'))

Fix to issue where loading PyKX on Windows from 2.5.0 could result in a users working directory being changed to site-packages/pykx

The full list including more fixes and improvements is available here.

2.5.1

11 Jun 15:21
57a42b4
Compare
Choose a tag to compare

PyKX 2.5.1 has been released 🎉 Full release notes for consumption can be found here.

Highlights:

  • Pandas API additions: isnull, isna, notnull, notna, idxmax, idxmin, kurt, sem.
  • Addition of filter_type, filter_columns, and custom parameters to QReader.csv() to add options for CSV type guessing.
>>> import pykx as kx
>>> reader = kx.QReader(kx.q)
>>> kx.q.read.csv("myFile0.csv", filter_type = "like", filter_columns="*name", custom={"SYMMAXGR":15})
pykx.Table(pykx.q('
firstname  lastname   
----------------------
"Frieda"   "Bollay"   
"Katuscha" "Paton"    
"Devina"   "Reinke"   
"Maurene"  "Bow"      
"Iseabal"  "Bashemeth"
..
'))

Other items of note:

  • Fix to regression in PyKX 2.5.0 where PyKX initialisation on Windows would result in a segmentation fault when using an k4.lic license type.
  • Previously user could not make direct use of kx.SymbolicFunction type objects against a remote process, this has been rectified
  • Previously use of the context interface for q primitive functions in licensed mode via IPC would partially run the function on the client rather than server, thus limiting usage for named entities on the server.
  • With the release of PyKX 2.5.0 and support of PyKX usage in paths containing spaces the context interface functionality could fail to load a requested context over IPC if PyKX was not loaded on the server.

The full list including more fixes and improvements is available here.

2.5.0

24 May 16:13
de7966c
Compare
Choose a tag to compare

PyKX 2.5.0 has been released. Full release notes can be found here.

Highlights:

  • table.xbar , table.window_join , table.replace
  • Added as_arrow keyword to the .pd() method to use PyArrow backed data types rather than NumPy.
  • Other items of note:
  • PyKX can now be installed to locations with spaces in the file path.
  • Updated libq to 4.0 2024.05.07 and 4.1 to 2024.04.29 for all supported OS's.
  • IPC queries can now pass PyKX Functions like objects as the first query parameter.
  • k4.lic licenses can now be installed using the interactive license helper.
  • To ease license updates, If PyKX fails to start due to a license error it will attempt to replace it's license from KDB_LICENSE_B64 or KDB_K4LICENSE_B64 if you have one set.

The full list including more fixes and improvements is available here.

Examples:

table.window_join

>>> trades = kx.Table(data={
...     'sym': ['ibm', 'ibm', 'ibm'],
...     'time': kx.q('10:01:01 10:01:04 10:01:08'),
...     'price': [100, 101, 105]})
>>> quotes = kx.Table(data={
...     'sym': 'ibm',
...     'time': kx.q('10:01:01+til 9'),
...     'ask': [101, 103, 103, 104, 104, 107, 108, 107, 108],
...     'bid': [98, 99, 102, 103, 103, 104, 106, 106, 107, 108]})
>>> windows = kx.q('{-2 1+\:x}', trades['time'])
    >>> trades.window_join(quotes,
    ...                    windows,
    ...                    ['sym', 'time'],
    ...                    {'ask_minus_bid': [lambda x, y: x - y, 'ask', 'bid'],
    ...                     'ask_max': [lambda x: max(x), 'ask']})
pykx.Table(pykx.q('
    sym time     price ask_minus_bid ask_max
    ----------------------------------------
    ibm 10:01:01 100   3 4           103
    ibm 10:01:04 101   4 1 1 1       104
    ibm 10:01:08 105   3 2 1 1       108
'))

table.xbar

>>> kx.random.seed(42)
>>> tab = kx.Table(data = {
...     'x': kx.random.random(N, 100.0),
...     'y': kx.random.random(N, 10.0)})
>>> tab
pykx.Table(pykx.q('
x        y
-----------------
77.42128 8.200469
70.49724 9.857311
52.12126 4.629496
99.96985 8.518719
1.196618 9.572477
'))
>>> tab.xbar('x', 10)
pykx.Table(pykx.q('
x  y
-----------
70 8.200469
70 9.857311
50 4.629496
90 8.518719
0  9.572477
'))

table.replace

>>> tab = kx.q('([] a:2 2 3; b:4 2 6; c:(1b;0b;1b); d:(`a;`b;`c); e:(1;2;`a))')
>>> tab
pykx.Table(pykx.q('
a b c d e
----------
2 4 1 a 1
2 2 0 b 2
3 6 1 c `a
'))
>>> tab.replace(2, "test")
pykx.Table(pykx.q('
a     b     c d e
---------------------
`test 4     1 a 1
`test `test 0 b `test
3     6     1 c `a
'))

2.4.0

21 Mar 09:00
bf21a0b
Compare
Choose a tag to compare

Full details on the release can be found here.

Additions:

  • Support for q/kdb+ 4.1 documentation here added as an opt-in capability, this functionality is enabled through setting PYKX_4_1_ENABLED environment variable.
>>> import os
>>> os.environ['PYKX_4_1_ENABLED'] = 'True'
>>> import pykx as kx
>>> kx.q.z.K
pykx.FloatAtom(pykx.q('4.1'))
  • Added support for Python 3.12.
    • Support for PyArrow in this python version is currently in Beta.
  • Added conversion of NumPy arrays of type datetime64[s], datetime64[ms], datetime64[us] to kx.TimestampVector
  • Added Table.sort_values(), Table.nsmallest() and Table.nlargest() to the Pandas like API for sorting tables.
  • Table.rename() now supports non-numerical index columns and improved the quality of errors thrown.
  • Added the reconnection_attempts key word argument to SyncQConnection, SecureQConnection, and AsyncQConnection IPC classes. This argument allows IPC connection to be automatically re-established when it is lost and a server has reinitialized.
>>> import pykx as kx
>>> conn = kx.SyncQConnection(port = 5050, reconnection_attempts=4)
>>> conn('1+1')    # Following this call the server on port 5050 was closed for 2 seconds
pykx.LongVector(pykx.q('2'))
>>> conn('1+2')
WARNING: Connection lost attempting to reconnect.
Failed to reconnect, trying again in 0.5 seconds.
Failed to reconnect, trying again in 1.0 seconds.
Connection successfully reestablished.
pykx.LongAtom(pykx.q('3'))
  • Added --reconnection_attempts option to Jupyter %%q magic making use of the above IPC logic changes.
  • Addition of environment variable/configuration value PYKX_QDEBUG which allows debugging backtrace to be displayed for all calls into q instead of requiring a user to specify debugging is enabled per-call. This additionally works for remote IPC calls and utilisation of Jupyter magic commands.
>>> import os
>>> os.environ['PYKX_QDEBUG'] = 'True'
>>> import pykx as kx
>>> kx.q('{x+1}', 'e')
backtrace:
  [2]  {x+1}
         ^
  [1]  (.Q.trp)

  [0]  {[pykxquery] .Q.trp[value; pykxquery; {if[y~();:(::)];2@"backtrace:
                    ^
",.Q.sbt y;'x}]}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/anaconda3/lib/python3.8/site-packages/pykx/embedded_q.py", line 230, in __call__
    return factory(result, False)
  File "pykx/_wrappers.pyx", line 493, in pykx._wrappers._factory
  File "pykx/_wrappers.pyx", line 486, in pykx._wrappers.factory
pykx.exceptions.QError: type

Fixes and Improvements:

  • Resolved segfaults on Windows when PyKX calls Python functions under q.
>>> import pykx as kx
>>> kx.q('{[f;x] f  x}', sum, kx.q('4 4#til 16'))
pykx.LongVector(pykx.q('24 28 32 36'))
  • Updated kdb Insights Core libraries to 4.0.8, see here for more information.
  • Updated libq 4.0 version to 2024.03.04 for all supported OS’s.
  • Fix issue where use of valid C backed q code APIs could result in segmentation faults when called.
>>> import pykx as kx
>>> isf = kx.q('.pykx.util.isf')
>>> isf
pykx.Foreign(pykx.q('code'))
>>> isf(True)
pykx.BooleanAtom(pykx.q('0b'))
  • Each call to the PyKX query API interned 3 new unique symbols. This has now been removed.

Beta Features

  • Addition of Compress and Encrypt classes to allow users to set global configuration and for usage within Database partition persistence.

Standalone

>>> import pykx as kx
>>> compress = kx.Compress(algo=kx.CompressionAlgorithm.gzip, level=8)
>>> kx.q.z.zd
pykx.Identity(pykx.q('::'))
>>> compress.global_init()
pykx.LongVector(pykx.q('17 2 8'))
>>> encrypt = kx.Encrypt(path='/path/to/the.key', password='PassWord')
>>> encrypt.load_key()

Database

>>> import pykx as kx
>>> compress = kx.Compress(algo=kx.CompressionAlgorithm.lz4hc, level=10)
>>> db = kx.DB(path='/tmp/db')
>>> db.create(kx.q('([]10?1f;10?1f)', 'tab', kx.q('2020.03m'), compress=compress)
>>> kx.q('-21!`:/tmp/db/2020.03/tab/x')
pykx.Dictionary(pykx.q('
compressedLength  | 140
uncompressedLength| 96
algorithm         | 4i
logicalBlockSize  | 17i
zipLevel          | 10i
'))

2.3.2

12 Feb 11:40
6de3d97
Compare
Choose a tag to compare
Merge pull request #21 from KxSystems/pykx-2.3.2

PyKX 2.3.2 release update

2.3.1

09 Feb 10:18
b4b2bdf
Compare
Choose a tag to compare

Release 2.3.1

2.2.0

15 Nov 13:40
Compare
Choose a tag to compare
Release 2.2