Attempting a conky replacement in Python (part 2)
In part 1 we say that a simple replacement for conky for generating
a statusline for i3 can be achieved. But since it uses the subprocess
module to call external programs it is pretty CPU intensive.
The question now is if we can reduce that? For that we’re going to use
mmap to look at the mailbox, and call sysctlbyname(3)
using ctypes to get the remaining system information. Note that sysctl
et
al and the names used are specific to FreeBSD.
Using mmap
to scan the mailbox
Instead of reading a complete file into memory, we can use mmap to map the file into memory. This can be done as follows.
In the “mbox” format, the start of a new mail message is a blank line followed by “From ”. Except at the beginning of the file where the first message starts with a line starting with “From ”. To solve this we search for “\n\nFrom ” and simply add 1 to the number of instances found.
In [1]: import os
In [2]: import mmap
In [3]: def mail(previous):
...: """
...: Report unread mail.
...:
...: Arguments:
...: previous: a 2-tuple (unread, time) from the previous call, or None
...:
...: Returns: A new 2-tuple (unread, time) and a string to display.
...: """
...: mboxname = '/home/rsmith/Mail/received'
...: newtime = os.stat(mboxname).st_mtime
...: if previous is None or newtime > previous[1]:
...: with open(mboxname) as mbox:
...: with mmap.mmap(mbox.fileno(), 0, prot=mmap.PROT_READ) as mm:
...: start, total = 0, 1
...: while True:
...: rv = mm.find(b'\n\nFrom ', start)
...: if rv == -1:
...: break
...: else:
...: total += 1
...: start = rv+7
...: start, read = 0, 0
...: while True:
...: rv = mm.find(b'\nStatus: R', start)
...: if rv == -1:
...: break
...: else:
...: read += 1
...: start = rv+10
...: unread = total - read
...: newdata = (unread, newtime)
...: else:
...: unread = previous[0]
...: newdata = previous
...: return newdata, f'Mail: {unread}'
...:
In [4]: %timeit mail(None)
78.4 ms ± 42.3 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [5]: mail(None)
Out[5]: ((43, 1562166977.0), 'Mail: 43')
In [8]: p, _ = mail(None)
In [9]: %timeit mail(p)
14.5 µs ± 31.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
The most common result is that the value from the global variable is returned. This is extremely fast. Doing the real scan will still take around 80 ms but that is about 4× faster than just reading the file.
Ctypes to the rescue
Via the built-in ctypes module we can call functions in libraries. (There are alternatives like e.g. cffi, but for now I’ve used what was built-in.)
In FreeBSD, sysctlbyname(3) can be found in libc
. It can be used as follows.
In [1]: import ctypes
In [2]: import ctypes.util
In [3]: libc = ctypes.CDLL(ctypes.util.find_library("c"), use_errno=True)
Out[3]: <CDLL 'libc.so.7', handle 80062c000 at 0x809cfde80>
In [4]: def to_degC(value):
...: """Convert binary sysctl value to degree Centigrade."""
...: return round(int.from_bytes(value, byteorder='little')/10 - 273.15, 1)
...:
In [5]: def sysctlbyname(name, buflen=4, convert=None):
...: """
...: Python wrapper for sysctlbyname(3) on FreeBSD.
...:
...: Arguments:
...: name (str): Name of the sysctl to query
...: buflen (int): Length of the data buffer to use.
...: convert: Optional function to convert the data.
...:
...: Returns:
...: The requested binary data, converted if desired.
...: """
...: name_in = ctypes.c_char_p(bytes(name, encoding='ascii'))
...: oldlen = ctypes.c_size_t(buflen)
...: oldlenp = ctypes.byref(oldlen)
...: oldp = ctypes.create_string_buffer(buflen)
...: rv = libc.sysctlbyname(name_in, oldp, oldlenp, None, ctypes.c_size_t(0))
...: if rv != 0:
...: errno = ctypes.get_errno()
...: raise ValueError(f'sysctlbyname error: {errno}')
...: if convert:
...: return convert(oldp.raw[:buflen])
...: return oldp.raw[:buflen]
...:
In [6]: sysctlbyname('dev.cpu.0.temperature', convert=to_degC)
Out[6]: 51.0
In [7]: %timeit sysctlbyname('dev.cpu.0.temperature', convert=to_degC)
25.8 µs ± 56.5 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
As can be seen, this is very fast. Where a subprocess call is around 80 ms, this is around 25 µs. That’s 3200x faster! The downside is that these calls return binary data.
From calling sysctl -t dev.cpu.0.temperature
we know that this sysctl
returns an integer. Knowing that FreeBSD on amd64 is a little-endian system,
we can convert the value.
In the source code of the coretemp
driver which produces this sysctl, we
see a format code “IK” for this integer. According to sysctl(9)
this
means the value is in deciKelvins, which we can easily convert to centigrade.
This is what to_degC()
does.
Memory
Below, the memory()
function is re-implemented by using sysctlbyname(3)
.
In [5]: def to_int(value):
...: """Convert binary sysctl value to integer."""
...: return int.from_bytes(value, byteorder='little')
...:
In [6]: def memory():
...: """
...: Report on the RAM usage on FreeBSD.
...:
...: Returns: a formatted string to display.
...: """
...: suffixes = ('page_count', 'free_count', 'inactive_count', 'cache_count')
...: stats = {
...: suffix: sysctlbyname(f'vm.stats.vm.v_{suffix}', convert=to_int)
...: for suffix in suffixes
...: }
...: memmax = stats['page_count']
...: mem = (memmax - stats['free_count'] - stats['inactive_count'] - stats['cache_count'])
...: free = int(100 * mem / memmax)
...: return f'RAM: {free}%'
...:
In [7]: memory()
Out[7]: 'RAM: 18%'
In [8]: %timeit memory()
90.8 µs ± 182 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
This re-implementation yields a speed improvement of about a 1000 times
compared to calling sysctl(1)
in a subprocess!
Network
Looking at the py-freebsd module, I found out that network interface
statistics can be gathered by reading “opaque” sysctls under
net.link.generic
. We can use net.link.generic.system.ifcount
to get
the number of configured interfaces. Let’s look at this first.
In [21]: sysctlbyname('net.link.generic.system.ifcount', convert=to_int)
Out[21]: 3
That matches the information from ifconfig
.
> ifconfig -l
age0 rl0 lo0
The interface data is available from nodes under
net.link.generic.ifdata
. Looking at the code, it seems that these nodes do
not have human-readable names. So we have to implement and use a wrapper for plain
sysctl
, where the “name” is merely an array of integers.
def sysctl(name, buflen=4, convert=None):
"""
Python wrapper for sysctl(3) on FreeBSD.
Arguments:
name: list or tuple of integers.
buflen (int): Length of the data buffer to use.
convert: Optional function to convert the data.
Returns:
The requested binary data, converted if desired.
"""
cnt = len(name)
mib = ctypes.c_int * cnt
name_in = mib(*name)
oldlen = ctypes.c_size_t(buflen)
oldlenp = ctypes.byref(oldlen)
oldp = ctypes.create_string_buffer(buflen)
rv = libc.sysctl(
ctypes.byref(name_in), ctypes.c_uint(cnt), oldp, oldlenp, None, ctypes.c_size_t(0)
)
if rv != 0:
errno = ctypes.get_errno()
raise ValueError(f'sysctl error: {errno}')
if convert:
return convert(oldp.raw[:buflen])
return oldp.raw[:buflen]
To get the network device statistics, our array name should be:
CTL_NET, PF_LINK, NETLINK_GENERIC, IFMIB_IFDATA, X, IFDATA_GENERAL
Searching through the FreeBSD header files yields:
CTL_NET
= 4PF_LINK
= 18NETLINK_GENERIC
= 0IFMIB_IFDATA
= 2- X is the number of the interface in this case.
IFDATA_GENERAL
= 1
Note that using interface number 0 does not work.
According to py_freebsd
, this sysctl should return a struct ifmibdata
.
The definition of this structure is found in /usr/include/net/if_mib.h
,
While the struct if_data
is found in /usr/include/net/if.h
.
The interface name string is 16 bytes long.
struct ifmibdata {
char ifmd_name[IFNAMSIZ]; /* name of interface */
int ifmd_pcount; /* number of promiscuous listeners */
int ifmd_flags; /* interface flags */
int ifmd_snd_len; /* instantaneous length of send queue */
int ifmd_snd_maxlen; /* maximum length of send queue */
int ifmd_snd_drops; /* number of drops in send queue */
int ifmd_filler[4]; /* for future expansion */
struct if_data ifmd_data; /* generic information and statistics */
};
struct if_data {
/* generic interface information */
uint8_t ifi_type; /* ethernet, tokenring, etc */
uint8_t ifi_physical; /* e.g., AUI, Thinnet, 10base-T, etc */
uint8_t ifi_addrlen; /* media address length */
uint8_t ifi_hdrlen; /* media header length */
uint8_t ifi_link_state; /* current link state */
uint8_t ifi_vhid; /* carp vhid */
uint16_t ifi_datalen; /* length of this data struct */
uint32_t ifi_mtu; /* maximum transmission unit */
uint32_t ifi_metric; /* routing metric (external only) */
uint64_t ifi_baudrate; /* linespeed */
/* volatile statistics */
uint64_t ifi_ipackets; /* packets received on interface */
uint64_t ifi_ierrors; /* input errors on interface */
uint64_t ifi_opackets; /* packets sent on interface */
uint64_t ifi_oerrors; /* output errors on interface */
uint64_t ifi_collisions; /* collisions on csma interfaces */
uint64_t ifi_ibytes; /* total number of octets received */
uint64_t ifi_obytes; /* total number of octets sent */
uint64_t ifi_imcasts; /* packets received via multicast */
uint64_t ifi_omcasts; /* packets sent via multicast */
uint64_t ifi_iqdrops; /* dropped on input */
uint64_t ifi_oqdrops; /* dropped on output */
uint64_t ifi_noproto; /* destined for unsupported protocol */
uint64_t ifi_hwassist; /* HW offload capabilities, see IFCAP */
/* Unions are here to make sizes MI. */
union { /* uptime at attach or stat reset */
time_t tt;
uint64_t ph;
} __ifi_epoch;
#define ifi_epoch __ifi_epoch.tt
union { /* time of last administrative change */
struct timeval tv;
struct {
uint64_t ph1;
uint64_t ph2;
} ph;
} __ifi_lastchange;
#define ifi_lastchange __ifi_lastchange.tv
};
I wrote a trivial C program to determine the size of these structures on my
platform, and the offsets of the things I want to know; ifi_ibytes
and
ifi_obytes
:
> cc -o structinfo structinfo.c > ./structinfo sizeof(struct ifmibdata) == 208 sizeof(struct if_data) == 152 offsetof(ifmd_data) == 56 offsetof(ifi_ibytes) == 64 offsetof(ifi_obytes) == 72 In [1]: 56+64 Out[1]: 120 In [2]: 56+72 Out[2]: 128 In [3]: 128+8 Out[3]: 136
Let’s see if that works.
In [11]: def network(previous):
...: """
...: Report on bytes in/out for the network interfaces.
...:
...: Arguments:
...: previous: A dict of {interface: (inbytes, outbytes, time)} or None.
...:
...: Returns:
...: A new dict of {interface: (inbytes, outbytes, time)}, and a formatted
...: string to display.
...: """
...: cnt = sysctlbyname('net.link.generic.system.ifcount', convert=to_int)
...: newdata = {}
...: items = []
...: for n in range(1, cnt):
...: tm = time.time()
...: data = sysctl([4, 18, 0, 2, n, 1], buflen=208)
...: name = data[:16].strip(b'\x00').decode('ascii')
...: ibytes = to_int(data[120:128])
...: obytes = to_int(data[128:136])
...: if previous:
...: dt = tm - previous[name][2]
...: d_in = int((ibytes - previous[name][0])/dt)
...: d_out = int((obytes - previous[name][1])/dt)
...: items.append(f'{name}: {d_in}B/{d_out}B')
...: else:
...: items.append(f'{name}: 0B/0B')
...: newdata[name] = (ibytes, obytes, tm)
...: return newdata, ' '.join(items)
...:
In [12]: p, _ = network(None)
In [13]: %timeit network(p)
86.6 µs ± 202 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
CPU
The CPU function displays the workload from 0—100% and the average
temperature of the cores. This is also re-implemented via sysctlbyname
.
The method for calculating the CPU percentage was inspired by conky.
In [14]: import statistics as stat
In [15]: import struct
In [16]: def to_degC(value):
...: """Convert binary sysctl value to degree Centigrade."""
...: return round(int.from_bytes(value, byteorder='little')/10 - 273.15, 1)
...:
In [17]: def cpu(previous):
...: """
...: Report the CPU usage and temperature.
...:
...: Argument:
...: previous: A 2-tuple (used, total) from the previous run.
...:
...: Returns:
...: A 2-tuple (used, total) and a string to display.
...: """
...: temps = [
...: sysctlbyname(f'dev.cpu.{n}.temperature', convert=to_degC)
...: for n in range(4)
...: ]
...: T = round(stat.mean(temps))
...: resbuf = sysctlbyname('kern.cp_time', buflen=40)
...: states = struct.unpack('5L', resbuf)
...: # According to /usr/include/sys/resource.h, these are:
...: # USER, NICE, SYS, INT, IDLE
...: total = sum(states)
...: used = total - states[-1]
...: if previous:
...: prevused, prevtotal = previous
...: frac = int((used - prevused) / (total - prevtotal) * 100)
...: else:
...: frac = 0
...: return (used, total), f'CPU: {frac}%, {T}°C'
...:
In [18]: p, _ = cpu(None)
In [19]: p, cpustr = cpu(p)
In [20]: p
Out[20]: (5810156, 193222344)
In [21]: cpustr
Out[21]: 'CPU: 0%, 53°C'
In [22]: %timeit cpu(p)
248 µs ± 658 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Given that this function performs three sysctl calls, the time used is acceptable, I think.
Putting it all together
The main function of the program looks like this.
def main():
"""
Entry point for statusline-i3.py
"""
netdata, _ = network(None)
maildata, _ = mail(None)
cpusage, _ = cpu(None)
time.sleep(0.1) # Lest we get divide by zero in cpu()
while True:
start = time.time()
netdata, netstr = network(netdata)
maildata, mailstr = mail(maildata)
cpusage, cpustr = cpu(cpusage)
print(' | '.join([netstr, mailstr, memory(), cpustr, date()]))
sys.stdout.flush() # Important to make the line show up!
end = time.time()
delta = end - start
print(f"DEBUG: cycle time = {delta:.3f} s")
if delta < 1:
time.sleep(1 - delta)
Let’s see it run.
> python3 statusline-i3.py
age0: 0B/0B rl0: 0B/0B | Mail: 43 | RAM: 19% | CPU: 0%, 52°C | Wed 2019-07-03 22:53:11
DEBUG: cycle time = 0.001 s
age0: 0B/0B rl0: 0B/0B | Mail: 43 | RAM: 19% | CPU: 0%, 52°C | Wed 2019-07-03 22:53:12
DEBUG: cycle time = 0.001 s
age0: 0B/0B rl0: 0B/0B | Mail: 43 | RAM: 19% | CPU: 0%, 52°C | Wed 2019-07-03 22:53:13
DEBUG: cycle time = 0.001 s
age0: 0B/0B rl0: 0B/0B | Mail: 43 | RAM: 20% | CPU: 28%, 55°C | Wed 2019-07-03 22:53:14
DEBUG: cycle time = 0.001 s
age0: 1285B/772B rl0: 0B/0B | Mail: 43 | RAM: 22% | CPU: 49%, 58°C | Wed 2019-07-03 22:53:15
DEBUG: cycle time = 0.001 s
age0: 4647B/2161B rl0: 0B/0B | Mail: 43 | RAM: 24% | CPU: 55%, 58°C | Wed 2019-07-03 22:53:16
DEBUG: cycle time = 0.001 s
age0: 904B/1114B rl0: 0B/0B | Mail: 43 | RAM: 26% | CPU: 18%, 54°C | Wed 2019-07-03 22:53:17
DEBUG: cycle time = 0.001 s
age0: 4730B/1685B rl0: 0B/0B | Mail: 43 | RAM: 26% | CPU: 0%, 53°C | Wed 2019-07-03 22:53:18
DEBUG: cycle time = 0.001 s
Conclusion
The cycle time has dropped like a rock compared to the previous version, and CPU usage is now negligible. This rewrite was a total success! The amount of performance improvement was more than I expected, to be honest. As of 2019-07-03, I’m using this program instead of conky.
According to cloc, this program has 134 lines of code, 63 lines of comments and 37 blank lines. That’s not bad.
Writing this program also taught me the basics about using ctypes
, which
is a good tool to have in the toolbox.
After the above code was written, I made the mailbox location configurable as a command-line parameter. I also changed the internal API to make it easier to account for variable outputs on my machines; my laptop has a battery. The script has been published in my github scripts repo.
For comments, please send me an e-mail.
Related articles
- Have Python log to syslog
- Attempting a conky replacement in Python (part 1)
- Setting the Razer ornata chroma color from userspace
- Updating python3 to 3.6
- Switching IPython to Python 3