Writing Encoders¶
A Little Bit About Encoders¶
Each field in the data model has two main properties:
- The value of the field - for example, a number may have the value 1, 2 or -1532 and so on.
- The representation of the field - for example, a number with the value 123 may be represented as decimal (
123
), hexadecimal (7B
) or a 32 bit little endian ('\x7b\x00\x00\x00'
) and so on.
The value of the field is decided by the field itself, using the default value or one of the field’s mutations. The representation of the field is decided by it’s encoder.
Kitty has a few encoder classes, used to encode integer values, floats, strings and Bits
objects.
You can see a list of available encoder classes in the Data Model Syntax documentation.
Kitty also provides many encoder objects that may be used directly in your model,
such as ENC_INT_BE
to encode an integer as a big endian,
or ENC_STR_BASE64
to encoder a string in base64.
Since most encoders have no state, the same object may be used by many fields without issue,
thus the use of a global encoder object makes sense.
However, Kitty doesn’t have any possible encoding, and so you might be required to implement your own encoder from time to time.
Implementing Your Own Encoder¶
Example 1 - Aligned String¶
Let’s say that you have a String
field that must be 4 bytes aligned,
and you decide that it makes no sense to have mutations of this field that are not aligned.
Since the alignment is only related to the representation of the string, you should probably just encode it as a 4 byte aligned string, meaning, implement an encoder that encodes the string with padding when needed.
There are several ways to implement such an encoder. Here are two of them.
Straight Forward Implementation¶
We’ll start with a simple one, which might be just enough in some cases,
and will make it more robust.
Since we encode a string, we need to inherit from StrEncoder
and override its encode()
method.
This method receive a string as argument, and returns an encoded string as a Bits
object.
class AlignedStrEncoder(StrEncoder):
def encode(self, value):
pad_len = (4 - len(value) % 4) % 4
new_value = value + '\x00' * pad_len
return Bits(bytes=new_value)
Then, we can instantiate it:
ENC_STR_ALIGN = AlignedStrEncoder()
And use it multiple times:
Template(name='two aligned strings', fields=[
String('foo', encoder=ENC_STR_ALIGN),
String('bar', encoder=ENC_STR_ALIGN),
])
Using StrFuncEncoder
¶
You can also implement it by passing the encoding function
to the constructor of StrFuncEncoder
def align_string(value):
pad_len = (4 - len(value) % 4) % 4
new_value = value + '\x00' * pad_len
return Bits(bytes=new_value)
ENC_STR_ALIGN = StrFuncEncoder(align_string)
Generic Implementation¶
We might want to create a generic aligned string encoder class
and pass the alignment size as a parameter.
In this case, we need to override the __init__
function:
class AlignedStrEncoder(StrEncoder):
def __init__(self, pad_size, pad_char='\x00'):
self._pad_size = pad_size
self._pad_char = pad_char
def encode(self, value):
pad_len = (self._pad_size - len(value) % self._pad_size) % self._pad_size
new_value = value + self._pad_char * pad_len
return Bits(bytes=new_value)
Then, we can instantiate it with different values:
ENC_STR_ALIGN4 = AlignedStrEncoder(4)
ENC_STR_ALIGN8 = AlignedStrEncoder(8)
Example 2 - Reversed Bits¶
The encode()
method returns a Bits
object (from the bitstring
package).
The main difference between the big encoder types is the (type of) value that their
encode()
method accepts.
In the first example, the it accepted a string,
in the case of BitsEncoder
, it accepts a Bits
object.
The encoder below encodes the bits in a reversed order,
e.g. if it receives the bits 10101100
it will return 00110101
.
There two main ways to implement such an encoder.
Using BitsFuncEncoder
¶
As with StrFuncEncoder
in the previous example,
BitsFuncEncoder
allows you to just pass an encode()
function to the constructor,
so you don’t need to create a new class and implement its encode
method.
This comes handy from time to time.
def reverse_bits(value):
return value[::-1]
ENC_BITS_REVERSED = BitsFuncEncoder(reverse_bits)
Subclassing BitsEncoder
¶
However, you may subclass BitsEncoder
directly.
class ReversedBitsEncoder(BitsEncoder):
def encoder(value):
return value[::-1]
And instantiate it
ENC_BITS_REVERSED = ReversedBitsEncoder()