Advanced Topics

This section gives a more detailed explanation of the features of Kim. If you're looking for a quick overview or if this is your first time using Kim, please check out the quickstart guide.

Mappers

Polymorphic Mappers

It's not uncommon to have collections of objects that are not all the same. Perhaps you have an Activity type that has two sub types Task and Event. Their serialization requirements differ slightly meaning you'd typically serialize two lists and manually munge them together.

Kim provides support for Polymorphic Mapper to solve this problem.

Polymorphic Mappers are defined like a normal mapper with a few small differences. Firstly we define our base "type". This is the Mapper all of our Polymorphic types extend from. Our base type should inherit from kim.mapper.PolymorphicMapper instead of kim.mapper.Mapper.

from kim import PolymorphicMapper, field

class ActivityMapper(PolymorphicMapper):

    __type__ = Activity

    id = field.String()
    name = field.String()
    object_type = field.String(choices=['event', 'task'])
    created_at = field.DateTime(read_only=True)

    __mapper_args__ = {
        'polymorphic_on': object_type,
    }

For users of SQLAlchemy, this API will feel very familiar. We've specified our base mapper with the __mapper_args__ property. The polymorphic_on key is given a referrence to the field used to indentify our polymorphic types. This can also be passed as a string.

__mapper_args__ = {
    'polymorphic_on': 'object_type'
}

Now we need to define our types.

class TaskMapper(ActivityMapper):

    __type__ = Task

    status = field.String(read_only=True)
    is_complete = field.Boolean()

    __mapper_args__ = {
        'polymorphic_name': 'task'
    }


class EventMapper(ActivityMapper):

    __type__ = Event

    location = field.String(read_only=True)

    __mapper_args__ = {
        'polymorphic_name': 'event'
    }

Our types inherit from our base ActivityMapper and also specify the __mapper_args__ property. Our types provide the polymorphic_name key which indentifies the type to the base mapper.

Serializing Polymorphic Mappers

Serializing Polymorphic Mappers works in the same way as serializing a normal Mapper. When we want to serialize a collection of mixed types we serialzie using the base mapper.

>>> activities = Activity.query.all()
>>> ActivityMapper.many(obj=activities).serialize()
[
    {'name': 'My Test Event', 'id': 1, 'object_type': 'event', 'created_at': '2017-03-11T05:14:43+00:00', 'location': 'London'},
    {'name': 'My Test Task', 'id': 1, 'object_type': 'task', 'created_at': '2016-03-11T05:14:43+00:00', 'status': 'overdue', 'is_complete': False},
]

As you would expect, serializing using one of the child types directly will only serialize its own type.

>>> activities = Event.query.all()
>>> EventMapper.many(obj=activities).serialize()
[
    {'name': 'My Test Event', 'id': 1, 'object_type': 'event', 'created_at': '2017-03-11T05:14:43+00:00', 'location': 'London'},
]

Marshaling Polymorphic Mappers

Marshaling Polymorphic Mappers is also supported but is disabled by default. It is currently considered an experimental feature.

To enable marshaling for Polymorphic Mappers we pass allow_polymorphic_marshal: True to the __mapper_args__ property on the base Polymorphic Mapper.

class ActivityMapper(PolymorphicMapper):

    __type__ = Activity

    id = field.String()
    name = field.String()
    object_type = field.String(choices=['event', 'task'])
    created_at = field.DateTime(read_only=True)

    __mapper_args__ = {
        'polymorphic_on': object_type,
        'allow_polymorphic_marshal': True,
    }

We can now marshal a collection of mixed object types using the base ActivityMapper.

data = [
    {'name': 'My Test Event', 'object_type': 'event', 'created_at': '2017-03-11T05:14:43+00:00', 'location': 'London'},
    {'name': 'My Test Task', 'object_type': 'task', 'created_at': '2016-03-11T05:14:43+00:00', 'status': 'overdue', 'is_complete': False},
]
>>> ActivityMapper.many(obj=activities).marshal()
[Event(name='My Test Event'), Task(name='My Test Task')]

Exception Handling

Kim uses custom exceptions when marshaling to allow you to get at all the errors that ocurred as a result of processing the fields in your mappers marshaling pipeline.

Each pipe in a field`s pipeline can raise a kim.exception.FieldInvalid. As the pipeline is processed the errors for the field will be stored against the mapper. Once all the fields have been processed the mapper checks to see if any errors occurred. If there are any errors the mapper will raise a kim.exception.MappingInvalid.

You should typically only worry about handling the kim.exception.MappingInvalid when marshaling.

from kim import MappingInvalid

try:
    data = mapper.marshal()
except MappingInvalid as e:
    print(e.errors)

The kim.exception.MappingInvalid exception raised will have an attribute called errors. Errors is a dictionary containing field_name: error message. The errors object can also contain nested error objects when marshaling a kim.field.Nested field fails.

Roles

As described in the quickstart, the Roles system provides users with a system for controlling what fields are available during marshaling and serialization.

Role Inheritance

Mappers inherit Roles from their parents automatically. Consider the following example.

class MapperA(Mapper):

    __type__ = dict

    field_a = field.String()
    field_b = field.String()

    __roles__ = {
        'ab': whitelist('field_a', 'field_b')
    }


class MapperB(MapperA):

    field_c = field.String()

    __roles__ = {
        'abc': blacklist()
    }

MapperB inherits from MapperA and therefore will have access to all the roles defined on MapperA. Equally, MapperB can define the role ab to override the fields available for that role.

Combining Roles

Under the hood kim.role.Role is a set object. This allows us to combine roles in the ways that sets can be combined. This is useful when you have a role defined on a base type that you need to extend.

When combining whitelist and blacklist roles the order is not important. The blacklist always takes priority. The following examples are equal.

>>> role = blacklist('name', 'id') | whitelist('name', 'email')
>>> assert 'email' in role
>>> assert 'name' not in role
>>> assert 'id' not in role
>>> assert role.whitelist

>>> role = whitelist('name', 'id') | blacklist('name', 'email')
>>> assert 'id' in role
>>> assert 'name' not in role
>>> assert 'email' not in role
>>> assert role.whitelist

Default Roles

Every mapper has a special hidden role called __default__. By default the __default__ role contains every field defined on your Mapper.

You can override the __default___ role by specifying it in the __roles__ property on your Mapper.

class MapperA(Mapper):

    __type__ = dict

    field_a = field.String()
    field_b = field.String()

    __roles__ = {
        '__default__': whitelist('field_a')
    }

Now whenever we call kim.mapper.Mapper.marshal or kim.mapper.Mapper.serialize on MapperA without a role, the default role will be used which now only includes field_a.

Note

The __default__ role does not currently inherit from it's parent and must be defined explitly on the all Mappers in the class heirarchy.

Fields

Name and Source

If you'd like the field in your JSON data to have a different name to the field on the object, pass the source attribute to Field.

from kim import Mapper, field

class CompanyMapper(Mapper):
    __type__ = Company
    title = field.String(source='name')

>>> company = Company(name='Wayne Enterprises')
>>> mapper = CompanyMapper(company)
>>> mapper.serialize()
{'title': 'Wayne Enterprises'}

Note

When marshaling, Kim will look for data in the field named in source

Similarly, if you'd like the JSON data to have a different name to the attribute name on the mapper class, pass the name attribute to Field. This is useful if you have multiple fields in different roles which should serialize to the same field.

from kim import Mapper, field, role

class CompanyMapper(Mapper):
    __type__ = Company
    short_title = field.String(name='title')
    long_title = field.String(name='title')

    __roles__ = {
        'simple': role.whitelist('short_title'),
        'full': role.whitelist('long_title')
    }


>>> company = Company(short_title='Wayne', long_title='Wayne Enterprises')
>>> mapper = CompanyMapper(company)
>>> mapper.serialize(role='simple')
{'title': 'Wayne'}
>>> mapper.serialize(role='full')
{'title': 'Wayne Enterprises'}

Nested __self__

Sometimes your object model may contain flat data but you'd like the JSON output to be nested. You can do this by setting source='__self__' on a Nested field.

from kim import Mapper, field, role

class AddressMapper(Mapper):
    __type__ = dict

    street = field.String()
    city = field.String()
    zip = field.String()

class CompanyMapper(Mapper):
    __type__ = Company

    name = field.String()
    address = field.Nested(AddressMapper, source='__self__')

>>> company = Company(
    title='Wayne Enterprises',
    street='4 Maple Road',
    city='Sunview',
    zip='90210')
>>> mapper = CompanyMapper(company)
>>> mapper.serialize()
{'name': 'Wayne Enterprises',
 'address': {'street': '4 Maple Road', 'city': 'Sunview', 'zip': '90210'}}

In this example, the address appears as a nested object in the JSON, but it's fields are all sourced from company.

Note

__self__ can also be used to marshal nested objects into flat structures

Marshaling Nested Fields

Nested fields can be marshaled in a similar manner to serializing, but there are several security concerns you should take into account when using them. Kim's settings default to the most secure and must be overridden to use the full functionality.

Note

This section, and Kim's defaults, assume you are using nested fields to refer to foreign keys (or similar NoSQL relationships) on ORM objects. If you are not using Kim with an ORM, you probably want to enable the allow_create and allow_updates_in_place options for seamless operation.

In general, there are four things you may want to happen when marshaling a nested field. The following sections describe them, and the input data they expect.

For all examples, assume the Mapper looks like this:

from kim import Mapper

class UserMapper(Mapper):
    __type__ = MyUser

    id = field.Integer(read_only=True)
    name = field.String(required=True)
    company = field.Nested('CompanyMapper')  # Set options on this field

1. Retrieve by ID only (default)

{'id': 1,
 'name': 'Bob Jones',
 'company': {
    'id': 5,  # Will be used to look up Company
    # Any other data here will be ignored
 }}

This is the most secure option and the most common thing you will want to do. This means that only the ID of the target object will be used, a getter function which you define will be used to retrieve the object with this ID from your database (taking into account security such as ensuring the user has access to the object), and the object returned from the getter function will be set on the target attribute.

2. allow_updates - Retrieve by ID, allowing updates

{'id': 1,
 'name': 'Bob Jones',
 'company': {
    'id': 5,  # Will be used to look up Company
    'name': 'New name',  # Will be set on the Company
 }}

This option retrieves the related object via it's ID using a getter function as in scenario 1. However, any other fields passed along with the ID will be updated on the related object, according to the role passed. You are strongly encouraged to only use this option with a restrictive role, in order to avoid introducing security holes where users can change fields on objects they should not be able to do, (for example, change the user field on an object to change it's ownership).

Use this option like this (role is not required):

company = field.Nested('CompanyMapper', allow_updates=True, role='restrictive_role')

3. allow_create - Retrieve by ID, or create object if no ID passed

# No ID passed - create new
{'id': 1,
 'name': 'Bob Jones',
 'company': {
    'name': 'My new company',  # Will be set on the new company
 }}
 # ID passed - works as scenario 1
 {'id': 1,
  'name': 'Bob Jones',
  'company': {
     'id': 5,  # Will be used to look up company
     # Any other data here will be ignored
  }}

This option uses your getter function to look up the related object by ID, but if it is not found (ie. your getter function returns None) then a new instance of the object will be created, using the fields passed according to the role.

This option may be combined with allow_updates in order to provide a field which will accept an existing object, allow it to be updated and allow a new one to be created.

Once again, you should consider carefully the role you use with this option to avoid unexpected consequences (for example, it being possible to set the user field on an object to someone other than the logged-in user.)

Use this option like this (role is not required):

company = field.Nested('CompanyMapper', allow_create=True, role='restrictive_role')

Collections

Collections are used to produce arrays of similar fields in the JSON output. They can be scalar fields or nested fields and work when serializing or marshaling.

To create a collection, wrap any field in Collection:

from kim import Mapper, field, role


class CompanyMapper(Mapper):
    __type__ = Company

    name = field.String()
    offices = field.Collection(field.String())

>>> mapper = CompanyMapper(company)
>>> mapper.serialize()
{'name': 'Wayne Enterprises',
 'offices': ['London', 'Berlin', 'New York']}

You can also wrap nested fields:

from kim import Mapper, field, role

class EmployeeMapper(Mapper):
    __type__ = Employee

    name = field.String()
    job = field.String()


class CompanyMapper(Mapper):
    __type__ = Company

    name = field.String()
    employees = field.Collection(field.Nested(EmployeeMapper))

>>> mapper = CompanyMapper(company)
>>> mapper.serialize()
{'name': 'Wayne Enterprises',
 'employees': [
    {'name': 'Jim', 'job': 'Developer'},
    {'name': 'Bob', 'job': 'Manager'},
]}

When marshaling, Nested fields can be forced to be unique on a key to avoid duplicates:

from kim import Mapper, field, role

class EmployeeMapper(Mapper):
    __type__ = Employee

    id = field.Integer()
    name = field.String()


class CompanyMapper(Mapper):
    __type__ = Company

    name = field.String()
    employees = field.Collection(
        field.Nested(EmployeeMapper), unique_on='id')

>>> data = {'employees': [{'id': 1, 'name': 'Jim'}, {'id': 1, 'name': 'Bob'}]}
>>> mapper = CompanyMapper(data=data)
>>> mapper.marshal()
MappingInvalid

Pipelines

Fields process their data through a series of pipes, called a pipeline. A pipe is passed some data, performs one operation on it and returns the new data. This is then passed to the next pipe in the chain. This concept is similar to Unix pipes.

There are separate pipelines for serializing and marshaling.

For example, here is the marhal pipeline for the String field. Pipes are grouped into four stages - input, validation, process and output.

input_pipes = [read_only, get_data_from_name]
validation_pipes = [is_valid_string, is_valid_choice, ]
process_pipes = []
output_pipes = [update_output_to_source]

# Order of execution is:
read_only ->                 # Stop execution if field is ready only
get_data_from_name ->        # Get the data for this field from the JSON
is_valid_string ->           # Raise exception if data is not a string
is_valid_choice ->           # If choices=[] set on field, raise exception if not valid choice
update_output_to_source ->   # Update the object with this data

Custom Fields and Pipelines

To define a custom field, you need to create the Field class and its corresponding Pipline. It's usually easiest to inherit from an existing Field/Pipeline, rather than defining an entirely new one.

This example defines a new field with a custom pipeline to convert its output to uppercase:

from kim import pipe, String, Mapper
from kim.pipelines.string import StringSerializePipeline


@pipe()
def to_upper(session):
    if session.data is not None:
        session.data = session.data.upper()
    return session.data

class UpperCaseStringSerializePipeline(StringSerializePipeline):
    process_pipes = StringSerializePipeline.process_pipes + [to_upper]

class UpperCaseString(String):
    serialize_pipeline = UpperCaseStringSerializePipeline

class MyMapper(Mapper):
    __type__ = dict

    name = UpperCaseString()

Note

This is a contrived example, for simple transforms like this see extra_marshal_pipelines below

Note that we have only overridden the process_pipes stage of StringSerializePipeline. Everything else remains the same. We have extended the process_pipes list from the parent object in order to retain it's functionality, and just added our new pipe at the end.

Pipes should find and set their data on session.data. The session object also provides access to the field, the current output object, the parent field (if nested) and the mapper. See the API docs for details.

Custom Validation - extra_marshal_pipes

If you just want to change the pipeline used by a particular instance of a Field on a Mapper, for example to add custom validation logic, you don't need to define an entirely new field. Instead you can pass extra_marshal_pipes:

extra_marshal_pipes are pushed onto the existing list of pipes defined on the field at compile time once each time a Field is instantiated.

from kim import Mapper, String, Integer, pipe


@pipe()
def check_age(session):
    if session.data is not None and session.data < 18:
        raise session.field.invalid('not_old_enough')

    return session.data


class MyMapper(Mapper):
    __type__ = dict

    name = String()
    age = Integer(
        extra_marshal_pipes={
            'validation': [check_age],
        },
        error_msgs={'not_old_enough': 'You must be over 18'}
    )

extra_marshal_pipes takes a dict of the format {stage: [pipe, pipe, pipe]}. Any pipes pased will be added at the end of their respective stage.