Utilities

ocdskit.util.grouper(iterable, n, fillvalue=None)[source]
class ocdskit.util.SerializableGenerator(iterable)[source]
class ocdskit.util.JSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]
default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
ocdskit.util.iterencode(data, ensure_ascii=False, **kwargs)[source]

Returns a generator that yields each string representation as available.

ocdskit.util.json_dump(data, io, ensure_ascii=False, **kwargs)[source]

Dumps JSON to a file-like object.

ocdskit.util.json_dumps(data, ensure_ascii=False, indent=None, sort_keys=False, **kwargs)[source]

Dumps JSON to a string, and returns it.

ocdskit.util.get_ocds_minor_version(data)[source]

Returns the OCDS minor version of the record package, release package, record or release.

ocdskit.util.is_package(data)[source]

Returns whether the data is a record package or release package.

ocdskit.util.is_record_package(data)[source]

Returns whether the data is a record package.

A record package has a required records field. Its other required fields are shared with release packages.

ocdskit.util.is_record(data)[source]

Returns whether the data is a record.

A record has required releases and ocid fields.

ocdskit.util.is_release_package(data)[source]

Returns whether the data is a release package.

A release package has a required releases field. Its other required fields are shared with record packages. To distinguish a release package from a record, we test for the absence of the ocid field.

ocdskit.util.is_release(data)[source]

Returns whether the data is a release (embedded or linked, individual or compiled).

ocdskit.util.is_compiled_release(data)[source]

Returns whether the data is a compiled release (embedded or linked).

ocdskit.util.is_linked_release(data)[source]

Returns whether the data is a linked release.

A linked release has required url and date fields and an optional tag field. An embedded release has required date and tag fields (among others), and it can have a url field as an additional field.

To distinguish a linked release from an embedded release, we test for the presence of the required url field and test whether the number of fields is fewer than three.

ocdskit.util.detect_format(path, root_path='', reader=<built-in function open>)[source]

Returns the format of OCDS data, and whether the OCDS data is concatenated or in an array.

If the OCDS data is concatenated or in an array, assumes that all items have the same format as the first item.

Parameters:
  • path (str) – the path to a file
  • root_path (str) – the path to the OCDS data within the file
Returns:

the format, whether data is concatenated, and whether data is in an array

Return type:

tuple

Raises:

UnknownFormatError – if the format cannot be detected