JSONAlchemy¶
JSONAlchemy provides an abstraction layer on top of your database to work with JSON objects, helping the administrators to define the data model of their site independent of the master format they are working with and letting the developers work in a controlled and uniform data environment.
WIP, Finish with the empty sections:
- Module Structure: Explain what is each folder for and maybe the namespaces.
- Readers:
- How it works: explain how everything works together, how the magic happens. Maybe a small example (record centric?).
- How to Extend JSONAlchemy Behaviour
- Invenio Use Cases: pointer to records, annotations and documents documentation (where real ‘how to’ stile documentation is place for each of them).
Module Structure¶
Configuration¶
JSONAlchemy works with two different configuration files, one for the field definitions and the second one for the models.
Field Configuration Files¶
This is an example (it might not be 100% semantically correct) of the definition of the field ‘title’:
title:
schema:
{'title': {'type': 'dict', 'required': False}}
creator:
@legacy((("245", "245__","245__%"), ""),
("245__a", "title", "title"),
("245__b", "subtitle"),
("245__k", "form"))
marc, '245..', {'title': value['a'], 'subtitle': value[b]}
dc, 'dc:title', {'title': value}
unimarc, '200[0,1]_', {'title': value['a'], 'subtitle': value['e']}
producer:
json_for_marc(), {'a': 'title', 'b': 'subtitle'}
json_for_dc(), {'dc:title': ''}
json_for_unimarc(), {'a': 'title', 'e': 'subtitle'}
A field definition is made out several sections, each of them identified by its indentation (like in python).
In this example exist the three most common sections that a field can have:
schema
, creator
and producer
. Even though there could be more
sections, we will explain only the ones that Invenio provides, In fact, the
aforementioned ones are inside the core of the JSONAlchemy, while the rest
are already defined as extensions by Invenio. Be aware that new sections
could come via extensions.
Each of these sections adds some information to the dictionary representing the field definition. For example, the dictionary generated for the field defined above would be something like:
{'aliases': [],
'extend': False,
'override': False,
'pid': None,
'producer': {
'json_for_marc': [((), {'a': 'title', 'b': 'subtitle'})],
'json_for_dc': [(), {'dc:title': ''}]
'json_for_unimarc': [(), {'a': 'title', 'e': 'subtitle'}]}
'rules': {
'json': [
{'decorators': {'after': {}, 'before': {}, 'on': {}},
'function': <code object <module> at 0x10f173030, file "", line 1>,
'source_format': 'json',
'source_tags': ['title'],
'type': 'creator'}],
'marc': [
{'decorators': {'after': {}, 'before': {}, 'on': {'legacy': None}},
'function': <code object <module> at 0x10f10de30, file "", line 1>,
'source_format': 'marc',
'source_tags': ['245__'],
'type': 'creator'}]}}
Only one field is shown here, but one file could contain from one up to n
field definitions. Check out the atlantis.cfg
file from the Invenio demo
site to get a quick view about how the configuration file for your fields
should look like.
For the BFN lovers, this is something close to the grammar used to parse this:
rule ::= [pid | extend | override]
json_id ["," aliases]":"
body
json_id ::= (letter|"_") (letter|digit|_)*
aliases ::= json_id ["," aliases]
pid ::= @persitent_identifier( level )
extend ::= @extend
override ::= @override
body ::=(creator* | derived | calculated) (extensions)*
creator ::= [decorators] format "," tag "," expr
derived ::= [decorators] expr
calculated ::= [decorators] expr
Creator¶
The creator is the one of most important parts of the field definition: Inside it, the content of the field is created, while the way this happens depends on its origin.
The creator section is the one used to define the fields that are coming directly from the input file and don’t depend on any type of calculation from another source. We also call this kind of field a real field.
This section can be made out of one or several lines, each one representing the translation of the field, from whatever the input format is, into JSON.
For example:
marc, '245..', {'title': value['a'], 'subtitle': value[b]}
This tells us that any field that matches the regular expression 245..
(more regular expressions could be specified space separated), the master
format marc
will be used, and that the transformation {'title':
value['a'], 'subtitle': value[b]}
will be applied.
The transformation must be a valid python expression as it will be evaluated as
such. In it, the value of the field with which we are dealing with is available
as value
(typically a dictionary). This python expression can also be a
function call. This function can either be imported via the __import__()
function or implemented in the /functions
folder, the contents of which it
are imported automatically.
For each master format that we want to deal with we need to have a Reader
, we
will see afterwards what that is and how to create one. A reader for JSON and
for MARC21 is provided by default with Invenio. See Readers for more
information about readers and How to Extend JSONAlchemy Behaviour to
learn how to write your own reader.
Along with each creator rule there could be one or more decorators (like in python). We will describe the default decorators that are implemented and how to do more later in the Decorators section.
Derived¶
When a field is derived from a source that is not the input file and needs to be calculated only when the source it depends on changes (this is expected to happen infrequently) it is called a derived virtual field.
An example of a virtual field could be something like this:
number_of_authors:
derived:
@depends_on('authors')
len(self['authors'])
This section is similar to the previous one, creator, but in this case each line is just a valid python expression.
Calculated¶
Another type of virtual fields are the ones which values’ change a lot over time; for example the number of comments that a record inside Invenio has or the number of reviews that a paper has.
In these cases we use calculated field definitions. Following the example of the number of comments, this could be its definition:
number_of_comments:
calculated:
@parse_first('recid')
@memoize(30)
get_number_of_comments(self.get('recid'))
The way that a calculate rule is defined is the same as for the derived fields.
One important point about the calculated fields is caching. One field could be:
- Always cached - until someone (some other module) changes its name
- Cached for a period of time - like in the example,
- Not cached at all - so its value is calculated every time.
See the Decorators for more information about this.
Schema¶
Here we can specify the schema or structure that the field should follow. This is done using nicolaiarocci/cerberus and you read the documentation on how to use it in read the docs.
JSONAlchemy only adds two things to the default cerberus:
- The
force
boolean value that tells if the value of the filed needs to be casted totype
. - The
default
function (which has no parameters) that is used if the field has a default value.
An example of the schema section could be:
schema:
{'uuid': {'type':'uuid', 'required': True, 'default': lambda: str(__import__('uuid').uuid4())}}
Description¶
This is an special section as it could be used without the block:
uuid:
"""
This is the main persistent identifier of a document and will be
used
internally as this, therefore the pid important should always
be '0'.
"""
recid:
description:
"""Record main identifier. """
Both cases have the same syntax (triple-quoted strings a-la python) and the same end result.
Note
The docstrings are not used anywhere else but inside the configuration files, for now. The plan is to use them to build the sites data model documentation using spinxh, therefore is quite important to write them and keep them updated.
JSON¶
Not all the fields that we want to use have a JSON-friendly representation.
Consider a date that we would like to use as a datetime
object, yet
we want to store it as a JSON object.
To solve this issue, we introduced the JSON section where a couple of functions:
- loads to load the JSON representation from the database into an object, and
- dumps which does the opposite.
A clear example of that is the creation_date
field:
creation_date:
json:
dumps, lambda d: d.isoformat()
loads, lambda d: __import__('dateutil').parser.parse(d)
Both functions take only one argument, which is the value of the field.
Producer¶
Generating a different output from a JSON object is not always easy: there might be implications among fields or rules. For this reason the producer section was introduced. The producer section can also be seen as a kind of documentation on how a field is exported to different formats and which formats those are.
This is an example of its use:
title:
creator:
marc, "245..", { 'title':value['a'], 'subtitle': value['b']}
producer:
json_for_marc(), {'a': 'title', 'b': 'subtitle'}
json_for_dc(), {'dc:title': 'title'}
Each rule inside the producer section follows the same pattern: first we
specify the function that we want to use (what we want to produce), which
should be placed inside the /producers
folder. This is not a real function
call, but only a way to specify which producer we will use and which parameter
we would like to use for this field. In the case of the MARC21 producer we can
put 245__
as parameter, so that only if title originated from a 245__
MARC21 field this function will be used to generate the output. This parameter
could be used differently depending on each producer.
The second part, after the comma, is the rule that we will apply and it is
typically a dictionary. In the case of the MARC21 producer we can put full name
of the field as key, 245__a
, or just the subfield like in the example. The
value for this key could a function call, a subfield or even empty (if we want
to use the entire field as a value).
For more information about the MARC22 producer please check JSON for MARC documentation.
Inside any JSONAlchemy object, like records or documents, there is a method,
produce(producer_code, fields=None)
, that uses this and outputs a
dictionary with a certain “flavor”. This new representation of the JSON object
could be used elsewhere, for example in the formatter module, to generate the
desired output in a easier way than only using the JSON object.
Decorators¶
Like python decorators, field decorators could be used either to add extra information to the field itself or to modify the translation process that creates the field.
There are two different types of field decorators, one that decorates the entire field and the other that decorates one creator/derived/calculated rule. As well as for the sections in the field definition new decorators could be defined to extend the current ones.
Field Decorators¶
This type of decorators should be used outside of the field definition and affects the whole field, maybe adding some information to the dictionary that defines it.
Invenio provides three different field decorators:
@persitent_identifier(int)
: Identifies a field as a PID with a priority, which could later be accessed using thepersistent_identifiers
property@override
: As its name points out, it allows us to completely override the field definition.@extend
: Allows us extend an existing field with, for example, new creator rules.
Note
There are currently no extensions for this type of decorators. It is in the road map to allow each Invenio instace to extend these decorators with any other that they might need.
Rule Decorators¶
This other type of decorators applies to the creator/derived/calculated rules. For example:
authors:
"""List with all the authors, connected with main_author and rest_authors"""
derived:
@parse_first('_first_author', '_additional_authors')
@connect('_first_author', sync_authors)
@connect('_additional_authors', sync_authors)
@only_if('_firs_author' in self or '_additional_authors' in self)
util_merge_fields_info_list(self, ['_first_author', '_additional_authors'])
These decorators are applied only if the derived rule of the field authors
is applied.
The rule decorators are split into three different kinds depending on when they are evaluated: before the rule gets evaluated, during the evaluation of the rule and after the rule evaluation.
This is the list of rule decorators available in Invenio and what they are used for.
connect(field_name, handler=None)
This is an post-evaluation decorator that allows the connection between fields. This connection is bidirectional: if the connected field gets modified, then the decorated field also gets modified and vice versa.
The optional handler function will be called whenever there is any modification in any of the fields. The default behavior is to propagate the value across all the connected fields.
depends_on(*field_names)
This decorator acts before rule evaluation and tells JSONAlchemy whether the rule will be evaluated depending on the existence of the
field_names
inside the current JSON object.If the fields are not in the JSON object and their rules have not been evaluated yet, then it will try to evaluate them before failing.
legacy(master_format, legay_field, matching)
- An on-evaluation decorator that adds some legacy information to the rule that its being applied. The master format is not important if dealing with a creator rule (it will be derived from the rule), otherwise it needs to specified. The matching argument is typically a tuple where we connect the legacy field with the subfields.
memoize(life_time=0)
- This post-evaluation decorator only works with calculated fields. It creates a cached value of the field that is decorated for a determined time.
only_if_master_value(*boolean_expresions)
On-evaluation decorator that gives access to the current master value. It is typically used to evaluate one rule only if the master value matches a series of conditions.
The boolean expression could be any python expression that is evaluated to
True
orFlase
.only_if(*boolean_expresions)
- Like the previous one, but in this case we don’t have access to the current master value, only to the current JSON object.
parse_first(*field_names)
- This could be seen as a lighter version of
depends_on
. However, in this case the rule will be evaluated even if the fields names are not inside the JSON object - it only triggers parsing the rules for the fields.
For more information about the decorators, and also about the other extensions, check the Parsers section.
Note
Be aware that, right now, the order of the decorators is not respected.
Model Configuration File¶
Readers¶
How it Works¶
How to Extend JSONAlchemy Behaviour¶
Invenio Use Cases¶
API Documentation¶
This documentation is automatically generated from JSONAlchemy’s source code.
Core¶
Bases¶
General extensions for JSON objects.
JSONAlchemy allows the developer to extend the behavior or capabilities of the
JSON objects using extensions. For more information about how extensions
works check invenio.modules.jsonalchemy.jsonext.parsers.extension_model_parser.ExtensionModelParser
.
Errors¶
JSONAlchemy errors.
-
exception
invenio.modules.jsonalchemy.errors.
FieldParserException
¶ Raised when some error happens parsing field definitions.
-
exception
invenio.modules.jsonalchemy.errors.
JSONAlchemyException
¶ Base exception.
-
exception
invenio.modules.jsonalchemy.errors.
ModelParserException
¶ Raised when some error happens parsing model definitions.
-
exception
invenio.modules.jsonalchemy.errors.
ReaderException
¶ Raised when some error happens reading a blob.
Base Model and Field Parser¶
-
invenio.modules.jsonalchemy.parser.
_create_field_parser
()¶ Create a parser that can handle field definitions.
BFN like grammar:
rule ::= [pid | extend | override] json_id ["," aliases]":" body json_id ::= (letter|"_") (letter|digit|_)* aliases ::= json_id ["," aliases] pid ::= @persistent_identifier( level ) extend ::= @extend override ::= @override hidden ::= @hidden body ::=(creator* | derived | calculated) (extensions)* creator ::= [decorators] format "," tag "," expr derived ::= [decorators] expr calculated ::= [decorators] expr
To check the syntactics of the parser extensions or decorators please go to
invenio.modules.jsonalchemy.jsonext.parsers
-
invenio.modules.jsonalchemy.parser.
_create_model_parser
()¶ Create a parser that can handle model definitions.
BFN like grammar:
TODO
Note: Unlike the field configuration files where you can specify more than one field inside each file for the models only one definition is allowed by file.
-
class
invenio.modules.jsonalchemy.parser.
FieldParser
(namespace)¶ Field definitions parser.
-
classmethod
decorator_after_extensions
()¶ TODO.
-
classmethod
decorator_before_extensions
()¶ TODO.
-
classmethod
decorator_on_extensions
()¶ TODO.
-
classmethod
field_definition_model_based
(field_name, model_name, namespace)¶ Get the real field definition based on the model name.
Based on a model name (and namespace) it gets the real field definition.
-
classmethod
field_definitions
(namespace)¶ Get all the field definitions from a given namespace.
If the namespace does not exist, it tries to create it first
-
classmethod
field_extensions
()¶ Get the field parser extensions from the parser registry.
-
classmethod
legacy_field_matchings
(namespace)¶ Get all the legacy mappings for a given namespace.
If the namespace does not exist, it tries to create it first
See: guess_legacy_field_names()
-
classmethod
reparse
(namespace)¶ Reparse all the fields.
Invalidate the cached version of all the fields inside the given namespace and parse them again.
-
classmethod
-
class
invenio.modules.jsonalchemy.parser.
ModelParser
(namespace)¶ Record model parser.
-
classmethod
model_definitions
(namespace)¶ Get all the model definitions given a namespace.
If the namespace does not exist, it tries to create it first.
-
classmethod
parser_extensions
()¶ Get only the model parser extensions from the parser registry.
-
classmethod
reparse
(namespace)¶ Invalidate the cached version of all the models.
It does it inside the given namespace and parse it again.
-
classmethod
resolve_models
(model_list, namespace)¶ Resolve all the field conflicts.
From a given list of model definitions resolves all the field conflicts and returns a new model definition containing all the information from the model list. The field definitions are resolved from left-to-right.
Parameters: model_list – It could be also a string, in which case the model definition is returned as it is. Returns: Dictionary containing the union of the model definitions.
-
classmethod
Base Reader¶
-
class
invenio.modules.jsonalchemy.reader.
Reader
(json, blob=None, **kwargs)¶ Base reader.
-
classmethod
add
(json, fields, blob=None, fetch_model_info=False)¶ Add the list of fields to the json structure.
If fields is
None
it adds all the possible fields from the current model.Parameters: - json – Any
SmartJson
object - fields – Dict of fields to be added to the json structure containing field_name:json_id
- json – Any
-
classmethod
process_model_info
(json)¶ Process model information.
Fetches all the possible information about the current models and applies all the model extensions evaluate methods if any extension is used.
-
classmethod
set
(json, field, value=None, set_default_value=False)¶ Set new field value to json object.
When adding a new field to the json object finds as much information about it as possible and attaches it to the json object inside
json['__meta_metadata__'][field]
.Parameters: - json – Any
SmartJson
object - field – Name of the new field to be added
- value – New value for the field (if not
None
) - set_default_value – If set to
True
looks for the default value if any and sets it.
- json – Any
-
static
split_blob
(blob, schema=None, **kwargs)¶ Specify how to split the blob by single record.
In case of several records inside the blob this method specify how to split then and work one by one afterwards.
-
classmethod
translate
(blob, json_class, master_format='json', **kwargs)¶ Transform the incoming blob into a json structure (
json_class
).It uses the rules described in the field and model definitions.
Parameters: - blob – incoming blob (like MARC)
- json_class – Any subclass of
SmartJson
- master_format – Master format of the input blob.
- kwargs – parameter to pass to json_class
Returns: New object of
json_class
type containing the result of the translation
-
classmethod
update
(json, fields, blob=None, update_db=False)¶ Update the fields given from the json structure.
Parameters: - json – Any
SmartJson
object - blob – incoming blob (like MARC), if
None
,json.get_blob
will be used to retrieve it if needed. - fields – List of fields to be updated, if
None
all fields will be updated. - save – If set to
True
a ‘soft save’ will be performed with the changes.
- json – Any
-
classmethod
update_meta_metadata
(json, blob=None, fields=None, section=None, keep_core_values=True, store_backup=True)¶ Update the meta-metadata for a guiven set of fields.
If it is
None
all fields will be used.
-
classmethod
Registries¶
Storage Engine Interface¶
-
class
invenio.modules.jsonalchemy.storage.
Storage
(model, **kargs)¶ Default storage engine interface.
-
create
()¶ Create underlying empty storage.
-
drop
()¶ Drop data from underlying storage.
-
get_field_values
(ids, field, repetitive_values=True, count=False, include_recid=False, split_by=0)¶ Return a list of field values for field for the given ids.
Parameters: - ids – list (or iterable) of integers
- repetitive_values – if set to True, returns all values even if they are doubled. If set to False, then return unique values only.
- count – in combination with repetitive_values=False, adds to the result the number of occurrences of the field.
- split – specifies the size of the output.
-
get_fields_values
(ids, fields, repetitive_values=True, count=False, include_recid=False, split_by=0)¶ Return a dictionary of field values for field for the given ids.
As in
get_field_values()
but in this case returns a dictionary with each of the fields and the list of field values.
-
get_many
(ids)¶ Return an iterable of json objects which id is inside ids.
-
get_one
(id)¶ Return the json matching the id.
-
save_many
(jsons, ids=None)¶ Store many JSON as elements on the iterable jsons.
-
save_one
(json, id=None)¶ Store one json in the storage system.
-
search
(query)¶ Retrieve all entries which match the query JSON prototype document.
This method should not be used on storage engines without native JSON support (e.g., MySQL). Returns a cursor over the matched documents.
Parameters: query – dictionary specifying the search prototype document
-
update_many
(jsons, ids=None)¶ Update many json objects following the same rule as update_one.
-
update_one
(json, id=None)¶ Update one JSON.
If id is None a field representing the id is expected inside the JSON object.
-
Default Validator¶
Wrappers¶
JSONAlchemy wrappers.
-
class
invenio.modules.jsonalchemy.wrappers.
SmartJson
(json=None, set_default_values=False, process_model_info=False, **kwargs)¶ Base class for Json structures.
-
additional_info
¶ Shortcut to __meta_metadata__.__additional_info__.
-
continuable_errors
¶ Shortcut to __meta_metadata__.__continuable_errors__.
-
dumps
(without_meta_metadata=False, with_calculated_fields=False, clean=False, keywords=None, filter_hidden=False)¶ Create the JSON friendly representation of the current object.
Parameters: - without_meta_metadata – by default
False
, if set toTrue
all the__meta_metadata__
will be removed from the output. - wit_calculated_fields – by default the calculated fields are not
dump, if they are needed in the output set it to
True
- clean – if set to
True
all the keys stating with_
will be removed from the ouput - keywords – list of keywords to dump. if None, return all
Returns: JSON friendly object
- without_meta_metadata – by default
-
errors
¶ Shortcut to __meta_metadata__.__errors__.
-
get
(key, default=None, reset=False, **kwargs)¶ Like in dict.get.
-
get_blob
(*args, **kwargs)¶ To be override in the specific class.
Should look for the original version of the file where the json came from.
-
items
(without_meta_metadata=False)¶ Like in dict.items.
-
iteritems
(without_meta_metadata=False)¶ Like in dict.items.
-
keys
(without_meta_metadata=False)¶ Like in dict.keys.
-
loads
(without_meta_metadata=False, with_calculated_fields=True, clean=False)¶ Create the BSON representation of the current object.
Parameters: - without_meta_metadata – if set to
True
all the__meta_metadata__
will be removed from the output. - wit_calculated_fields – by default the calculated fields are in
the output, if they are not needed set it to
False
- clean – if set to
True
all the keys stating with_
will be removed from the ouput
Returns: JSON friendly object
- without_meta_metadata – if set to
-
meta_metadata
¶ Shortcut to __meta_metadata__.
-
model_info
¶ Shortcut to __meta_metadata__.__model_info__.
-
produce
(producer_code, fields=None)¶ Create a different flavor of JSON depending on procuder_code.
Parameters: - producer_code – One of the possible producers listed in the producer section inside the field definitions.
- fields – List of fields that should be present in the output, if None all fields from self will be used.
Returns: It depends on each producer, see producer folder inside jsonext, typically dict.
-
set_default_values
(fields=None)¶ Set default value for the fields using the schema definition.
Parameters: fields – List of fields to set the default value, if None all.
-
validate
(validator=None)¶ Validate using current JSON content using Cerberus.
See: (Cerberus)[http://cerberus.readthedocs.org/en/latest].
Parameters: validator – Validator to be used, if None Validator
-
-
class
invenio.modules.jsonalchemy.wrappers.
SmartJsonLD
(json=None, set_default_values=False, process_model_info=False, **kwargs)¶ Utility class for JSON-LD serialization.
-
get_context
(context)¶ Return the context definition identified by the parameter.
If the context is not found in the current namespace, the received parameter is returned as is, the assumption being that a IRI was passed.
Parameters: context – context identifier
-
get_jsonld
(context, new_context={}, format='full')¶ Return the JSON-LD serialization.
Param: context the context to use for raw publishing; each SmartJsonLD instance is expected to have a default context associated. Param: new_context the context to use for formatted publishing, usually supplied by the client; used by the ‘compacted’, ‘framed’, and ‘normalized’ formats. Param: format the publishing format; can be ‘full’, ‘inline’, ‘compacted’, ‘expanded’, ‘flattened’, ‘framed’ or ‘normalized’. Note that ‘full’ and ‘inline’ are synonims, referring to the document form which includes the context; for more information see: [http://www.w3.org/TR/json-ld/]
-
translate
(context_name, context)¶ Translate object to fit given JSON-LD context.
Should not inject context as this will be done at publication time.
-
-
class
invenio.modules.jsonalchemy.wrappers.
StorageEngine
(name, bases, dct)¶ Storage metaclass for parsing application config.
-
storage_engine
¶ Return an instance of storage engine defined in application config.
It looks for key “ENGINE’ prefixed by
__storagename__.upper()
for example:class Dummy(SmartJson): __storagename__ = 'dummy'
will look for key “DUMMY_ENGINE” and “DUMMY_`DUMMY_ENGINE.__name__.upper()`” should contain dictionary with keyword arguments of the engine defined in “DUMMY_ENGINE”.
-
Extensions¶
Engines¶
-
class
invenio.modules.jsonalchemy.jsonext.engines.cache.
CacheStorage
(**kwargs)¶ Implement storage engine for Flask-Cache useful for testing.
-
get_field_values
(ids, field, repetitive_values=True, count=False, include_recid=False, split_by=0)¶ See
get_field_values()
.
-
get_fields_values
(ids, fields, repetitive_values=True, count=False, include_recid=False, split_by=0)¶ See
get_fields_values()
.
-
get_many
(ids)¶ See
get_many()
.
-
save_many
(jsons, ids=None)¶ See
save_many()
.
-
save_one
(data, id=None)¶ See
save_one()
.
-
update_many
(jsons, ids=None)¶ See
update_many()
.
-
update_one
(data, id=None)¶ See
update_one()
.
-
-
class
invenio.modules.jsonalchemy.jsonext.engines.memory.
MemoryStorage
(**kwargs)¶ Implement in-memory storage engine.
-
get_field_values
(ids, field, repetitive_values=True, count=False, include_recid=False, split_by=0)¶ See
get_field_values()
.
-
get_fields_values
(ids, fields, repetitive_values=True, count=False, include_recid=False, split_by=0)¶ See
get_fields_values()
.
-
get_many
(ids)¶ See
get_many()
.
-
save_many
(jsons, ids=None)¶ See
save_many()
.
-
save_one
(data, id=None)¶ See
save_one()
.
-
update_many
(jsons, ids=None)¶ See
update_many()
.
-
update_one
(data, id=None)¶ See
update_one()
.
-
-
class
invenio.modules.jsonalchemy.jsonext.engines.mongodb_pymongo.
MongoDBStorage
(model, **kwards)¶ Storage engine for MongoDB using the driver pymongo.
-
get_field_values
(ids, field, repetitive_values=True, count=False, include_id=False, split_by=0)¶ See
get_field_values()
.
-
get_fields_values
(ids, fields, repetitive_values=True, count=False, include_id=False, split_by=0)¶ See
get_fields_values()
.
-
get_many
(ids)¶ See
get_many()
.
-
save_many
(jsons, ids=None)¶ See
save_many()
.
-
save_one
(json, id=None)¶ See
save_one()
.
-
update_many
(jsons, ids=None)¶ See
update_many()
.
-
update_one
(json, id=None)¶ See
update_one()
.
-
-
class
invenio.modules.jsonalchemy.jsonext.engines.sqlalchemy.
SQLAlchemyStorage
(model, **kwards)¶ Implement database backend for SQLAlchemy model storage.
-
db
¶ Return SQLAlchemy database object.
-
get_field_values
(recids, field, repetitive_values=True, count=False, include_recid=False, split_by=0)¶ See
get_field_values()
.
-
get_fields_values
(recids, fields, repetitive_values=True, count=False, include_recid=False, split_by=0)¶ See
get_fields_values()
.
-
get_many
(ids)¶ See
get_many()
.
-
model
¶ Return SQLAchemy model.
-
save_many
(jsons, ids=None)¶ See
save_many()
.
-
save_one
(json, id=None)¶ See
save_one()
.
-
update_many
(jsons, ids=None)¶ See
update_many()
.
-
update_one
(json, id=None)¶ See
update_one()
.
-
Functions¶
Parsers¶
JSONAlchemy parsers.
-
class
invenio.modules.jsonalchemy.jsonext.parsers.connect_parser.
ConnectParser
¶ Handles the @connect decorator:
authors: derived: @connect('creators', handler_function) @connect('contributors' handler_function) self.get_list('creators') + self.get_list(contributors)
The handler functions will receive as parameters
self
and the current value of the field-
classmethod
add_info_to_field
(json_id, info, args)¶ Simply returns the list with the tuples
-
classmethod
create_element
(rule, field_def, content, namespace)¶ Simply returns the list with the tuples
-
classmethod
evaluate
(json, field_name, action, args)¶ Applies the connect funtion with json, field_name and action parameters if any functions availabe, otherwise it will put the content of the current field into the connected one.
-
classmethod
parse_element
(indent_stack)¶ Sets
connect
attribute to the rule
-
classmethod
-
class
invenio.modules.jsonalchemy.jsonext.parsers.depends_on_parser.
DependsOnParser
¶ Handle the @depends_on decorator:
authors: derived: @depends_on('creators', 'contributors') self.get_list('creators') + self.get_list(contributors)
-
classmethod
create_element
(rule, field_def, content, namespace)¶ Just returns the list with the field names
-
classmethod
evaluate
(reader, args)¶ Tries to apply the rules for each field, if it fails on one of them returns False
-
classmethod
-
class
invenio.modules.jsonalchemy.jsonext.parsers.description_parser.
DescriptionParser
¶ Handle the description section in model and field definitions.
title: """Description on title""" title: description: """Description on title"""
-
classmethod
create_element
(rule, namespace)¶ Simply return of the string.
-
classmethod
evaluate
(*args, **kwargs)¶ Evaluate parser.
This method is implemented like this because this parser is made for both, fields and models, and each of them have a different signature. Moreover this method does nothing.
-
classmethod
extend_model
(current_value, new_value)¶ The description should remain the one from the child model.
-
classmethod
inherit_model
(current_value, base_value)¶ The description should remain the one from the child model.
-
classmethod
parse_element
(indent_stack)¶ Set to the rule the description.
-
classmethod
-
class
invenio.modules.jsonalchemy.jsonext.parsers.extension_model_parser.
ExtensionModelParser
¶ Handles the extension section in the model definitions:
fields: .... extensions: 'invenio_records.api:RecordIter' 'invenio.modules.jsonalchemy.bases:Versinable'
-
classmethod
add_info_to_field
(info)¶ Adds the list of extensions to the model information
-
classmethod
create_element
(rule, namespace)¶ Simply returns the list of extensions
-
classmethod
evaluate
(obj, args)¶ Extend the incoming object with all the new things from args
-
classmethod
extend_model
(current_value, new_value)¶ Like inherit
-
classmethod
inherit_model
(current_value, base_value)¶ Extends the list of extensions with the new ones without repeating
-
classmethod
parse_element
(indent_stack)¶ Sets
extensions
attribute to the rule definition
-
classmethod
-
class
invenio.modules.jsonalchemy.jsonext.parsers.json_extra_parser.
JsonExtraParser
¶ JSON extension.
It parses something like this:
json: loads, function_to_load(field) dumps, function_to_dump(field)
The functions to load and dump must have one parameter which is the field to parse.
The main purpose of this extensions is to be able to work inside the JSON object with non JSON fields, such as dates. Following the example of dates, the load function will take a string representing a date and transform it into a datetime object, whereas the dumps function should take this object an create a JSON friendly representation, usually datetime.isoformat.
-
classmethod
add_info_to_field
(json_id, rule)¶ Add to the field definition the path to get the json functions.
-
classmethod
create_element
(rule, namespace)¶ Create the dictionary with the dump and load functions.
-
classmethod
evaluate
(json, field_name, action, args)¶ Evaluate the dumps and loads functions depending on the action.
-
classmethod
parse_element
(indent_stack)¶ Set
json_ext
in the rule.
-
classmethod
-
class
invenio.modules.jsonalchemy.jsonext.parsers.legacy_parser.
LegacyParser
¶ Handle the
@legacy
decorator.doi: creator: @legacy((("024", "0247_", "0247_%"), ""), ("0247_a", "")) marc, "0247_", get_doi(value) files: calculated: @legacy('marc', ("8564_z", "comment"), ("8564_y", "caption", "description"), ("8564_q", "eformat"), ("8564_f", "name"), ("8564_s", "size"), ("8564_u", "url", "url") ) @parse_first(('recid', )) {'url': 'http://example.org'}
-
classmethod
create_element
(rule, field_def, content, namespace)¶ Special case of decorator.
It creates the legacy rules dictionary and it doesn’t have any effect to the field definitions:
{'100' : ['authors[0]'], '100__' : ['authors[0]'], '100__%': ['authors[0]'], '100__a': ['authors[0].full_name'], ....... }
-
classmethod
evaluate
(value, namespace, args)¶ Evaluate parser.
This is a special case where the real evaluation of the decorator happened before the evaluation.
-
classmethod
-
class
invenio.modules.jsonalchemy.jsonext.parsers.memoize_parser.
MemoizeParser
¶ Handle the
@memoze
decorator.number_of_comments: calculated: @memoize(300) get_number_of_comments(self['recid'])
This decorator works only with calculated fields and it has three different ways of doing it:
- No decorator is specified, the value of the field will be calculated every time that somebody asks for it and its value will not be stored in the DB. This way is useful to create fields that return objects that can’t be stored in the DB in a JSON friendly manner or a field that changes a lot its value and the calculated function is really light.
- The decorator is used without any time,
@memoize()
. This means that the value of the field is calculated when the record is created, it is stored in the DB and it is the job of the client that modifies the data, which is used to calculated the field, to update the field value in the DB. This way should be used for fields that are typically updated just by a few clients, likebibupload
,bibrank
, etc. - A lifetime is set with the decorator
@memoize(300)
. In this case the field value is only calculated when somebody asks for it and its value is stored in a general cache (invenio.ext.cache
) using the timeout from the decorator. This form of the memoize decorator should be used with a field that changes a lot its value and the function to calculate it is not light. Keep in mind that the value that someone might get could be outdated. To avoid this situation the client that modifies the data where the value is calculated from could also invalidate the cache or modify the cached value. One good example of the use of it is the fieldnumber_of_comments
The cache engine used by this decorator could be set using
CFG_JSONALCHEMY_CACHE
in your instance configuration, by defaultinvenio.ext.cache:cache
will use.CFG_JSONALCHEMY_CACHE
must be and importable string pointing to the cache object.-
DEFAULT_TIMEOUT
= -1¶ Default timeout, -1 means the cache will not be invalidated unless is explicitly requested
-
classmethod
add_info_to_field
(json_id, info, args)¶ Set the time out for the field
-
classmethod
create_element
(rule, field_def, content, namespace)¶ Try to evaluate the memoize value to int.
If it fails it sets the default value from
DEFAULT_TIMEOUT
.
-
classmethod
evaluate
(json, field_name, action, args)¶ Evaluate the parser.
When getting a json field compare the timestamp and the lifetime of it and, if it the lifetime is over calculate its value again.
If the value of the field has changed since the last time it gets updated in the DB.
-
classmethod
parse_element
(indent_stack)¶ Set
memoize
attribute to the rule.
-
class
invenio.modules.jsonalchemy.jsonext.parsers.only_if_master_value_parser.
OnlyIfMasterValueParser
¶ Handle the
@only_if_master_value
decorator.files_to_upload: creator: @only_if_value(is_local_url(value['u']), is_available_url(value['u'])) marc, "8564_", {'hots_name': value['a'], 'access_number': value['b'], ........
-
classmethod
create_element
(rule, field_def, content, namespace)¶ Simply return the list of boolean expressions.
-
classmethod
evaluate
(value, namespace, args)¶ Evaluate
args
with the master value from the input.Returns: a boolean depending on evaluated value
.
-
classmethod
parse_element
(indent_stack)¶ Set
only_if_master_value
attribute to the rule.
-
classmethod
-
class
invenio.modules.jsonalchemy.jsonext.parsers.only_if_parser.
OnlyIfParser
¶ Handle the
@only_if
decorator.number_of_copies: creator: @only_if('BOOK' in self.get('collection.primary', [])) get_number_of_copies(self.get('recid'))
-
classmethod
evaluate
(reader, args)¶ Evaluate parser.
This is a special case where the real evaluation of the decorator is happening before the evaluation.
-
classmethod
-
class
invenio.modules.jsonalchemy.jsonext.parsers.parse_first_parser.
ParseFirstParser
¶ Handle the
@parse_first
decorator.author_aggregation: derived: @parse_first('creators', 'contributors') self.get_list('creators') + self.get_list(contributors)
-
classmethod
evaluate
(reader, args)¶ Try to parse
args
first and return alwaysTrue
.
-
classmethod
-
class
invenio.modules.jsonalchemy.jsonext.parsers.producer_parser.
ProducerParser
¶ Handles the producer section from a field definition.
An example of this section could be:
recid: producer: json_for_marc(), {'001': ''} title: producer: json_for_marc(), {'a': 'title'} creator: producer: json_for_marc('100__'), {....} json_for_marc('1001_'), {....} json_for_marc('100[^1][^_]'), {....}
The parameter passed to the producer could be used by the producer for example to decide if the current producer rule will be applied depending on the tag from the master format. Typically is a string or a regex but it should be double check with the producer implementation.
To view the list of possible producer, check the producer folder inside jsonext or simply:
>>> from invenio.modules.jsonalchemy.registry import producers >>> dict(producers)
-
classmethod
create_element
(rule, namespace)¶ Prepare the list of producers with their names and parameters.
-
classmethod
parse_element
(indent_stack)¶ Set. to the rule the list of producers in
producer
attribute.
-
classmethod
-
class
invenio.modules.jsonalchemy.jsonext.parsers.schema_parser.
SchemaParser
¶ Parse the schema definitions for fields, using cerberus.
modification_date: schema: {'modification_date': { 'type': 'datetime', 'required': True, 'default': lambda: __import__('datetime').datetime.now()}}
-
classmethod
create_element
(rule, namespace)¶ Just evaluate the content of the schema to a python dictionary.
-
classmethod
parse_element
(indent_stack)¶ Set the
schema
attribute inside the rule.
-
classmethod
Producers¶
JSON for MARC¶
MARC formatted as JSON producer.
This producer could be used in several ways.
It could preserve the input tag from marc:
title:
...
producer:
json_for_marc(), {'a': 'title'}
It will output the old marc tag followed by the subfield (dictionary key) and the value of this key will be json[‘title’][‘title’] For example:
...
<datafield tag="245" ind1="1" ind2="2">
<subfield code="a">Awesome title</subfield>
</datafield>
...
Will produce:
[..., {'24512a': 'Awesome title'}, ...]
Also could also unify the input marc:
title:
...
producer:
json_for_marc(), {'245__a': 'title'}
Using the same example as before it will produce:
[..., {'245__a': 'Awesome title'}, ...]
The third way of using it is to create different outputs depending of the input tag. Lets say this time we have this field definition:
title:
...
producer:
json_for_marc('24511'), {'a': 'title'}
json_for_marc('245__'), {'a': 'title', 'b': 'subtitle'}
The previous piece of MARC will produce the same output as before:
[..., {'24512a': 'Awesome title'}, ...]
But if we use this one:
...
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">Awesome title</subfield>
<subfield code="b">Awesome subtitle</subfield>
</datafield>
...
This will produce:
[..., {'245__a': 'Awesome title'}, {'245__b': 'Awesome subtitle'},...]
This last approach should be used carefully as all the rules are applied, therefore the rules should not overlap (unless this is the desired behavior).
-
invenio.modules.jsonalchemy.jsonext.producers.json_for_marc.
produce
(self, fields=None)¶ Export the json in marc format.
Produces a list of dictionaries will all the possible marc tags as keys.
Parameters: fields – list of fields to include in the output, if None or empty list all available tags will be included.
Readers¶
-
class
invenio.modules.jsonalchemy.jsonext.readers.json_reader.
JsonReader
(json, blob=None, **kwargs)¶ JSON reader.
-
static
split_blob
(blob, schema=None, **kwargs)¶ In case of several objs inside the blob this method specify how to split then and work one by one afterwards.
-
static
-
class
invenio.modules.jsonalchemy.jsonext.readers.marc_reader.
MarcReader
(json, blob=None, **kwargs)¶ Marc reader.
-
guess_model_from_input
()¶ Guess from the input Marc the model to be used in this record.
This is the simplest implementation possible, it just take all 980__a tags and sets it as list of models. The guess function could be easily change by setting CFG_MARC_MODEL_GUESSER with an importable string
-
static
split_blob
(blob, schema=None, **kwargs)¶ Split the blob using <record.*?>.*?</record> as pattern.
Note 1: Taken from invenio.legacy.bibrecord:create_records Note 2: Use the DOTALL flag to include newlines.
-