BioC Json
Encoding the BioC object
Encoding the BioC collection object collection
:
from bioc import biocjson
# Serialize ``collection`` to a BioC Json formatted ``str``.
biocjson.dumps(collection, indent=2)
# Serialize ``collection`` as a BioC Json formatted stream to ``fp``.
with open(filename, 'w') as fp
biocjson.dump(collection, fp, indent=2)
Compact encoding:
from bioc import biocjson
biocjson.dumps(collection)
Decoding the BioC Json file
from bioc import biocjson
# Deserialize ``s`` to a BioC collection object.
collection = biocjson.loads(s)
# Deserialize ``fp`` to a BioC collection object.
with open(filename, 'r') as fp:
collection = biocjson.load(fp)
Json Lines
Incrementally encoding the BioC structure:
from bioc import biocjson
with biocjson.iterwriter(filename) as writer:
for doc in collection.documents:
writer.write(doc)
or
from bioc import biocjson
import jsonlines
with jsonlines.open(filename, 'w') as writer:
for doc in collection.documents:
for passage in doc.passages:
writer.write(biocjson.toJSON(passage))
Incrementally decoding the BioC Json lines file:
from bioc import biocjson
with biocjson.iterreader(filename) as reader:
for passage in reader:
# process passage
...
or
import bioc
from bioc import biocjson
import jsonlines
with jsonlines.open(filename) as reader:
for obj in reader:
passage = biocjson.fromJSON(obj, bioctype=bioc.PASSAGE)
...