Simple Battery Cell Metadata#

Let’s describe an instance of a simple CR2032 coin cell with a capacity defined in a specification sheet from the manufacturer!

This example covers a few topics:

  • How to describe a resource using ontology terms and JSON-LD

  • How machines convert JSON-LD into triples

  • What is the meaning of the subject, predicate, and object identifiers

  • How to run a simple query using SPARQL [Moderate]

  • How to use the ontology to fetch more information from other sources [Advanced]

A live version of this notebook is available on Google Colab here

Describe the powder using ontology terms in JSON-LD format#

The JSON-LD data that we will use is:

[45]:
jsonld = {
            "@context": "https://w3id.org/emmo/domain/battery/context",
            "@type": "CR2032",
            "schema:name": "My CR2032 Coin Cell",
            "schema:manufacturer": {
               "@id": "https://www.wikidata.org/wiki/Q3041255",
               "schema:name": "SINTEF"
            },
            "hasProperty": {
               "@type": ["NominalCapacity", "ConventionalProperty"],
               "hasNumericalPart": {
                     "@type": "RealData",
                     "hasNumberValue": 230
               },
               "hasMeasurementUnit": "emmo:MilliAmpereHour"
            }
         }

Parse this description into a graph#

Now let’s see how a machine would process this data by reading it into a Graph!

First, we install and import the python dependencies that we need for this example.

[46]:
# Import dependencies
import json
import rdflib
import requests
import sys
from IPython.display import Image, display
import matplotlib.pyplot as plt

We create the graph using a very handy python package called rdflib, which provides us a way to parse our json-ld data, run some queries using the language SPARQL, and serialize the graph in any RDF compatible format (e.g. JSON-LD, Turtle, etc.).

[47]:
# Create a new graph
g = rdflib.Graph()

# Parse our json-ld data into the graph
g.parse(data=json.dumps(jsonld), format="json-ld")

# Create a SPARQL query to return all the triples in the graph
query_all = """
SELECT ?subject ?predicate ?object
WHERE {
  ?subject ?predicate ?object
}
"""

# Execute the SPARQL query
all_the_things = g.query(query_all)

# Print the results
for row in all_the_things:
    print(row)

(rdflib.term.URIRef('https://www.wikidata.org/wiki/Q3041255'), rdflib.term.URIRef('https://schema.org/name'), rdflib.term.Literal('SINTEF'))
(rdflib.term.BNode('Na34f2abdb78d44d69060d98294546cda'), rdflib.term.URIRef('https://w3id.org/emmo#EMMO_8ef3cd6d_ae58_4a8d_9fc0_ad8f49015cd0'), rdflib.term.BNode('N72de31d60ed94fa98a884e7d964226b0'))
(rdflib.term.BNode('Nd4cf6308611c4920a4582b0daeecbf57'), rdflib.term.URIRef('https://w3id.org/emmo#EMMO_e1097637_70d2_4895_973f_2396f04fa204'), rdflib.term.BNode('Na34f2abdb78d44d69060d98294546cda'))
(rdflib.term.BNode('N72de31d60ed94fa98a884e7d964226b0'), rdflib.term.URIRef('https://w3id.org/emmo#EMMO_faf79f53_749d_40b2_807c_d34244c192f4'), rdflib.term.Literal('230', datatype=rdflib.term.URIRef('http://www.w3.org/2001/XMLSchema#integer')))
(rdflib.term.BNode('Na34f2abdb78d44d69060d98294546cda'), rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), rdflib.term.URIRef('https://w3id.org/emmo#EMMO_d8aa8e1f_b650_416d_88a0_5118de945456'))
(rdflib.term.BNode('Nd4cf6308611c4920a4582b0daeecbf57'), rdflib.term.URIRef('https://schema.org/name'), rdflib.term.Literal('My CR2032 Coin Cell'))
(rdflib.term.BNode('Nd4cf6308611c4920a4582b0daeecbf57'), rdflib.term.URIRef('https://schema.org/manufacturer'), rdflib.term.URIRef('https://www.wikidata.org/wiki/Q3041255'))
(rdflib.term.BNode('Nd4cf6308611c4920a4582b0daeecbf57'), rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), rdflib.term.URIRef('https://w3id.org/emmo/domain/battery#battery_b61b96ac_f2f4_4b74_82d5_565fe3a2d88b'))
(rdflib.term.BNode('Na34f2abdb78d44d69060d98294546cda'), rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), rdflib.term.URIRef('https://w3id.org/emmo/domain/electrochemistry#electrochemistry_8abde9d0_84f6_4b4f_a87e_86028a397100'))
(rdflib.term.BNode('N72de31d60ed94fa98a884e7d964226b0'), rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), rdflib.term.URIRef('https://w3id.org/emmo#EMMO_18d180e4_5e3e_42f7_820c_e08951223486'))
(rdflib.term.BNode('Na34f2abdb78d44d69060d98294546cda'), rdflib.term.URIRef('https://w3id.org/emmo#EMMO_bed1d005_b04e_4a90_94cf_02bc678a8569'), rdflib.term.URIRef('https://w3id.org/emmo#MilliAmpereHour'))

You can see that our human-readable JSON-LD file has been transformed into some nasty looking (but machine-readable!) triples. Let’s look at a couple in more detail to understand what’s going on.

Examine and explore the triples#

Let’s start with this one:

subject

https://www.wikidata.org/wiki/Q3041255

predicate

https://schema.org/name

object

‘SINTEF

This tells the machine that something with a wikidata identifier has a property called ‘name’ from the schema.org vocabulary with a literal value ‘SINTEF’. These identifiers serve not only as persistent and unique identifiers for the concepts, but also point to a place where a machine can go to learn more about what it is. Try it yourself! Click on one and see where it takes you!

Neat, right?! Let’s look at another one:

subject

‘Nb9d4bdc220954548a09b8b56f95d9cf3’

predicate

http://www.w3.org/1999/02/22-rdf-syntax-ns#type

object

http://emmo.info/battery#battery_b61b96ac_f2f4_4b74_82d5_565fe3a2d88b

This tells the machine that a certain node in the graph is a a type of some thing that exists in the EMMO domain ‘battery’. And this gets to one of the difficult bits for humans: many ontologies (like EMMO) use UUIDs for term names to ensure that they are universally unique. It works, but it sacrifices the human readability. Luckily we can get around this by assigning human-readable annotations to that term and/or mapping the IRI to a human readable label in a JSON-LD context like we did above.

Go ahead, click the link and see if you can figure out what this thing is…

it’s a CR2032! Now we can see how our simple description in the JSON-LD file has now been converted to a machine-readable IRI.

Query the graph using SPARQL [Moderate]#

Now, let’s write a SPARQL query to get back some specific thing…like what is the name of the manufacturer?

[48]:
query = """
PREFIX schema: <https://schema.org/>

SELECT ?manufacturerName
WHERE {
  ?thing schema:manufacturer ?manufacturer .
  ?manufacturer schema:name ?manufacturerName .
}
"""

# Execute the SPARQL query
results = g.query(query)

# Print the results
for row in results:
    print(row)

(rdflib.term.Literal('SINTEF'),)

Retrieve External Identifiers Using EMMOntoPy#

Ontologies contain rich semantic descriptions, but they don’t always embed all external information directly. Instead, they often include references (like Wikidata IDs) to link concepts to authoritative external sources.

In this example, we use EMMOntoPy, a Python interface for OWL ontologies, to query the ontology and extract the Wikidata ID associated with a specific class — in this case, the CR2032 coin cell.

EMMOntoPy allows us to access ontology classes and properties by their human-readable labels (e.g., "CR2032" and "wikidataReference"), and we use these to build a SPARQL query. This query is then executed over the ontology graph to extract the Wikidata IRI for CR2032.

We can later use this IRI to retrieve additional metadata directly from Wikidata or other linked data endpoints — a common pattern in semantic data workflows.

[49]:
from ontopy import get_ontology

# Loading from web
battinfo = get_ontology('https://w3id.org/emmo/domain/battery/inferred').load()
[50]:
wikidata_url = battinfo.CR2032.wikidataReference[0]
wikidata_id = wikidata_url.split('/')[-1]
print(f"The Wikidata ID for CR2032 is: {wikidata_id}")
The Wikidata ID for CR2032 is: Q5013811

Now that we have the Wikidata ID for CR2032, we can query their SPARQL endpoint to retrieve some property. Let’s ask it for the thickness.

[51]:
# Query the Wikidata knowledge graph for more information about zinc
wikidata_endpoint = "https://query.wikidata.org/sparql"

# SPARQL query to get the thickness of a CR2032 cell and label for the unit
query = """
SELECT ?value ?unit ?unitLabel WHERE {
  wd:%s p:P2386 ?statement .
  ?statement ps:P2386 ?value .
  OPTIONAL {
    ?statement psv:P2386 ?valueNode .
    ?valueNode wikibase:quantityUnit ?unit .
    ?unit rdfs:label ?unitLabel .
    FILTER (lang(?unitLabel) = "en")
  }
}
""" % wikidata_id

# Execute the request
response = requests.get(wikidata_endpoint, params={'query': query, 'format': 'json'})
data = response.json()

# Extract and print the thickness value
binding = data['results']['bindings'][0]
thickness = binding['value']['value']
unit_label = binding.get('unitLabel', {}).get('value', binding.get('unit', {}).get('value', ''))

print(f"Wikidata says the thickness of a CR2032 cell is: {thickness} {unit_label}")
Wikidata says the thickness of a CR2032 cell is: 20 millimetre

We can also retrieve more complex data. For example, let’s ask Wikidata to show us an image of a CR2032.

[52]:
# SPARQL query to get the image of the CR2032 cell (Q758)
query = """
SELECT ?image WHERE {
  wd:%s wdt:P18 ?image .
}
""" % wikidata_id

# Execute the request
response = requests.get(wikidata_endpoint, params={'query': query, 'format': 'json'})
data = response.json()

# Extract and display the image URL
if data['results']['bindings']:
    image_url = data['results']['bindings'][0]['image']['value']
    print(f"Image of a CR2032- cell: {image_url}")
    display(Image(url=image_url, width=300))  # Adjust width and height as needed

else:
    print("No image found.")
Image of a CR2032- cell: http://commons.wikimedia.org/wiki/Special:FilePath/CR2032%20battery%2C%20KTS-2728.jpg

Finally, let’s retireve the id for CR2032 in the Google Knowledge Graph and see what it has to say!

[53]:
# SPARQL query to get the Google Knowledge Graph ID of the CR2032 cell
query = """
SELECT ?id WHERE {
  wd:%s wdt:P2671 ?id .
}
""" % wikidata_id

# Execute the request
response = requests.get(wikidata_endpoint, params={'query': query, 'format': 'json'})
data = response.json()

# Extract and display the Google Knowledge Graph ID
if data['results']['bindings']:
    gkgid = data['results']['bindings'][0]['id']['value']
    gkgns = 'https://www.google.com/search?kgmid='
    gkg = gkgns + gkgid
    print(f"The Google Knowledge Graph entry for a CR2032 cell: {gkg}")

else:
    print("None found.")
The Google Knowledge Graph entry for a CR2032 cell: https://www.google.com/search?kgmid=/g/11bc5qf2g9

Summary#

In this notebook, we explored how to combine ontologies, JSON-LD, and external knowledge graphs (like Wikidata) to build semantically rich descriptions of battery components.

What We Did#

  • Described a battery cell (e.g., CR2032) using JSON-LD and ontology terms from the EMMO battery domain.

  • Queried the ontology using EMMOntoPy to retrieve structured metadata, including external identifiers like Wikidata IDs.

  • Parsed JSON-LD into RDF using RDFLib, enabling SPARQL queries over the resulting knowledge graph.

  • Executed live SPARQL queries against Wikidata to retrieve additional information — such as the physical thickness of the CR2032 cell.

  • Mapped raw IRIs to readable labels to improve interpretability of results (e.g., converting unit IRIs to “millimetre”).

Why This Matters#

By linking structured ontology-based descriptions to external resources like Wikidata:

  • We avoid data duplication while gaining access to curated, up-to-date public knowledge.

  • We create machine-interpretable metadata that can support automated reasoning, traceability, and interoperability.

  • We set the foundation for semantic integration across research domains, data platforms, and digital twin ecosystems.

This approach scales well from simple battery metadata to rich, queryable knowledge graphs that connect batteries with materials, processes, and real-world data.