Interview with Katariina Kari


Katariina Kari is an Ontologist at the Zalando Tech-Hub in Helsinki. She is specialised in semantic web and guides the art business to the digital age. She is founder of ExCLaM! Digital, a connector of classical music and digital technologies. At Zalando she is modelling the Fashion Knowledge Graph, with which Zalando improves is customer experience.


You talk about building a knowledge graph at Zalando, how do you explain the concept to non-technical people?

The fashion knowledge graph is a collection of fashion concepts, such as bikini or beach, together with the knowledge of how each one of those concepts relates to another. There are two main relations that concepts can have, either associative or structural. For example, swimwear and bikini relate to each other structurally: swimwear is a more general concept and bikini is more detailed. Bikini and beach are associated with each other although structurally they are very different. The concepts and their relations reflect how we humans see the world of fashion and thus the knowledge graph gives the computer a way to either recognise these concepts or talk back to us using them. Many different kind of applications, such as a search bar, voice command, navigational elements, can make use of this knowledge to create a user experience that seemingly understands the human way of thinking.


What role does Open Data play in this Knowledge Graph (KG)?

Open Data, and more specific, open ontologies, have the potential to grow the KG by filling gaps in its knowledge. For example, Wikidata has a comprehensive collection of concepts for fabric that could be incorporated into the knowledge graph. We are doing this manually for now, but I have already written a few idea papers on how this could also be done automatically in real-time when we find ourselves in situations where the knowledge graph has gaps.


Zalando is a commercial business. Can and does it give back information to the open data community?

Zalando acknowledges the strength of an Open Source community as the best way to collaborate. We even have our own Open Source website where we share our projects. These are bits of stand-alone software that our engineers have built to improve our work and that our Open Source guild approves for release. We also publish tutorials for certain technologies and share them in public meet-ups, some of them which we organise ourselves.


There are multiple graph databases out there, why did you decide to go with the W3C RDF stack?

Currently, I see that there are two main graph forms (and multiple database formats that adhere to one or the other): a triple store according to the W3C RDF stack and property graphs. The power of the RDF stack is that one can define a schema and reason with it, thus creating more knowledge then originally is written into the graph. The power of the property graph, as I see it, is traversing the graph and making certain calculations with it more effectively than with RDF. Roughly one could say that RDF is good for facts and property graphs good for large data sets. But to be fair, I would like to spend more time to compare the strengths of both to understand that better. Our team is mainly using the RDF stack, but we have also tested property graphs once for one application. I hope that in the future we can learn more about the strengths and differences of the two!