This streaming app consumes data from a wikidata kafka topic and normalized them tto a common schema. The app decouples the producing of the stream from the normalizing process.
This streaming app consumes data from the wikidata-geo topic normalized and analysed them to a common schema. The app decouples the producing of the stream from the normalizing and analysing process.
## Normalizer
The Readable stream has two Writable stream subscribed. One for Analysing the data and build links and one for normalising the data a common schema. The Normaliser sends the data to the topic `wikidata-small`
## Analyser
The analyser extracts specific wikidata properties and send a concordance of links to the `linker` topic
## Docker
To build the image use following command. The image will fetch data from a wikidata topic and streams the result back into kafka. The container based on linux alpine.
```bash
...
...
@@ -14,8 +17,4 @@ We hav a build pipline in gitlab. So manually building of the image is not longe
We execute a job on k8 to stream the dump into kafka