Today was very heavily focused on thesaurle. I wanted to start getting some stats about the dataset and ended up fighting with a number of different libraries. I was looking at nxalg but ended up having a number of problems with the library – first, it seemed that the docker container I was using didn’t have the right dependencies and getting the exact right configuration was very difficult. In the end, I’m not sure what configuration I used ended up working. It involved having to import some modules which were then mounted in the docker container as a volume. I ended up abandoning that approach because our graph has almost 20K nodes so finding all paths between all nodes was going to take quite some time.
In the end, I generated a script which would iterate through all nodes, and using the built in breadth-first search algorithm, find 20 possible pairs for that node, which are all within 5-10 steps. This script was pretty fast and gives us a good data set which we can use to seed the game.
I also figured out how to generate a dump of the data from the database so that the docker container can be loaded from a snapshot, rather than having to run the migration script every time. Since this data doesn’t change much (or ever?), having the static data file should be helpful.
I opened a PR with all the changes. I think up next is probably making the game experience a bit nicer and maybe figuring out how to deploy it somewhere!