See my previous post for what the three food problem is.
I decided to solve this problem using Python, as it is the language I’m most familiar with, and has several useful libraries. It took a day to get to a working program, and might have been less if I’d not decided to learn vim at the same time.
The program is divided up into 5 parts:
- The recipe counter that checks how many recipes on allrecipes.com feature certain ingredients.
- The word generator that gets words that are types of food from WordNet.
- The food generator that generates a file containing words from the word generator that feature in at least one recipe on allrecipes.com. This means the program can avoid checking things like “low fat diet” that WordNet classifies as foods, but aren’t, and also obscure foods that don’t appear in any recipes.
- The sampler, that picks combinations of 3 foods to be tested.
- The tester, that checks that each pair of foods in a triad has at least one recipe on allrecipes.com, and that all three of them don’t appear together in any recipes.
I used three libraries for these tasks – requests, to make the searches, BeautifulSoup to find the recipe number on the results page, and NLTK to search WordNet for types of foods. This was the first time I’d used either BeautifulSoup or NLTK. I found them both easy to use, and well documented, although NLTK lacked an inbuilt function for something that seemed like core functionality to me (getting the name of a synset, e.g. `Synset(“maple_syrup.n.01”)` -> `”maple syrup”`). It wasn’t difficult to write some code to do this, but I was somewhat surprised that I had too.
Development was fairly straightforward. The only time I had to rewrite much code was when I realised the program would be much faster if I filtered the list of foods to only include those that appeared in any recipes. This reduced the foods from around 2300 to 1200, with a corresponding decrease in total triads to be checked of 1725515020 (or several hundred years).