[ ] Refactor website table to generic document table (maybe using URN instead of URL?) [ ] Define tokens table FKed to document table [ ] Refactor index.py to tokenize input into tokens table [ ] Define N-Grams table [ ] Add N-Gram generation to index.py