Commit graph

  • bbba459480 Clean up site map scanning. Return all results instead of 10 main rmgr 2024-06-09 21:53:57 +09:30
  • 2a99a61dbe Add site map crawl option rmgr 2024-06-08 20:43:05 +09:30
  • e3c67b64e6 Make excluded file types more robust rmgr 2024-06-08 20:24:21 +09:30
  • 98efe9d1a2 Fix temp table being randomly dropped due to name collision. Fix multi-word non-phrase search rmgr 2024-05-05 19:06:56 +09:30
  • bdb4064acc Rework ngram generation. Greatly improve performance of indexer. Commit horrendous sql sins rmgr 2024-05-04 21:10:46 +09:30
  • 9f0e7e6b29 Indexer and query optimisations rmgr 2024-04-06 19:34:59 +10:30
  • 9d57f66cd7 Add beginnings of ngram search capability rmgr 2024-04-05 21:36:15 +10:30
  • 343410e62f Add first pass youtube subtitle indexer rmgr 2024-04-05 06:22:56 +10:30
  • 7ee9d978b2 Tidy up crawling and implement boolean search rmgr 2024-04-04 20:46:34 +10:30
  • d4bb3fb8dc Tidy up index.py rmgr 2024-03-07 21:12:19 +10:30
  • 20d198e559 Refactor to use postgresql end to end rmgr 2024-03-07 20:44:34 +10:30
  • 8605ee6b2c Add todo file rmgr 2024-03-02 19:58:10 +10:30
  • aed568d11e Remove beehave.txt note rmgr 2024-03-02 19:54:53 +10:30
  • 8903f7a3e5 Merge postgres chagnes rmgr 2024-03-02 19:53:58 +10:30
  • 24ee04c0ff Begin adding Postgresql support instead of filesystem flat files postgres rmgr 2024-03-01 21:12:40 +10:30
  • efe6dea1f5 Fix crawling. Add initial linksfile crawling. Still need to remove records as they are processed. rmgr 2024-01-01 20:52:12 +10:30
  • f4ea8ad1d7 Respect robots.txt rmgr 2024-01-01 19:53:22 +10:30
  • b43343e0ee Fix recursive crawling. rmgr 2023-12-12 17:35:15 +10:30
  • 21961fced6 Add notes to index.py rmgr 2023-12-06 08:31:54 +10:30
  • 9fc2e1af53 Implement recursive page crawling rmgr 2023-12-06 08:29:39 +10:30
  • 3d7b72e5ef Join counts for multiple words rmgr 2023-11-30 17:26:59 +10:30
  • d30397cefa Add count of times word appears on a site to index. rmgr 2023-11-30 08:03:43 +10:30
  • f36ab2fbfb Initial commit rmgr 2023-11-28 20:51:54 +10:30