Commit graph

16 commits

Author SHA1 Message Date
rmgr
343410e62f Add first pass youtube subtitle indexer 2024-04-05 06:22:56 +10:30
rmgr
7ee9d978b2 Tidy up crawling and implement boolean search 2024-04-04 20:46:34 +10:30
rmgr
d4bb3fb8dc Tidy up index.py 2024-03-07 21:12:19 +10:30
rmgr
20d198e559 Refactor to use postgresql end to end 2024-03-07 21:00:11 +10:30
rmgr
8605ee6b2c Add todo file 2024-03-02 19:58:10 +10:30
rmgr
aed568d11e Remove beehave.txt note 2024-03-02 19:54:53 +10:30
rmgr
8903f7a3e5 Merge postgres chagnes 2024-03-02 19:53:58 +10:30
rmgr
24ee04c0ff Begin adding Postgresql support instead of filesystem flat files 2024-03-01 21:12:40 +10:30
rmgr
efe6dea1f5 Fix crawling. Add initial linksfile crawling. Still need to remove records as they are processed. 2024-01-01 20:52:12 +10:30
rmgr
f4ea8ad1d7 Respect robots.txt 2024-01-01 19:53:22 +10:30
rmgr
b43343e0ee Fix recursive crawling. 2023-12-12 17:35:15 +10:30
rmgr
21961fced6 Add notes to index.py 2023-12-06 08:31:54 +10:30
rmgr
9fc2e1af53 Implement recursive page crawling 2023-12-06 08:29:39 +10:30
rmgr
3d7b72e5ef Join counts for multiple words 2023-11-30 17:26:59 +10:30
rmgr
d30397cefa Add count of times word appears on a site to index. 2023-11-30 08:03:43 +10:30
rmgr
f36ab2fbfb Initial commit 2023-11-28 20:51:54 +10:30