Clean up site map scanning. Return all results instead of 10

This commit is contained in:
rmgr 2024-06-09 21:53:57 +09:30
parent 2a99a61dbe
commit bbba459480
6 changed files with 140 additions and 57 deletions

4
todo
View file

@ -6,6 +6,6 @@
[x] Add clustered index to document_ngrams table model
[x] Add clustered index to document_tokens table model
[ ] Add ddl command to create partition tables
[ ] Investigate whether or not robots.txt is as aggressive as I'm making ito ut to be
[ ] Instead of starting from a random page on the site, go to root and find site map and crawl that
[x] Investigate whether or not robots.txt is as aggressive as I'm making ito ut to be
[x] Instead of starting from a random page on the site, go to root and find site map and crawl that