Commit graph

334 commits

Author SHA1 Message Date
nanos
d0982ab835 some changes 2024-07-16 09:24:00 +01:00
Coehill
75185fd046
Improve Readme grammar and clarity 2024-07-15 10:46:14 -04:00
Michael
70795fc04f
Merge pull request #145 from nanos/dependabot/pip/certifi-2024.7.4
Bump certifi from 2023.7.22 to 2024.7.4
2024-07-06 07:55:58 +01:00
dependabot[bot]
158826f053
Bump certifi from 2023.7.22 to 2024.7.4
Bumps [certifi](https://github.com/certifi/python-certifi) from 2023.7.22 to 2024.7.4.
- [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-06 01:19:58 +00:00
Michael
7333eb4d9f
Update README.md 2024-07-04 12:46:19 +01:00
Michael
68ab1306ed
Update README.md 2024-07-04 12:43:39 +01:00
Michael
2a8cc3f244
Update README.md 2024-07-04 12:42:56 +01:00
Michael
f5df3424f6
Merge pull request #142 from nanos/peertube
Rudimentary peertube support
2024-07-03 07:16:35 +01:00
nanos
2e99ce0199 Fix for missing entry in seen hosts 2024-07-02 10:29:08 +01:00
nanos
c5af476348 Rudimentary peertube support 2024-07-02 10:20:07 +01:00
nanos
0e2178cc5a bug fix 2024-07-02 09:43:39 +01:00
Michael
5f290b5682
Merge pull request #140 from nanos/backfill-list
backfil mentioned users in list timelines
2024-07-02 07:58:46 +01:00
Michael
12f29b8ed2
Merge pull request #141 from nanos/logging-format
allow specification of a custom log format
2024-07-02 07:53:19 +01:00
nanos
5b247aa22a allow specification of a custom log format 2024-07-02 07:52:37 +01:00
nanos
f4873e7c8e backfil mentioned users in list timelines 2024-07-02 07:34:15 +01:00
Michael
d863b58513
Merge pull request #139 from AndrewKvalheim/skip-private
Skip fetching context of private posts
2024-07-02 06:54:16 +01:00
Michael
5f92da7178
Merge pull request #138 from nanos/cache-file-names
use sha hashes to cache file names
2024-07-02 06:52:51 +01:00
Andrew Kvalheim
5ed751a8c6 Skip fetching context of private posts
Context fetching is performed without authentication, so it is only
possible for public and unlisted posts.
2024-07-01 18:21:57 -07:00
nanos
c58c5b5af0 use sha hashes to cache file names 2024-07-01 20:06:51 +01:00
nanos
e85384a5a6 name collision (fixes #134) 2024-06-28 12:46:12 +01:00
nanos
6c1ec2f1c5 fix logs 2024-06-28 09:08:59 +01:00
nanos
3639878df0 Update version 2024-06-28 08:50:32 +01:00
nanos
80c8937a88 Improve docs 2024-06-28 08:49:54 +01:00
Michael
5e6aa2bd66
Merge pull request #133 from nanos/lists
Support for fetching lists
2024-06-28 08:47:42 +01:00
nanos
42774c5195 documentation update 2024-06-28 08:47:21 +01:00
nanos
fd615cad15 Support for fetching lists 2024-06-28 08:22:34 +01:00
nanos
e7da9a1f61 Fix bug 2024-06-27 17:14:41 +01:00
Michael
e0faafb37a
Merge pull request #130 from nanos/cache-robots-on-disk
Cache robots.txt for 24 hours on disk to reduce load on servers
2024-06-27 16:46:01 +01:00
Michael
009fbe54b4
Merge pull request #131 from nanos/no-bot
Do not backfill users that have opted out
2024-06-27 09:18:20 +01:00
nanos
d2a14f687a log what's happening 2024-06-27 09:18:06 +01:00
nanos
aa589670eb Do not backfill users that have opted out of indexing 2024-06-27 09:16:59 +01:00
nanos
40b624aaff update version 2024-06-27 07:56:50 +01:00
nanos
90988872b7 fix 2024-06-26 16:45:30 +01:00
nanos
7e8ca17640 Cache robots.txt for 24 hours on disk to reduce load on servers 2024-06-26 16:41:51 +01:00
nanos
3651d028a6 update version number 2024-06-25 16:36:01 +01:00
nanos
01a2719918 shorten http timeout for robots.txt fetch 2024-06-25 16:32:47 +01:00
Michael
dec718db76
Merge pull request #129 from nanos/cache-robots
Cache robots.txt for each run of the script, to reduce load on the server
2024-06-25 16:25:26 +01:00
nanos
7b9896b5c0 Cache robots.txt 2024-06-25 16:24:37 +01:00
Michael
ac8044db83
Merge pull request #128 from nanos/user-agent
User FediFetcher as User Agent to fetch the robots.txt
2024-06-25 16:16:49 +01:00
nanos
dd468d5956 User FediFetcher as User Agent to fetch the robots.txt 2024-06-25 16:15:43 +01:00
nanos
e40d61d291 update version 2024-06-25 11:00:24 +01:00
nanos
ac2b648e05 change timeout periods to never allow more than once per minute 2024-06-25 10:54:00 +01:00
Michael
de656d1e0d
Merge pull request #125 from nanos/robots
respect robots.txt
2024-06-25 10:46:22 +01:00
nanos
885b84d598 ensure callbacks aren't blocked by robtos 2024-06-25 10:38:47 +01:00
nanos
1b4c135f8f respect robots.txt 2024-06-25 10:24:45 +01:00
nanos
ed5f0ba3b4 update gitignore 2024-06-25 10:01:37 +01:00
nanos
1c7023819e try again 2024-06-25 09:03:59 +01:00
nanos
e6fd9c6b00 bug fix 2024-06-25 09:01:07 +01:00
Michael
721d2fc5bb
Merge pull request #124 from nanos/rate-limits
Rate limit fetching of context
2024-06-25 08:53:16 +01:00
nanos
120008ced0 shorten storage time 2024-06-25 08:50:27 +01:00