Commit Graph

100 Commits

Author SHA1 Message Date
Andrew Dolgov 528b387563 update individual feed in a separate process to prevent PHP fatal errors
(for example, OOM) from stopping the entire batch
this should also slightly increase memory budget for update processes
2020-09-27 15:58:13 +03:00
Andrew Dolgov 05744bb474 fix updater never scheduling feeds for update if they never been updated before while having default update interval set 2020-09-22 20:33:51 +03:00
Andrew Dolgov 6811d0bde2 use self:: in some places to invoke static methods from the same class 2020-09-22 14:54:15 +03:00
Andrew Dolgov 74568df4ff remove a lot of stuff from global context (functions.php), add a few helper classes instead 2020-09-22 09:04:33 +03:00
Andrew Dolgov 3dd4169b5f clarify some URL validation-related error messages 2020-09-21 20:35:24 +03:00
Andrew Dolgov 4785f21316 update_rss_feed: log effective URL after fetching
validate_url: treat scheme as case-insensitive
2020-09-21 20:26:57 +03:00
Andrew Dolgov a4525d31b2 replace FALSE with false so that static analyzer shuts up about it 2020-09-17 19:02:27 +03:00
Andrew Dolgov afa0023c51 don't try to update manually disabled feeds even if they haven't been updated before or are marked for a manual update 2020-09-17 15:40:50 +03:00
Andrew Dolgov c352e872e9 core: pass found enclosures to HOOK_ARTICLE_FILTER
af_redditimgur: remove enclosures if we found something to embed because it's going to be a low-res thumbnail
2020-06-24 22:54:14 +03:00
Andrew Dolgov 6eb94f1e13 better support for image srcset attributes as discussed in https://community.tt-rss.org/t/problem-with-img-srcset/3519 2020-06-15 11:58:59 +03:00
Andrew Dolgov 06d2c65193 calculate_article_hash: don't die() on previous, woops 2020-05-17 17:44:32 +03:00
Andrew Dolgov 3a142cbf58 calculate_article_hash: ignore some useless or read-only fields (i.e. GUID) when calculating hash 2020-05-17 17:42:37 +03:00
Andrew Dolgov cd1f3cb8cc * store UID in article hashed GUID separately so it could be migrated cleanly to a different instance
* store resulting GUID as a JSON object so it could be extended easier if needed
2020-05-17 14:01:16 +03:00
Andrew Dolgov 3a4b9249a9 DiskCache: properly deal with srcset attributes 2020-04-29 19:29:36 +03:00
Andrew Dolgov 4a00f96733 remove unneeded var_dump() 2020-04-29 11:35:02 +03:00
Andrew Dolgov 6573541873 * add HOOK_ENCLOSURE_IMPORTED
* pass feed id to HOOK_FEED_PARSED
2020-04-29 11:33:39 +03:00
lllusion3418 ec1b0befc7 add support for video[@src] in media cache
it's a valid alternative to a source[@src] child element:
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/video
2020-03-12 11:08:39 +01:00
lllusion3418 cdde23b4dc actually download <video> posters to media cache
video[@poster] is already supported in the rewriting logic but never
actually downloaded
2020-03-12 11:08:33 +01:00
Andrew Dolgov f24ece85a6 add validationtextarea control, use it for filter match editor 2020-02-28 13:53:45 +03:00
Andrew Dolgov 6080cca9ca scrap counter cache system; rework counters to sum() booleans instead 2020-01-24 14:25:31 +03:00
Andrew Dolgov e5b7b145e5 cache media: set referrer to source URL when fetching images 2019-11-25 09:48:24 +03:00
Andrew Dolgov 304d3a0b88 tag-related fixes
1. move tag sanitization to feedparser common item class
2. enforce length limit on tags when parsing
3. support multiple tags passed via one dc:subject and other such elements, parse them as a comma-separated list
4. sort resulting tag list to prevent different order between feed updates
5. remove some duplicate code related to tag validation
6. allow + symbol in tags
2019-11-20 18:56:34 +03:00
Andrew Dolgov 8c3efd51ec reset domain hit quota on feed update start 2019-11-17 13:17:21 +03:00
Andrew Dolgov 0d7b10469b update_rss_feed: add specific logging for HOOK_FETCH_FEED, HOOK_FEED_FETCHED, HOOK_FEED_PARSED handlers 2019-11-14 06:39:45 +03:00
Andrew Dolgov 5bb8dad631 is_gzipped: don't try to strpos() over entire buffer 2019-11-12 07:11:10 +03:00
Andrew Dolgov 647c7c45eb allow article filters to modify num_comments 2019-10-25 14:37:00 +03:00
Andrew Dolgov 4e05008aac update_rss_feed: force cast initial timestamp value to integer 2019-09-30 11:41:07 +03:00
Andrew Dolgov b0d67cd3d0 rework previous to pass unformatted timestamp to plugin, and deal with formatting later
also, move timestamp-related debugging output after plugin handler
2019-09-11 14:04:59 +03:00
Andrew Dolgov 94a12b9674 pass formatted entry timestamp to article filters and allow them to modify it 2019-09-11 11:43:40 +03:00
Andrew Dolgov 6914ad1f74 retire MIN_CACHE_FILE_SIZE 2019-08-14 12:44:50 +03:00
Andrew Dolgov 84974c60a7 RSSUtils::cache_media, cache_enclosures: use DiskCache 2019-08-14 12:15:56 +03:00
Andrew Dolgov fdb6066bf6 * HOOK_ENCLOSURE_ENTRY: pass article_id to handler
* DiskCache: multiple fixes; support isWritable() for cache entries, set content-disposition for send()
* public/cached_url: allow selecting files from sub-caches other than images
* plugins/Cache_Starred_Images: rework to use DiskCache, can be enabled per-user, properly handles article enclosures, etc
2019-08-13 16:40:21 +03:00
Andrew Dolgov 19b9b27662 expire_cached_files to DiskCache::expire() 2019-08-13 14:13:42 +03:00
Andrew Dolgov 088fcf8131 move more globals to more appropriate places
set libxml to always use internal errors
2019-06-20 08:40:02 +03:00
Andrew Dolgov 4fa9aee4e7 move several more global functions to more appropriate classes 2019-06-20 08:14:06 +03:00
Andrew Dolgov 9423d72f6c parser: force libxml error messages to valid utf8 2019-05-12 10:13:22 +03:00
Andrew Dolgov c936cc3a1f use DEFAULT_SEARCH_LANGUAGE to generate tsvector index if per-feed language is not specified, also use it as default value on search form for convenience 2019-04-10 13:03:26 +03:00
Andrew Dolgov 671f4cee65 domdocument: remove old meta charset unicode hacks, replace with shorter xml preamble utf8 hack (on loadhtml where it makes sense)
af_readability: better (?) charset hack for non-unicode pages
2019-03-21 21:08:02 +03:00
Andrew Dolgov 33a2d5f8e4 update_rss_feed: set basic feed info if site_url is blank 2019-03-15 14:00:09 +03:00
Andrew Dolgov 69a691f4e1 cleanup old feed browser cache 2019-03-06 20:12:44 +03:00
Andrew Dolgov 0b74db5ad7 remove feedbrowser (other feeds) 2019-03-06 20:02:06 +03:00
Andrew Dolgov 38e01270d8 archived feeds: expire old entries (schema bump) 2019-03-06 19:06:05 +03:00
Andrew Dolgov 13e7e775a3 update_rss_feed: mark_unread_on_update should take into account catchup filter action and entry_force_catchup 2019-02-06 22:56:14 +03:00
Andrew Dolgov 949bfa3457 add minor clean()-ing on some rss feed values 2018-12-26 09:58:28 +03:00
Andrew Dolgov eedd402807 rssutils: don't gzdecode() stuff 2018-12-21 17:52:41 +03:00
Andrew Dolgov a5517fe857 fetch_file_contents: decompress gzipped data
af_readability: remove utf8 preamble hack
2018-12-21 17:50:16 +03:00
Andrew Dolgov 958fbfedb6 rssutils: check if returned data is in gzip format before trying to decode it 2018-12-14 14:55:36 +03:00
JustAMacUser 4b2f3039d2 Properly report filter plugin time (re-fixes PR 98). 2018-12-12 21:30:16 -05:00
JustAMacUser 53602096b9 Fixed misplaced bracket. 2018-12-12 11:47:36 -05:00
Andrew Dolgov f3737c0b24 update_rss_feed: add log message if article is filtered out
combine filters: fix crash on missing global function
2018-12-08 17:01:30 +03:00