Commit Graph

213 Commits

Author SHA1 Message Date
Andrew Dolgov 5a450b8760 add workaround for languagedetect idiotic shit of some kind 2013-11-13 20:36:15 +04:00
Andrew Dolgov d8179cb9d9 pubsubhubbub: use atom rel=self link (when available) when subscribing
to push-enabled feeds
2013-11-11 22:52:15 +04:00
Andrew Dolgov 4ad04ee227 report all libxml errors in updater debug output
force utf8 encoding if devforceupdate is on
parser: try to convert non-unicode feeds with specified encoding to utf8
before trying to remove dangling utf8 characters in case of utf8-related
libxml errors because doing so produces garbage content
2013-10-29 12:15:26 +04:00
Andrew Dolgov 88edaa9344 daemon: cache parser object while processing a batch of feeds withsame url on first success 2013-10-25 14:42:43 +04:00
Andrew Dolgov 5ddd2705ca make language detection optional (closes ) 2013-09-27 13:45:21 +04:00
Andrew Dolgov 1357a263be include title when detecting article language 2013-09-17 12:28:31 +04:00
Andrew Dolgov 4f71d7431c replace suppress debugging kludge with a more flexible function (fixes
logging with update.php --feeds being stopped after first feed)
2013-09-15 23:02:21 +04:00
Andrew Dolgov a33558a61e pass logfile to child tasks if locking is possible, lock logfile before writing, add kludge to prevent update_rss_feed unneeded debugging go into master logfile 2013-09-02 12:33:59 +04:00
Andrew Dolgov f73e03e000 pass feed information to article filters 2013-09-02 10:03:04 +04:00
Andrew Dolgov 5c54e68388 support media:description for media: enclosures 2013-08-05 12:26:09 +04:00
Andrew Dolgov edba269b6f fix entries not inserted properly when no languages are detected 2013-08-02 16:03:13 +04:00
Andrew Dolgov 00f22824d7 rss: force language to 2 characters; run house keeping hooks properly 2013-08-02 14:47:34 +04:00
Andrew Dolgov 8e47022036 add hook_house_keeping 2013-08-02 14:06:18 +04:00
Andrew Dolgov 2fc4d981d1 remove unused old-style image rewritign 2013-08-02 14:04:00 +04:00
Andrew Dolgov 6b4617970f add text_languagedetect to guess article language for better hyphenation
(bump schema)
2013-07-31 10:30:17 +04:00
Andrew Dolgov 0997c2bd62 Revert "add temporary hack to store original unhashed guid into cached_content for debugging"
This reverts commit 8096e309a5.
2013-07-14 21:48:14 +04:00
Andrew Dolgov 8096e309a5 add temporary hack to store original unhashed guid into cached_content for debugging 2013-07-11 21:40:26 +04:00
Andrew Dolgov c052e25a8b remove unused cached_content 2013-07-10 16:55:55 +04:00
Andrew Dolgov 420940fa90 do not catchup newly subscribed feeds 2013-06-25 10:01:41 +04:00
Andrew Dolgov 6791af0cfd pass feed it to feed_fetched and fetch_feed hooks 2013-05-20 15:28:56 +04:00
Andrew Dolgov ee65bef405 add HOOK_FETCH_FEED 2013-05-20 15:26:53 +04:00
Andrew Dolgov 0ad2013bd2 update_rss_feed: remove unused override_url parameter 2013-05-20 15:20:14 +04:00
Andrew Dolgov 47673e6611 add fetch_url and owner_uid to HOOK_FEED_FETCHED 2013-05-18 09:22:06 +04:00
Andrew Dolgov d1f3fa9791 try to force-convert feed data to utf8 2013-05-17 20:09:43 +04:00
Andrew Dolgov fd687300bf Revert "subscribe_to_feed: use already fetched data when updating initially"
This reverts commit 23923fb29b.
2013-05-08 19:22:33 +04:00
Andrew Dolgov 23923fb29b subscribe_to_feed: use already fetched data when updating initially 2013-05-07 15:34:20 +04:00
Rasmus Lerdorf 6f7798b643 Fixing bugs found by static analysis 2013-05-07 00:35:10 -07:00
Andrew Dolgov f4ae0f053b update: remove debugging block 2013-05-02 10:26:32 +04:00
Andrew Dolgov 566417c4e7 restore updstart threshold 2013-05-02 02:31:32 +04:00
Andrew Dolgov 5d3e5a1bb2 simplify feed cache age handling (reduce caching to sequential updates) 2013-05-02 02:30:53 +04:00
Andrew Dolgov 5de4010487 disable marking for the time being 2013-05-02 02:21:11 +04:00
Andrew Dolgov 5ef8409700 move the precautionary timestamp bumping 2013-05-02 02:20:34 +04:00
Andrew Dolgov 5d0d3887af add _DISABLE_HTTP_304 2013-05-02 02:11:11 +04:00
Andrew Dolgov 15c762beda updater: show owner_uid for checked feeds 2013-05-02 02:02:49 +04:00
Andrew Dolgov 52637d3b30 remove cache valid bailout clause 2013-05-02 01:36:17 +04:00
Andrew Dolgov 865a3ed6a0 change feed cache file extension 2013-05-02 01:33:02 +04:00
Andrew Dolgov d4992d6b48 add support for dc:subject and slash:comments 2013-05-01 20:55:08 +04:00
Andrew Dolgov ee78f81ccd update: better tag-related debugging info 2013-05-01 20:33:59 +04:00
Andrew Dolgov 852d4ac890 support RDF-XML feeds 2013-05-01 20:30:52 +04:00
Andrew Dolgov fd0daa9b55 remove simplepie 2013-05-01 19:14:48 +04:00
Andrew Dolgov 431e27851b actually save feed xml in the cache 2013-05-01 18:10:27 +04:00
Andrew Dolgov 99429e57e4 remove simplepie entity decode hacks 2013-05-01 18:07:05 +04:00
Andrew Dolgov b8f316dc28 change caching to save xml data, remove RDF init section 2013-05-01 17:56:21 +04:00
Andrew Dolgov 04d2f9c831 add basic rss support 2013-05-01 17:38:16 +04:00
Andrew Dolgov cd07592c29 add basic tinyparser/atom 2013-05-01 17:04:57 +04:00
Andrew Dolgov 65c8d5e76d update: set last_updated to now() when process starts 2013-05-01 15:43:56 +04:00
Andrew Dolgov 39ede9862f experimental: decode numerical utf entities on import in entry title 2013-04-29 16:59:36 +04:00
Andrew Dolgov efe46a3b53 Merge pull request from KonishchevDmitry/pr-allow-slash-in-filter-regex
Allow slashes in filter regular expressions
2013-04-27 02:10:27 -07:00
Dmitry Konishchev 7b80b5e160 Match each tag separately against user filter regular expression
Each article's tag should be matched against user filter regular
expression separately. Current matching confuses when you want to match
an exact tag. You suppose to write "^tag$", bug now have to write
"(^|,)tag(,|$)" which is very inconvenient and requires knowledge about
how do you process this matching.
2013-04-26 16:30:25 +04:00
Dmitry Konishchev ffa1bd7b19 Allow slashes in filter regular expressions
User's regular expressions need escaping before passing them to
preg_match()
2013-04-26 15:46:48 +04:00