Andrew Dolgov
4ad04ee227
report all libxml errors in updater debug output
...
force utf8 encoding if devforceupdate is on
parser: try to convert non-unicode feeds with specified encoding to utf8
before trying to remove dangling utf8 characters in case of utf8-related
libxml errors because doing so produces garbage content
2013-10-29 12:15:26 +04:00
Andrew Dolgov
88edaa9344
daemon: cache parser object while processing a batch of feeds withsame url on first success
2013-10-25 14:42:43 +04:00
Andrew Dolgov
5ddd2705ca
make language detection optional ( closes #779 )
2013-09-27 13:45:21 +04:00
Andrew Dolgov
1357a263be
include title when detecting article language
2013-09-17 12:28:31 +04:00
Andrew Dolgov
4f71d7431c
replace suppress debugging kludge with a more flexible function (fixes
...
logging with update.php --feeds being stopped after first feed)
2013-09-15 23:02:21 +04:00
Andrew Dolgov
a33558a61e
pass logfile to child tasks if locking is possible, lock logfile before writing, add kludge to prevent update_rss_feed unneeded debugging go into master logfile
2013-09-02 12:33:59 +04:00
Andrew Dolgov
f73e03e000
pass feed information to article filters
2013-09-02 10:03:04 +04:00
Andrew Dolgov
5c54e68388
support media:description for media: enclosures
2013-08-05 12:26:09 +04:00
Andrew Dolgov
edba269b6f
fix entries not inserted properly when no languages are detected
2013-08-02 16:03:13 +04:00
Andrew Dolgov
00f22824d7
rss: force language to 2 characters; run house keeping hooks properly
2013-08-02 14:47:34 +04:00
Andrew Dolgov
8e47022036
add hook_house_keeping
2013-08-02 14:06:18 +04:00
Andrew Dolgov
2fc4d981d1
remove unused old-style image rewritign
2013-08-02 14:04:00 +04:00
Andrew Dolgov
6b4617970f
add text_languagedetect to guess article language for better hyphenation
...
(bump schema)
2013-07-31 10:30:17 +04:00
Andrew Dolgov
0997c2bd62
Revert "add temporary hack to store original unhashed guid into cached_content for debugging"
...
This reverts commit 8096e309a5
.
2013-07-14 21:48:14 +04:00
Andrew Dolgov
8096e309a5
add temporary hack to store original unhashed guid into cached_content for debugging
2013-07-11 21:40:26 +04:00
Andrew Dolgov
c052e25a8b
remove unused cached_content
2013-07-10 16:55:55 +04:00
Andrew Dolgov
420940fa90
do not catchup newly subscribed feeds
2013-06-25 10:01:41 +04:00
Andrew Dolgov
6791af0cfd
pass feed it to feed_fetched and fetch_feed hooks
2013-05-20 15:28:56 +04:00
Andrew Dolgov
ee65bef405
add HOOK_FETCH_FEED
2013-05-20 15:26:53 +04:00
Andrew Dolgov
0ad2013bd2
update_rss_feed: remove unused override_url parameter
2013-05-20 15:20:14 +04:00
Andrew Dolgov
47673e6611
add fetch_url and owner_uid to HOOK_FEED_FETCHED
2013-05-18 09:22:06 +04:00
Andrew Dolgov
d1f3fa9791
try to force-convert feed data to utf8
2013-05-17 20:09:43 +04:00
Andrew Dolgov
fd687300bf
Revert "subscribe_to_feed: use already fetched data when updating initially"
...
This reverts commit 23923fb29b
.
2013-05-08 19:22:33 +04:00
Andrew Dolgov
23923fb29b
subscribe_to_feed: use already fetched data when updating initially
2013-05-07 15:34:20 +04:00
Rasmus Lerdorf
6f7798b643
Fixing bugs found by static analysis
2013-05-07 00:35:10 -07:00
Andrew Dolgov
f4ae0f053b
update: remove debugging block
2013-05-02 10:26:32 +04:00
Andrew Dolgov
566417c4e7
restore updstart threshold
2013-05-02 02:31:32 +04:00
Andrew Dolgov
5d3e5a1bb2
simplify feed cache age handling (reduce caching to sequential updates)
2013-05-02 02:30:53 +04:00
Andrew Dolgov
5de4010487
disable marking for the time being
2013-05-02 02:21:11 +04:00
Andrew Dolgov
5ef8409700
move the precautionary timestamp bumping
2013-05-02 02:20:34 +04:00
Andrew Dolgov
5d0d3887af
add _DISABLE_HTTP_304
2013-05-02 02:11:11 +04:00
Andrew Dolgov
15c762beda
updater: show owner_uid for checked feeds
2013-05-02 02:02:49 +04:00
Andrew Dolgov
52637d3b30
remove cache valid bailout clause
2013-05-02 01:36:17 +04:00
Andrew Dolgov
865a3ed6a0
change feed cache file extension
2013-05-02 01:33:02 +04:00
Andrew Dolgov
d4992d6b48
add support for dc:subject and slash:comments
2013-05-01 20:55:08 +04:00
Andrew Dolgov
ee78f81ccd
update: better tag-related debugging info
2013-05-01 20:33:59 +04:00
Andrew Dolgov
852d4ac890
support RDF-XML feeds
2013-05-01 20:30:52 +04:00
Andrew Dolgov
fd0daa9b55
remove simplepie
2013-05-01 19:14:48 +04:00
Andrew Dolgov
431e27851b
actually save feed xml in the cache
2013-05-01 18:10:27 +04:00
Andrew Dolgov
99429e57e4
remove simplepie entity decode hacks
2013-05-01 18:07:05 +04:00
Andrew Dolgov
b8f316dc28
change caching to save xml data, remove RDF init section
2013-05-01 17:56:21 +04:00
Andrew Dolgov
04d2f9c831
add basic rss support
2013-05-01 17:38:16 +04:00
Andrew Dolgov
cd07592c29
add basic tinyparser/atom
2013-05-01 17:04:57 +04:00
Andrew Dolgov
65c8d5e76d
update: set last_updated to now() when process starts
2013-05-01 15:43:56 +04:00
Andrew Dolgov
39ede9862f
experimental: decode numerical utf entities on import in entry title
2013-04-29 16:59:36 +04:00
Andrew Dolgov
efe46a3b53
Merge pull request #167 from KonishchevDmitry/pr-allow-slash-in-filter-regex
...
Allow slashes in filter regular expressions
2013-04-27 02:10:27 -07:00
Dmitry Konishchev
7b80b5e160
Match each tag separately against user filter regular expression
...
Each article's tag should be matched against user filter regular
expression separately. Current matching confuses when you want to match
an exact tag. You suppose to write "^tag$", bug now have to write
"(^|,)tag(,|$)" which is very inconvenient and requires knowledge about
how do you process this matching.
2013-04-26 16:30:25 +04:00
Dmitry Konishchev
ffa1bd7b19
Allow slashes in filter regular expressions
...
User's regular expressions need escaping before passing them to
preg_match()
2013-04-26 15:46:48 +04:00
Andrew Dolgov
90e5f4f1de
base if-modified-since on last received article, not feed update timestamp
2013-04-25 18:42:48 +04:00
Andrew Dolgov
23283f11a3
fetch: better checking for 1970- date
2013-04-25 16:12:49 +04:00