Tweeper 1.4.1 released, fixed twitter to RSS conversion

Tweeper is a web scraper which converts Twitter and other social media websites to RSS.

On September 7th 2019 version 1.4.1 of Tweeper has been released. The tweeper Debian package has also been updated.

For PHP composer users there is also ao2/tweeper on packagist, which means that tweeper can be installed and run with these commands:

$ composer global require ao2/tweeper
$ ~/.config/composer/vendor/bin/tweeper

Version 1.4.1 is a maintenance release with a couple of fixes.

Here are the NEWS file entries:

News for v1.4.1:
================

   * Enable cookie handling in cURL to fix scraping twitter.com
   * Update User-Agent version to fix scraping hashtag pages on twitter.com

CommentsSyndicate content

Did twitter break tweeper

Aaron's picture

Did twitter break tweeper recently?

php ../git/tweeper/tweeper.php https://twitter.com/business | tee business
PHP Warning:  Error 801: Tag svg invalid in Entity, line 138 in /home/aas/git/tweeper/src/Tweeper.php on line 257
PHP Warning:  Error 801: Tag g invalid in Entity, line 138 in /home/aas/git/tweeper/src/Tweeper.php on line 257
PHP Warning:  Error 801: Tag path invalid in Entity, line 138 in /home/aas/git/tweeper/src/Tweeper.php on line 257
<?xml version="1.0"?>
<rss version="2.0" xml:base="https://twitter.com">
  <channel>
    <generator>Tweeper</generator>
    <title>Twitter / </title>
    <link>https://twitter.com/business</link>
    <description/>
  </channel>
</rss>

Yes, it looks like

ao2's picture

Yes, it looks like twitter.com now serves content via json to modern browsers and parses and renders it on the client side.

Adjusting the User-Agent header to mimic an older browser seems to fix this and does not break the other sites supported by Tweeper, see https://git.ao2.it/tweeper.git/commitdiff/da4568f5a2d24e0933d44b16b5ef180095c42dab.

Post new comment

The content of this field is kept private and will not be shown publicly. If you have a Gravatar account associated with the e-mail address you provide, it will be used to display your avatar.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
R
9
4
j
Z
L
Enter the code without spaces.