Roland's homepage

My random knot in the Web

Reading an RSS feed

Based on a question from Bill Seymour, this is what I came up with.

curl -s https://freethoughtblogs.com/feed/ | \
grep -E '<title>|<link>' | sed -E 's/<[^>]+>//g'

In parts;

First, curl gets the data.
Then grep strains the interesting parts (title and link) out of the XML soup.
Finally sed strips off the XML tags.

Since the title and <link> are both on their own line, using grep is sufficient.

If that was not the case, I would first try to insert xmllint --format after curl. In case that was not sufficient, I would probably write the thing in Python.

For comments, please send me an e-mail.

Extracting data from XML with regular expressions

← Gnumeric build fix for FreeBSD

Related articles