Reading an RSS feed

Based on a question from Bill Seymour, this is what I came up with.

curl -s | \
grep -E '<title>|<link>' | sed -E 's/<[^>]+>//g'

In parts;

  • First, curl gets the data.
  • Then grep strains the interesting parts (title and link) out of the XML soup.
  • Finally sed strips off the XML tags.

Since the title and <link> are both on their own line, using grep is sufficient.

If that was not the case, I would first try to insert xmllint --format after curl. In case that was not sufficient, I would probably write the thing in Python.

