Skip to content

Commit

Permalink
Improve RSS importer with canonical_link and extract_tags option (#489)
Browse files Browse the repository at this point in the history
Merge pull request 489
  • Loading branch information
sumanmaity112 authored Nov 28, 2022
1 parent 098b02d commit f9b6653
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 8 deletions.
7 changes: 7 additions & 0 deletions docs/_importers/rss.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,10 @@ $ ruby -r rubygems -e 'require "jekyll-import";
{% endhighlight %}

The `source` field is required and can be either a local file or a remote one.
Other optional fields are as follows:
* `canonical_link` – copy original link as `canonical_url` to post. (default: `false`)
* `render_audio` – render `<audio>` element in posts for the enclosure URLs (default: `false`)
* `tag` – add a specific tag to all posts
* `extract_tags` – copies tags from the given subfield on the RSS `<item>`

__Note:__ `tag` and `extract_tags` are exclusive option, both can not be provided together.
35 changes: 27 additions & 8 deletions lib/jekyll-import/importers/rss.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,15 @@ module Importers
class RSS < Importer
def self.specify_options(c)
c.option "source", "--source NAME", "The RSS file or URL to import"
c.option "tag", "--tag NAME", "Add a tag to posts"
c.option "render_audio", "--render_audio", "Render <audio> element as necessary"
c.option "tag", "--tag NAME", "Add a specific tag to all posts"
c.option "extract_tags", "--extract_tags KEY", "Copies tags from the given subfield on the RSS <item>"
c.option "render_audio", "--render_audio", "Render <audio> element in posts for the enclosure URLs (default: false)"
c.option "canonical_link", "--canonical_link", "Copy original link as canonical_url to post. (default: false)"
end

def self.validate(options)
abort "Missing mandatory option --source." if options["source"].nil?
abort "Provide either --tag or --extract_tags option." if options["extract_tags"] && options["tag"]
end

def self.require_deps
Expand All @@ -33,7 +36,7 @@ def self.process(options)
source = options.fetch("source")

content = ""
open(source) { |s| content = s.read }
URI.open(source) { |s| content = s.read }
rss = ::RSS::Parser.parse(content, false)

raise "There doesn't appear to be any RSS items at the source (#{source}) provided." unless rss
Expand All @@ -52,13 +55,14 @@ def self.write_rss_item(item, options)
post_name = Jekyll::Utils.slugify(item.title, :mode => "latin")
name = "#{formatted_date}-#{post_name}"
audio = render_audio && item.enclosure.url
canonical_link = options.fetch("canonical_link", false)

header = {
"layout" => "post",
"title" => item.title,
}

header["tag"] = options["tag"] unless options["tag"].nil? || options["tag"].empty?
"layout" => "post",
"title" => item.title,
"canonical_url" => (canonical_link ? item.link : nil),
"tag" => get_tags(item, options),
}.compact

frontmatter.each do |value|
header[value] = item.send(value)
Expand Down Expand Up @@ -91,6 +95,21 @@ def self.write_rss_item(item, options)
f.puts output
end
end

def self.get_tags(item, options)
explicit_tag = options["tag"]
return explicit_tag unless explicit_tag.nil? || explicit_tag.empty?

tags_reference = options["extract_tags"]
return unless tags_reference

tags_from_feed = item.instance_variable_get("@#{tags_reference}")
return unless tags_from_feed.is_a?(Array)

tags = tags_from_feed.map { |feed_tag| feed_tag.content.downcase }
tags.empty? ? nil : tags.tap(&:uniq!)
end
private_class_method :get_tags
end
end
end

0 comments on commit f9b6653

Please sign in to comment.