#ratt
RSS the overall issues!
ratt is a instrument for converting internet sites to rss/atom feeds. It uses config recordsdata which justify the extraction of the feed info by the use of css selectors, or Lua script.
Featured Content Ads
add advertising hereConfig recordsdata are in yaml format:
#for automatic extraction, ratt assessments all config recordsdata and matches the regex regex: https://videoportal.joj.sk/.* selectors: #settings for all http requests for the earn situation httpsettings: cookie: {} header: {} useragent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.72 Safari/537.36 #css selectors to safe the feed info feed: title: .title.my-2 description: .description authorname: authoremail: #css selectors to safe item info item: #the thing container container: article.b-article.title-xs.article-lp #all subsequent attributes of the thing are selected from the subtree of the thing container title: div.train material > h3 link: a linkattr: href created: .date createdformat: 2.1.2006 description: div.col > .date image: img.img-fluid imageattr: info-usual
#Configs
Config recordsdata are yaml recordsdata. ratt has some confs embedded. When calling eg: ratt auto https://1337x.to/high-100
ratt will attempt and in discovering the config for the earn situation url, it searches the embedded config recordsdata, the present itemizing and in ~/.config/ratt/*.yml
.
#Installation
Set up most up to the moment with dart:
dart set up git.sr.ht/~ghost08/ratt/cmd/ratt@most up to the moment
Featured Content Ads
add advertising hereSet up on Arch Linux from AUR alongside with your celebrated helper:
yay -S ratt-git
#Issues
File bugs and TODOs by map of the grief tracker or send an electronic mail
to ~ghost08/ratt@todo.sr.ht. For long-established discussion, use the
mailing listing: ~ghost08/ratt@lists.sr.ht.
#Usage
ratt has three commands:
Featured Content Ads
add advertising hereauto
– automatically searches for the config that will seemingly be ancient.
extract
– with other arguments, ratt will scrap the earn situation to generate the RSS/Atom feed.
assign
– while you have to perhaps have the correct css selectors/lua scripts, assign the config to a yaml file
ratt assign --feed-title=".featured-heading robust" --item-container=".table-listing-wrap tbody tr" --item-title="a:nth-child(2)" --item-link='a = sel:in discovering("a:nth-child(2)")
itemURL = "https://1337x.to" .. a:attr("href")
doc, err = goquery.newDocFromURL(itemURL)
if err ~= nil then
error(err)
cease
link = doc:in discovering("ul li a[onclick]"):first():attr("href")
link = link:gsub("%s+", "")
print(link)' --item-created=".coll-date" --item-created-format="" "https://1337x.to/.*" 1337x.yml
That is a if truth be told gorgeous ask. I’m chuffed you requested 🙂
It’s possible you’ll perhaps perhaps feed the feed on to photon, which is a latest RSS/Atom reader. photon will play you the media from your feed. It uses mpv and youtube-dl to automaticaly play movies, salvage torrents, perceive pictures and much more 🙂
So establish this out:
ratt auto https://1337x.to/high-100 | photon -
#Lua
If a css selector is rarely any longer ample to safe the wanted info, every feed and item attribute is also written as a multiline value and ratt will make clear it as Lua script.
The Lua script will safe some international variables, to again with the extraction:
goquery
is a module imported by default and it’s a subset of the well-known goquery library
sel
is the preference object of the feed/item container on which it’ll even be queried for the selectors
gojq
is a module imported by default, it’s miles the gojq) library
setGlobal
sets a international variable that will seemingly be viewed in other lua scripts. eg. in feed title setGlobal("myvar", 1)
is known as and then in every subsequent item title, item link, …, item image the variable will seemingly be viewed and is also ancient: print(myvar)
. Warning: Please show conceal, that environment a international variable in an item could perhaps goal cease up in mosey stipulations, as objects are processed in parallel.
index
series of the thing processed
ratt will safe the stdout of the Lua script and insert it because the guidelines of the feed/item. When a error has occured, goal use the error
characteristic.
#examples
Calling one other link, parsing it to a goquery.Doc and querying the modern doc:
item: #safe the thing container html order container: .table-listing-wrap tbody tr #safe the title order in the thing container title: a:nth-child(2) #lua script link: |- --sel is the thing container order, in discovering a = sel:in discovering("a:nth-child(2)") --safe the href attribute of and safe a item url link from it itemURL = "https://1337x.to" .. a:attr("href") --search info from and parse the doc doc, err = goquery.newDocFromURL(itemURL) if err ~= nil then --return error if the search info from became unsuccesfull error(err) cease --in discovering the thing link you wish link = doc:in discovering("ul li a[onclick]"):first():attr("href") --neat apartment characters link = link:gsub("%s+", "") --and lastly print the link out so ratt can encompass it in the thing.link print(link)
You’ll seemingly be in a location to additionally parse and ask json info, with the again of the awesome gojq) library:
feed: title: .title description: |- --in discovering the