Sigh HN: Ratt – RSS The total Issues

63
Sigh HN: Ratt – RSS The total Issues

#ratt

RSS the overall issues!

ratt is a instrument for converting internet sites to rss/atom feeds. It uses config recordsdata which justify the extraction of the feed info by the use of css selectors, or Lua script.

Config recordsdata are in yaml format:

#for automatic extraction, ratt assessments all config recordsdata and matches the regex
regex:  https://videoportal.joj.sk/.*
selectors: 
    #settings for all http requests for the earn situation
    httpsettings: 
        cookie:  {}
        header:  {}
        useragent:  Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.72 Safari/537.36
    #css selectors to safe the feed info
    feed: 
        title:  .title.my-2
        description:  .description
        authorname: 
        authoremail: 
    #css selectors to safe item info
    item: 
        #the thing container
        container:  article.b-article.title-xs.article-lp
        #all subsequent attributes of the thing are selected from the subtree of the thing container
        title:  div.train material > h3
        link:  a
        linkattr:  href
        created:  .date
        createdformat:  2.1.2006
        description:  div.col > .date
        image:  img.img-fluid
        imageattr:  info-usual

#Configs

Config recordsdata are yaml recordsdata. ratt has some confs embedded. When calling eg: ratt auto https://1337x.to/high-100 ratt will attempt and in discovering the config for the earn situation url, it searches the embedded config recordsdata, the present itemizing and in ~/.config/ratt/*.yml.

#Installation

Set up most up to the moment with dart:

dart set up git.sr.ht/~ghost08/ratt/cmd/ratt@most up to the moment

Set up on Arch Linux from AUR alongside with your celebrated helper:

yay -S ratt-git

#Issues

File bugs and TODOs by map of the grief tracker or send an electronic mail
to ~ghost08/ratt@todo.sr.ht. For long-established discussion, use the
mailing listing: ~ghost08/ratt@lists.sr.ht.

#Usage

ratt has three commands:

auto – automatically searches for the config that will seemingly be ancient.

extract – with other arguments, ratt will scrap the earn situation to generate the RSS/Atom feed.

assign – while you have to perhaps have the correct css selectors/lua scripts, assign the config to a yaml file

ratt assign --feed-title=".featured-heading robust" --item-container=".table-listing-wrap tbody tr" --item-title="a:nth-child(2)" --item-link='a = sel:in discovering("a:nth-child(2)")
itemURL = "https://1337x.to" .. a:attr("href")
doc, err = goquery.newDocFromURL(itemURL)
if err ~= nil then
    error(err)
cease
link = doc:in discovering("ul li a[onclick]"):first():attr("href")
link = link:gsub("%s+", "")
print(link)' --item-created=".coll-date" --item-created-format="" "https://1337x.to/.*" 1337x.yml

That is a if truth be told gorgeous ask. I’m chuffed you requested 🙂

It’s possible you’ll perhaps perhaps feed the feed on to photon, which is a latest RSS/Atom reader. photon will play you the media from your feed. It uses mpv and youtube-dl to automaticaly play movies, salvage torrents, perceive pictures and much more 🙂

So establish this out:

ratt auto https://1337x.to/high-100 | photon -

photon 1337x screenshot

#Lua

If a css selector is rarely any longer ample to safe the wanted info, every feed and item attribute is also written as a multiline value and ratt will make clear it as Lua script.

The Lua script will safe some international variables, to again with the extraction:

goquery is a module imported by default and it’s a subset of the well-known goquery library

sel is the preference object of the feed/item container on which it’ll even be queried for the selectors

gojq is a module imported by default, it’s miles the gojq) library

setGlobal sets a international variable that will seemingly be viewed in other lua scripts. eg. in feed title setGlobal("myvar", 1) is known as and then in every subsequent item title, item link, …, item image the variable will seemingly be viewed and is also ancient: print(myvar). Warning: Please show conceal, that environment a international variable in an item could perhaps goal cease up in mosey stipulations, as objects are processed in parallel.

index series of the thing processed

ratt will safe the stdout of the Lua script and insert it because the guidelines of the feed/item. When a error has occured, goal use the error characteristic.

#examples

Calling one other link, parsing it to a goquery.Doc and querying the modern doc:

You’ll seemingly be in a location to additionally parse and ask json info, with the again of the awesome gojq) library:

feed: 
    title:  .title
    description:  |-
        --in discovering the 

Join the pack! Join 8000+ others registered users, and safe chat, safe teams, post updates and safe chums across the enviornment!
www.knowasiak.com/register/

Charlie Layers
WRITTEN BY

Charlie Layers

Fill your life with experiences so you always have a great story to tell