A simple HTML scraping script to query for magnet links on BitTorrent sites.
Searching for BitTorrent content
Before you can download something with BitTorrent you need to find the .torrent
file or magnet link for the content.
This searching is usually done with the help of one of the many BitTorrent hosting sites. The problem with these sites is that they often require JavaScript to be used and I would not trust them to be safe.
Rust command-line script
I could find some JavaScript scripts on GitHub that allow searching these sites, but they would be difficult to install and run as command-line programs.
I have written a simple single file Rust script that can be run with the use of my denim scripting crate.
#!/usr/bin/env denim
/* Cargo.toml
[package]
name = "search-torrent"
version = "0.1.0"
authors = ["Anonymous"]
edition = "2018"
[dependencies]
cotton = "0.0.7"
structopt = "0.3.2"
reqwest = { version = "0.10.8", features = ["blocking"] }
url = "2.1.1"
scraper = "0.12.0"
*/
use cotton::prelude::*;
use url::Url;
use scraper::{Html, Selector};
/// Searches for torrent magnet links given a search term.
#[derive(Debug, StructOpt)]
struct Cli {
#[structopt(flatten)]
logging: LoggingOpt,
#[structopt()]
query: String,
}
fn main() -> FinalResult {
let args = Cli::from_args();
init_logger(&args.logging, vec![module_path!()]);
// https://github.com/JimmyLaurent/torrent-search-api/blob/master/lib/providers/1337x.js
let base = Url::parse("https://www.1337x.to/")?;
let search = base.join("search/")?;
info!("Searching for {:?} on {}", args.query, base);
let item = Selector::parse("tbody > tr").unwrap();
let title = Selector::parse("a:nth-child(2)").unwrap();
let link = Selector::parse("a:nth-child(2)").unwrap();
let time = Selector::parse(".coll-date").unwrap();
let seeds = Selector::parse(".seeds").unwrap();
let peers = Selector::parse(".leeches").unwrap();
let size = Selector::parse(".size").unwrap();
for page in 1.. {
let resp = reqwest::blocking::get(search.join(&format!("{}/", &args.query))?.join(&format!("{}/", page))?)?;
let body = resp.text()?;
debug!("{}", body);
let html = Html::parse_document(&body);
let items = html.select(&item).collect_vec();
if items.is_empty() {
break;
}
for item in items {
debug!("{}", item.inner_html());
let link = item.select(&link).next().ok_or("no link found")?.value().attr("href").ok_or("no href found")?;
let title = item.select(&title).next().ok_or("no title found")?.inner_html();
let time = item.select(&time).next().ok_or("no time found")?.inner_html();
let seeds = item.select(&seeds).next().ok_or("no seeds found")?.inner_html();
let peers = item.select(&peers).next().ok_or("no peers found")?.inner_html();
let size = item.select(&size).next().ok_or("no size found")?.inner_html();
let size = size.splitn(2, "<").next().unwrap();
info!("[{}, {}, seeds: {}, peers: {}] {}", time, size, seeds, peers, title);
let desc = base.join(link)?;
let resp = reqwest::blocking::get(desc)?;
let body = resp.text()?;
let html = Html::parse_document(&body);
let links = Selector::parse("a").unwrap();
let magnets = html
.select(&links)
.filter_map(|a| a.value().attr("href"))
.filter(|href| href.starts_with("magnet:"))
.sorted()
.dedup();
for magnet in magnets {
println!("{}", magnet);
}
}
}
Ok(())
}
Installation and usage
I am assuming you have Rust installed.
Install denim from cargo:
cargo install denim
Copy the content of the script to file named search-torrent
and make it executable:
vim search-torrent
chmod +x search-torrent
Now you can run the script as-is, but if you want to see the build progress you can run it via denim with:
denim exec search-torrent -- -h
Once the build is complete you can perform your searches:
./search-torrent -v "Debian"
You should see a list of Debian ISO files along with their magnet links.