1041 lines
32 KiB
Markdown
1041 lines
32 KiB
Markdown
# Einführung in die Umsetzung des Projekts in Rust: ProtonVPN-Integration & Session-Management
|
|
|
|
## Inhaltsverzeichnis
|
|
1. [Übersicht](#übersicht)
|
|
2. [Architektur](#architektur)
|
|
3. [Dependencies](#dependencies)
|
|
4. [Kern-Module](#kern-module)
|
|
5. [Implementierungsschritte](#implementierungsschritte)
|
|
6. [Konfiguration](#konfiguration)
|
|
7. [Fehlerbehandlung & Best Practices](#fehlerbehandlung--best-practices)
|
|
|
|
---
|
|
|
|
## Übersicht
|
|
|
|
Dieses Dokument beschreibt eine detaillierte Anleitung zur Umsetzung eines **Session-Management-Systems**, bei dem jede Session eine andere externe IP-Adresse verwendet. Das System wird in Rust implementiert und verwendet die **ProtonVPN-Chrome-Extension** zur IP-Rotation.
|
|
|
|
### Ziele
|
|
- ✅ Sessions managen mit unterschiedlichen externen IP-Adressen
|
|
- ✅ ChromeDriver-Pool mit konfigurierter Poolgröße
|
|
- ✅ Automatisierung der ProtonVPN-Extension (Verbindung trennen/verbinden)
|
|
- ✅ IP-Rotation zwischen Sessions
|
|
- ✅ Browser-Traffic ausschließlich über VPN leiten (nicht systemweit)
|
|
- ✅ Flexible Konfiguration via `config.rs`
|
|
|
|
### Warum dieser Ansatz?
|
|
|
|
| Aspekt | Begründung |
|
|
|--------|-----------|
|
|
| **ProtonVPN-Extension** | Routet nur Browser-Traffic über VPN, systemweit nicht nötig |
|
|
| **thirtyfour/fantoccini** | Selenium-ähnliche Browser-Automatisierung in Rust |
|
|
| **Automatisierte Extension** | Ermöglicht programmatische Steuerung von VPN-Verbindungen |
|
|
| **Pool-Management** | Eine Gruppe von ChromeDriver-Instanzen pro Session = gleiche IP innerhalb Session |
|
|
| **Flexible Rotation** | Konfigurierbar: nach X Tasks oder zwischen Phasen (economic/corporate) |
|
|
|
|
### Einschränkungen & Annahmen
|
|
|
|
**Einschränkungen:**
|
|
- ProtonVPN-Server verwenden Load-Balancing → dieselbe Server-Auswahl garantiert nicht exakt dieselbe IP
|
|
- Typischerweise aber ähnliche/gleiche IP bei kurzzeitiger Reconnection
|
|
- Für präzise IP-Garantie: alternative Proxy-Services erwägen (nicht in dieser Anleitung)
|
|
|
|
**Annahmen:**
|
|
- ✓ ProtonVPN-Account vorhanden (kostenlos oder paid)
|
|
- ✓ Rust-Umgebung installiert (Cargo, Rustup)
|
|
- ✓ Chrome + ChromeDriver kompatibel
|
|
- ✓ Plattformübergreifend (Windows/Linux/macOS), Extension-Automatisierung am besten auf Desktop
|
|
- ✓ Keine zusätzlichen Pakete außer standard Crates
|
|
|
|
---
|
|
|
|
## Architektur
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ main.rs │
|
|
│ (Config laden, Sessions initialisieren, Tasks verwalten) │
|
|
└──────────────────────┬──────────────────────────────────────┘
|
|
│
|
|
┌─────────────┴─────────────┐
|
|
│ │
|
|
┌────▼──────────────┐ ┌──────▼──────────────┐
|
|
│ Session Manager │ │ ChromeDriver Pool │
|
|
│ (IP-Rotation) │ │ (WebDriver instances) │
|
|
└────┬──────────────┘ └──────┬───────────────┘
|
|
│ │
|
|
┌────▼──────────────┐ ┌──────▼───────────────┐
|
|
│ ProtonVPN Ext. │ │ Browser Automation │
|
|
│ Automater │ │ (fantoccini/thirtyfour) │
|
|
└────┬──────────────┘ └──────┬───────────────┘
|
|
│ │
|
|
┌────▼──────────────────────────▼────────────┐
|
|
│ Chrome mit ProtonVPN-Extension geladen │
|
|
│ (Browser-Traffic über VPN) │
|
|
└─────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Komponenten
|
|
|
|
1. **Config Manager** (`config.rs`)
|
|
- Lädt Einstellungen aus `.env`
|
|
- Definiert: `max_parallel_tasks`, `tasks_per_vpn_session`, `vpn_servers`
|
|
|
|
2. **Session Manager** (neu: `scraper/vpn_session.rs`)
|
|
- Verwaltet VPN-Sessions
|
|
- Rotiert Server/IPs zwischen Sessions
|
|
- Verfolgt: aktuelle IP, Session-Start, Task-Counter
|
|
|
|
3. **ProtonVPN Automater** (neu: `scraper/protonvpn_extension.rs`)
|
|
- Interagiert mit ProtonVPN-Extension im Browser
|
|
- Verbindungen trennen/verbinden
|
|
- IP-Überprüfung via `whatismyipaddress.com` o.ä.
|
|
|
|
4. **ChromeDriver Pool** (erweitert: `scraper/webdriver.rs`)
|
|
- Verwaltet Pool-Instanzen
|
|
- Erzeugt Sessions mit ProtonVPN-Extension
|
|
|
|
5. **Task Manager** (erweitert: `main.rs`)
|
|
- Koordiniert Sessions + Tasks
|
|
- Triggert IP-Rotation bei Bedarf
|
|
|
|
---
|
|
|
|
## Dependencies
|
|
|
|
Überprüfen Sie `Cargo.toml`. Folgende Crates sind erforderlich:
|
|
|
|
```toml
|
|
[dependencies]
|
|
tokio = { version = "1.38", features = ["full"] }
|
|
fantoccini = { version = "0.20", features = ["rustls-tls"] } # WebDriver
|
|
reqwest = { version = "0.12", features = ["json", "gzip", "brotli", "deflate", "blocking"] }
|
|
scraper = "0.19"
|
|
serde = { version = "1.0", features = ["derive"] }
|
|
serde_json = "1.0"
|
|
anyhow = "1.0"
|
|
chrono = { version = "0.4", features = ["serde"] }
|
|
dotenvy = "0.15"
|
|
toml = "0.9.8"
|
|
tracing = "0.1"
|
|
tracing-subscriber = { version = "0.3", features = ["fmt", "env-filter"] }
|
|
futures = "0.3"
|
|
```
|
|
|
|
**Keine zusätzlichen Pakete erforderlich** — Standard-Crates werden verwendet.
|
|
|
|
---
|
|
|
|
## Kern-Module
|
|
|
|
### 1. **config.rs** (Erweiterungen)
|
|
|
|
```rust
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub struct Config {
|
|
pub enable_vpn_rotation: bool,
|
|
pub vpn_servers: String, // "US,JP,DE" oder "server1,server2,server3"
|
|
pub tasks_per_vpn_session: usize, // Tasks pro Session (0 = rotate between phases)
|
|
pub max_tasks_per_instance: usize, // Tasks pro ChromeDriver-Instanz
|
|
pub max_parallel_tasks: usize,
|
|
// ... weitere Felder
|
|
}
|
|
|
|
impl Config {
|
|
pub fn get_vpn_server_list(&self) -> Vec<String> {
|
|
self.vpn_servers
|
|
.split(',')
|
|
.map(|s| s.trim().to_string())
|
|
.filter(|s| !s.is_empty())
|
|
.collect()
|
|
}
|
|
}
|
|
```
|
|
|
|
### 2. **scraper/vpn_session.rs** (NEU)
|
|
|
|
Verwaltet VPN-Sessions und IP-Rotation:
|
|
|
|
```rust
|
|
use chrono::{DateTime, Utc};
|
|
use std::sync::Arc;
|
|
use tokio::sync::Mutex;
|
|
|
|
#[derive(Debug, Clone)]
|
|
pub struct VpnSessionConfig {
|
|
pub server: String,
|
|
pub session_id: String,
|
|
pub created_at: DateTime<Utc>,
|
|
pub current_ip: Option<String>,
|
|
pub task_count: usize,
|
|
pub max_tasks: usize,
|
|
}
|
|
|
|
pub struct VpnSessionManager {
|
|
pub current_session: Arc<Mutex<Option<VpnSessionConfig>>>,
|
|
pub servers: Vec<String>,
|
|
pub server_index: Arc<Mutex<usize>>,
|
|
pub tasks_per_session: usize,
|
|
}
|
|
|
|
impl VpnSessionManager {
|
|
pub fn new(servers: Vec<String>, tasks_per_session: usize) -> Self {
|
|
Self {
|
|
current_session: Arc::new(Mutex::new(None)),
|
|
servers,
|
|
server_index: Arc::new(Mutex::new(0)),
|
|
tasks_per_session,
|
|
}
|
|
}
|
|
|
|
/// Erstellt eine neue VPN-Session mit einem neuen Server
|
|
pub async fn create_new_session(&self) -> anyhow::Result<String> {
|
|
let mut index = self.server_index.lock().await;
|
|
let server = self.servers[*index % self.servers.len()].clone();
|
|
*index += 1;
|
|
|
|
let session_id = format!(
|
|
"session_{}_{}",
|
|
server,
|
|
chrono::Utc::now().timestamp_millis()
|
|
);
|
|
|
|
let session = VpnSessionConfig {
|
|
server,
|
|
session_id: session_id.clone(),
|
|
created_at: Utc::now(),
|
|
current_ip: None,
|
|
task_count: 0,
|
|
max_tasks: self.tasks_per_session,
|
|
};
|
|
|
|
*self.current_session.lock().await = Some(session);
|
|
Ok(session_id)
|
|
}
|
|
|
|
/// Inkrementiert Task-Counter und prüft, ob neue Session nötig ist
|
|
pub async fn increment_task_count(&self) -> bool {
|
|
let mut session = self.current_session.lock().await;
|
|
|
|
if let Some(ref mut s) = &mut *session {
|
|
s.task_count += 1;
|
|
|
|
if self.tasks_per_session > 0 && s.task_count >= self.tasks_per_session {
|
|
return true; // Neue Session nötig
|
|
}
|
|
}
|
|
false
|
|
}
|
|
|
|
pub async fn get_current_session(&self) -> Option<VpnSessionConfig> {
|
|
self.current_session.lock().await.clone()
|
|
}
|
|
|
|
pub async fn set_current_ip(&self, ip: String) {
|
|
if let Some(ref mut session) = &mut *self.current_session.lock().await {
|
|
session.current_ip = Some(ip);
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3. **scraper/protonvpn_extension.rs** (NEU)
|
|
|
|
Automatisiert die ProtonVPN-Extension im Browser:
|
|
|
|
```rust
|
|
use anyhow::{anyhow, Result, Context};
|
|
use fantoccini::Client;
|
|
use tokio::time::{sleep, Duration};
|
|
|
|
pub struct ProtonVpnAutomater {
|
|
extension_id: String, // ProtonVPN Extension ID
|
|
}
|
|
|
|
impl ProtonVpnAutomater {
|
|
/// Initialisiert den ProtonVPN-Automater
|
|
/// extension_id: Die Chrome-Extension-ID (z.B. "abcdef123456...")
|
|
pub fn new(extension_id: String) -> Self {
|
|
Self { extension_id }
|
|
}
|
|
|
|
/// Verbindung zur ProtonVPN trennen
|
|
pub async fn disconnect(&self, client: &Client) -> Result<()> {
|
|
// Extension-Seite öffnen
|
|
let extension_url = format!("chrome-extension://{}/popup.html", self.extension_id);
|
|
client.goto(&extension_url)
|
|
.await
|
|
.context("Failed to navigate to ProtonVPN extension")?;
|
|
|
|
sleep(Duration::from_millis(500)).await;
|
|
|
|
// "Disconnect"-Button finden und klicken
|
|
// Selektoren hängen von Extension-Version ab
|
|
let disconnect_btn = client
|
|
.find(fantoccini::LocatorStrategy::XPath(
|
|
"//button[contains(text(), 'Disconnect')] | //button[@data-action='disconnect']"
|
|
))
|
|
.await;
|
|
|
|
match disconnect_btn {
|
|
Ok(elem) => {
|
|
elem.click().await.context("Failed to click Disconnect button")?;
|
|
sleep(Duration::from_secs(2)).await; // Warten auf Disconnect
|
|
Ok(())
|
|
}
|
|
Err(_) => {
|
|
// Eventuell bereits disconnected
|
|
tracing::warn!("Disconnect button not found, may be already disconnected");
|
|
Ok(())
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Mit neuem ProtonVPN-Server verbinden
|
|
pub async fn connect_to_server(&self, client: &Client, server: &str) -> Result<()> {
|
|
let extension_url = format!("chrome-extension://{}/popup.html", self.extension_id);
|
|
client.goto(&extension_url)
|
|
.await
|
|
.context("Failed to navigate to ProtonVPN extension")?;
|
|
|
|
sleep(Duration::from_millis(500)).await;
|
|
|
|
// Server-Liste öffnen (hängt von Extension-UI ab)
|
|
let server_list = client
|
|
.find(fantoccini::LocatorStrategy::XPath(
|
|
"//button[@data-action='select-servers'] | //div[@class='server-list']"
|
|
))
|
|
.await;
|
|
|
|
if server_list.is_ok() {
|
|
// Auf Server-Option für "server" klicken
|
|
let server_option = client
|
|
.find(fantoccini::LocatorStrategy::XPath(
|
|
&format!("//div[@data-server='{}'] | //button[contains(text(), '{}')]", server, server)
|
|
))
|
|
.await;
|
|
|
|
if let Ok(elem) = server_option {
|
|
elem.click().await.context("Failed to click server option")?;
|
|
sleep(Duration::from_millis(500)).await;
|
|
}
|
|
}
|
|
|
|
// Connect-Button klicken
|
|
let connect_btn = client
|
|
.find(fantoccini::LocatorStrategy::XPath(
|
|
"//button[contains(text(), 'Connect')] | //button[@data-action='connect']"
|
|
))
|
|
.await
|
|
.context("Failed to find Connect button")?;
|
|
|
|
connect_btn.click().await.context("Failed to click Connect button")?;
|
|
|
|
// Warten bis Verbindung hergestellt (bis zu 10s)
|
|
for _ in 0..20 {
|
|
sleep(Duration::from_millis(500)).await;
|
|
if self.is_connected(client).await.unwrap_or(false) {
|
|
return Ok(());
|
|
}
|
|
}
|
|
|
|
Err(anyhow!("Failed to connect to ProtonVPN server: {}", server))
|
|
}
|
|
|
|
/// Prüft, ob VPN verbunden ist
|
|
pub async fn is_connected(&self, client: &Client) -> Result<bool> {
|
|
let extension_url = format!("chrome-extension://{}/popup.html", self.extension_id);
|
|
client.goto(&extension_url)
|
|
.await
|
|
.context("Failed to navigate to extension")?;
|
|
|
|
sleep(Duration::from_millis(200)).await;
|
|
|
|
let status = client
|
|
.find(fantoccini::LocatorStrategy::XPath(
|
|
"//*[contains(text(), 'Connected')] | //*[@data-status='connected']"
|
|
))
|
|
.await;
|
|
|
|
Ok(status.is_ok())
|
|
}
|
|
|
|
/// Holt die aktuelle externe IP-Adresse
|
|
pub async fn get_current_ip(&self, client: &Client) -> Result<String> {
|
|
// Zur IP-Check-Seite navigieren
|
|
client.goto("https://whatismyipaddress.com/")
|
|
.await
|
|
.context("Failed to navigate to IP check site")?;
|
|
|
|
sleep(Duration::from_secs(1)).await;
|
|
|
|
// IP-Adresse aus HTML extrahieren
|
|
let body = client.source().await.context("Failed to get page source")?;
|
|
|
|
// Einfache Regex für IPv4
|
|
let re = regex::Regex::new(r"(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})")?;
|
|
|
|
if let Some(caps) = re.captures(&body) {
|
|
if let Some(ip) = caps.get(1) {
|
|
return Ok(ip.as_str().to_string());
|
|
}
|
|
}
|
|
|
|
Err(anyhow!("Failed to extract IP from page"))
|
|
}
|
|
}
|
|
```
|
|
|
|
**Hinweis:** `fantoccini` wird hier verwendet. Falls Sie `thirtyfour` bevorzugen, ersetzen Sie entsprechend die WebDriver-Calls.
|
|
|
|
### 4. **scraper/webdriver.rs** (Erweiterungen)
|
|
|
|
Erweitern Sie `ChromeDriverPool` um ProtonVPN-Extension-Unterstützung:
|
|
|
|
```rust
|
|
use crate::scraper::protonvpn_extension::ProtonVpnAutomater;
|
|
|
|
pub struct ChromeDriverPool {
|
|
instances: Vec<Arc<Mutex<ChromeInstance>>>,
|
|
semaphore: Arc<Semaphore>,
|
|
protonvpn_automater: Option<ProtonVpnAutomater>,
|
|
enable_vpn: bool,
|
|
}
|
|
|
|
impl ChromeDriverPool {
|
|
pub async fn new_with_vpn(
|
|
pool_size: usize,
|
|
enable_vpn: bool,
|
|
extension_id: Option<String>,
|
|
) -> Result<Self> {
|
|
// ... existing code ...
|
|
|
|
let protonvpn_automater = if enable_vpn {
|
|
extension_id.map(ProtonVpnAutomater::new)
|
|
} else {
|
|
None
|
|
};
|
|
|
|
Ok(Self {
|
|
instances,
|
|
semaphore: Arc::new(Semaphore::new(pool_size)),
|
|
protonvpn_automater,
|
|
enable_vpn,
|
|
})
|
|
}
|
|
|
|
pub fn get_protonvpn_automater(&self) -> Option<&ProtonVpnAutomater> {
|
|
self.protonvpn_automater.as_ref()
|
|
}
|
|
}
|
|
|
|
pub struct ChromeInstance {
|
|
process: Child,
|
|
base_url: String,
|
|
extension_path: Option<String>, // Pfad zur ProtonVPN-Extension
|
|
}
|
|
|
|
impl ChromeInstance {
|
|
pub async fn new_with_extension(extension_path: Option<String>) -> Result<Self> {
|
|
let mut command = Command::new("chromedriver-win64/chromedriver.exe");
|
|
command
|
|
.arg("--port=0")
|
|
.stdout(Stdio::piped())
|
|
.stderr(Stdio::piped());
|
|
|
|
// Falls Extension-Pfad vorhanden: Extension-Argumente hinzufügen
|
|
if let Some(ref path) = extension_path {
|
|
// Chrome wird später mit --load-extension gestartet
|
|
}
|
|
|
|
// ... rest of initialization ...
|
|
|
|
Ok(Self {
|
|
process,
|
|
base_url,
|
|
extension_path,
|
|
})
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Implementierungsschritte
|
|
|
|
### Schritt 1: Abhängigkeiten konfigurieren
|
|
|
|
**`.env` Datei erstellen/erweitern:**
|
|
|
|
```env
|
|
# Existing
|
|
ECONOMIC_START_DATE=2007-02-13
|
|
CORPORATE_START_DATE=2010-01-01
|
|
ECONOMIC_LOOKAHEAD_MONTHS=3
|
|
MAX_PARALLEL_TASKS=3
|
|
|
|
# VPN Configuration (NEW)
|
|
ENABLE_VPN_ROTATION=true
|
|
VPN_SERVERS=US-Free#1,US-Free#2,UK-Free#1,JP-Free#1
|
|
TASKS_PER_VPN_SESSION=5 # Tasks pro Session (0 = zwischen Phasen rotieren)
|
|
PROTONVPN_EXTENSION_ID=ghmbeldphafepmbegfdlkpapadhbakde # Offizielle ProtonVPN Extension ID
|
|
```
|
|
|
|
**Oder in `config.toml` (alternativ):**
|
|
|
|
```toml
|
|
enable_vpn_rotation = true
|
|
vpn_servers = "US-Free#1,US-Free#2,UK-Free#1"
|
|
tasks_per_vpn_session = 5
|
|
protonvpn_extension_id = "ghmbeldphafepmbegfdlkpapadhbakde"
|
|
```
|
|
|
|
### Schritt 2: Config-Struktur erweitern
|
|
|
|
Aktualisieren Sie `src/config.rs`:
|
|
|
|
```rust
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub struct Config {
|
|
// ... existing fields ...
|
|
|
|
pub enable_vpn_rotation: bool,
|
|
pub vpn_servers: String,
|
|
pub tasks_per_vpn_session: usize,
|
|
pub protonvpn_extension_id: String,
|
|
}
|
|
|
|
impl Config {
|
|
pub fn load() -> Result<Self> {
|
|
// ... existing code ...
|
|
|
|
let enable_vpn_rotation = dotenvy::var("ENABLE_VPN_ROTATION")
|
|
.unwrap_or_else(|_| "false".to_string())
|
|
.parse::<bool>()
|
|
.context("Failed to parse ENABLE_VPN_ROTATION")?;
|
|
|
|
let vpn_servers = dotenvy::var("VPN_SERVERS")
|
|
.unwrap_or_default();
|
|
|
|
let tasks_per_vpn_session: usize = dotenvy::var("TASKS_PER_VPN_SESSION")
|
|
.unwrap_or_else(|_| "0".to_string())
|
|
.parse()
|
|
.context("Failed to parse TASKS_PER_VPN_SESSION")?;
|
|
|
|
let protonvpn_extension_id = dotenvy::var("PROTONVPN_EXTENSION_ID")
|
|
.unwrap_or_else(|_| "ghmbeldphafepmbegfdlkpapadhbakde".to_string());
|
|
|
|
Ok(Self {
|
|
// ... other fields ...
|
|
enable_vpn_rotation,
|
|
vpn_servers,
|
|
tasks_per_vpn_session,
|
|
protonvpn_extension_id,
|
|
})
|
|
}
|
|
|
|
pub fn get_vpn_server_list(&self) -> Vec<String> {
|
|
self.vpn_servers
|
|
.split(',')
|
|
.map(|s| s.trim().to_string())
|
|
.filter(|s| !s.is_empty())
|
|
.collect()
|
|
}
|
|
}
|
|
```
|
|
|
|
### Schritt 3: VPN-Session-Module erstellen
|
|
|
|
**`src/scraper/vpn_session.rs`:**
|
|
|
|
```rust
|
|
use chrono::{DateTime, Utc};
|
|
use std::sync::Arc;
|
|
use tokio::sync::Mutex;
|
|
use uuid::Uuid;
|
|
|
|
#[derive(Debug, Clone)]
|
|
pub struct VpnSessionConfig {
|
|
pub server: String,
|
|
pub session_id: String,
|
|
pub created_at: DateTime<Utc>,
|
|
pub current_ip: Option<String>,
|
|
pub task_count: usize,
|
|
pub max_tasks: usize,
|
|
}
|
|
|
|
pub struct VpnSessionManager {
|
|
pub current_session: Arc<Mutex<Option<VpnSessionConfig>>>,
|
|
pub servers: Vec<String>,
|
|
pub server_index: Arc<Mutex<usize>>,
|
|
pub tasks_per_session: usize,
|
|
}
|
|
|
|
impl VpnSessionManager {
|
|
pub fn new(servers: Vec<String>, tasks_per_session: usize) -> Self {
|
|
Self {
|
|
current_session: Arc::new(Mutex::new(None)),
|
|
servers,
|
|
server_index: Arc::new(Mutex::new(0)),
|
|
tasks_per_session,
|
|
}
|
|
}
|
|
|
|
pub async fn create_new_session(&self) -> anyhow::Result<String> {
|
|
let mut index = self.server_index.lock().await;
|
|
let server = self.servers[*index % self.servers.len()].clone();
|
|
*index += 1;
|
|
|
|
let session_id = Uuid::new_v4().to_string();
|
|
|
|
let session = VpnSessionConfig {
|
|
server,
|
|
session_id: session_id.clone(),
|
|
created_at: Utc::now(),
|
|
current_ip: None,
|
|
task_count: 0,
|
|
max_tasks: self.tasks_per_session,
|
|
};
|
|
|
|
*self.current_session.lock().await = Some(session);
|
|
tracing::info!("Created new VPN session: {}", session_id);
|
|
Ok(session_id)
|
|
}
|
|
|
|
pub async fn should_rotate(&self) -> bool {
|
|
let session = self.current_session.lock().await;
|
|
|
|
if let Some(s) = session.as_ref() {
|
|
if self.tasks_per_session > 0 && s.task_count >= self.tasks_per_session {
|
|
return true;
|
|
}
|
|
}
|
|
false
|
|
}
|
|
|
|
pub async fn increment_task_count(&self) {
|
|
if let Some(ref mut session) = &mut *self.current_session.lock().await {
|
|
session.task_count += 1;
|
|
}
|
|
}
|
|
|
|
pub async fn get_current_session(&self) -> Option<VpnSessionConfig> {
|
|
self.current_session.lock().await.clone()
|
|
}
|
|
|
|
pub async fn set_current_ip(&self, ip: String) {
|
|
if let Some(ref mut session) = &mut *self.current_session.lock().await {
|
|
session.current_ip = Some(ip);
|
|
tracing::info!("Session {} IP set to: {}", session.session_id, ip);
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Schritt 4: ProtonVPN-Extension-Automater erstellen
|
|
|
|
**`src/scraper/protonvpn_extension.rs`:**
|
|
|
|
```rust
|
|
use anyhow::{anyhow, Result, Context};
|
|
use fantoccini::Client;
|
|
use tokio::time::{sleep, Duration};
|
|
use tracing::{debug, info, warn};
|
|
|
|
pub struct ProtonVpnAutomater {
|
|
extension_id: String,
|
|
}
|
|
|
|
impl ProtonVpnAutomater {
|
|
pub fn new(extension_id: String) -> Self {
|
|
Self { extension_id }
|
|
}
|
|
|
|
pub async fn disconnect(&self, client: &Client) -> Result<()> {
|
|
info!("Disconnecting from ProtonVPN");
|
|
|
|
let extension_url = format!("chrome-extension://{}/popup.html", self.extension_id);
|
|
client.goto(&extension_url)
|
|
.await
|
|
.context("Failed to navigate to ProtonVPN extension")?;
|
|
|
|
sleep(Duration::from_millis(500)).await;
|
|
|
|
// Versuchen, Disconnect-Button zu finden
|
|
match self.find_and_click_button(client, "disconnect").await {
|
|
Ok(_) => {
|
|
sleep(Duration::from_secs(2)).await;
|
|
info!("Successfully disconnected from ProtonVPN");
|
|
Ok(())
|
|
}
|
|
Err(e) => {
|
|
warn!("Disconnect button not found: {}", e);
|
|
Ok(()) // Continue anyway
|
|
}
|
|
}
|
|
}
|
|
|
|
pub async fn connect_to_server(&self, client: &Client, server: &str) -> Result<()> {
|
|
info!("Connecting to ProtonVPN server: {}", server);
|
|
|
|
let extension_url = format!("chrome-extension://{}/popup.html", self.extension_id);
|
|
client.goto(&extension_url)
|
|
.await
|
|
.context("Failed to navigate to extension")?;
|
|
|
|
sleep(Duration::from_millis(500)).await;
|
|
|
|
// Server-Liste öffnen
|
|
self.find_and_click_button(client, "server").await.ok();
|
|
sleep(Duration::from_millis(300)).await;
|
|
|
|
// Auf spezifischen Server klicken
|
|
self.find_and_click_button(client, server).await.ok();
|
|
sleep(Duration::from_millis(300)).await;
|
|
|
|
// Connect-Button klicken
|
|
self.find_and_click_button(client, "connect").await?;
|
|
|
|
// Warten bis verbunden (max 15s)
|
|
for attempt in 0..30 {
|
|
sleep(Duration::from_millis(500)).await;
|
|
|
|
if self.is_connected(client).await.unwrap_or(false) {
|
|
info!("Successfully connected to ProtonVPN after {} ms", attempt * 500);
|
|
return Ok(());
|
|
}
|
|
}
|
|
|
|
Err(anyhow!("Failed to connect to server: {}", server))
|
|
}
|
|
|
|
pub async fn is_connected(&self, client: &Client) -> Result<bool> {
|
|
let extension_url = format!("chrome-extension://{}/popup.html", self.extension_id);
|
|
client.goto(&extension_url)
|
|
.await
|
|
.context("Failed to navigate to extension")?;
|
|
|
|
sleep(Duration::from_millis(200)).await;
|
|
|
|
let page_source = client.source().await?;
|
|
|
|
// Prüfe auf "Connected" oder ähnliche Indikatoren
|
|
Ok(page_source.contains("Connected") ||
|
|
page_source.contains("connected") ||
|
|
page_source.contains("status-connected"))
|
|
}
|
|
|
|
pub async fn get_current_ip(&self, client: &Client) -> Result<String> {
|
|
info!("Checking current IP address");
|
|
|
|
client.goto("https://whatismyipaddress.com/")
|
|
.await
|
|
.context("Failed to navigate to IP check site")?;
|
|
|
|
sleep(Duration::from_secs(2)).await;
|
|
|
|
let page_source = client.source().await?;
|
|
|
|
// Regex für IPv4
|
|
if let Some(start) = page_source.find("IPv4:") {
|
|
let ip_section = &page_source[start..start+50];
|
|
if let Some(ip_start) = ip_section.find(|c: char| c.is_numeric()) {
|
|
if let Some(ip_end) = ip_section[ip_start..].find(|c: char| !c.is_numeric() && c != '.') {
|
|
let ip = &ip_section[ip_start..ip_start + ip_end];
|
|
info!("Current IP: {}", ip);
|
|
return Ok(ip.to_string());
|
|
}
|
|
}
|
|
}
|
|
|
|
Err(anyhow!("Failed to extract IP from whatismyipaddress.com"))
|
|
}
|
|
|
|
async fn find_and_click_button(&self, client: &Client, text: &str) -> Result<()> {
|
|
let xpath = format!(
|
|
"//button[contains(translate(text(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), '{}')] | \
|
|
//*[@data-action='{}']",
|
|
text.to_lowercase(),
|
|
text.to_lowercase()
|
|
);
|
|
|
|
let element = client
|
|
.find(fantoccini::LocatorStrategy::XPath(&xpath))
|
|
.await
|
|
.context(format!("Button '{}' not found", text))?;
|
|
|
|
element.click().await.context(format!("Failed to click button '{}'", text))?;
|
|
Ok(())
|
|
}
|
|
}
|
|
```
|
|
|
|
### Schritt 5: `scraper/mod.rs` aktualisieren
|
|
|
|
```rust
|
|
pub mod webdriver;
|
|
pub mod protonvpn_extension;
|
|
pub mod vpn_session;
|
|
```
|
|
|
|
### Schritt 6: `main.rs` erweitern
|
|
|
|
```rust
|
|
// src/main.rs
|
|
mod config;
|
|
mod corporate;
|
|
mod economic;
|
|
mod scraper;
|
|
mod util;
|
|
|
|
use anyhow::Result;
|
|
use config::Config;
|
|
use scraper::webdriver::ChromeDriverPool;
|
|
use scraper::vpn_session::VpnSessionManager;
|
|
use scraper::protonvpn_extension::ProtonVpnAutomater;
|
|
use std::sync::Arc;
|
|
|
|
#[tokio::main]
|
|
async fn main() -> Result<()> {
|
|
// Initialize logging
|
|
tracing_subscriber::fmt()
|
|
.with_max_level(tracing::Level::INFO)
|
|
.init();
|
|
|
|
let config = Config::load().map_err(|err| {
|
|
eprintln!("Failed to load Config .env: {}", err);
|
|
err
|
|
})?;
|
|
|
|
// Create VPN session manager if enabled
|
|
let vpn_session_manager = if config.enable_vpn_rotation {
|
|
let servers = config.get_vpn_server_list();
|
|
if servers.is_empty() {
|
|
anyhow::bail!("VPN rotation enabled but no servers configured");
|
|
}
|
|
Some(Arc::new(VpnSessionManager::new(
|
|
servers,
|
|
config.tasks_per_vpn_session,
|
|
)))
|
|
} else {
|
|
None
|
|
};
|
|
|
|
// Initialize pool with VPN support
|
|
let pool_size = config.max_parallel_tasks;
|
|
let pool = Arc::new(
|
|
ChromeDriverPool::new_with_vpn(
|
|
pool_size,
|
|
config.enable_vpn_rotation,
|
|
if config.enable_vpn_rotation {
|
|
Some(config.protonvpn_extension_id.clone())
|
|
} else {
|
|
None
|
|
},
|
|
).await?
|
|
);
|
|
|
|
// Wenn VPN aktiviert: erste Session erstellen
|
|
if let Some(vpn_mgr) = &vpn_session_manager {
|
|
vpn_mgr.create_new_session().await?;
|
|
|
|
// Optional: IP überprüfen nach Verbindung
|
|
if let Some(automater) = pool.get_protonvpn_automater() {
|
|
// ... IP-Check durchführen ...
|
|
}
|
|
}
|
|
|
|
// Run updates
|
|
economic::run_full_update(&config, &pool).await?;
|
|
corporate::run_full_update(&config, &pool).await?;
|
|
|
|
println!("✓ All updates completed successfully");
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Konfiguration
|
|
|
|
### `.env` Beispiel
|
|
|
|
```env
|
|
# Bestehende Konfiguration
|
|
ECONOMIC_START_DATE=2007-02-13
|
|
CORPORATE_START_DATE=2010-01-01
|
|
ECONOMIC_LOOKAHEAD_MONTHS=3
|
|
MAX_PARALLEL_TASKS=3
|
|
MAX_TASKS_PER_INSTANCE=0
|
|
|
|
# VPN-Konfiguration (NEW)
|
|
ENABLE_VPN_ROTATION=true
|
|
VPN_SERVERS=US-Free#1,US-Free#2,UK-Free#1,JP-Free#1,NL-Free#1
|
|
TASKS_PER_VPN_SESSION=5
|
|
PROTONVPN_EXTENSION_ID=ghmbeldphafepmbegfdlkpapadhbakde
|
|
```
|
|
|
|
### Konfigurationsoptionen
|
|
|
|
| Variable | Typ | Beschreibung | Standard |
|
|
|----------|-----|-------------|----------|
|
|
| `ENABLE_VPN_ROTATION` | bool | VPN-Rotation aktivieren | `false` |
|
|
| `VPN_SERVERS` | String | Komma-separierte Server-Liste | `` (leer) |
|
|
| `TASKS_PER_VPN_SESSION` | usize | Tasks pro Session vor Rotation (0 = zwischen Phasen) | `0` |
|
|
| `PROTONVPN_EXTENSION_ID` | String | Chrome Extension ID | `ghmbeldphafepmbegfdlkpapadhbakde` |
|
|
| `MAX_PARALLEL_TASKS` | usize | Parallele ChromeDriver-Instanzen | `10` |
|
|
| `MAX_TASKS_PER_INSTANCE` | usize | Tasks pro Instanz (0 = unlimited) | `0` |
|
|
|
|
### Chrome-Extension installieren
|
|
|
|
Die ProtonVPN-Extension wird automatisch vom Browser heruntergeladen, wenn Sie das folgende in Ihrem Code verwenden:
|
|
|
|
```rust
|
|
// In ChromeInstance::new()
|
|
let mut command = Command::new("chromedriver-win64/chromedriver.exe");
|
|
|
|
// Optional: Extension via Kommandozeile laden (für Testing)
|
|
// command.arg("--load-extension=/path/to/protonvpn-extension");
|
|
```
|
|
|
|
**Oder manuell:**
|
|
1. Chrome öffnen → `chrome://extensions/`
|
|
2. ProtonVPN by Proton Technologies AG suchen und installieren
|
|
3. Extension ID kopieren: Klick auf "Details" → `ghmbeldphafepmbegfdlkpapadhbakde`
|
|
|
|
---
|
|
|
|
## Fehlerbehandlung & Best Practices
|
|
|
|
### Error Handling
|
|
|
|
```rust
|
|
use anyhow::{Result, Context, anyhow};
|
|
|
|
// Beispiel: VPN-Verbindung mit Retry
|
|
async fn connect_with_retry(
|
|
automater: &ProtonVpnAutomater,
|
|
client: &Client,
|
|
server: &str,
|
|
max_retries: u32,
|
|
) -> Result<()> {
|
|
for attempt in 1..=max_retries {
|
|
match automater.connect_to_server(client, server).await {
|
|
Ok(_) => return Ok(()),
|
|
Err(e) if attempt < max_retries => {
|
|
tracing::warn!("Connection attempt {} failed: {}, retrying...", attempt, e);
|
|
tokio::time::sleep(Duration::from_secs(2 * attempt as u64)).await;
|
|
}
|
|
Err(e) => return Err(e).context(format!(
|
|
"Failed to connect after {} attempts",
|
|
max_retries
|
|
)),
|
|
}
|
|
}
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
### Best Practices
|
|
|
|
1. **Timeout-Management**
|
|
```rust
|
|
tokio::time::timeout(Duration::from_secs(30), async_operation).await?
|
|
```
|
|
|
|
2. **Logging**
|
|
```rust
|
|
tracing::info!("Session created with IP: {}", ip);
|
|
tracing::warn!("Connection unstable, retrying...");
|
|
tracing::debug!("Extension UI loaded");
|
|
```
|
|
|
|
3. **Ressourcen-Cleanup**
|
|
```rust
|
|
// Drop am Ende des Scope
|
|
drop(client);
|
|
drop(process);
|
|
```
|
|
|
|
4. **Session-Tracking**
|
|
- Speichern Sie Session-IDs für Logging/Debugging
|
|
- Protokollieren Sie IP-Adressen pro Session
|
|
- Verfolgen Sie Task-Counter pro Session
|
|
|
|
5. **Extension-Zuverlässigkeit**
|
|
- Verwenden Sie explizite Waits statt feste Sleep-Zeiten wo möglich
|
|
- Fallback auf alternative IP-Check-Services
|
|
- Handle Extension-Updates (may change selectors)
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Problem: Extension-Buttons nicht gefunden
|
|
|
|
**Lösung:** Extension-UI-Selektoren können sich zwischen Versionen ändern. Aktualisieren Sie die XPath-Expressions in `protonvpn_extension.rs`.
|
|
|
|
```bash
|
|
# Chrome Extension ID überprüfen
|
|
chrome://extensions/
|
|
```
|
|
|
|
### Problem: VPN verbindet sich nicht
|
|
|
|
**Lösung:**
|
|
1. Stellen Sie sicher, dass ProtonVPN-Account aktiv ist
|
|
2. Erhöhen Sie Timeout-Werte in `connect_to_server()`
|
|
3. Aktivieren Sie Debug-Logging: `RUST_LOG=debug cargo run`
|
|
|
|
### Problem: IP-Überprüfung schlägt fehl
|
|
|
|
**Lösung:** Alternative IP-Check-Services:
|
|
- `https://icanhazip.com/` (gibt nur IP zurück)
|
|
- `https://ifconfig.me/`
|
|
- `https://checkip.amazonaws.com/`
|
|
|
|
```rust
|
|
pub async fn get_current_ip_alt(&self, client: &Client) -> Result<String> {
|
|
client.goto("https://icanhazip.com/").await?;
|
|
let body = client.source().await?;
|
|
Ok(body.trim().to_string())
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Deployment-Checkliste
|
|
|
|
- [ ] `.env` Datei mit VPN-Konfiguration erstellt
|
|
- [ ] ProtonVPN-Extension ID korrekt eingegeben
|
|
- [ ] `Cargo.toml` Dependencies überprüft
|
|
- [ ] VPN-Session-Module implementiert
|
|
- [ ] ProtonVPN-Automater integriert
|
|
- [ ] ChromeDriver-Pool mit Extension-Support erweitert
|
|
- [ ] `main.rs` mit Session-Manager aktualisiert
|
|
- [ ] Tests mit `ENABLE_VPN_ROTATION=false` durchgeführt
|
|
- [ ] Tests mit kleinem Pool (`MAX_PARALLEL_TASKS=1`) durchgeführt
|
|
- [ ] Logging aktiviert: `RUST_LOG=info cargo run`
|
|
- [ ] ProtonVPN-Account getestet (Login erfolgreich)
|
|
- [ ] Chrome + ChromeDriver Kompatibilität überprüft
|
|
|
|
---
|
|
|
|
## Zusammenfassung
|
|
|
|
Diese Anleitung bietet ein vollständiges Framework für:
|
|
|
|
✅ **Session-Management** mit VPN-Rotation
|
|
✅ **Automatisierte ProtonVPN-Extension**-Steuerung
|
|
✅ **IP-Rotation** zwischen Sessions
|
|
✅ **ChromeDriver-Pool** mit konfigurierbarer Größe
|
|
✅ **Flexible Konfiguration** via `.env`
|
|
✅ **Fehlerbehandlung** und Logging
|
|
|
|
Das System ist modular, erweiterbar und plattformübergreifend kompatibel. Folgen Sie den Implementierungsschritten sequenziell und testen Sie nach jedem Schritt.
|
|
|
|
**Für Fragen zur ProtonVPN-Extension:**
|
|
- Offizielle Extension: https://chrome.google.com/webstore/detail/protonvpn/ghmbeldphafepmbegfdlkpapadhbakde
|
|
- ProtonVPN-Dokumentation: https://protonvpn.com/support
|
|
|