Using GrantProxy proxies in scripts
Intro
This page gives examples of how to use GrantProxy to create hard-to-detect web bots in various languages. Since the requests performed by your script are proxied through a real web browser, all of your request's metadata (headers, cookies, TLS connections) will reflect that of a real user.
Regardless of language, an HTTP library (or software tool) must have the following to work with a GrantProxy proxy:
- Support for HTTP proxies that require username/password authentication.
- Support for setting custom certificate authority (CA)
- Alternatively, the ability to disable TLS/SSL validation. This is less ideal as you lose the benifits of HTTPS for secure requests.
Python (2 & 3)
The following Python script makes use of the following libraries:
The script pulls the Reddit homepage using GrantProxy and extracts the CSRF token. This token is required to make other useful requests, such as the action taken later in the script: sending a Reddit DM.
Reddit is used as an example because it has a notoriously trigger-happy anti bot system. Note that while using GrantProxy, no cookies are specified and no special code other than the proxy setup is required to hide it from the bot detection. Since the request is proxied through a real browser session, it is hidden amongst completely regular traffic, making it much harder to detect as a bot.
import os
import requests
from bs4 import BeautifulSoup
from requests.auth import HTTPProxyAuth
# Set the GrantProxy CA bundle so the SSL certificate can be verified.
os.environ['REQUESTS_CA_BUNDLE'] = 'GrantProxyCA.crt'
# Use GrantProxy proxy
proxies = {
"http": "proxy.grantproxy.com:8080",
"https": "proxy.grantproxy.com:8080",
}
# GrantProxy proxy credentials
auth = HTTPProxyAuth(
"grantuserxxxxxxxxx",
"xxxxxxxxxxxxxxxxxx"
)
# Get CSRF token from the response body
response = requests.get(
"https://www.reddit.com/",
proxies=proxies,
auth=auth,
)
soup = BeautifulSoup(
response.text,
'html.parser',
)
# Parse CSRF token from body
input_elem = soup.find("input", {"name": "uh"})
csrf_token = input_elem.get("value")
# Note: No cookies needed, the browser's cookies
# are automatically used if the proxy is configured
# to use the browser's sessions.
response = requests.post(
"https://www.reddit.com/api/compose",
proxies=proxies,
auth=auth,
# Alternatively, use the GrantProxy CA
# to verify SSL certificates.
verify=True,
data={
# CSRF token, pull from another page
"uh": csrf_token,
"from_sr": "",
"to": "mandatoryprogrammer",
"subject": "Automated message",
"thing_id": "",
"text": "This is a reddit DM sent via GrantProxy",
"source": "compose",
"id": "#compose-message",
"renderstyle": "html"
}
)
print(response.text)