Writing a simple SOCKS server in Python
This article explains how to write a tiny and basic SOCKS 5 server in Python 3.6. I am assuming that you already have a basic understanding of proxy servers.
Introduction
SOCKS is a generic proxy protocol that relays TCP connections from one point to another using intermediate connection (socks server). Originally, SOCKS proxies were mostly used as a circuit-level gateways, that is, a firewall between local and external resources (the internet). However, nowadays it is also popular in censorship circumvention and web scraping.
Throughout the article, I will be referring the RFC 1928 specification which describes SOCKS protocol.
Before reading this article, I recommend you to clone a completed version of the implementation so you can see the full picture.
TCP sessions handling
The SOCKS protocol is implemented on top of the TCP stack, in such way that the client must establish a separate TCP connection with the SOCKS server for each remote server it wants to exchange data with.
So, first of all, we need to create a regular TCP session handler. Python has a built-in socketserver module, which simplifies the task of writing network servers.
from socketserver import ThreadingMixIn, TCPServer, StreamRequestHandler
class ThreadingTCPServer(ThreadingMixIn, TCPServer):
pass
class SocksProxy(StreamRequestHandler):
def handle(self):
# Our main logic will be here
pass
if __name__ == '__main__':
with ThreadingTCPServer(('127.0.0.1', 9011), SocksProxy) as server:
server.serve_forever()
Here the ThreadingTCPServer
creates a threading version of TCP server and listens for incoming connections on a specified address and port. Every time there is a new incoming TCP connection (session) the server spawns a new thread with SocksProxy
instance running inside it. It gives us an easy way to handle concurrent connections.
The ThreadingMixIn
can be replaced by ForkingTCPServer
, which uses forking approach, that is, it spawns a new process for each TCP session.
Connection establishment and negotiation
When a client establishes a TCP session to the SOCKS server, it must send a greeting message.
The message consists of 3 fields:
version | nmethods | methods |
1 byte | 1 byte | 0 to 255 bytes |
Here version
field represents a version of the protocol, which equals to 5 in our case. The nmethods
field contains the number of authentication methods supported by the client. The methods
field consists of a sequence of supported methods by the client. Thus the methods
field indicates the length of a methods
sequence.
According to the RFC 1928, the supported values of methods
field defined as follows:
- '00' NO AUTHENTICATION REQUIRED
- '01' GSSAPI
- '02' USERNAME/PASSWORD
- '03' to X'7F' IANA ASSIGNED
- '80' to X'FE' RESERVED FOR PRIVATE METHODS
- 'FF' NO ACCEPTABLE METHODS
When the SOCKS server receives such message, it should choose an appropriate method and answer back. Let's pretend we only support a USERNAME/PASSWORD
method.
The format of the answer looks as follows:
version | method |
1 byte | 1 byte |
Here is how the whole process looks in Python:
def handle(self):
# Greating header
# read and unpack 2 bytes from a client
header = self.connection.recv(2)
version, nmethods = struct.unpack("!BB", header)
# socks 5
assert version == SOCKS_VERSION
assert nmethods > 0
# Get available methods
methods = self.get_available_methods(nmethods)
# accept only USERNAME/PASSWORD auth
if 2 not in set(methods):
# close connection
self.connection.sendall(struct.pack("!BB", SOCKS_VERSION, 255))
self.server.close_request(self.request)
return
# Send server choice
self.connection.sendall(struct.pack("!BB", SOCKS_VERSION, 2))
def get_available_methods(self, n):
methods = []
for i in range(n):
methods.append(ord(self.connection.recv(1)))
return methods
Here the recv function reads n
bytes from the client and the struct module helps to pack and unpack binary data using specified format.
Once the client has received the server choice, it responds with username and password credentials.
version | ulen | uname | plen | passwd |
1 byte | 1 byte | 0 to 255 bytes | 1 byte | 0 to 255 bytes |
The version
field represents the authentication version, which is equals to 1 in our case. The ulen
and plen
fields represent lengths of text fields so the server knows how much data it should read from the client.
The server response should look as follows:
version | status |
1 byte | 1 byte |
The status
field of 0 indicates a successful authorization, while other values treated as a failure.
Python version of authorization looks as follows:
def verify_credentials(self):
version = ord(self.connection.recv(1))
assert version == 1
username_len = ord(self.connection.recv(1))
username = self.connection.recv(username_len).decode('utf-8')
password_len = ord(self.connection.recv(1))
password = self.connection.recv(password_len).decode('utf-8')
if username == self.username and password == self.password:
# Success, status = 0
response = struct.pack("!BB", version, 0)
self.connection.sendall(response)
return True
# Failure, status != 0
response = struct.pack("!BB", version, 0xFF)
self.connection.sendall(response)
self.server.close_request(self.request)
return False
Once the authorization has completed the client can send request details.
version | cmd | rsv | atyp | dst.addr | dst.port |
1 byte | 1 byte | 1 byte | 1 byte | 4 to 255 bytes | 2 bytes |
Where:
- VERSION protocol version: '05'
- CMD
- CONNECT '01'
- BIND '02'
- UDP ASSOCIATE '03'
- RSV RESERVED
- ATYP address type of following address
- IP V4 address: '01'
- DOMAINNAME: '03'
- IP V6 address: '04'
- DST.ADDR desired destination address
- DST.PORT desired destination port in network octet order
The cmd
field indicates the type of connection. This article is limited to CONNECT
method only, which is used for TCP connections. For more details, please read the SOCKS RFC.
If a client sends a domain name, it should be resolved by the DNS on the server side. Thus a client has no need for a working DNS server when working with SOCKS.
As soon as server establishes a connection to the desired destination it should reply with a status and remote address.
version | rep | rsv | atyp | bnd.addr | bnd.port |
1 byte | 1 byte | 1 byte | 1 byte | 4 to 255 bytes | 2 bytes |
Where:
- VER protocol version: X'05'
- REP Reply field:
- '00' succeeded
- '01' general SOCKS server failure
- '02' connection not allowed by ruleset
- '03' Network unreachable
- '04' Host unreachable
- '05' Connection refused
- '06' TTL expired
- '07' Command not supported
- '08' Address type not supported
- '09' to X'FF' unassigned
- RSV RESERVED
- ATYP address type of following address
- IP V4 address: '01'
- DOMAINNAME: '03'
- IP V6 address: '04'
- BND.ADDR server bound address
- BND.PORT server bound port in network octet order
Here is how it looks in Python:
# client request
version, cmd, _, address_type = struct.unpack("!BBBB", self.connection.recv(4))
assert version == SOCKS_VERSION
if address_type == 1: # ipv4
address = socket.inet_ntoa(self.connection.recv(4))
elif address_type == 3: # domain
domain_length = ord(self.connection.recv(1)[0])
address = self.connection.recv(domain_length)
port = struct.unpack('!H', self.rfile.read(2))[0]
# server reply
try:
if cmd == 1: # CONNECT
remote = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
remote.connect((address, port))
bind_address = remote.getsockname()
else:
self.server.close_request(self.request)
addr = struct.unpack("!I", socket.inet_aton(bind_address[0]))[0]
port = bind_address[1]
reply = struct.pack("!BBBBIH", SOCKS_VERSION, 0, 0, address_type,
addr, port)
except Exception as err:
# return Connection refused error
reply = self.generate_failed_reply(address_type, 5)
self.connection.sendall(reply)
# Establish data exchange
if reply[1] == 0 and cmd == 1:
self.exchange_loop(self.connection, remote)
self.server.close_request(self.request)
If server's reply indicates a success, the client may now start passing the data. In order to work with both client and remote hosts concurrently we can use select library which supports select
and pool
Unix interfaces.
Here is how we can read and resend data in one loop both from client and remote host:
def exchange_loop(self, client, remote):
while True:
# wait until client or remote is available for read
r, w, e = select.select([client, remote], [], [])
if client in r:
data = client.recv(4096)
if remote.send(data) <= 0:
break
if remote in r:
data = remote.recv(4096)
if client.send(data) <= 0:
break
That's it! We have got a working SOCKS 5 proxy.
Now we can test it using curl
:
curl -v --socks5 127.0.0.1:9011 -U username:password https://github.com
Maybe my english is so bad, but this is typo, isn't it?