caterpillar/README.md

87 lines
3.8 KiB
Markdown
Raw Normal View History

2024-02-19 04:41:01 +00:00
# gnh1201/caterpillar
2024-02-22 09:13:20 +00:00
Caterpillar - The simple and parasitic web proxy with spam filter (formerly, php-httpproxy)
2022-10-05 17:19:11 +00:00
2024-02-19 07:30:43 +00:00
![title image](title.jfif)
2024-02-19 05:16:42 +00:00
2024-02-19 12:13:11 +00:00
## How it works
2024-02-19 07:30:43 +00:00
### Basic structure
2022-10-05 17:19:11 +00:00
```
2024-02-19 05:20:01 +00:00
You <-> Proxy client (Python) <-> Parasitized proxy server (Optional, PHP) <-> On the Web
2022-10-05 17:19:11 +00:00
```
2024-02-19 07:30:43 +00:00
For example, build a simple web debugging proxy on the shared servers.
2022-10-05 17:19:11 +00:00
2024-02-19 07:31:57 +00:00
### Spam filtering strategy
2024-02-19 07:45:33 +00:00
* [K-Anonymity](https://en.wikipedia.org/wiki/K-anonymity) test - Estimating whether the characters has been arranged by humans. (use [Have I Been Pwned](https://haveibeenpwned.com/Passwords))
2024-02-20 09:12:46 +00:00
* Not CAPTCHA - Image spam containing characters that look very similar to [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA). (use [TrueCaptcha](https://truecaptcha.org/))
2024-02-22 05:23:48 +00:00
* VowelRatio10 - In characters arranged by humans, there is a high frequency of [vowels](https://en.wikipedia.org/wiki/Vowel) (aeiou) and [semivowels](https://en.wikipedia.org/wiki/Semivowel) (wy) included in strings that are 10 characters.
* Palindrome5 - Detect [palindromes](https://en.wikipedia.org/wiki/Palindrome) composed of 5 or more characters
* KnownWords5 - Detect [well-known words](https://github.com/dwyl/english-words) composed of 5 or more characters
2024-02-23 10:25:12 +00:00
* SearchEngine3 - In public search engine, the given string yields more than 2 results. (use [LibreY](https://github.com/Ahwxorg/librey))
2024-02-19 12:08:51 +00:00
* SSL decryption (MITM) when relaying to federated servers.
2024-02-19 12:05:56 +00:00
2024-02-20 05:36:46 +00:00
The strategies were implemented to respond to [the Fediverse Spam Attacks which started on the 15th of February](https://github.com/Mastodon-DE/blocklists/blob/main/spam%2F2024-02-15%2F2024-02-15-spam-mute-list.md).
2022-10-05 17:22:13 +00:00
2024-02-19 05:20:01 +00:00
## (Optional) Before to use
2024-02-19 04:49:03 +00:00
If you have an ***will be parasitize*** server that you want to proxy, you can install the `index.php` file.
2022-10-07 17:19:19 +00:00
2024-02-19 04:41:01 +00:00
## How to use
1. Write a file `.env`(Linux) or `settings.ini`(Windows). Like this:
2022-10-07 17:19:19 +00:00
```
[settings]
PORT=5555
2022-11-25 08:12:58 +00:00
SERVER_URL=http://example.org
CA_KEY=ca.key
CA_CERT=ca.crt
CERT_KEY=cert.key
CERT_DIR=certs/
2022-11-25 12:58:03 +00:00
OPENSSL_BINPATH=openssl
2022-11-25 12:35:02 +00:00
CLIENT_ENCODING=utf-8
2024-02-17 19:34:50 +00:00
LOCAL_DOMAIN=example.org
PROXY_PASS=http://127.0.0.1:3000
2024-02-19 06:22:19 +00:00
MASTODON_SERVER=
MASTODON_USER_TOKEN=
TRUECAPTCHA_USERID=
TRUECAPTCHA_APIKEY=
2024-02-23 09:36:49 +00:00
LIBREY_APIURL=
2022-10-07 17:19:19 +00:00
```
2024-02-20 07:19:46 +00:00
- (Optional) Install RootCA for SSL decryption ([Download CA Certificate](ca.crt))
2024-02-17 09:01:59 +00:00
```bash
2024-02-17 09:04:50 +00:00
sudo apt-get install -y ca-certificates
2024-02-19 04:49:51 +00:00
sudo cp ca.crt /usr/local/share/ca-certificates/caterpillar-ca.crt
2024-02-17 09:04:50 +00:00
sudo update-ca-certificates
2024-02-17 09:01:59 +00:00
```
2024-02-17 09:04:50 +00:00
2. Run `python3 server.py` and set HTTP(S) proxy in your web browser (e.g. Firefox)
2022-10-07 17:19:19 +00:00
2022-11-25 13:11:55 +00:00
3. Test [100MB](http://speed.hetzner.de/100MB.bin)/[SSL](https://speed.hetzner.de/100MB.bin), [1GB](http://speed.hetzner.de/1GB.bin)/[SSL](https://speed.hetzner.de/1GB.bin), [10GB](http://speed.hetzner.de/10GB.bin)/[SSL](http://speed.hetzner.de/10GB.bin) download and check the speed (e.g. https://speed.hetzner.de/1GB.bin)
2022-10-07 17:32:53 +00:00
2022-10-07 17:19:19 +00:00
3. Enjoy it
2022-10-08 15:51:58 +00:00
4. (Optional) With [Cloudflare](https://cloudflare.com), we can expect to accelerate the 4x speed and reduce the network stuck.
2022-10-08 15:48:25 +00:00
2024-02-17 09:04:50 +00:00
## (Optional) For Mastodon users
2024-02-20 08:27:03 +00:00
### In [Caterpillar installed directory]/settings.ini or .env
1. set `SERVER_URL` variable to `localhost` in `.env` (e.g. `SERVER_URL=localhost`)
2024-02-20 08:28:49 +00:00
2. set `PROXY_PASS` variable to Mastodon backend URI (e.g. `http://127.0.0.1:3000`)
2024-02-20 08:27:03 +00:00
3. if you want use notification, set `MASTODON_SERVER`(server domain) and `MASTODON_USER_TOKEN`(access token) variables
### In [Mastodon installed directory]/env.production
1. set `http_proxy` variable to `http://localhost:5555` (e.g. `http_proxy=http://localhost:5555`)
### In NGINX configuration
2024-02-20 08:28:25 +00:00
1. Check your port number of Caterpillar (default: 5555)
1. In NGINX configuration (e.g. `/etc/nginx/conf.d/mastodon.conf`), edit the `proxy_pass` like a `proxy_pass http://localhost:5555`
2024-02-17 09:04:50 +00:00
2022-10-05 17:21:12 +00:00
## References
* https://github.com/anapeksha/python-proxy-server
2022-11-25 14:05:24 +00:00
* https://github.com/inaz2/proxy2
2022-10-05 17:21:12 +00:00
2022-10-05 17:19:11 +00:00
## Contact
2024-02-17 09:14:45 +00:00
* ActivityPub [@gnh1201@catswords.social](https://catswords.social/@gnh1201)
2022-11-25 14:05:24 +00:00
* abuse@catswords.net